Prediction of tuberculosis using an automated machine learning platform for models trained on synthetic data
High-quality medical data is critical to the development and implementation of machine learning (ML) algorithms in healthcare; however, security, and privacy concerns continue to limit access. We sought to determine the utility of "synthetic data" in training ML algorithms for the detectio...
Saved in:
Main Authors: | , , , , , , , , , |
---|---|
Format: | Book |
Published: |
Elsevier,
2022-01-01T00:00:00Z.
|
Subjects: | |
Online Access: | Connect to this object online. |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
MARC
LEADER | 00000 am a22000003u 4500 | ||
---|---|---|---|
001 | doaj_8eef777f25f34c9d9515c4b0cc05ffb4 | ||
042 | |a dc | ||
100 | 1 | 0 | |a Hooman H Rashidi |e author |
700 | 1 | 0 | |a Imran H Khan |e author |
700 | 1 | 0 | |a Luke T Dang |e author |
700 | 1 | 0 | |a Samer Albahra |e author |
700 | 1 | 0 | |a Ujjwal Ratan |e author |
700 | 1 | 0 | |a Nihir Chadderwala |e author |
700 | 1 | 0 | |a Wilson To |e author |
700 | 1 | 0 | |a Prathima Srinivas |e author |
700 | 1 | 0 | |a Jeffery Wajda |e author |
700 | 1 | 0 | |a Nam K Tran |e author |
245 | 0 | 0 | |a Prediction of tuberculosis using an automated machine learning platform for models trained on synthetic data |
260 | |b Elsevier, |c 2022-01-01T00:00:00Z. | ||
500 | |a 2153-3539 | ||
500 | |a 10.4103/jpi.jpi_75_21 | ||
520 | |a High-quality medical data is critical to the development and implementation of machine learning (ML) algorithms in healthcare; however, security, and privacy concerns continue to limit access. We sought to determine the utility of "synthetic data" in training ML algorithms for the detection of tuberculosis (TB) from inflammatory biomarker profiles. A retrospective dataset (A) comprised of 278 patients was used to generate synthetic datasets (B, C, and D) for training models prior to secondary validation on a generalization dataset. ML models trained and validated on the Dataset A (real) demonstrated an accuracy of 90%, a sensitivity of 89% (95% CI, 83-94%), and a specificity of 100% (95% CI, 81-100%). Models trained using the optimal synthetic dataset B showed an accuracy of 91%, a sensitivity of 93% (95% CI, 87-96%), and a specificity of 77% (95% CI, 50-93%). Synthetic datasets C and D displayed diminished performance measures (respective accuracies of 71% and 54%). This pilot study highlights the promise of synthetic data as an expedited means for ML algorithm development. | ||
546 | |a EN | ||
690 | |a artificial intelligence | ||
690 | |a biomarkers | ||
690 | |a data accessibility | ||
690 | |a electronic medical record | ||
690 | |a privacy | ||
690 | |a simulation | ||
690 | |a Computer applications to medicine. Medical informatics | ||
690 | |a R858-859.7 | ||
690 | |a Pathology | ||
690 | |a RB1-214 | ||
655 | 7 | |a article |2 local | |
786 | 0 | |n Journal of Pathology Informatics, Vol 13, Iss 1, Pp 10-10 (2022) | |
787 | 0 | |n http://www.jpathinformatics.org/article.asp?issn=2153-3539;year=2022;volume=13;issue=1;spage=10;epage=10;aulast=Rashidi | |
787 | 0 | |n https://doaj.org/toc/2153-3539 | |
856 | 4 | 1 | |u https://doaj.org/article/8eef777f25f34c9d9515c4b0cc05ffb4 |z Connect to this object online. |