1
|
Rider NL, Cahill G, Motazedi T, Wei L, Kurian A, Noroski LM, Seeborg FO, Chinn IK, Roberts K. PI Prob: A risk prediction and clinical guidance system for evaluating patients with recurrent infections. PLoS One 2021; 16:e0237285. [PMID: 33591972 PMCID: PMC7886140 DOI: 10.1371/journal.pone.0237285] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 01/16/2021] [Indexed: 12/12/2022] Open
Abstract
Background Primary immunodeficiency diseases represent an expanding set of heterogeneous conditions which are difficult to recognize clinically. Diagnostic rates outside of the newborn period have not changed appreciably. This concern underscores a need for novel methods of disease detection. Objective We built a Bayesian network to provide real-time risk assessment about primary immunodeficiency and to facilitate prescriptive analytics for initiating the most appropriate diagnostic work up. Our goal is to improve diagnostic rates for primary immunodeficiency and shorten time to diagnosis. We aimed to use readily available health record data and a small training dataset to prove utility in diagnosing patients with relatively rare features. Methods We extracted data from the Texas Children’s Hospital electronic health record on a large population of primary immunodeficiency patients (n = 1762) and appropriately-matched set of controls (n = 1698). From the cohorts, clinically relevant prior probabilities were calculated enabling construction of a Bayesian network probabilistic model(PI Prob). Our model was constructed with clinical-immunology domain expertise, trained on a balanced cohort of 100 cases-controls and validated on an unseen balanced cohort of 150 cases-controls. Performance was measured by area under the receiver operator characteristic curve (AUROC). We also compared our network performance to classic machine learning model performance on the same dataset. Results PI Prob was accurate in classifying immunodeficiency patients from controls (AUROC = 0.945; p<0.0001) at a risk threshold of ≥6%. Additionally, the model was 89% accurate for categorizing validation cohort members into appropriate International Union of Immunological Societies diagnostic categories. Our network outperformed 3 other machine learning models and provides superior transparency with a prescriptive output element. Conclusion Artificial intelligence methods can classify risk for primary immunodeficiency and guide management. PI Prob enables accurate, objective decision making about risk and guides the user towards the appropriate diagnostic evaluation for patients with recurrent infections. Probabilistic models can be trained with small datasets underscoring their utility for rare disease detection given appropriate domain expertise for feature selection and network construction.
Collapse
Affiliation(s)
- Nicholas L. Rider
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, United States of America
- Section of Immunology, Allergy and Retrovirology, Texas Children’s Hospital, Houston, Texas, United States of America
- Department of Information Services, Texas Children’s Hospital, Houston, Texas, United States of America
- * E-mail:
| | - Gina Cahill
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, United States of America
- Section of Immunology, Allergy and Retrovirology, Texas Children’s Hospital, Houston, Texas, United States of America
| | - Tina Motazedi
- Division of Allergy and Immunology, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | - Lei Wei
- Department of Information Services, Texas Children’s Hospital, Houston, Texas, United States of America
| | - Ashok Kurian
- Department of Information Services, Texas Children’s Hospital, Houston, Texas, United States of America
| | - Lenora M. Noroski
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, United States of America
- Section of Immunology, Allergy and Retrovirology, Texas Children’s Hospital, Houston, Texas, United States of America
| | - Filiz O. Seeborg
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, United States of America
- Section of Immunology, Allergy and Retrovirology, Texas Children’s Hospital, Houston, Texas, United States of America
| | - Ivan K. Chinn
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, United States of America
- Section of Immunology, Allergy and Retrovirology, Texas Children’s Hospital, Houston, Texas, United States of America
| | - Kirk Roberts
- The University of Texas School of Biomedical Informatics, Houston, Texas, United States of America
| |
Collapse
|
2
|
A Multi-Method Approach for Proteomic Network Inference in 11 Human Cancers. PLoS Comput Biol 2016; 12:e1004765. [PMID: 26928298 PMCID: PMC4771175 DOI: 10.1371/journal.pcbi.1004765] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2015] [Accepted: 01/20/2016] [Indexed: 12/27/2022] Open
Abstract
Protein expression and post-translational modification levels are tightly regulated in neoplastic cells to maintain cellular processes known as 'cancer hallmarks'. The first Pan-Cancer initiative of The Cancer Genome Atlas (TCGA) Research Network has aggregated protein expression profiles for 3,467 patient samples from 11 tumor types using the antibody based reverse phase protein array (RPPA) technology. The resultant proteomic data can be utilized to computationally infer protein-protein interaction (PPI) networks and to study the commonalities and differences across tumor types. In this study, we compare the performance of 13 established network inference methods in their capacity to retrieve the curated Pathway Commons interactions from RPPA data. We observe that no single method has the best performance in all tumor types, but a group of six methods, including diverse techniques such as correlation, mutual information, and regression, consistently rank highly among the tested methods. We utilize the high performing methods to obtain a consensus network; and identify four robust and densely connected modules that reveal biological processes as well as suggest antibody-related technical biases. Mapping the consensus network interactions to Reactome gene lists confirms the pan-cancer importance of signal transduction pathways, innate and adaptive immune signaling, cell cycle, metabolism, and DNA repair; and also suggests several biological processes that may be specific to a subset of tumor types. Our results illustrate the utility of the RPPA platform as a tool to study proteomic networks in cancer.
Collapse
|