1
|
Jones SE, Bradwell KR, Chan LE, McMurry JA, Olson-Chen C, Tarleton J, Wilkins KJ, Ly V, Ljazouli S, Qin Q, Faherty EG, Lau YK, Xie C, Kao YH, Liebman MN, Mariona F, Challa AP, Li L, Ratcliffe SJ, Haendel MA, Patel RC, Hill EL. Who is pregnant? Defining real-world data-based pregnancy episodes in the National COVID Cohort Collaborative (N3C). JAMIA Open 2023; 6:ooad067. [PMID: 37600074 PMCID: PMC10432357 DOI: 10.1093/jamiaopen/ooad067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 05/12/2023] [Accepted: 08/08/2023] [Indexed: 08/22/2023] Open
Abstract
Objectives To define pregnancy episodes and estimate gestational age within electronic health record (EHR) data from the National COVID Cohort Collaborative (N3C). Materials and Methods We developed a comprehensive approach, named Hierarchy and rule-based pregnancy episode Inference integrated with Pregnancy Progression Signatures (HIPPS), and applied it to EHR data in the N3C (January 1, 2018-April 7, 2022). HIPPS combines: (1) an extension of a previously published pregnancy episode algorithm, (2) a novel algorithm to detect gestational age-specific signatures of a progressing pregnancy for further episode support, and (3) pregnancy start date inference. Clinicians performed validation of HIPPS on a subset of episodes. We then generated pregnancy cohorts based on gestational age precision and pregnancy outcomes for assessment of accuracy and comparison of COVID-19 and other characteristics. Results We identified 628 165 pregnant persons with 816 471 pregnancy episodes, of which 52.3% were live births, 24.4% were other outcomes (stillbirth, ectopic pregnancy, abortions), and 23.3% had unknown outcomes. Clinician validation agreed 98.8% with HIPPS-identified episodes. We were able to estimate start dates within 1 week of precision for 475 433 (58.2%) episodes. 62 540 (7.7%) episodes had incident COVID-19 during pregnancy. Discussion HIPPS provides measures of support for pregnancy-related variables such as gestational age and pregnancy outcomes based on N3C data. Gestational age precision allows researchers to find time to events with reasonable confidence. Conclusion We have developed a novel and robust approach for inferring pregnancy episodes and gestational age that addresses data inconsistency and missingness in EHR data.
Collapse
Affiliation(s)
- Sara E Jones
- Office of Data Science and Emerging Technologies, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Rockville, MD 20852, United States
| | | | - Lauren E Chan
- College of Public Health and Human Sciences, Oregon State University, Corvallis, OR 97331, United States
| | - Julie A McMurry
- Department of Biomedical Informatics, University of Colorado, Anschutz Medical Campus, Aurora, CO 80045, United States
| | - Courtney Olson-Chen
- Department of Obstetrics and Gynecology, University of Rochester Medical Center, Rochester, NY 14620, United States
| | - Jessica Tarleton
- Department of Obstetrics and Gynecology, Medical University of South Carolina, Charleston, SC 29425, United States
| | - Kenneth J Wilkins
- Biostatistics Program, Office of the Director, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892, United States
| | - Victoria Ly
- Department of Obstetrics and Gynecology, University of Rochester Medical Center, Rochester, NY 14620, United States
| | - Saad Ljazouli
- Palantir Technologies, Denver, CO 80202, United States
| | - Qiuyuan Qin
- Department of Public Health Sciences, University of Rochester Medical Center, Rochester, NY 14618, United States
| | - Emily Groene Faherty
- School of Public Health, University of Minnesota, Minneapolis, MN 55455, United States
| | | | - Catherine Xie
- Department of Public Health Sciences, University of Rochester Medical Center, Rochester, NY 14618, United States
| | - Yu-Han Kao
- Sema4, Stamford, CT 06902, United States
| | | | - Federico Mariona
- Beaumont Hospital, Dearborn, MI 48124, United States
- Wayne State University, Detroit, MI 48202, United States
| | - Anup P Challa
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, TN 37212, United States
| | - Li Li
- Sema4, Stamford, CT 06902, United States
| | - Sarah J Ratcliffe
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22903, United States
| | - Melissa A Haendel
- College of Public Health and Human Sciences, Oregon State University, Corvallis, OR 97331, United States
| | - Rena C Patel
- Department of Medicine and Global Health, University of Washington, Seattle, WA 98105, United States
| | - Elaine L Hill
- Department of Obstetrics and Gynecology, University of Rochester Medical Center, Rochester, NY 14620, United States
- Department of Public Health Sciences, University of Rochester Medical Center, Rochester, NY 14618, United States
| |
Collapse
|
2
|
Reinhold WC, Wilson K, Elloumi F, Bradwell KR, Ceribelli M, Varma S, Wang Y, Duveau D, Menon N, Trepel J, Zhang X, Klumpp-Thomas C, Micheal S, Shinn P, Luna A, Thomas C, Pommier Y. CellMinerCDB: NCATS Is a Web-Based Portal Integrating Public Cancer Cell Line Databases for Pharmacogenomic Explorations. Cancer Res 2023; 83:1941-1952. [PMID: 37140427 PMCID: PMC10330642 DOI: 10.1158/0008-5472.can-22-2996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 02/27/2023] [Accepted: 04/25/2023] [Indexed: 05/05/2023]
Abstract
Major advances have been made in the field of precision medicine for treating cancer. However, many open questions remain that need to be answered to realize the goal of matching every patient with cancer to the most efficacious therapy. To facilitate these efforts, we have developed CellMinerCDB: National Center for Advancing Translational Sciences (NCATS; https://discover.nci.nih.gov/rsconnect/cellminercdb_ncats/), which makes available activity information for 2,675 drugs and compounds, including multiple nononcology drugs and 1,866 drugs and compounds unique to the NCATS. CellMinerCDB: NCATS comprises 183 cancer cell lines, with 72 unique to NCATS, including some from previously understudied tissues of origin. Multiple forms of data from different institutes are integrated, including single and combination drug activity, DNA copy number, methylation and mutation, transcriptome, protein levels, histone acetylation and methylation, metabolites, CRISPR, and miscellaneous signatures. Curation of cell lines and drug names enables cross-database (CDB) analyses. Comparison of the datasets is made possible by the overlap between cell lines and drugs across databases. Multiple univariate and multivariate analysis tools are built-in, including linear regression and LASSO. Examples have been presented here for the clinical topoisomerase I (TOP1) inhibitors topotecan and irinotecan/SN-38. This web application provides both substantial new data and significant pharmacogenomic integration, allowing exploration of interrelationships. SIGNIFICANCE CellMinerCDB: NCATS provides activity information for 2,675 drugs in 183 cancer cell lines and analysis tools to facilitate pharmacogenomic research and to identify determinants of response.
Collapse
Affiliation(s)
- William C. Reinhold
- Developmental Therapeutics Branch, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | - Kelli Wilson
- National Center for Advancing Translational Sciences, NIH Bethesda, MD 20892, USA
| | - Fathi Elloumi
- Developmental Therapeutics Branch, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | | | - Michele Ceribelli
- National Center for Advancing Translational Sciences, NIH Bethesda, MD 20892, USA
| | - Sudhir Varma
- Developmental Therapeutics Branch, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD 20892, USA
- HiThru Analytics LLC, Princeton, NJ 08540, USA
| | - Yanghsin Wang
- Developmental Therapeutics Branch, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD 20892, USA
- ICF International Inc., Fairfax, VA 22031, USA
| | - Damien Duveau
- National Center for Advancing Translational Sciences, NIH Bethesda, MD 20892, USA
| | - Nikhil Menon
- National Center for Advancing Translational Sciences, NIH Bethesda, MD 20892, USA
| | - Jane Trepel
- Developmental Therapeutics Branch, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | - Xiaohu Zhang
- National Center for Advancing Translational Sciences, NIH Bethesda, MD 20892, USA
| | | | - Samuel Micheal
- National Center for Advancing Translational Sciences, NIH Bethesda, MD 20892, USA
| | - Paul Shinn
- National Center for Advancing Translational Sciences, NIH Bethesda, MD 20892, USA
| | - Augustin Luna
- cBio Center, Dana-Farber Cancer Institute and Department of Cell Biology, Harvard Medical School, Boston, MA 02215, USA
| | - Craig Thomas
- National Center for Advancing Translational Sciences, NIH Bethesda, MD 20892, USA
| | - Yves Pommier
- Developmental Therapeutics Branch, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| |
Collapse
|
3
|
Jones S, Bradwell KR, Chan LE, Olson-Chen C, Tarleton J, Wilkins KJ, Qin Q, Faherty EG, Lau YK, Xie C, Kao YH, Liebman MN, Mariona F, Challa A, Li L, Ratcliffe SJ, McMurry JA, Haendel MA, Patel RC, Hill EL. Who is pregnant? defining real-world data-based pregnancy episodes in the National COVID Cohort Collaborative (N3C). medRxiv 2022:2022.08.04.22278439. [PMID: 35982668 PMCID: PMC9387155 DOI: 10.1101/2022.08.04.22278439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Objective To define pregnancy episodes and estimate gestational aging within electronic health record (EHR) data from the National COVID Cohort Collaborative (N3C). Materials and Methods We developed a comprehensive approach, named H ierarchy and rule-based pregnancy episode I nference integrated with P regnancy P rogression S ignatures (HIPPS) and applied it to EHR data in the N3C from 1 January 2018 to 7 April 2022. HIPPS combines: 1) an extension of a previously published pregnancy episode algorithm, 2) a novel algorithm to detect gestational aging-specific signatures of a progressing pregnancy for further episode support, and 3) pregnancy start date inference. Clinicians performed validation of HIPPS on a subset of episodes. We then generated three types of pregnancy cohorts based on the level of precision for gestational aging and pregnancy outcomes for comparison of COVID-19 and other characteristics. Results We identified 628,165 pregnant persons with 816,471 pregnancy episodes, of which 52.3% were live births, 24.4% were other outcomes (stillbirth, ectopic pregnancy, spontaneous abortions), and 23.3% had unknown outcomes. We were able to estimate start dates within one week of precision for 431,173 (52.8%) episodes. 66,019 (8.1%) episodes had incident COVID-19 during pregnancy. Across varying COVID-19 cohorts, patient characteristics were generally similar though pregnancy outcomes differed. Discussion HIPPS provides support for pregnancy-related variables based on EHR data for researchers to define pregnancy cohorts. Our approach performed well based on clinician validation. Conclusion We have developed a novel and robust approach for inferring pregnancy episodes and gestational aging that addresses data inconsistency and missingness in EHR data.
Collapse
Affiliation(s)
- Sara Jones
- Office of Data Science and Emerging Technologies, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Rockville, MD
| | | | - Lauren E Chan
- College of Public Health and Human Sciences, Oregon State University, Corvallis, OR
| | - Courtney Olson-Chen
- Department of Obstetrics and Gynecology, University of Rochester Medical Center, Rochester, NY
| | - Jessica Tarleton
- Department of Obstetrics and Gynecology, Medical University of South Carolina, Charleston, SC
| | - Kenneth J Wilkins
- Biostatistics Program, Office of the Director, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD
| | - Qiuyuan Qin
- Department of Public Health Sciences, University of Rochester Medical Center, Rochester, NY
| | | | | | - Catherine Xie
- Department of Public Health Sciences, University of Rochester Medical Center, Rochester, NY
| | | | | | - Federico Mariona
- Beaumont Hospital, Dearborn, MI
- Wayne State University, Detroit, MI
| | - Anup Challa
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, TN
| | | | - Sarah J Ratcliffe
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA
| | - Julie A McMurry
- Department of Biomedical Informatics, University of Colorado, Anschutz Medical Campus, Aurora, CO
| | - Melissa A Haendel
- Department of Biomedical Informatics, University of Colorado, Anschutz Medical Campus, Aurora, CO
| | - Rena C Patel
- Department of Medicine and Global Health, University of Washington, Seattle, WA
| | - Elaine L Hill
- Department of Obstetrics and Gynecology, University of Rochester Medical Center, Rochester, NY
- Department of Public Health Sciences, University of Rochester Medical Center, Rochester, NY
| |
Collapse
|
4
|
Reese JT, Coleman B, Chan L, Blau H, Callahan TJ, Cappelletti L, Fontana T, Bradwell KR, Harris NL, Casiraghi E, Valentini G, Karlebach G, Deer R, McMurry JA, Haendel MA, Chute CG, Pfaff E, Moffitt R, Spratt H, Singh JA, Mungall CJ, Williams AE, Robinson PN. NSAID use and clinical outcomes in COVID-19 patients: a 38-center retrospective cohort study. Virol J 2022; 19:84. [PMID: 35570298 PMCID: PMC9107579 DOI: 10.1186/s12985-022-01813-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 05/04/2022] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Non-steroidal anti-inflammatory drugs (NSAIDs) are commonly used to reduce pain, fever, and inflammation but have been associated with complications in community-acquired pneumonia. Observations shortly after the start of the COVID-19 pandemic in 2020 suggested that ibuprofen was associated with an increased risk of adverse events in COVID-19 patients, but subsequent observational studies failed to demonstrate increased risk and in one case showed reduced risk associated with NSAID use. METHODS A 38-center retrospective cohort study was performed that leveraged the harmonized, high-granularity electronic health record data of the National COVID Cohort Collaborative. A propensity-matched cohort of 19,746 COVID-19 inpatients was constructed by matching cases (treated with NSAIDs at the time of admission) and 19,746 controls (not treated) from 857,061 patients with COVID-19 available for analysis. The primary outcome of interest was COVID-19 severity in hospitalized patients, which was classified as: moderate, severe, or mortality/hospice. Secondary outcomes were acute kidney injury (AKI), extracorporeal membrane oxygenation (ECMO), invasive ventilation, and all-cause mortality at any time following COVID-19 diagnosis. RESULTS Logistic regression showed that NSAID use was not associated with increased COVID-19 severity (OR: 0.57 95% CI: 0.53-0.61). Analysis of secondary outcomes using logistic regression showed that NSAID use was not associated with increased risk of all-cause mortality (OR 0.51 95% CI: 0.47-0.56), invasive ventilation (OR: 0.59 95% CI: 0.55-0.64), AKI (OR: 0.67 95% CI: 0.63-0.72), or ECMO (OR: 0.51 95% CI: 0.36-0.7). In contrast, the odds ratios indicate reduced risk of these outcomes, but our quantitative bias analysis showed E-values of between 1.9 and 3.3 for these associations, indicating that comparatively weak or moderate confounder associations could explain away the observed associations. CONCLUSIONS Study interpretation is limited by the observational design. Recording of NSAID use may have been incomplete. Our study demonstrates that NSAID use is not associated with increased COVID-19 severity, all-cause mortality, invasive ventilation, AKI, or ECMO in COVID-19 inpatients. A conservative interpretation in light of the quantitative bias analysis is that there is no evidence that NSAID use is associated with risk of increased severity or the other measured outcomes. Our results confirm and extend analogous findings in previous observational studies using a large cohort of patients drawn from 38 centers in a nationally representative multicenter database.
Collapse
Affiliation(s)
- Justin T Reese
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| | - Ben Coleman
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
- Institute for Systems Genomics, University of Connecticut, Farmington, CT, USA
| | - Lauren Chan
- Translational and Integrative Sciences Center, Oregon State University, Corvallis, OR, USA
| | - Hannah Blau
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Tiffany J Callahan
- Computational Bioscience, University of Colorado Anschutz Medical Campus, Boulder, CO, USA
- Center for Health AI, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Luca Cappelletti
- AnacletoLab, Dipartimento Di Informatica, Università Degli Studi Di Milano, Milan, Italy
| | - Tommaso Fontana
- AnacletoLab, Dipartimento Di Informatica, Università Degli Studi Di Milano, Milan, Italy
| | | | - Nomi L Harris
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Elena Casiraghi
- AnacletoLab, Dipartimento Di Informatica, Università Degli Studi Di Milano, Milan, Italy
- CINI, National Laboratory in Artificial Intelligence and Intelligent Systems-AIIS, Rome, Italy
| | - Giorgio Valentini
- AnacletoLab, Dipartimento Di Informatica, Università Degli Studi Di Milano, Milan, Italy
- CINI, National Laboratory in Artificial Intelligence and Intelligent Systems-AIIS, Rome, Italy
| | - Guy Karlebach
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Rachel Deer
- University of Texas Medical Branch, Galveston, TX, USA
| | - Julie A McMurry
- Center for Health AI, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Melissa A Haendel
- Center for Health AI, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Christopher G Chute
- Schools of Medicine, Public Health, and Nursing, Johns Hopkins University, Baltimore, MD, USA
| | - Emily Pfaff
- North Carolina Translational and Clinical Sciences Institute (NC TraCS), University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Richard Moffitt
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Heidi Spratt
- University of Texas Medical Branch, Galveston, TX, USA
| | - Jasvinder A Singh
- University of Alabama at Birmingham, Birmingham, AL, USA
- Medicine Service, VA Medical Center, Birmingham, AL, USA
| | - Christopher J Mungall
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Andrew E Williams
- Tufts Medical Center Clinical and Translational Science Institute, Tufts Medical Center, Boston, MA, USA
- Institute for Clinical Research and Health Policy Studies, Tufts University School of Medicine, Boston, USA
- OHDSI Center at the Roux Institute, Northeastern University, Boston, USA
| | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
- Institute for Systems Genomics, University of Connecticut, Farmington, CT, USA.
| |
Collapse
|
5
|
Bradwell KR, Wooldridge JT, Amor B, Bennett TD, Anand A, Bremer C, Yoo YJ, Qian Z, Johnson SG, Pfaff ER, Girvin AT, Manna A, Niehaus EA, Hong SS, Zhang XT, Zhu RL, Bissell M, Qureshi N, Saltz J, Haendel MA, Chute CG, Lehmann HP, Moffitt RA. Harmonizing units and values of quantitative data elements in a very large nationally pooled electronic health record (EHR) dataset. J Am Med Inform Assoc 2022; 29:1172-1182. [PMID: 35435957 PMCID: PMC9196692 DOI: 10.1093/jamia/ocac054] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 03/25/2022] [Accepted: 04/08/2022] [Indexed: 11/24/2022] Open
Abstract
Objective The goals of this study were to harmonize data from electronic health records (EHRs) into common units, and impute units that were missing. Materials and Methods The National COVID Cohort Collaborative (N3C) table of laboratory measurement data—over 3.1 billion patient records and over 19 000 unique measurement concepts in the Observational Medical Outcomes Partnership (OMOP) common-data-model format from 55 data partners. We grouped ontologically similar OMOP concepts together for 52 variables relevant to COVID-19 research, and developed a unit-harmonization pipeline comprised of (1) selecting a canonical unit for each measurement variable, (2) arriving at a formula for conversion, (3) obtaining clinical review of each formula, (4) applying the formula to convert data values in each unit into the target canonical unit, and (5) removing any harmonized value that fell outside of accepted value ranges for the variable. For data with missing units for all the results within a lab test for a data partner, we compared values with pooled values of all data partners, using the Kolmogorov-Smirnov test. Results Of the concepts without missing values, we harmonized 88.1% of the values, and imputed units for 78.2% of records where units were absent (41% of contributors’ records lacked units). Discussion The harmonization and inference methods developed herein can serve as a resource for initiatives aiming to extract insight from heterogeneous EHR collections. Unique properties of centralized data are harnessed to enable unit inference. Conclusion The pipeline we developed for the pooled N3C data enables use of measurements that would otherwise be unavailable for analysis.
Collapse
Affiliation(s)
| | - Jacob T Wooldridge
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York, USA
| | | | - Tellen D Bennett
- Section of Informatics and Data Science, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado, Aurora, Colorado, USA
| | - Adit Anand
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York, USA
| | - Carolyn Bremer
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York, USA
| | - Yun Jae Yoo
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York, USA
| | - Zhenglong Qian
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York, USA
| | - Steven G Johnson
- Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Emily R Pfaff
- Department of Medicine, North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | | | - Amin Manna
- Palantir Technologies, Denver, Colorado, USA
| | | | - Stephanie S Hong
- School of Medicine, Section of Biomedical Informatics and Data Science, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | | | - Richard L Zhu
- Department of Medicine, Johns Hopkins, Baltimore, Maryland, USA
| | | | | | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York, USA
| | | | - Christopher G Chute
- Schools of Medicine, Public Health, and Nursing, Johns Hopkins University, Baltimore, Maryland, USA
| | | | - Richard A Moffitt
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York, USA
| |
Collapse
|
6
|
Pfaff ER, Girvin AT, Gabriel DL, Kostka K, Morris M, Palchuk MB, Lehmann HP, Amor B, Bissell M, Bradwell KR, Gold S, Hong SS, Loomba J, Manna A, McMurry JA, Niehaus E, Qureshi N, Walden A, Zhang XT, Zhu RL, Moffitt RA, Haendel MA, Chute CG, Adams WG, Al-Shukri S, Anzalone A, Baghal A, Bennett TD, Bernstam EV, Bernstam EV, Bissell MM, Bush B, Campion TR, Castro V, Chang J, Chaudhari DD, Chen W, Chu S, Cimino JJ, Crandall KA, Crooks M, Davies SJD, DiPalazzo J, Dorr D, Eckrich D, Eltinge SE, Fort DG, Golovko G, Gupta S, Haendel MA, Hajagos JG, Hanauer DA, Harnett BM, Horswell R, Huang N, Johnson SG, Kahn M, Khanipov K, Kieler C, Luzuriaga KRD, Maidlow S, Martinez A, Mathew J, McClay JC, McMahan G, Melancon B, Meystre S, Miele L, Morizono H, Pablo R, Patel L, Phuong J, Popham DJ, Pulgarin C, Santos C, Sarkar IN, Sazo N, Setoguchi S, Soby S, Surampalli S, Suver C, Vangala UMR, Visweswaran S, von Oehsen J, Walters KM, Wiley L, Williams DA, Zai A. Synergies between centralized and federated approaches to data quality: a report from the national COVID cohort collaborative. J Am Med Inform Assoc 2022; 29:609-618. [PMID: 34590684 PMCID: PMC8500110 DOI: 10.1093/jamia/ocab217] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 08/19/2021] [Accepted: 09/23/2021] [Indexed: 02/01/2023] Open
Abstract
OBJECTIVE In response to COVID-19, the informatics community united to aggregate as much clinical data as possible to characterize this new disease and reduce its impact through collaborative analytics. The National COVID Cohort Collaborative (N3C) is now the largest publicly available HIPAA limited dataset in US history with over 6.4 million patients and is a testament to a partnership of over 100 organizations. MATERIALS AND METHODS We developed a pipeline for ingesting, harmonizing, and centralizing data from 56 contributing data partners using 4 federated Common Data Models. N3C data quality (DQ) review involves both automated and manual procedures. In the process, several DQ heuristics were discovered in our centralized context, both within the pipeline and during downstream project-based analysis. Feedback to the sites led to many local and centralized DQ improvements. RESULTS Beyond well-recognized DQ findings, we discovered 15 heuristics relating to source Common Data Model conformance, demographics, COVID tests, conditions, encounters, measurements, observations, coding completeness, and fitness for use. Of 56 sites, 37 sites (66%) demonstrated issues through these heuristics. These 37 sites demonstrated improvement after receiving feedback. DISCUSSION We encountered site-to-site differences in DQ which would have been challenging to discover using federated checks alone. We have demonstrated that centralized DQ benchmarking reveals unique opportunities for DQ improvement that will support improved research analytics locally and in aggregate. CONCLUSION By combining rapid, continual assessment of DQ with a large volume of multisite data, it is possible to support more nuanced scientific questions with the scale and rigor that they require.
Collapse
Affiliation(s)
- Emily R Pfaff
- Department of Medicine, UNC Chapel Hill School of Medicine, Chapel Hill, North Carolina, USA
| | | | - Davera L Gabriel
- Section of Biomedical Informatics and Data Science, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Kristin Kostka
- The OHDSI Center at the Roux Institute, Northeastern University, Portland, Maine, USA
| | - Michele Morris
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | | | - Harold P Lehmann
- Department of Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
| | | | | | | | - Sigfried Gold
- Section of Biomedical Informatics and Data Science, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Stephanie S Hong
- Section of Biomedical Informatics and Data Science, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | | | - Amin Manna
- Palantir Technologies, Denver, Colorado, USA
| | - Julie A McMurry
- Center for Health AI, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | | | | | - Anita Walden
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
| | | | - Richard L Zhu
- Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Richard A Moffitt
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York, USA
| | - Melissa A Haendel
- University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Christopher G Chute
- Schools of Medicine, Public Health, and Nursing, Johns Hopkins University, Baltimore, Maryland, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Martin B, DeWitt PE, Russell S, Anand A, Bradwell KR, Bremer C, Gabriel D, Girvin AT, Hajagos JG, McMurry JA, Neumann AJ, Pfaff ER, Walden A, Wooldridge JT, Yoo YJ, Saltz J, Gersing KR, Chute CG, Haendel MA, Moffitt R, Bennett TD. Characteristics, Outcomes, and Severity Risk Factors Associated With SARS-CoV-2 Infection Among Children in the US National COVID Cohort Collaborative. JAMA Netw Open 2022; 5:e2143151. [PMID: 35133437 PMCID: PMC8826172 DOI: 10.1001/jamanetworkopen.2021.43151] [Citation(s) in RCA: 85] [Impact Index Per Article: 42.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 11/15/2021] [Indexed: 01/20/2023] Open
Abstract
Importance Understanding of SARS-CoV-2 infection in US children has been limited by the lack of large, multicenter studies with granular data. Objective To examine the characteristics, changes over time, outcomes, and severity risk factors of children with SARS-CoV-2 within the National COVID Cohort Collaborative (N3C). Design, Setting, and Participants A prospective cohort study of encounters with end dates before September 24, 2021, was conducted at 56 N3C facilities throughout the US. Participants included children younger than 19 years at initial SARS-CoV-2 testing. Main Outcomes and Measures Case incidence and severity over time, demographic and comorbidity severity risk factors, vital sign and laboratory trajectories, clinical outcomes, and acute COVID-19 vs multisystem inflammatory syndrome in children (MIS-C), and Delta vs pre-Delta variant differences for children with SARS-CoV-2. Results A total of 1 068 410 children were tested for SARS-CoV-2 and 167 262 test results (15.6%) were positive (82 882 [49.6%] girls; median age, 11.9 [IQR, 6.0-16.1] years). Among the 10 245 children (6.1%) who were hospitalized, 1423 (13.9%) met the criteria for severe disease: mechanical ventilation (796 [7.8%]), vasopressor-inotropic support (868 [8.5%]), extracorporeal membrane oxygenation (42 [0.4%]), or death (131 [1.3%]). Male sex (odds ratio [OR], 1.37; 95% CI, 1.21-1.56), Black/African American race (OR, 1.25; 95% CI, 1.06-1.47), obesity (OR, 1.19; 95% CI, 1.01-1.41), and several pediatric complex chronic condition (PCCC) subcategories were associated with higher severity disease. Vital signs and many laboratory test values from the day of admission were predictive of peak disease severity. Variables associated with increased odds for MIS-C vs acute COVID-19 included male sex (OR, 1.59; 95% CI, 1.33-1.90), Black/African American race (OR, 1.44; 95% CI, 1.17-1.77), younger than 12 years (OR, 1.81; 95% CI, 1.51-2.18), obesity (OR, 1.76; 95% CI, 1.40-2.22), and not having a pediatric complex chronic condition (OR, 0.72; 95% CI, 0.65-0.80). The children with MIS-C had a more inflammatory laboratory profile and severe clinical phenotype, with higher rates of invasive ventilation (117 of 707 [16.5%] vs 514 of 8241 [6.2%]; P < .001) and need for vasoactive-inotropic support (191 of 707 [27.0%] vs 426 of 8241 [5.2%]; P < .001) compared with those who had acute COVID-19. Comparing children during the Delta vs pre-Delta eras, there was no significant change in hospitalization rate (1738 [6.0%] vs 8507 [6.2%]; P = .18) and lower odds for severe disease (179 [10.3%] vs 1242 [14.6%]) (decreased by a factor of 0.67; 95% CI, 0.57-0.79; P < .001). Conclusions and Relevance In this cohort study of US children with SARS-CoV-2, there were observed differences in demographic characteristics, preexisting comorbidities, and initial vital sign and laboratory values between severity subgroups. Taken together, these results suggest that early identification of children likely to progress to severe disease could be achieved using readily available data elements from the day of admission. Further work is needed to translate this knowledge into improved outcomes.
Collapse
Affiliation(s)
- Blake Martin
- Section of Critical Care Medicine, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado, Aurora
| | - Peter E. DeWitt
- Section of Informatics and Data Science, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado, Aurora
| | - Seth Russell
- Section of Informatics and Data Science, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado, Aurora
| | - Adit Anand
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York
| | | | - Carolyn Bremer
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York
| | - Davera Gabriel
- Johns Hopkins University School of Medicine, Baltimore, Maryland
| | | | - Janos G. Hajagos
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York
| | - Julie A. McMurry
- Translational and Integrative Sciences Center, University of Colorado, Aurora
- Center for Health AI, University of Colorado, Aurora
| | - Andrew J. Neumann
- Translational and Integrative Sciences Center, University of Colorado, Aurora
- Center for Health AI, University of Colorado, Aurora
| | - Emily R. Pfaff
- North Carolina Translational and Clinical Sciences Institute), University of North Carolina at Chapel Hill, Chapel Hill
| | - Anita Walden
- Center for Health AI, University of Colorado, Aurora
| | - Jacob T. Wooldridge
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York
| | - Yun Jae Yoo
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York
| | - Ken R. Gersing
- National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, Maryland
| | - Christopher G. Chute
- Johns Hopkins University School of Medicine, Baltimore, Maryland
- Schools of Public Health, and Nursing, Johns Hopkins University, Baltimore, Maryland
| | | | - Richard Moffitt
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York
| | - Tellen D. Bennett
- Section of Critical Care Medicine, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado, Aurora
- Section of Informatics and Data Science, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado, Aurora
| |
Collapse
|
8
|
Zhou AE, Shah ZV, Bradwell KR, Munro JB, Berry AA, Serre D, Takala-Harrison S, O'Connor TD, Silva JC, Travassos MA. STRIDE: a command-line HMM-based identifier and sub-classifier of Plasmodium falciparum RIFIN and STEVOR variant surface antigen families. BMC Bioinformatics 2022; 23:15. [PMID: 34991452 PMCID: PMC8733436 DOI: 10.1186/s12859-021-04515-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Accepted: 12/06/2021] [Indexed: 01/04/2023] Open
Abstract
Background RIFINs and STEVORs are variant surface antigens expressed by P. falciparum that play roles in severe malaria pathogenesis and immune evasion. These two highly diverse multigene families feature multiple paralogs, making their classification challenging using traditional bioinformatic methods. Results STRIDE (STevor and RIfin iDEntifier) is an HMM-based, command-line program that automates the identification and classification of RIFIN and STEVOR protein sequences in the malaria parasite Plasmodium falciparum. STRIDE is more sensitive in detecting RIFINs and STEVORs than available PFAM and TIGRFAM tools and reports RIFIN subtypes and the number of sequences with a FHEYDER amino acid motif, which has been associated with severe malaria pathogenesis. Conclusions STRIDE will be beneficial to malaria research groups analyzing genome sequences and transcripts of clinical field isolates, providing insight into parasite biology and virulence. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04515-8.
Collapse
Affiliation(s)
- Albert E Zhou
- Malaria Research Program, Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Zalak V Shah
- Malaria Research Program, Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Katie R Bradwell
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - James B Munro
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Andrea A Berry
- Malaria Research Program, Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - David Serre
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Shannon Takala-Harrison
- Malaria Research Program, Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Timothy D O'Connor
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA.,Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Joana C Silva
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA.,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Mark A Travassos
- Malaria Research Program, Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
9
|
Martin B, DeWitt PE, Russell S, Anand A, Bradwell KR, Bremer C, Gabriel D, Girvin AT, Hajagos JG, McMurry JA, Neumann AJ, Pfaff ER, Walden A, Wooldridge JT, Yoo YJ, Saltz J, Gersing KR, Chute CG, Haendel MA, Moffitt R, Bennett TD. Children with SARS-CoV-2 in the National COVID Cohort Collaborative (N3C). medRxiv 2021:2021.07.19.21260767. [PMID: 34341796 PMCID: PMC8328064 DOI: 10.1101/2021.07.19.21260767] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
IMPORTANCE SARS-CoV-2. OBJECTIVE To determine the characteristics, changes over time, outcomes, and severity risk factors of SARS-CoV-2 affected children within the National COVID Cohort Collaborative (N3C). DESIGN Prospective cohort study of patient encounters with end dates before May 27th, 2021. SETTING 45 N3C institutions. PARTICIPANTS Children <19-years-old at initial SARS-CoV-2 testing. MAIN OUTCOMES AND MEASURES Case incidence and severity over time, demographic and comorbidity severity risk factors, vital sign and laboratory trajectories, clinical outcomes, and acute COVID-19 vs MIS-C contrasts for children infected with SARS-CoV-2. RESULTS 728,047 children in the N3C were tested for SARS-CoV-2; of these, 91,865 (12.6%) were positive. Among the 5,213 (6%) hospitalized children, 685 (13%) met criteria for severe disease: mechanical ventilation (7%), vasopressor/inotropic support (7%), ECMO (0.6%), or death/discharge to hospice (1.1%). Male gender, African American race, older age, and several pediatric complex chronic condition (PCCC) subcategories were associated with higher clinical severity (p ≤ 0.05). Vital signs (all p≤0.002) and many laboratory tests from the first day of hospitalization were predictive of peak disease severity. Children with severe (vs moderate) disease were more likely to receive antimicrobials (71% vs 32%, p<0.001) and immunomodulatory medications (53% vs 16%, p<0.001). Compared to those with acute COVID-19, children with MIS-C were more likely to be male, Black/African American, 1-to-12-years-old, and less likely to have asthma, diabetes, or a PCCC (p < 0.04). MIS-C cases demonstrated a more inflammatory laboratory profile and more severe clinical phenotype with higher rates of invasive ventilation (12% vs 6%) and need for vasoactive-inotropic support (31% vs 6%) compared to acute COVID-19 cases, respectively (p<0.03). CONCLUSIONS In the largest U.S. SARS-CoV-2-positive pediatric cohort to date, we observed differences in demographics, pre-existing comorbidities, and initial vital sign and laboratory test values between severity subgroups. Taken together, these results suggest that early identification of children likely to progress to severe disease could be achieved using readily available data elements from the day of admission. Further work is needed to translate this knowledge into improved outcomes.
Collapse
Affiliation(s)
- Blake Martin
- Section of Critical Care Medicine, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado, Aurora, CO, USA
| | - Peter E. DeWitt
- Section of Informatics and Data Science, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado, Aurora, CO, USA
| | - Seth Russell
- Section of Informatics and Data Science, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado, Aurora, CO, USA
| | - Adit Anand
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | | | - Carolyn Bremer
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Davera Gabriel
- Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | | | - Janos G. Hajagos
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Julie A. McMurry
- Translational and Integrative Sciences Center, University of Colorado, Aurora, CO, USA,Center for Health AI, University of Colorado, Aurora, CO, USA
| | - Andrew J. Neumann
- Translational and Integrative Sciences Center, University of Colorado, Aurora, CO, USA,Center for Health AI, University of Colorado, Aurora, CO, USA
| | - Emily R. Pfaff
- North Carolina Translational and Clinical Sciences Institute (NC TraCS), University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Anita Walden
- Center for Health AI, University of Colorado, Aurora, CO, USA
| | - Jacob T. Wooldridge
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Yun Jae Yoo
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Ken R. Gersing
- National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, MD, USA
| | - Christopher G. Chute
- Johns Hopkins University School of Medicine, Baltimore, MD, USA,Schools of Public Health, and Nursing, Johns Hopkins University, Baltimore, MD, USA
| | | | - Richard Moffitt
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Tellen D. Bennett
- Section of Critical Care Medicine, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado, Aurora, CO, USA,Section of Informatics and Data Science, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado, Aurora, CO, USA
| |
Collapse
|
10
|
Bradwell KR, Koparde VN, Matveyev AV, Serrano MG, Alves JMP, Parikh H, Huang B, Lee V, Espinosa-Alvarez O, Ortiz PA, Costa-Martins AG, Teixeira MMG, Buck GA. Genomic comparison of Trypanosoma conorhini and Trypanosoma rangeli to Trypanosoma cruzi strains of high and low virulence. BMC Genomics 2018; 19:770. [PMID: 30355302 PMCID: PMC6201504 DOI: 10.1186/s12864-018-5112-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 09/25/2018] [Indexed: 01/09/2023] Open
Abstract
Background Trypanosoma conorhini and Trypanosoma rangeli, like Trypanosoma cruzi, are kinetoplastid protist parasites of mammals displaying divergent hosts, geographic ranges and lifestyles. Largely nonpathogenic T. rangeli and T. conorhini represent clades that are phylogenetically closely related to the T. cruzi and T. cruzi-like taxa and provide insights into the evolution of pathogenicity in those parasites. T. rangeli, like T. cruzi is endemic in many Latin American countries, whereas T. conorhini is tropicopolitan. T. rangeli and T. conorhini are exclusively extracellular, while T. cruzi has an intracellular stage in the mammalian host. Results Here we provide the first comprehensive sequence analysis of T. rangeli AM80 and T. conorhini 025E, and provide a comparison of their genomes to those of T. cruzi G and T. cruzi CL, respectively members of T. cruzi lineages TcI and TcVI. We report de novo assembled genome sequences of the low-virulent T. cruzi G, T. rangeli AM80, and T. conorhini 025E ranging from ~ 21–25 Mbp, with ~ 10,000 to 13,000 genes, and for the highly virulent and hybrid T. cruzi CL we present a ~ 65 Mbp in-house assembled haplotyped genome with ~ 12,500 genes per haplotype. Single copy orthologs of the two T. cruzi strains exhibited ~ 97% amino acid identity, and ~ 78% identity to proteins of T. rangeli or T. conorhini. Proteins of the latter two organisms exhibited ~ 84% identity. T. cruzi CL exhibited the highest heterozygosity. T. rangeli and T. conorhini displayed greater metabolic capabilities for utilization of complex carbohydrates, and contained fewer retrotransposons and multigene family copies, i.e. trans-sialidases, mucins, DGF-1, and MASP, compared to T. cruzi. Conclusions Our analyses of the T. rangeli and T. conorhini genomes closely reflected their phylogenetic proximity to the T. cruzi clade, and were largely consistent with their divergent life cycles. Our results provide a greater context for understanding the life cycles, host range expansion, immunity evasion, and pathogenesis of these trypanosomatids. Electronic supplementary material The online version of this article (10.1186/s12864-018-5112-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Katie R Bradwell
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA, USA.,Present address: Institute for Genome Sciences, University of Maryland, Baltimore, MD, USA
| | - Vishal N Koparde
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA, USA
| | - Andrey V Matveyev
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA, USA.,Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA, USA
| | - Myrna G Serrano
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA, USA.,Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA, USA
| | - João M P Alves
- Department of Parasitology, ICB, University of São Paulo, São Paulo, SP, Brazil
| | - Hardik Parikh
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA, USA.,Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA, USA
| | - Bernice Huang
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA, USA.,Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA, USA
| | - Vladimir Lee
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA, USA
| | | | - Paola A Ortiz
- Department of Parasitology, ICB, University of São Paulo, São Paulo, SP, Brazil
| | | | - Marta M G Teixeira
- Department of Parasitology, ICB, University of São Paulo, São Paulo, SP, Brazil
| | - Gregory A Buck
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA, USA. .,Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
11
|
Castro J, França A, Bradwell KR, Serrano MG, Jefferson KK, Cerca N. Comparative transcriptomic analysis of Gardnerella vaginalis biofilms vs. planktonic cultures using RNA-seq. NPJ Biofilms Microbiomes 2017; 3:3. [PMID: 28649404 PMCID: PMC5460279 DOI: 10.1038/s41522-017-0012-7] [Citation(s) in RCA: 57] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Revised: 11/17/2016] [Accepted: 11/22/2016] [Indexed: 01/18/2023] Open
Abstract
Bacterial vaginosis is the most common gynecological disorder affecting women of reproductive age. Bacterial vaginosis is frequently associated with the development of a Gardnerella vaginalis biofilm. Recent data indicates that G. vaginalis biofilms are more tolerant to antibiotics and are able to incorporate other bacterial vaginosis -associated species, yielding a multi-species biofilm. However, despite its apparent role in bacterial vaginosis, little is known regarding the molecular determinants involved in biofilm formation by G. vaginalis. To gain insight into the role of G. vaginalis in the pathogenesis of bacterial vaginosis, we carried out comparative transcriptomic analysis between planktonic and biofilm phenotypes, using RNA-sequencing. Significant differences were found in the expression levels of 815 genes. A detailed analysis of the results obtained was performed based on direct and functional gene interactions. Similar to other bacterial species, expression of genes involved in antimicrobial resistance were elevated in biofilm cells. In addition, our data indicate that G. vaginalis biofilms assume a characteristic response to stress and starvation conditions. The abundance of transcripts encoding proteins involved in glucose and carbon metabolism was reduced in biofilms. Surprisingly, transcript levels of vaginolysin were reduced in biofilms relative to planktonic cultures. Overall, our data revealed that gene-regulated processes in G. vaginalis biofilms resulted in a protected form of bacterial growth, characterized by low metabolic activity. This phenotype may contribute towards the chronic and recurrent nature of bacterial vaginosis. This suggests that G. vaginalis is capable of drastically adjusting its phenotype through an extensive change of gene expression.
Collapse
Affiliation(s)
- Joana Castro
- Centre of Biological Engineering (CEB), Laboratory of Research in Biofilms Rosário Oliveira (LIBRO), University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal
- Instituto de Ciências Biomédicas Abel Salazar (ICBAS), University of Porto, Rua de Jorge Viterbo Ferreira 228, 4050-313 Porto, Portugal
- Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA 23298-0678c USA
| | - Angela França
- Centre of Biological Engineering (CEB), Laboratory of Research in Biofilms Rosário Oliveira (LIBRO), University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal
| | - Katie R. Bradwell
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA 23284 USA
| | - Myrna G. Serrano
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA 23284 USA
| | - Kimberly K. Jefferson
- Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA 23298-0678c USA
| | - Nuno Cerca
- Centre of Biological Engineering (CEB), Laboratory of Research in Biofilms Rosário Oliveira (LIBRO), University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal
| |
Collapse
|