1
|
Abstract
Machine learning is increasingly important in microbiology where it is used for tasks such as predicting antibiotic resistance and associating human microbiome features with complex host diseases. The applications in microbiology are quickly expanding and the machine learning tools frequently used in basic and clinical research range from classification and regression to clustering and dimensionality reduction. In this Review, we examine the main machine learning concepts, tasks and applications that are relevant for experimental and clinical microbiologists. We provide the minimal toolbox for a microbiologist to be able to understand, interpret and use machine learning in their experimental and translational activities.
Collapse
Affiliation(s)
- Francesco Asnicar
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy
| | - Andrew Maltez Thomas
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy
| | - Andrea Passerini
- Department of Information Engineering and Computer Science, University of Trento, Trento, Italy
| | - Levi Waldron
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy.
- Department of Epidemiology and Biostatistics, City University of New York, New York, NY, USA.
| | - Nicola Segata
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy.
- Department of Experimental Oncology, European Institute of Oncology IRCCS, Milan, Italy.
| |
Collapse
|
2
|
Bhosle A, Bae S, Zhang Y, Chun E, Avila-Pacheco J, Geistlinger L, Pishchany G, Glickman JN, Michaud M, Waldron L, Clish CB, Xavier RJ, Vlamakis H, Franzosa EA, Garrett WS, Huttenhower C. Integrated annotation prioritizes metabolites with bioactivity in inflammatory bowel disease. Mol Syst Biol 2024; 20:338-361. [PMID: 38467837 PMCID: PMC10987656 DOI: 10.1038/s44320-024-00027-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 02/13/2024] [Accepted: 02/15/2024] [Indexed: 03/13/2024] Open
Abstract
Microbial biochemistry is central to the pathophysiology of inflammatory bowel diseases (IBD). Improved knowledge of microbial metabolites and their immunomodulatory roles is thus necessary for diagnosis and management. Here, we systematically analyzed the chemical, ecological, and epidemiological properties of ~82k metabolic features in 546 Integrative Human Microbiome Project (iHMP/HMP2) metabolomes, using a newly developed methodology for bioactive compound prioritization from microbial communities. This suggested >1000 metabolic features as potentially bioactive in IBD and associated ~43% of prevalent, unannotated features with at least one well-characterized metabolite, thereby providing initial information for further characterization of a significant portion of the fecal metabolome. Prioritized features included known IBD-linked chemical families such as bile acids and short-chain fatty acids, and less-explored bilirubin, polyamine, and vitamin derivatives, and other microbial products. One of these, nicotinamide riboside, reduced colitis scores in DSS-treated mice. The method, MACARRoN, is generalizable with the potential to improve microbial community characterization and provide therapeutic candidates.
Collapse
Affiliation(s)
- Amrisha Bhosle
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Sena Bae
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Yancong Zhang
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Eunyoung Chun
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | | | - Ludwig Geistlinger
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY, USA
- Center for Computational Biomedicine, Harvard Medical School, Boston, MA, USA
| | - Gleb Pishchany
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jonathan N Glickman
- Beth Israel Deaconess Medical Center, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Monia Michaud
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Levi Waldron
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY, USA
| | - Clary B Clish
- Metabolomics Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ramnik J Xavier
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gastrointestinal Unit and Center for the Study of Inflammatory Bowel Disease, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Hera Vlamakis
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Eric A Franzosa
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Wendy S Garrett
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Curtis Huttenhower
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA.
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA.
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA, USA.
| |
Collapse
|
3
|
Björk JR, Bolte LA, Maltez Thomas A, Lee KA, Rossi N, Wind TT, Smit LM, Armanini F, Asnicar F, Blanco-Miguez A, Board R, Calbet-Llopart N, Derosa L, Dhomen N, Brooks K, Harland M, Harries M, Lorigan P, Manghi P, Marais R, Newton-Bishop J, Nezi L, Pinto F, Potrony M, Puig S, Serra-Bellver P, Shaw HM, Tamburini S, Valpione S, Waldron L, Zitvogel L, Zolfo M, de Vries EGE, Nathan P, Fehrmann RSN, Spector TD, Bataille V, Segata N, Hospers GAP, Weersma RK. Longitudinal gut microbiome changes in immune checkpoint blockade-treated advanced melanoma. Nat Med 2024; 30:785-796. [PMID: 38365950 PMCID: PMC10957474 DOI: 10.1038/s41591-024-02803-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 01/03/2024] [Indexed: 02/18/2024]
Abstract
Multiple clinical trials targeting the gut microbiome are being conducted to optimize treatment outcomes for immune checkpoint blockade (ICB). To improve the success of these interventions, understanding gut microbiome changes during ICB is urgently needed. Here through longitudinal microbiome profiling of 175 patients treated with ICB for advanced melanoma, we show that several microbial species-level genome bins (SGBs) and pathways exhibit distinct patterns from baseline in patients achieving progression-free survival (PFS) of 12 months or longer (PFS ≥12) versus patients with PFS shorter than 12 months (PFS <12). Out of 99 SGBs that could discriminate between these two groups, 20 were differentially abundant only at baseline, while 42 were differentially abundant only after treatment initiation. We identify five and four SGBs that had consistently higher abundances in patients with PFS ≥12 and <12 months, respectively. Constructing a log ratio of these SGBs, we find an association with overall survival. Finally, we find different microbial dynamics in different clinical contexts including the type of ICB regimen, development of immune-related adverse events and concomitant medication use. Insights into the longitudinal dynamics of the gut microbiome in association with host factors and treatment regimens will be critical for guiding rational microbiome-targeted therapies aimed at enhancing ICB efficacy.
Collapse
Affiliation(s)
- Johannes R Björk
- Department of Gastroenterology and Hepatology, University of Groningen and University Medical Center Groningen, Groningen, the Netherlands.
| | - Laura A Bolte
- Department of Gastroenterology and Hepatology, University of Groningen and University Medical Center Groningen, Groningen, the Netherlands
| | - Andrew Maltez Thomas
- Department of CellularComputational and Integrative Biology, University of Trento, Trento, Italy
| | - Karla A Lee
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Niccolo Rossi
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Thijs T Wind
- Department of Medical Oncology, Groningen University of Groningen and University Medical Center Groningen, Groningent, the Netherlands
| | - Lotte M Smit
- Department of Medical Oncology, Groningen University of Groningen and University Medical Center Groningen, Groningent, the Netherlands
| | - Federica Armanini
- Department of CellularComputational and Integrative Biology, University of Trento, Trento, Italy
| | - Francesco Asnicar
- Department of CellularComputational and Integrative Biology, University of Trento, Trento, Italy
| | - Aitor Blanco-Miguez
- Department of CellularComputational and Integrative Biology, University of Trento, Trento, Italy
| | - Ruth Board
- Department of Oncology, Lancashire Teaching Hospitals NHS Trust, Preston, UK
| | - Neus Calbet-Llopart
- Department of Dermatology, Melanoma Group, Hospital Clínic Barcelona, IDIBAPS, Universitat de Barcelona, Barcelona, Spain
- Centro de Investigación Biomédica en Red en Enfermedades Raras, Instituto de Salud Carlos III, Barcelona, Spain
| | - Lisa Derosa
- Gustave Roussy Cancer Center, U1015 INSERM and Oncobiome Network, University Paris Saclay, Villejuif-Grand-Paris, France
| | - Nathalie Dhomen
- Division of Immunology, Immunity to Infection and Respiratory Medicine, University of Manchester, Manchester, UK
| | - Kelly Brooks
- Division of Immunology, Immunity to Infection and Respiratory Medicine, University of Manchester, Manchester, UK
| | - Mark Harland
- Division of Haematology and Immunology, Institute of Medical Research at St. James's, University of Leeds, Leeds, UK
| | - Mark Harries
- Department of Medical Oncology, Guys Cancer Centre, Guy's and St Thomas' NHS Trust, London, UK
- Biochemical and Molecular Genetics Department, Hospital Clínic de Barcelona and IDIBAPS, University of Barcelona, Barcelona, Spain
| | - Paul Lorigan
- The Christie NHS Foundation Trust, Manchester, UK
- Division of Cancer Sciences, University of Manchester, Manchester, UK
| | - Paolo Manghi
- Department of CellularComputational and Integrative Biology, University of Trento, Trento, Italy
| | - Richard Marais
- Molecular Oncology Group, Cancer Research UK Manchester Institute, University of Manchester, Manchester, UK
| | - Julia Newton-Bishop
- Division of Haematology and Immunology, Institute of Medical Research at St. James's, University of Leeds, Leeds, UK
| | - Luigi Nezi
- European Institute of Oncology (Istituto Europeo di Oncologia), Milan, Italy
| | - Federica Pinto
- Department of CellularComputational and Integrative Biology, University of Trento, Trento, Italy
| | - Miriam Potrony
- Centro de Investigación Biomédica en Red en Enfermedades Raras, Instituto de Salud Carlos III, Barcelona, Spain
- Biochemical and Molecular Genetics Department, Hospital Clínic de Barcelona and IDIBAPS, University of Barcelona, Barcelona, Spain
| | - Susana Puig
- Department of Dermatology, Melanoma Group, Hospital Clínic Barcelona, IDIBAPS, Universitat de Barcelona, Barcelona, Spain
- Centro de Investigación Biomédica en Red en Enfermedades Raras, Instituto de Salud Carlos III, Barcelona, Spain
| | | | - Heather M Shaw
- Department of Medical Oncology, Mount Vernon Cancer Centre, East and North Herts NHS Trust, Northwood, UK
| | - Sabrina Tamburini
- European Institute of Oncology (Istituto Europeo di Oncologia), Milan, Italy
| | - Sara Valpione
- Division of Immunology, Immunity to Infection and Respiratory Medicine, University of Manchester, Manchester, UK
- The Christie NHS Foundation Trust, Manchester, UK
| | - Levi Waldron
- Department of CellularComputational and Integrative Biology, University of Trento, Trento, Italy
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY, USA
| | - Laurence Zitvogel
- Gustave Roussy Cancer Center, U1015 INSERM and Oncobiome Network, University Paris Saclay, Villejuif-Grand-Paris, France
| | - Moreno Zolfo
- Department of CellularComputational and Integrative Biology, University of Trento, Trento, Italy
| | - Elisabeth G E de Vries
- Department of Medical Oncology, Groningen University of Groningen and University Medical Center Groningen, Groningent, the Netherlands
| | - Paul Nathan
- Biochemical and Molecular Genetics Department, Hospital Clínic de Barcelona and IDIBAPS, University of Barcelona, Barcelona, Spain
- Department of Medical Oncology, Mount Vernon Cancer Centre, East and North Herts NHS Trust, Northwood, UK
| | - Rudolf S N Fehrmann
- Department of Medical Oncology, Groningen University of Groningen and University Medical Center Groningen, Groningent, the Netherlands
| | - Tim D Spector
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Véronique Bataille
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
- Department of Dermatology, Mount Vernon Cancer Centre, Northwood, UK
- Department of Dermatology, Hemel Hempstead Hospital, West Hertfordshire NHS Trust, Hemel Hempstead, UK
| | - Nicola Segata
- Department of CellularComputational and Integrative Biology, University of Trento, Trento, Italy
- European Institute of Oncology (Istituto Europeo di Oncologia), Milan, Italy
| | - Geke A P Hospers
- Department of Medical Oncology, Groningen University of Groningen and University Medical Center Groningen, Groningent, the Netherlands
| | - Rinse K Weersma
- Department of Gastroenterology and Hepatology, University of Groningen and University Medical Center Groningen, Groningen, the Netherlands.
| |
Collapse
|
4
|
Basso M, Gori A, Nardella C, Palviainen M, Holcar M, Sotiropoulos I, Bobis‐Wozowicz S, D'Agostino VG, Casarotto E, Ciani Y, Suetsugu S, Gualerzi A, Martin‐Jaular L, Boselli D, Kashkanova A, Parisse P, Lippens L, Pagliuca M, Blessing M, Frigerio R, Fourniols T, Meliciano A, Fietta A, Fioretti PV, Soroczyńska K, Picciolini S, Salviano‐Silva A, Bergese P, Zocco D, Chiari M, Jenster G, Waldron L, Milosavljevic A, Nolan J, Monopoli MP, Witwer KW, Bussolati B, Di Vizio D, Falcon Perez J, Lenassi M, Cretich M, Demichelis F. International Society for Extracellular Vesicles Workshop. QuantitatEVs: multiscale analyses, from bulk to single extracellular vesicle. J Extracell Biol 2024; 3:e137. [PMID: 38405579 PMCID: PMC10883470 DOI: 10.1002/jex2.137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 12/19/2023] [Accepted: 12/23/2023] [Indexed: 02/27/2024]
Abstract
The 'QuantitatEVs: multiscale analyses, from bulk to single vesicle' workshop aimed to discuss quantitative strategies and harmonized wet and computational approaches toward the comprehensive analysis of extracellular vesicles (EVs) from bulk to single vesicle analyses with a special focus on emerging technologies. The workshop covered the key issues in the quantitative analysis of different EV-associated molecular components and EV biophysical features, which are considered the core of EV-associated biomarker discovery and validation for their clinical translation. The in-person-only workshop was held in Trento, Italy, from January 31st to February 2nd, 2023, and continued in Milan on February 3rd with "Next Generation EVs", a satellite event dedicated to early career researchers (ECR). This report summarizes the main topics and outcomes of the workshop.
Collapse
Affiliation(s)
- Manuela Basso
- Department of Cellular, Computational, and Integrative Biology (CIBIO)University of TrentoTrentoItaly
| | - Alessandro Gori
- National Research Council of ItalyIstituto di Scienze e Tecnologie Chimiche (SCITEC‐CNR)MilanItaly
| | - Caterina Nardella
- Department of Cellular, Computational, and Integrative Biology (CIBIO)University of TrentoTrentoItaly
| | - Mari Palviainen
- EV group, Molecular and Integrative Biosciences Research Program, Faculty of Biological and Environmental SciencesUniversity of HelsinkiHelsinkiFinland
| | - Marija Holcar
- Institute of Biochemistry and Molecular Genetics, Faculty of MedicineUniversity of LjubljanaLjubljanaSlovenia
| | - Ioannis Sotiropoulos
- Institute of Biosciences & ApplicationsNational Center for Scientific Research (NCSR) DemokritosParaskeviGreece
| | - Sylwia Bobis‐Wozowicz
- Faculty of Biochemistry, Biophysics and Biotechnology, Department of Cell BiologyJagiellonian UniversityKrakowPoland
| | - Vito G. D'Agostino
- Department of Cellular, Computational, and Integrative Biology (CIBIO)University of TrentoTrentoItaly
| | - Elena Casarotto
- Dipartimento di Scienze Farmacologiche e Biomolecolari “Rodolfo Paoletti” (DiSFeB), Dipartimento di EccellenzaUniversità degli Studi di MilanoMilanItaly
| | - Yari Ciani
- Department of Cellular, Computational, and Integrative Biology (CIBIO)University of TrentoTrentoItaly
| | - Shiro Suetsugu
- Division of Biological ScienceGraduate School of Science and Technology, Nara Institute of Science and TechnologyIkomaJapan
| | | | | | - Daniela Boselli
- FRACTAL (Flow Cytometry Resource, Advanced Cytometry Technical Applications Laboratory)San Raffaele Scientific InstituteMilanItaly
| | - Anna Kashkanova
- Max Planck Institute for the Science of LightErlangenGermany
| | - Pietro Parisse
- National Research Council of Italy, Istituto Officina dei Materiali (IOM‐CNR)TriesteItaly
| | - Lien Lippens
- Department of Human Structure and Repair, Laboratory of Experimental Cancer ResearchGhent UniversityGhentBelgium
- Cancer Research Institute GhentGhentBelgium
| | - Martina Pagliuca
- Molecular Predictors and New Targets in OncologyGustave RoussyVillejuifFrance
- Clinical and Translational OncologyScuola Superiore MeridionaleNaplesItaly
| | - Martin Blessing
- Max Planck Institute for the Science of LightErlangenGermany
| | - Roberto Frigerio
- National Research Council of ItalyIstituto di Scienze e Tecnologie Chimiche (SCITEC‐CNR)MilanItaly
| | | | - Ana Meliciano
- iBET‐Instituto de Biologia Experimental e TecnológicaOeirasPortugal
| | - Anna Fietta
- Department of Biomedical Sciences (DSB)University of PaduaPaduaItaly
- Fondazione Istituto di Ricerca Pediatrica Città della Speranza (IRP)PaduaItaly
| | - Paolo Vincenzo Fioretti
- Department of Cellular, Computational, and Integrative Biology (CIBIO)University of TrentoTrentoItaly
| | | | | | | | - Paolo Bergese
- Department of Molecular and Translational MedicineUniversità degli Studi di BresciaBresciaItaly
- IRIB ‐ Institute for Research and Biomedical Innovation of CNRPalermoItaly
| | | | - Marcella Chiari
- National Research Council of ItalyIstituto di Scienze e Tecnologie Chimiche (SCITEC‐CNR)MilanItaly
| | - Guido Jenster
- Department of Urology, Erasmus MC Cancer InstituteErasmus University Medical CenterRotterdamThe Netherlands
| | - Levi Waldron
- Graduate School of Public Health and Health PolicyCity University of New YorkNew YorkNew YorkUSA
| | - Aleksandar Milosavljevic
- Department of Molecular and Human Genetics, Dan L Duncan Comprehensive Cancer Center, and Program in Quantitative and Computational BiosciencesBaylor College of MedicineHoustonTexasUSA
| | - John Nolan
- Scintillon InstituteSan DiegoCaliforniaUSA
| | | | - Kenneth W. Witwer
- Department of Molecular and Comparative PathobiologyJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Benedetta Bussolati
- Department of Molecular Biotechnology and Health SciencesUniversity of TurinTurinItaly
| | - Dolores Di Vizio
- Department of Surgery, Division of Cancer Biology and TherapeuticsCedars‐Sinai Medical CenterLos AngelesCaliforniaUSA
| | - Juan Falcon Perez
- Center for Cooperative Research in Biosciences (CIC bioGUNE)Basque Research and Technology Alliance (BRTA), Exosomes LaboratoryDerioSpain
- Centro de Investigación Biomédica en Red de enfermedades hepáticas y digestivas (CIBERehd)MadridSpain
- IKERBASQUE, Basque Foundation for ScienceBilbaoSpain
| | - Metka Lenassi
- Institute of Biochemistry and Molecular Genetics, Faculty of MedicineUniversity of LjubljanaLjubljanaSlovenia
| | - Marina Cretich
- National Research Council of ItalyIstituto di Scienze e Tecnologie Chimiche (SCITEC‐CNR)MilanItaly
| | - Francesca Demichelis
- Department of Cellular, Computational, and Integrative Biology (CIBIO)University of TrentoTrentoItaly
| |
Collapse
|
5
|
Geistlinger L, Mirzayi C, Zohra F, Azhar R, Elsafoury S, Grieve C, Wokaty J, Gamboa-Tuz SD, Sengupta P, Hecht I, Ravikrishnan A, Gonçalves RS, Franzosa E, Raman K, Carey V, Dowd JB, Jones HE, Davis S, Segata N, Huttenhower C, Waldron L. BugSigDB captures patterns of differential abundance across a broad range of host-associated microbial signatures. Nat Biotechnol 2023:10.1038/s41587-023-01872-y. [PMID: 37697152 DOI: 10.1038/s41587-023-01872-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 06/20/2023] [Indexed: 09/13/2023]
Abstract
The literature of human and other host-associated microbiome studies is expanding rapidly, but systematic comparisons among published results of host-associated microbiome signatures of differential abundance remain difficult. We present BugSigDB, a community-editable database of manually curated microbial signatures from published differential abundance studies accompanied by information on study geography, health outcomes, host body site and experimental, epidemiological and statistical methods using controlled vocabulary. The initial release of the database contains >2,500 manually curated signatures from >600 published studies on three host species, enabling high-throughput analysis of signature similarity, taxon enrichment, co-occurrence and coexclusion and consensus signatures. These data allow assessment of microbiome differential abundance within and across experimental conditions, environments or body sites. Database-wide analysis reveals experimental conditions with the highest level of consistency in signatures reported by independent studies and identifies commonalities among disease-associated signatures, including frequent introgression of oral pathobionts into the gut.
Collapse
Affiliation(s)
- Ludwig Geistlinger
- Center for Computational Biomedicine, Harvard Medical School, Boston, MA, USA
| | - Chloe Mirzayi
- Institute for Implementation Science in Population Health, City University of New York School of Public Health, New York, NY, USA
- Department of Epidemiology and Biostatistics, City University of New York School of Public Health, New York, NY, USA
| | - Fatima Zohra
- Institute for Implementation Science in Population Health, City University of New York School of Public Health, New York, NY, USA
- Department of Epidemiology and Biostatistics, City University of New York School of Public Health, New York, NY, USA
| | - Rimsha Azhar
- Institute for Implementation Science in Population Health, City University of New York School of Public Health, New York, NY, USA
- Department of Epidemiology and Biostatistics, City University of New York School of Public Health, New York, NY, USA
| | - Shaimaa Elsafoury
- Institute for Implementation Science in Population Health, City University of New York School of Public Health, New York, NY, USA
- Department of Epidemiology and Biostatistics, City University of New York School of Public Health, New York, NY, USA
| | - Clare Grieve
- Institute for Implementation Science in Population Health, City University of New York School of Public Health, New York, NY, USA
- Department of Epidemiology and Biostatistics, City University of New York School of Public Health, New York, NY, USA
| | - Jennifer Wokaty
- Institute for Implementation Science in Population Health, City University of New York School of Public Health, New York, NY, USA
- Department of Epidemiology and Biostatistics, City University of New York School of Public Health, New York, NY, USA
| | - Samuel David Gamboa-Tuz
- Institute for Implementation Science in Population Health, City University of New York School of Public Health, New York, NY, USA
- Department of Epidemiology and Biostatistics, City University of New York School of Public Health, New York, NY, USA
| | - Pratyay Sengupta
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras, Chennai, India
- Robert Bosch Centre for Data Science and Artificial Intelligence, Indian Institute of Technology (IIT) Madras, Chennai, India
- Centre for Integrative Biology and Systems mEdicine (IBSE), Indian Institute of Technology (IIT) Madras, Chennai, India
| | | | - Aarthi Ravikrishnan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Rafael S Gonçalves
- Center for Computational Biomedicine, Harvard Medical School, Boston, MA, USA
| | - Eric Franzosa
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Karthik Raman
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras, Chennai, India
- Robert Bosch Centre for Data Science and Artificial Intelligence, Indian Institute of Technology (IIT) Madras, Chennai, India
- Centre for Integrative Biology and Systems mEdicine (IBSE), Indian Institute of Technology (IIT) Madras, Chennai, India
| | - Vincent Carey
- Channing Division of Network Medicine, Mass General Brigham, Harvard Medical School, Boston, MA, USA
| | - Jennifer B Dowd
- Leverhulme Centre for Demographic Science, University of Oxford, Oxford, UK
| | - Heidi E Jones
- Institute for Implementation Science in Population Health, City University of New York School of Public Health, New York, NY, USA
- Department of Epidemiology and Biostatistics, City University of New York School of Public Health, New York, NY, USA
| | - Sean Davis
- Departments of Biomedical Informatics and Medicine, University of Colorado Anschutz School of Medicine, Denver, CO, USA
| | - Nicola Segata
- Department CIBIO, University of Trento, Trento, Italy
- Istituto Europeo di Oncologia (IEO) IRCSS, Milan, Italy
| | - Curtis Huttenhower
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Levi Waldron
- Institute for Implementation Science in Population Health, City University of New York School of Public Health, New York, NY, USA.
- Department of Epidemiology and Biostatistics, City University of New York School of Public Health, New York, NY, USA.
- Department CIBIO, University of Trento, Trento, Italy.
| |
Collapse
|
6
|
Eckenrode KB, Righelli D, Ramos M, Argelaguet R, Vanderaa C, Geistlinger L, Culhane AC, Gatto L, Carey V, Morgan M, Risso D, Waldron L. Curated single cell multimodal landmark datasets for R/Bioconductor. PLoS Comput Biol 2023; 19:e1011324. [PMID: 37624866 PMCID: PMC10497156 DOI: 10.1371/journal.pcbi.1011324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 09/12/2023] [Accepted: 07/03/2023] [Indexed: 08/27/2023] Open
Abstract
BACKGROUND The majority of high-throughput single-cell molecular profiling methods quantify RNA expression; however, recent multimodal profiling methods add simultaneous measurement of genomic, proteomic, epigenetic, and/or spatial information on the same cells. The development of new statistical and computational methods in Bioconductor for such data will be facilitated by easy availability of landmark datasets using standard data classes. RESULTS We collected, processed, and packaged publicly available landmark datasets from important single-cell multimodal protocols, including CITE-Seq, ECCITE-Seq, SCoPE2, scNMT, 10X Multiome, seqFISH, and G&T. We integrate data modalities via the MultiAssayExperiment Bioconductor class, document and re-distribute datasets as the SingleCellMultiModal package in Bioconductor's Cloud-based ExperimentHub. The result is single-command actualization of landmark datasets from seven single-cell multimodal data generation technologies, without need for further data processing or wrangling in order to analyze and develop methods within Bioconductor's ecosystem of hundreds of packages for single-cell and multimodal data. CONCLUSIONS We provide two examples of integrative analyses that are greatly simplified by SingleCellMultiModal. The package will facilitate development of bioinformatic and statistical methods in Bioconductor to meet the challenges of integrating molecular layers and analyzing phenotypic outputs including cell differentiation, activity, and disease.
Collapse
Affiliation(s)
- Kelly B. Eckenrode
- Graduate School of Public Health and Health Policy, City University of New York, NY, NY, United States of America
- Institute for Implementation Science in Public Health, City University of New York, NY, NY, United States of America
| | - Dario Righelli
- Department of Statistical Sciences, University of Padova, Padova, Italy
| | - Marcel Ramos
- Graduate School of Public Health and Health Policy, City University of New York, NY, NY, United States of America
- Institute for Implementation Science in Public Health, City University of New York, NY, NY, United States of America
- Roswell Park Comprehensive Cancer Center, Buffalo, New York, United States of America
| | - Ricard Argelaguet
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, United Kingdom
| | | | - Ludwig Geistlinger
- Center for Computational Biomedicine, Harvard Medical School, Boston, Massachusetts, United States of America
| | | | - Laurent Gatto
- de Duve Institute, Université catholique de Louvain, Brussels, Belgium
| | - Vincent Carey
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Martin Morgan
- Roswell Park Comprehensive Cancer Center, Buffalo, New York, United States of America
| | - Davide Risso
- Department of Statistical Sciences, University of Padova, Padova, Italy
| | - Levi Waldron
- Graduate School of Public Health and Health Policy, City University of New York, NY, NY, United States of America
- Institute for Implementation Science in Public Health, City University of New York, NY, NY, United States of America
| |
Collapse
|
7
|
Afiaz A, Ivanov AA, Chamberlin J, Hanauer D, Savonen CL, Goldman MJ, Morgan M, Reich M, Getka A, Holmes A, Pati S, Knight D, Boutros PC, Bakas S, Caporaso JG, Del Fiol G, Hochheiser H, Haas B, Schloss PD, Eddy JA, Albrecht J, Fedorov A, Waldron L, Hoffman AM, Bradshaw RL, Leek JT, Wright C. Evaluation of software impact designed for biomedical research: Are we measuring what's meaningful? ArXiv 2023:arXiv:2306.03255v1. [PMID: 37332562 PMCID: PMC10274942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Software is vital for the advancement of biology and medicine. Through analysis of usage and impact metrics of software, developers can help determine user and community engagement. These metrics can be used to justify additional funding, encourage additional use, and identify unanticipated use cases. Such analyses can help define improvement areas and assist with managing project resources. However, there are challenges associated with assessing usage and impact, many of which vary widely depending on the type of software being evaluated. These challenges involve issues of distorted, exaggerated, understated, or misleading metrics, as well as ethical and security concerns. More attention to the nuances, challenges, and considerations involved in capturing impact across the diverse spectrum of biological software is needed. Furthermore, some tools may be especially beneficial to a small audience, yet may not have comparatively compelling metrics of high usage. Although some principles are generally applicable, there is not a single perfect metric or approach to effectively evaluate a software tool's impact, as this depends on aspects unique to each tool, how it is used, and how one wishes to evaluate engagement. We propose more broadly applicable guidelines (such as infrastructure that supports the usage of software and the collection of metrics about usage), as well as strategies for various types of software and resources. We also highlight outstanding issues in the field regarding how communities measure or evaluate software impact. To gain a deeper understanding of the issues hindering software evaluations, as well as to determine what appears to be helpful, we performed a survey of participants involved with scientific software projects for the Informatics Technology for Cancer Research (ITCR) program funded by the National Cancer Institute (NCI). We also investigated software among this scientific community and others to assess how often infrastructure that supports such evaluations is implemented and how this impacts rates of papers describing usage of the software. We find that although developers recognize the utility of analyzing data related to the impact or usage of their software, they struggle to find the time or funding to support such analyses. We also find that infrastructure such as social media presence, more in-depth documentation, the presence of software health metrics, and clear information on how to contact developers seem to be associated with increased usage rates. Our findings can help scientific software developers make the most out of the evaluations of their software so that they can more fully benefit from such assessments.
Collapse
Affiliation(s)
- Awan Afiaz
- Department of Biostatistics, University of Washington, Seattle, WA
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA
| | - Andrey A. Ivanov
- Department of Pharmacology and Chemical Biology, Emory University School of Medicine, Emory University, Atlanta, GA
| | - John Chamberlin
- Department of Biomedical Informatics, University of Utah, Salt Lake City, UT
| | - David Hanauer
- Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, MI
| | - Candace L. Savonen
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA
| | | | - Martin Morgan
- Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | | | | | - Aaron Holmes
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA
- Institute for Precision Health, University of California, Los Angeles, CA
- Department of Human Genetics, University of California, Los Angeles, CA
- Department of Urology, University of California, Los Angeles, CA
| | | | - Dan Knight
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA
- Institute for Precision Health, University of California, Los Angeles, CA
- Department of Human Genetics, University of California, Los Angeles, CA
- Department of Urology, University of California, Los Angeles, CA
| | - Paul C. Boutros
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA
- Institute for Precision Health, University of California, Los Angeles, CA
- Department of Human Genetics, University of California, Los Angeles, CA
- Department of Urology, University of California, Los Angeles, CA
| | | | - J. Gregory Caporaso
- Pathogen and Microbiome Institute, Northern Arizona University, Flagstaff, AZ
| | - Guilherme Del Fiol
- Department of Biomedical Informatics, University of Utah, Salt Lake City, UT
| | - Harry Hochheiser
- Department of Biomedical Informatics, University of Pittsburgh,Pittsburgh, PA
| | - Brian Haas
- Methods Development Laboratory, Broad Institute, Cambridge, MA
| | - Patrick D. Schloss
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI
| | | | | | - Andrey Fedorov
- Department of Radiology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| | - Levi Waldron
- Department of Epidemiology and Biostatistics, City University of New York Graduate School of Public Health and Health Policy, New York, NY
| | - Ava M. Hoffman
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA
| | - Richard L. Bradshaw
- Department of Biomedical Informatics, University of Utah, Salt Lake City, UT
| | - Jeffrey T. Leek
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA
| | - Carrie Wright
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA
| |
Collapse
|
8
|
Ramos M, Morgan M, Geistlinger L, Carey VJ, Waldron L. RaggedExperiment: the missing link between genomic ranges and matrices in Bioconductor. Bioinformatics 2023; 39:btad330. [PMID: 37208161 PMCID: PMC10272705 DOI: 10.1093/bioinformatics/btad330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 04/04/2023] [Accepted: 05/18/2023] [Indexed: 05/21/2023] Open
Abstract
SUMMARY The RaggedExperiment R / Bioconductor package provides lossless representation of disparate genomic ranges across multiple specimens or cells, in conjunction with efficient and flexible calculations of rectangular-shaped summaries for downstream analysis. Applications include statistical analysis of somatic mutations, copy number, methylation, and open chromatin data. RaggedExperiment is compatible with multimodal data analysis as a component of MultiAssayExperiment data objects, and simplifies data representation and transformation for software developers and analysts. MOTIVATION AND RESULTS Measurement of copy number, mutation, single nucleotide polymorphism, and other genomic attributes that may be stored as VCF files produce "ragged" genomic ranges data: i.e. across different genomic coordinates in each sample. Ragged data are not rectangular or matrix-like, presenting informatics challenges for downstream statistical analyses. We present the RaggedExperiment R/Bioconductor data structure for lossless representation of ragged genomic data, with associated reshaping tools for flexible and efficient calculation of tabular representations to support a wide range of downstream statistical analyses. We demonstrate its applicability to copy number and somatic mutation data across 33 TCGA cancer datasets.
Collapse
Affiliation(s)
- Marcel Ramos
- Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY 10027, United States
- Institute for Implementation Science and Population Health, City University of New York, New York, NY 10027, United States
- Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY 14203, United States
| | - Martin Morgan
- Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY 14203, United States
| | - Ludwig Geistlinger
- Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY 10027, United States
- Institute for Implementation Science and Population Health, City University of New York, New York, NY 10027, United States
| | - Vincent J Carey
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, United States
| | - Levi Waldron
- Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY 10027, United States
- Institute for Implementation Science and Population Health, City University of New York, New York, NY 10027, United States
| |
Collapse
|
9
|
Nash D, Qasmieh S, Robertson M, Rane M, Zimba R, Kulkarni SG, Berry A, You W, Mirzayi C, Westmoreland D, Parcesepe A, Waldron L, Kochhar S, Maroko AR, Grov C. Household factors and the risk of severe COVID-like illness early in the U.S. pandemic. PLoS One 2022; 17:e0271786. [PMID: 35862418 PMCID: PMC9302833 DOI: 10.1371/journal.pone.0271786] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 07/07/2022] [Indexed: 12/12/2022] Open
Abstract
OBJECTIVE To investigate the role of children in the home and household crowding as risk factors for severe COVID-19 disease. METHODS We used interview data from 6,831 U.S. adults screened for the Communities, Households and SARS/CoV-2 Epidemiology (CHASING) COVID Cohort Study in April 2020. RESULTS In logistic regression models, the adjusted odds ratio [aOR] of hospitalization due to COVID-19 for having (versus not having) children in the home was 10.5 (95% CI:5.7-19.1) among study participants living in multi-unit dwellings and 2.2 (95% CI:1.2-6.5) among those living in single unit dwellings. Among participants living in multi-unit dwellings, the aOR for COVID-19 hospitalization among participants with more than 4 persons in their household (versus 1 person) was 2.5 (95% CI:1.0-6.1), and 0.8 (95% CI:0.15-4.1) among those living in single unit dwellings. CONCLUSION Early in the US SARS-CoV-2 pandemic, certain household exposures likely increased the risk of both SARS-CoV-2 acquisition and the risk of severe COVID-19 disease.
Collapse
Affiliation(s)
- Denis Nash
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY), New York City, New York, United States of America
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY), New York City, New York, United States of America
| | - Saba Qasmieh
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY), New York City, New York, United States of America
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY), New York City, New York, United States of America
| | - McKaylee Robertson
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY), New York City, New York, United States of America
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY), New York City, New York, United States of America
| | - Madhura Rane
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY), New York City, New York, United States of America
| | - Rebecca Zimba
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY), New York City, New York, United States of America
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY), New York City, New York, United States of America
| | - Sarah G. Kulkarni
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY), New York City, New York, United States of America
| | - Amanda Berry
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY), New York City, New York, United States of America
| | - William You
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY), New York City, New York, United States of America
| | - Chloe Mirzayi
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY), New York City, New York, United States of America
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY), New York City, New York, United States of America
| | - Drew Westmoreland
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY), New York City, New York, United States of America
| | - Angela Parcesepe
- Department of Maternal and Child Health, Gillings School of Public Health, University of North Carolina, Chapel Hill, North Carolina, United States of America
- Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Levi Waldron
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY), New York City, New York, United States of America
- Department of Environmental, Occupational, and Geospatial Health Sciences, Graduate School of Public Health and Health Policy, City University of New York (CUNY), New York City, New York, United States of America
| | - Shivani Kochhar
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY), New York City, New York, United States of America
| | - Andrew R. Maroko
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY), New York City, New York, United States of America
- Department of Environmental, Occupational, and Geospatial Health Sciences, Graduate School of Public Health and Health Policy, City University of New York (CUNY), New York City, New York, United States of America
| | - Christian Grov
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY), New York City, New York, United States of America
- Department of Community Health and Social Sciences, Graduate School of Public Health and Health Policy, City University of New York (CUNY), New York City, New York, United States of America
| | | |
Collapse
|
10
|
Khaliq AM, Erdogan C, Kurt Z, Turgut SS, Grunvald MW, Rand T, Khare S, Borgia JA, Hayden DM, Pappas SG, Govekar HR, Kam AE, Reiser J, Turaga K, Radovich M, Zang Y, Qiu Y, Liu Y, Fishel ML, Turk A, Gupta V, Al-Sabti R, Subramanian J, Kuzel TM, Sadanandam A, Waldron L, Hussain A, Saleem M, El-Rayes B, Salahudeen AA, Masood A. Correction: Refining colorectal cancer classification and clinical stratification through a single-cell atlas. Genome Biol 2022; 23:156. [PMID: 35831907 PMCID: PMC9277898 DOI: 10.1186/s13059-022-02724-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Affiliation(s)
- Ateeq M Khaliq
- Indiana University School of Medicine, Indianapolis, IN, USA
| | - Cihat Erdogan
- Isparta University of Applied Sciences, Isparta, Turkey
| | - Zeyneb Kurt
- Northumbria University, Upon Tyne, Newcastle, UK
| | | | | | - Tim Rand
- Tempus Labs, Inc., Chicago, IL, USA
| | | | | | | | - Sam G Pappas
- Rush University Medical Center, Chicago, IL, USA
| | | | - Audrey E Kam
- Rush University Medical Center, Chicago, IL, USA
| | | | | | - Milan Radovich
- Indiana University School of Medicine, Indianapolis, IN, USA
| | - Yong Zang
- Indiana University School of Medicine, Indianapolis, IN, USA
| | - Yingjie Qiu
- Indiana University School of Medicine, Indianapolis, IN, USA
| | - Yunlong Liu
- Indiana University School of Medicine, Indianapolis, IN, USA
| | | | - Anita Turk
- Indiana University School of Medicine, Indianapolis, IN, USA
| | - Vineet Gupta
- Rush University Medical Center, Chicago, IL, USA
| | - Ram Al-Sabti
- Rush University Medical Center, Chicago, IL, USA
| | | | | | | | - Levi Waldron
- CUNY Graduate School of Public Health and Health Policy, New York, NY, USA
| | - Arif Hussain
- University of Maryland Marlene and Stewart Greenebaum Comprehensive Cancer Center, Baltimore, MD, USA
| | | | - Bassel El-Rayes
- University of Alabama, O'Neil Comprehensive Cancer Institute, Birmingham, AL, USA
| | | | - Ashiq Masood
- Indiana University School of Medicine, Indianapolis, IN, USA.
| |
Collapse
|
11
|
Oh S, Geistlinger L, Ramos M, Blankenberg D, van den Beek M, Taroni JN, Carey VJ, Greene CS, Waldron L, Davis S. GenomicSuperSignature facilitates interpretation of RNA-seq experiments through robust, efficient comparison to public databases. Nat Commun 2022; 13:3695. [PMID: 35760813 PMCID: PMC9237024 DOI: 10.1038/s41467-022-31411-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 06/14/2022] [Indexed: 02/04/2023] Open
Abstract
Millions of transcriptomic profiles have been deposited in public archives, yet remain underused for the interpretation of new experiments. We present a method for interpreting new transcriptomic datasets through instant comparison to public datasets without high-performance computing requirements. We apply Principal Component Analysis on 536 studies comprising 44,890 human RNA sequencing profiles and aggregate sufficiently similar loading vectors to form Replicable Axes of Variation (RAV). RAVs are annotated with metadata of originating studies and by gene set enrichment analysis. Functionality to associate new datasets with RAVs, extract interpretable annotations, and provide intuitive visualization are implemented as the GenomicSuperSignature R/Bioconductor package. We demonstrate the efficient and coherent database search, robustness to batch effects and heterogeneous training data, and transfer learning capacity of our method using TCGA and rare diseases datasets. GenomicSuperSignature aids in analyzing new gene expression data in the context of existing databases using minimal computing resources.
Collapse
Affiliation(s)
- Sehyun Oh
- grid.212340.60000000122985718Graduate School of Public Health and Health Policy and Institute for Implementation Sciences in Public Health, City University of New York, New York, NY USA
| | - Ludwig Geistlinger
- grid.38142.3c000000041936754XCenter for Computational Biomedicine, Harvard Medical School, Boston, MA USA
| | - Marcel Ramos
- grid.212340.60000000122985718Graduate School of Public Health and Health Policy and Institute for Implementation Sciences in Public Health, City University of New York, New York, NY USA
| | - Daniel Blankenberg
- grid.239578.20000 0001 0675 4725Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH USA ,grid.67105.350000 0001 2164 3847Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH USA
| | - Marius van den Beek
- grid.29857.310000 0001 2097 4281The Pennsylvania State University, State College, PA USA
| | - Jaclyn N. Taroni
- grid.430722.0Childhood Cancer Data Lab, Alex’s Lemonade Stand Foundation, Bala Cynwyd, PA USA
| | - Vincent J. Carey
- grid.38142.3c000000041936754XChanning Division of Network Medicine, Mass General Brigham, Harvard Medical School, Boston, MA USA
| | - Casey S. Greene
- grid.241116.10000000107903411Center for Health AI, University of Colorado Anschutz School of Medicine, Denver, CO USA
| | - Levi Waldron
- grid.212340.60000000122985718Graduate School of Public Health and Health Policy and Institute for Implementation Sciences in Public Health, City University of New York, New York, NY USA
| | - Sean Davis
- grid.241116.10000000107903411Center for Health AI, University of Colorado Anschutz School of Medicine, Denver, CO USA
| |
Collapse
|
12
|
Ghazi AR, Sucipto K, Rahnavard A, Franzosa EA, McIver LJ, Lloyd-Price J, Schwager E, Weingart G, Moon YS, Morgan XC, Waldron L, Huttenhower C. High-sensitivity pattern discovery in large, paired multiomic datasets. Bioinformatics 2022; 38:i378-i385. [PMID: 35758795 PMCID: PMC9235493 DOI: 10.1093/bioinformatics/btac232] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Modern biological screens yield enormous numbers of measurements, and identifying and interpreting statistically significant associations among features are essential. In experiments featuring multiple high-dimensional datasets collected from the same set of samples, it is useful to identify groups of associated features between the datasets in a way that provides high statistical power and false discovery rate (FDR) control. RESULTS Here, we present a novel hierarchical framework, HAllA (Hierarchical All-against-All association testing), for structured association discovery between paired high-dimensional datasets. HAllA efficiently integrates hierarchical hypothesis testing with FDR correction to reveal significant linear and non-linear block-wise relationships among continuous and/or categorical data. We optimized and evaluated HAllA using heterogeneous synthetic datasets of known association structure, where HAllA outperformed all-against-all and other block-testing approaches across a range of common similarity measures. We then applied HAllA to a series of real-world multiomics datasets, revealing new associations between gene expression and host immune activity, the microbiome and host transcriptome, metabolomic profiling and human health phenotypes. AVAILABILITY AND IMPLEMENTATION An open-source implementation of HAllA is freely available at http://huttenhower.sph.harvard.edu/halla along with documentation, demo datasets and a user group. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Andrew R Ghazi
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Kathleen Sucipto
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Ali Rahnavard
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Eric A Franzosa
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Lauren J McIver
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Jason Lloyd-Price
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Emma Schwager
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - George Weingart
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Yo Sup Moon
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Xochitl C Morgan
- Department of Microbiology and Immunology, University of Otago, Dunedin 9016, New Zealand
| | - Levi Waldron
- Department of Epidemiology and Biostatistics, City University of New York Graduate School of Public Health and Health Policy, New York City, NY 10035, USA
| | - Curtis Huttenhower
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| |
Collapse
|
13
|
Nash D, Rane MS, Robertson MM, Chang M, Gorrell SK, Zimba R, You W, Berry A, Mirzayi C, Kochhar S, Maroko A, Westmoreland DA, Parcesepe AM, Waldron L, Grov C. Severe Acute Respiratory Syndrome Coronavirus 2 Incidence and Risk Factors in a National, Community-Based Prospective Cohort of US Adults. Clin Infect Dis 2022; 76:e375-e384. [PMID: 35639911 PMCID: PMC9213857 DOI: 10.1093/cid/ciac423] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 04/01/2022] [Accepted: 05/24/2022] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Prospective cohort studies of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) incidence complement case-based surveillance and cross-sectional seroprevalence surveys. METHODS We estimated the incidence of SARS-CoV-2 infection in a national cohort of 6738 US adults, enrolled in March-August 2020. Using Poisson models, we examined the association of social distancing and a composite epidemiologic risk score with seroconversion. The risk score was created using least absolute shrinkage selection operator (LASSO) regression to identify factors predictive of seroconversion. The selected factors were household crowding, confirmed case in household, indoor dining, gathering with groups of ≥10, and no masking in gyms or salons. RESULTS Among 4510 individuals with ≥1 serologic test, 323 (7.3% [95% confidence interval (CI), 6.5%-8.1%]) seroconverted by January 2021. Among 3422 participants seronegative in May-September 2020 and retested from November 2020 to January 2021, 161 seroconverted over 1646 person-years of follow-up (9.8 per 100 person-years [95% CI, 8.3-11.5]). The seroincidence rate was lower among women compared with men (incidence rate ratio [IRR], 0.69 [95% CI, .50-.94]) and higher among Hispanic (2.09 [1.41-3.05]) than white non-Hispanic participants. In adjusted models, participants who reported social distancing with people they did not know (IRR for always vs never social distancing, 0.42 [95% CI, .20-1.0]) and with people they knew (IRR for always vs never, 0.64 [.39-1.06]; IRR for sometimes vs never, 0.60 [.38-.96]) had lower seroconversion risk. Seroconversion risk increased with epidemiologic risk score (IRR for medium vs low score, 1.68 [95% CI, 1.03-2.81]; IRR for high vs low score, 3.49 [2.26-5.58]). Only 29% of those who seroconverted reported isolating, and only 19% were asked about contacts. CONCLUSIONS Modifiable risk factors and poor reach of public health strategies drove SARS-CoV-2 transmission across the United States.
Collapse
Affiliation(s)
- Denis Nash
- CORRESPONDING AUTHOR: Denis Nash, Ph.D., MPH CUNY Graduate School of Public Health and Health Policy 55 W. 125th St., 6th Floor New York, NY USA 10027
| | - Madhura S. Rane
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - McKaylee M. Robertson
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Mindy Chang
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Sarah Kulkarni Gorrell
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Rebecca Zimba
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA,Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - William You
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Amanda Berry
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Chloe Mirzayi
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA,Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - Shivani Kochhar
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Andrew Maroko
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA,Department of Environmental, Occupational, and Geospatial Health Sciences, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - Drew A. Westmoreland
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Angela M. Parcesepe
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA,Department of Maternal and Child Health, Gillings School of Public Health, University of North Carolina, Chapel Hill, NC, USA,Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Levi Waldron
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA,Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - Christian Grov
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA,Department of Community Health and Social Sciences, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| |
Collapse
|
14
|
Khaliq AM, Erdogan C, Kurt Z, Turgut SS, Grunvald MW, Rand T, Khare S, Borgia JA, Hayden DM, Pappas SG, Govekar HR, Kam AE, Reiser J, Turaga K, Radovich M, Zang Y, Qiu Y, Liu Y, Fishel ML, Turk A, Gupta V, Al-Sabti R, Subramanian J, Kuzel TM, Sadanandam A, Waldron L, Hussain A, Saleem M, El-Rayes B, Salahudeen AA, Masood A. Refining colorectal cancer classification and clinical stratification through a single-cell atlas. Genome Biol 2022; 23:113. [PMID: 35538548 PMCID: PMC9092724 DOI: 10.1186/s13059-022-02677-z] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Accepted: 04/21/2022] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Colorectal cancer (CRC) consensus molecular subtypes (CMS) have different immunological, stromal cell, and clinicopathological characteristics. Single-cell characterization of CMS subtype tumor microenvironments is required to elucidate mechanisms of tumor and stroma cell contributions to pathogenesis which may advance subtype-specific therapeutic development. We interrogate racially diverse human CRC samples and analyze multiple independent external cohorts for a total of 487,829 single cells enabling high-resolution depiction of the cellular diversity and heterogeneity within the tumor and microenvironmental cells. RESULTS Tumor cells recapitulate individual CMS subgroups yet exhibit significant intratumoral CMS heterogeneity. Both CMS1 microsatellite instability (MSI-H) CRCs and microsatellite stable (MSS) CRC demonstrate similar pathway activations at the tumor epithelial level. However, CD8+ cytotoxic T cell phenotype infiltration in MSI-H CRCs may explain why these tumors respond to immune checkpoint inhibitors. Cellular transcriptomic profiles in CRC exist in a tumor immune stromal continuum in contrast to discrete subtypes proposed by studies utilizing bulk transcriptomics. We note a dichotomy in tumor microenvironments across CMS subgroups exists by which patients with high cancer-associated fibroblasts (CAFs) and C1Q+TAM content exhibit poor outcomes, providing a higher level of personalization and precision than would distinct subtypes. Additionally, we discover CAF subtypes known to be associated with immunotherapy resistance. CONCLUSIONS Distinct CAFs and C1Q+ TAMs are sufficient to explain CMS predictive ability and a simpler signature based on these cellular phenotypes could stratify CRC patient prognosis with greater precision. Therapeutically targeting specific CAF subtypes and C1Q + TAMs may promote immunotherapy responses in CRC patients.
Collapse
Affiliation(s)
- Ateeq M Khaliq
- Indiana University School of Medicine, Indianapolis, IN, USA
| | - Cihat Erdogan
- Isparta University of Applied Sciences, Isparta, Turkey
| | - Zeyneb Kurt
- Northumbria University, Newcastle Upon Tyne, UK
| | | | | | - Tim Rand
- Tempus Labs, Inc., Chicago, IL, USA
| | | | | | | | - Sam G Pappas
- Rush University Medical Center, Chicago, IL, USA
| | | | - Audrey E Kam
- Rush University Medical Center, Chicago, IL, USA
| | | | | | - Milan Radovich
- Indiana University School of Medicine, Indianapolis, IN, USA
| | - Yong Zang
- Indiana University School of Medicine, Indianapolis, IN, USA
| | - Yingjie Qiu
- Indiana University School of Medicine, Indianapolis, IN, USA
| | - Yunlong Liu
- Indiana University School of Medicine, Indianapolis, IN, USA
| | | | - Anita Turk
- Indiana University School of Medicine, Indianapolis, IN, USA
| | - Vineet Gupta
- Rush University Medical Center, Chicago, IL, USA
| | - Ram Al-Sabti
- Rush University Medical Center, Chicago, IL, USA
| | | | | | | | - Levi Waldron
- CUNY Graduate School of Public Health and Health Policy, New York, NY, USA
| | - Arif Hussain
- University of Maryland Marlene and Stewart Greenebaum Comprehensive Cancer Center, Baltimore, MD, USA
| | | | - Bassel El-Rayes
- University of Alabama, O'Neil Comprehensive Cancer Institute, Birmingham, AL, USA
| | | | - Ashiq Masood
- Indiana University School of Medicine, Indianapolis, IN, USA.
| |
Collapse
|
15
|
Lee KA, Thomas AM, Bolte LA, Björk JR, de Ruijter LK, Armanini F, Asnicar F, Blanco-Miguez A, Board R, Calbet-Llopart N, Derosa L, Dhomen N, Brooks K, Harland M, Harries M, Leeming ER, Lorigan P, Manghi P, Marais R, Newton-Bishop J, Nezi L, Pinto F, Potrony M, Puig S, Serra-Bellver P, Shaw HM, Tamburini S, Valpione S, Vijay A, Waldron L, Zitvogel L, Zolfo M, de Vries EGE, Nathan P, Fehrmann RSN, Bataille V, Hospers GAP, Spector TD, Weersma RK, Segata N. Cross-cohort gut microbiome associations with immune checkpoint inhibitor response in advanced melanoma. Nat Med 2022; 28:535-544. [PMID: 35228751 PMCID: PMC8938272 DOI: 10.1038/s41591-022-01695-5] [Citation(s) in RCA: 134] [Impact Index Per Article: 67.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Accepted: 01/13/2022] [Indexed: 12/13/2022]
Abstract
The composition of the gut microbiome has been associated with clinical responses to immune checkpoint inhibitor (ICI) treatment, but there is limited consensus on the specific microbiome characteristics linked to the clinical benefits of ICIs. We performed shotgun metagenomic sequencing of stool samples collected before ICI initiation from five observational cohorts recruiting ICI-naive patients with advanced cutaneous melanoma (n = 165). Integrating the dataset with 147 metagenomic samples from previously published studies, we found that the gut microbiome has a relevant, but cohort-dependent, association with the response to ICIs. A machine learning analysis confirmed the link between the microbiome and overall response rates (ORRs) and progression-free survival (PFS) with ICIs but also revealed limited reproducibility of microbiome-based signatures across cohorts. Accordingly, a panel of species, including Bifidobacterium pseudocatenulatum, Roseburia spp. and Akkermansia muciniphila, associated with responders was identified, but no single species could be regarded as a fully consistent biomarker across studies. Overall, the role of the human gut microbiome in ICI response appears more complex than previously thought, extending beyond differing microbial species simply present or absent in responders and nonresponders. Future studies should adopt larger sample sizes and take into account the complex interplay of clinical factors with the gut microbiome over the treatment course.
Collapse
Affiliation(s)
- Karla A Lee
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | | | - Laura A Bolte
- Department of Gastroenterology and Hepatology, University of Groningen and University Medical Center Groningen, Groningen, the Netherlands
| | - Johannes R Björk
- Department of Gastroenterology and Hepatology, University of Groningen and University Medical Center Groningen, Groningen, the Netherlands
| | - Laura Kist de Ruijter
- Department of Medical Oncology, University of Groningen and University Medical Center Groningen, Groningen, the Netherlands
| | | | | | | | - Ruth Board
- Department of Oncology, Lancashire Teaching Hospitals NHS Trust, Preston, UK
| | - Neus Calbet-Llopart
- Dermatology Department, Hospital Clínic Barcelona, Universitat de Barcelona, IDIBAPS, Barcelona, Spain
- Centro de Investigación Biomédica en Red en Enfermedades Raras, Instituto de Salud Carlos III, Barcelona, Spain
| | - Lisa Derosa
- U1015 INSERM, University Paris Saclay, Gustave Roussy Cancer Center and Oncobiome Network, Villejuif-Grand-Paris, France
| | - Nathalie Dhomen
- Molecular Oncology Group, CRUK Manchester Institute, University of Manchester, Manchester, UK
| | - Kelly Brooks
- Molecular Oncology Group, CRUK Manchester Institute, University of Manchester, Manchester, UK
| | - Mark Harland
- Division of Haematology and Immunology, Institute of Medical Research at St. James's, University of Leeds, Leeds, UK
| | - Mark Harries
- Biochemical and Molecular Genetics Department, Hospital Clínic de Barcelona, IDIBAPS and University of Barcelona, Barcelona, Spain
- Department of Medical Oncology, Guys Cancer Centre, Guys and St Thomas's NHS Trust, London, UK
| | - Emily R Leeming
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Paul Lorigan
- The Christie NHS Foundation Trust, Manchester, UK
- Division of Cancer Sciences, University of Manchester, Manchester, UK
| | - Paolo Manghi
- Department CIBIO, University of Trento, Trento, Italy
| | - Richard Marais
- Molecular Oncology Group, CRUK Manchester Institute, University of Manchester, Manchester, UK
| | - Julia Newton-Bishop
- Division of Haematology and Immunology, Institute of Medical Research at St. James's, University of Leeds, Leeds, UK
| | - Luigi Nezi
- European Institute of Oncology (Istituto Europeo di Oncologia, IRCSS), Milan, Italy
| | | | - Miriam Potrony
- Centro de Investigación Biomédica en Red en Enfermedades Raras, Instituto de Salud Carlos III, Barcelona, Spain
- Biochemical and Molecular Genetics Department, Hospital Clínic de Barcelona, IDIBAPS and University of Barcelona, Barcelona, Spain
| | - Susana Puig
- Centro de Investigación Biomédica en Red en Enfermedades Raras, Instituto de Salud Carlos III, Barcelona, Spain
- Biochemical and Molecular Genetics Department, Hospital Clínic de Barcelona, IDIBAPS and University of Barcelona, Barcelona, Spain
| | | | - Heather M Shaw
- Department of Medical Oncology, Mount Vernon Cancer Centre, Northwood, UK
| | - Sabrina Tamburini
- European Institute of Oncology (Istituto Europeo di Oncologia, IRCSS), Milan, Italy
| | - Sara Valpione
- Molecular Oncology Group, CRUK Manchester Institute, University of Manchester, Manchester, UK
- The Christie NHS Foundation Trust, Manchester, UK
| | - Amrita Vijay
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
- Rheumatology & Orthopaedics Division, School of Medicine, University of Nottingham, Nottingham, UK
| | - Levi Waldron
- Department CIBIO, University of Trento, Trento, Italy
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY, USA
| | - Laurence Zitvogel
- U1015 INSERM, University Paris Saclay, Gustave Roussy Cancer Center and Oncobiome Network, Villejuif-Grand-Paris, France
| | - Moreno Zolfo
- Department CIBIO, University of Trento, Trento, Italy
| | - Elisabeth G E de Vries
- Department of Medical Oncology, University of Groningen and University Medical Center Groningen, Groningen, the Netherlands
| | - Paul Nathan
- Biochemical and Molecular Genetics Department, Hospital Clínic de Barcelona, IDIBAPS and University of Barcelona, Barcelona, Spain
| | - Rudolf S N Fehrmann
- Department of Medical Oncology, University of Groningen and University Medical Center Groningen, Groningen, the Netherlands
| | - Véronique Bataille
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
- Department of Dermatology, Mount Vernon Cancer Centre, Northwood, UK
| | - Geke A P Hospers
- Department of Medical Oncology, University of Groningen and University Medical Center Groningen, Groningen, the Netherlands
| | - Tim D Spector
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK.
| | - Rinse K Weersma
- Department of Gastroenterology and Hepatology, University of Groningen and University Medical Center Groningen, Groningen, the Netherlands.
| | - Nicola Segata
- Department CIBIO, University of Trento, Trento, Italy.
- European Institute of Oncology (Istituto Europeo di Oncologia, IRCSS), Milan, Italy.
| |
Collapse
|
16
|
Schatz MC, Philippakis AA, Afgan E, Banks E, Carey VJ, Carroll RJ, Culotti A, Ellrott K, Goecks J, Grossman RL, Hall IM, Hansen KD, Lawson J, Leek JT, Luria AO, Mosher S, Morgan M, Nekrutenko A, O’Connor BD, Osborn K, Paten B, Patterson C, Tan FJ, Taylor CO, Vessio J, Waldron L, Wang T, Wuichet K. Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space. Cell Genom 2022; 2:100085. [PMID: 35199087 PMCID: PMC8863334 DOI: 10.1016/j.xgen.2021.100085] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL; https://anvilproject.org) was developed to address a widespread community need for a unified computing environment for genomics data storage, management, and analysis. In this perspective, we present AnVIL, describe its ecosystem and interoperability with other platforms, and highlight how this platform and associated initiatives contribute to improved genomic data sharing efforts. The AnVIL is a federated cloud platform designed to manage and store genomics and related data, enable population-scale analysis, and facilitate collaboration through the sharing of data, code, and analysis results. By inverting the traditional model of data sharing, the AnVIL eliminates the need for data movement while also adding security measures for active threat detection and monitoring and provides scalable, shared computing resources for any researcher. We describe the core data management and analysis components of the AnVIL, which currently consists of Terra, Gen3, Galaxy, RStudio/Bioconductor, Dockstore, and Jupyter, and describe several flagship genomics datasets available within the AnVIL. We continue to extend and innovate the AnVIL ecosystem by implementing new capabilities, including mechanisms for interoperability and responsible data sharing, while streamlining access management. The AnVIL opens many new opportunities for analysis, collaboration, and data sharing that are needed to drive research and to make discoveries through the joint analysis of hundreds of thousands to millions of genomes along with associated clinical and molecular data types.
Collapse
Affiliation(s)
- Michael C. Schatz
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA,Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA,Corresponding author
| | | | - Enis Afgan
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Eric Banks
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Robert J. Carroll
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Alessandro Culotti
- Broad Institute of MIT and Harvard, Cambridge, MA, USA,Center for Translational Data Science, University of Chicago, Chicago, IL, USA
| | - Kyle Ellrott
- Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - Jeremy Goecks
- Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - Robert L. Grossman
- Center for Translational Data Science, University of Chicago, Chicago, IL, USA
| | - Ira M. Hall
- Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Kasper D. Hansen
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA
| | | | - Jeffrey T. Leek
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA
| | | | - Stephen Mosher
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Martin Morgan
- Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA
| | - Anton Nekrutenko
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, State College, PA, USA
| | | | - Kevin Osborn
- UC Santa Cruz Genomics Institute, UC Santa Cruz, Santa Cruz, CA, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, UC Santa Cruz, Santa Cruz, CA, USA
| | | | - Frederick J. Tan
- Department of Embryology, Carnegie Institution, Baltimore, MD, USA
| | - Casey Overby Taylor
- Departments of Medicine and Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Jennifer Vessio
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Levi Waldron
- Department of Epidemiology and Biostatistics, City University of New York Graduate School of Public Health and Health Policy, New York, NY, USA
| | - Ting Wang
- Department of Genetics, Washington University of St. Louis, St. Louis, MO, USA
| | - Kristin Wuichet
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | | |
Collapse
|
17
|
Zimba R, Romo ML, Kulkarni SG, Berry A, You W, Mirzayi C, Westmoreland DA, Parcesepe AM, Waldron L, Rane MS, Kochhar S, Robertson MM, Maroko AR, Grov C, Nash D. Patterns of SARS-CoV-2 Testing Preferences in a National Cohort in the United States: Latent Class Analysis of a Discrete Choice Experiment. JMIR Public Health Surveill 2021; 7:e32846. [PMID: 34793320 PMCID: PMC8722498 DOI: 10.2196/32846] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 10/21/2021] [Accepted: 11/15/2021] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND Inadequate screening and diagnostic testing in the United States throughout the first several months of the COVID-19 pandemic led to undetected cases transmitting disease in the community and an underestimation of cases. Though testing supply has increased, maintaining testing uptake remains a public health priority in the efforts to control community transmission considering the availability of vaccinations and threats from variants. OBJECTIVE This study aimed to identify patterns of preferences for SARS-CoV-2 screening and diagnostic testing prior to widespread vaccine availability and uptake. METHODS We conducted a discrete choice experiment (DCE) among participants in the national, prospective CHASING COVID (Communities, Households, and SARS-CoV-2 Epidemiology) Cohort Study from July 30 to September 8, 2020. The DCE elicited preferences for SARS-CoV-2 test type, specimen type, testing venue, and result turnaround time. We used latent class multinomial logit to identify distinct patterns of preferences related to testing as measured by attribute-level part-worth utilities and conducted a simulation based on the utility estimates to predict testing uptake if additional testing scenarios were offered. RESULTS Of the 5098 invited cohort participants, 4793 (94.0%) completed the DCE. Five distinct patterns of SARS-CoV-2 testing emerged. Noninvasive home testers (n=920, 19.2% of participants) were most influenced by specimen type and favored less invasive specimen collection methods, with saliva being most preferred; this group was the least likely to opt out of testing. Fast-track testers (n=1235, 25.8%) were most influenced by result turnaround time and favored immediate and same-day turnaround time. Among dual testers (n=889, 18.5%), test type was the most important attribute, and preference was given to both antibody and viral tests. Noninvasive dual testers (n=1578, 32.9%) were most strongly influenced by specimen type and test type, preferring saliva and cheek swab specimens and both antibody and viral tests. Among hesitant home testers (n=171, 3.6%), the venue was the most important attribute; notably, this group was the most likely to opt out of testing. In addition to variability in preferences for testing features, heterogeneity was observed in the distribution of certain demographic characteristics (age, race/ethnicity, education, and employment), history of SARS-CoV-2 testing, COVID-19 diagnosis, and concern about the pandemic. Simulation models predicted that testing uptake would increase from 81.6% (with a status quo scenario of polymerase chain reaction by nasal swab in a provider's office and a turnaround time of several days) to 98.1% by offering additional scenarios using less invasive specimens, both viral and antibody tests from a single specimen, faster turnaround time, and at-home testing. CONCLUSIONS We identified substantial differences in preferences for SARS-CoV-2 testing and found that offering additional testing options would likely increase testing uptake in line with public health goals. Additional studies may be warranted to understand if preferences for testing have changed since the availability and widespread uptake of vaccines.
Collapse
Affiliation(s)
- Rebecca Zimba
- Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
- Department of Epidemiology and Biostatistics, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
| | - Matthew L Romo
- Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
- Department of Epidemiology and Biostatistics, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
| | - Sarah G Kulkarni
- Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
| | - Amanda Berry
- Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
| | - William You
- Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
| | - Chloe Mirzayi
- Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
- Department of Epidemiology and Biostatistics, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
| | - Drew A Westmoreland
- Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
| | - Angela M Parcesepe
- Department of Maternal and Child Health, Gillings School of Public Health, University of North Carolina, Chapel Hill, NC, United States
- Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Levi Waldron
- Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
- Department of Epidemiology and Biostatistics, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
| | - Madhura S Rane
- Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
| | - Shivani Kochhar
- Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
| | - McKaylee M Robertson
- Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
| | - Andrew R Maroko
- Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
- Department of Environmental, Occupational, and Geospatial Health Sciences, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
| | - Christian Grov
- Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
- Department of Community Health and Social Sciences, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
| | - Denis Nash
- Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
- Department of Epidemiology and Biostatistics, CUNY Graduate School of Public Health & Health Policy, New York, NY, United States
| |
Collapse
|
18
|
Mallick H, Rahnavard A, McIver LJ, Ma S, Zhang Y, Nguyen LH, Tickle TL, Weingart G, Ren B, Schwager EH, Chatterjee S, Thompson KN, Wilkinson JE, Subramanian A, Lu Y, Waldron L, Paulson JN, Franzosa EA, Bravo HC, Huttenhower C. Multivariable association discovery in population-scale meta-omics studies. PLoS Comput Biol 2021; 17:e1009442. [PMID: 34784344 PMCID: PMC8714082 DOI: 10.1371/journal.pcbi.1009442] [Citation(s) in RCA: 540] [Impact Index Per Article: 180.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 12/28/2021] [Accepted: 09/09/2021] [Indexed: 12/13/2022] Open
Abstract
It is challenging to associate features such as human health outcomes, diet, environmental conditions, or other metadata to microbial community measurements, due in part to their quantitative properties. Microbiome multi-omics are typically noisy, sparse (zero-inflated), high-dimensional, extremely non-normal, and often in the form of count or compositional measurements. Here we introduce an optimized combination of novel and established methodology to assess multivariable association of microbial community features with complex metadata in population-scale observational studies. Our approach, MaAsLin 2 (Microbiome Multivariable Associations with Linear Models), uses generalized linear and mixed models to accommodate a wide variety of modern epidemiological studies, including cross-sectional and longitudinal designs, as well as a variety of data types (e.g., counts and relative abundances) with or without covariates and repeated measurements. To construct this method, we conducted a large-scale evaluation of a broad range of scenarios under which straightforward identification of meta-omics associations can be challenging. These simulation studies reveal that MaAsLin 2's linear model preserves statistical power in the presence of repeated measures and multiple covariates, while accounting for the nuances of meta-omics features and controlling false discovery. We also applied MaAsLin 2 to a microbial multi-omics dataset from the Integrative Human Microbiome (HMP2) project which, in addition to reproducing established results, revealed a unique, integrated landscape of inflammatory bowel diseases (IBD) across multiple time points and omics profiles.
Collapse
Affiliation(s)
- Himel Mallick
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Ali Rahnavard
- Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Washington DC, United States of America
| | - Lauren J. McIver
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Siyuan Ma
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Yancong Zhang
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Long H. Nguyen
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Timothy L. Tickle
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - George Weingart
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Boyu Ren
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Emma H. Schwager
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Suvo Chatterjee
- Epidemiology Branch, Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Kelsey N. Thompson
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Jeremy E. Wilkinson
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Ayshwarya Subramanian
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Yiren Lu
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Levi Waldron
- Department of Epidemiology and Biostatistics, CUNY School of Public Health, New York City, New York, United States of America
| | - Joseph N. Paulson
- Department of Biostatistics, Product Development, Genentech, Inc., South San Francisco, California, United States of America
| | - Eric A. Franzosa
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Hector Corrada Bravo
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America
| | - Curtis Huttenhower
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| |
Collapse
|
19
|
Mallick H, Rahnavard A, McIver LJ, Ma S, Zhang Y, Nguyen LH, Tickle TL, Weingart G, Ren B, Schwager EH, Chatterjee S, Thompson KN, Wilkinson JE, Subramanian A, Lu Y, Waldron L, Paulson JN, Franzosa EA, Bravo HC, Huttenhower C. Multivariable association discovery in population-scale meta-omics studies. PLoS Comput Biol 2021. [PMID: 34784344 DOI: 10.1101/2021.01.20.427420v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/07/2023] Open
Abstract
It is challenging to associate features such as human health outcomes, diet, environmental conditions, or other metadata to microbial community measurements, due in part to their quantitative properties. Microbiome multi-omics are typically noisy, sparse (zero-inflated), high-dimensional, extremely non-normal, and often in the form of count or compositional measurements. Here we introduce an optimized combination of novel and established methodology to assess multivariable association of microbial community features with complex metadata in population-scale observational studies. Our approach, MaAsLin 2 (Microbiome Multivariable Associations with Linear Models), uses generalized linear and mixed models to accommodate a wide variety of modern epidemiological studies, including cross-sectional and longitudinal designs, as well as a variety of data types (e.g., counts and relative abundances) with or without covariates and repeated measurements. To construct this method, we conducted a large-scale evaluation of a broad range of scenarios under which straightforward identification of meta-omics associations can be challenging. These simulation studies reveal that MaAsLin 2's linear model preserves statistical power in the presence of repeated measures and multiple covariates, while accounting for the nuances of meta-omics features and controlling false discovery. We also applied MaAsLin 2 to a microbial multi-omics dataset from the Integrative Human Microbiome (HMP2) project which, in addition to reproducing established results, revealed a unique, integrated landscape of inflammatory bowel diseases (IBD) across multiple time points and omics profiles.
Collapse
Affiliation(s)
- Himel Mallick
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Ali Rahnavard
- Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Washington DC, United States of America
| | - Lauren J McIver
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Siyuan Ma
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Yancong Zhang
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Long H Nguyen
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Timothy L Tickle
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - George Weingart
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Boyu Ren
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Emma H Schwager
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Suvo Chatterjee
- Epidemiology Branch, Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Kelsey N Thompson
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Jeremy E Wilkinson
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Ayshwarya Subramanian
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Yiren Lu
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Levi Waldron
- Department of Epidemiology and Biostatistics, CUNY School of Public Health, New York City, New York, United States of America
| | - Joseph N Paulson
- Department of Biostatistics, Product Development, Genentech, Inc., South San Francisco, California, United States of America
| | - Eric A Franzosa
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| | - Hector Corrada Bravo
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America
| | - Curtis Huttenhower
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- The Broad Institute, Cambridge, Massachusetts, United States of America
| |
Collapse
|
20
|
Nash D, Rane MS, Chang M, Kulkarni SG, Zimba R, You W, Berry A, Mirzayi C, Kochhar S, Maroko A, Robertson MM, Westmoreland DA, Parcesepe AM, Waldron L, Grov C. SARS-CoV-2 incidence and risk factors in a national, community-based prospective cohort of U.S. adults. medRxiv 2021:2021.02.12.21251659. [PMID: 33619505 PMCID: PMC7899475 DOI: 10.1101/2021.02.12.21251659] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
BACKGROUND Epidemiologic risk factors for incident SARS-CoV-2 infection as determined via prospective cohort studies greatly augment and complement information from case-based surveillance and cross-sectional seroprevalence surveys. METHODS We estimated the incidence of SARS-CoV-2 infection and risk factors in a well-characterized, national prospective cohort of 6,738 U.S. adults, enrolled March-August 2020, a subset of whom (n=4,510) underwent repeat serologic testing between May 2020 and January 2021. We examined the crude associations of sociodemographic factors, epidemiologic risk factors, and county-level community transmission with the incidence of seroconversion. In multivariable Poisson models we examined the association of social distancing and a composite score of several epidemiologic risk factors with the rate of seroconversion. FINDINGS Among the 4,510 individuals with at least one serologic test, 323 (7.3%, 95% confidence interval [CI] 6.5%-8.1%) seroconverted by January 2021. Among 3,422 participants seronegative in May-September 2020 and tested during November 2020-January 2021, we observed 161 seroconversions over 1,646 person-years of follow-up (incidence rate of 9.8 per 100 person-years [95%CI 8.3-11.5]). In adjusted models, participants who reported always or sometimes social distancing with people they knew (IRRalways vs. never 0.43, 95%CI 0.21-1.0; IRRsometimes vs. never 0.47, 95%CI 0.22-1.2) and people they did not know (IRRalways vs. never 0.64, 95%CI 0.39-1.1; IRRsometimes vs. never 0.60, 95%CI 0.38-0.97) had lower rates of seroconversion. The rate of seroconversion increased across tertiles of the composite score of epidemiologic risk (IRRmedium vs. low 1.5, 95%CI 0.92-2.4; IRRhigh vs. low 3.0, 95%CI 2.0-4.6). Among the 161 observed seroconversions, 28% reported no symptoms of COVID-like illness (i.e., were asymptomatic), and 27% reported a positive SARS-CoV-2 diagnostic test. Ultimately, only 29% reported isolating and 19% were asked about contacts. INTERPRETATION Modifiable epidemiologic risk factors and poor reach of public health strategies drove SARS-CoV-2 transmission across the U.S during May 2020-January 2021. FUNDING U.S. National Institutes of Allergy and Infectious Diseases (NIAID).
Collapse
Affiliation(s)
- Denis Nash
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - Madhura S. Rane
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Mindy Chang
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Sarah Gorrell Kulkarni
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Rebecca Zimba
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - William You
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Amanda Berry
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Chloe Mirzayi
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - Shivani Kochhar
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Andrew Maroko
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Environmental, Occupational, and Geospatial Health Sciences, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - McKaylee M. Robertson
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Drew A. Westmoreland
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Angela M. Parcesepe
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Maternal and Child Health, Gillings School of Public Health, University of North Carolina, Chapel Hill, NC, USA
- Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Levi Waldron
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - Christian Grov
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Community Health and Social Sciences, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| |
Collapse
|
21
|
Robertson MM, Kulkarni SG, Rane M, Kochhar S, Berry A, Chang M, Mirzayi C, You W, Maroko A, Zimba R, Westmoreland D, Grov C, Parcesepe AM, Waldron L, Nash D. Cohort profile: a national, community-based prospective cohort study of SARS-CoV-2 pandemic outcomes in the USA-the CHASING COVID Cohort study. BMJ Open 2021; 11:e048778. [PMID: 34548354 PMCID: PMC8458000 DOI: 10.1136/bmjopen-2021-048778] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Accepted: 08/05/2021] [Indexed: 12/13/2022] Open
Abstract
PURPOSE The Communities, Households and SARS-CoV-2 Epidemiology (CHASING) COVID Cohort Study is a community-based prospective cohort study launched during the upswing of the USA COVID-19 epidemic. The objectives of the cohort study are to: (1) estimate and evaluate determinants of the incidence of SARS-CoV-2 infection, disease and deaths; (2) assess the impact of the pandemic on psychosocial and economic outcomes and (3) assess the uptake of pandemic mitigation strategies. PARTICIPANTS We began enrolling participants from 28 March 2020 using internet-based strategies. Adults≥18 years residing anywhere in the USA or US territories were eligible. 6740 people are enrolled in the cohort, including participants from all 50 US states, the District of Columbia, Puerto Rico and Guam. Participants are contacted regularly to complete study assessments, including interviews and dried blood spot specimen collection for serologic testing. FINDINGS TO DATE Participants are geographically and sociodemographically diverse and include essential workers (19%). 84.2% remain engaged in cohort follow-up activities after enrolment. Data have been used to assess SARS-CoV-2 cumulative incidence, seroincidence and related risk factors at different phases of the US pandemic; the role of household crowding and the presence of children in the household as potential risk factors for severe COVID-19 early in the US pandemic; to describe the prevalence of anxiety symptoms and its relationship to COVID-19 outcomes and other potential stressors; to identify preferences for SARS-CoV-2 diagnostic testing when community transmission is on the rise via a discrete choice experiment and to assess vaccine hesitancy over time and its relationship to vaccine uptake. FUTURE PLANS The CHASING COVID Cohort Study has outlined a research agenda that involves ongoing monitoring of the incidence and determinants of SARS-CoV-2 outcomes, mental health outcomes and economic outcomes. Additional priorities include assessing the incidence, prevalence and correlates of long-haul COVID-19.
Collapse
Affiliation(s)
- McKaylee M Robertson
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
| | - Sarah Gorrell Kulkarni
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
| | - Madhura Rane
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
| | - Shivani Kochhar
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
| | - Amanda Berry
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
| | - Mindy Chang
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
| | - Chloe Mirzayi
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
| | - William You
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
| | - Andrew Maroko
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
- Environmental Health Sciences, Graduate School of Public Health and Health Policy, City University of New York, New York, New York, USA
| | - Rebecca Zimba
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
| | - Drew Westmoreland
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
| | - Christian Grov
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
- Community Health and Social Sciences, Graduate School of Public Health and Health Policy, City University of New York, New York, New York, USA
| | - Angela Marie Parcesepe
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
- Maternal and Child Health, University of North Carolina at Chapel Hill Gillings School of Global Public Health, Chapel Hill, North Carolina, USA
| | - Levi Waldron
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
| | - Denis Nash
- City University of New York (CUNY) Institute for Implementation Science in Population Health, New York, New York, USA
- Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, New York, USA
| |
Collapse
|
22
|
Carey VJ, Ramos M, Stubbs BJ, Gopaulakrishnan S, Oh S, Turaga N, Waldron L, Morgan M. Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale. JCO Clin Cancer Inform 2021; 4:472-479. [PMID: 32453635 PMCID: PMC7265787 DOI: 10.1200/cci.19.00111] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Institutional efforts toward the democratization of cloud-scale data and analysis methods for cancer genomics are proceeding rapidly. As part of this effort, we bridge two major bioinformatic initiatives: the Global Alliance for Genomics and Health (GA4GH) and Bioconductor. METHODS We describe in detail a use case in pancancer transcriptomics conducted by blending implementations of the GA4GH Workflow Execution Services and Tool Registry Service concepts with the Bioconductor curatedTCGAData and BiocOncoTK packages. RESULTS We carried out the analysis with a formally archived workflow and container at dockstore.org and a workspace and notebook at app.terra.bio. The analysis identified relationships between microsatellite instability and biomarkers of immune dysregulation at a finer level of granularity than previously reported. Our use of standard approaches to containerization and workflow programming allows this analysis to be replicated and extended. CONCLUSION Experimental use of dockstore.org and app.terra.bio in concert with Bioconductor enabled novel statistical analysis of large genomic projects without the need for local supercomputing resources but involved challenges related to container design, script archiving, and unit testing. Best practices and cost/benefit metrics for the management and analysis of globally federated genomic data and annotation are evolving. The creation and execution of use cases like the one reported here will be helpful in the development and comparison of approaches to federated data/analysis systems in cancer genomics.
Collapse
Affiliation(s)
- Vincent J Carey
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Marcel Ramos
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY.,Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Benjamin J Stubbs
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Shweta Gopaulakrishnan
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Sehyun Oh
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY
| | - Nitesh Turaga
- Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Levi Waldron
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY
| | - Martin Morgan
- Roswell Park Comprehensive Cancer Center, Buffalo, NY
| |
Collapse
|
23
|
Ramos M, Geistlinger L, Oh S, Schiffer L, Azhar R, Kodali H, de Bruijn I, Gao J, Carey VJ, Morgan M, Waldron L. Multiomic Integration of Public Oncology Databases in Bioconductor. JCO Clin Cancer Inform 2021; 4:958-971. [PMID: 33119407 PMCID: PMC7608653 DOI: 10.1200/cci.19.00119] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
PURPOSE Investigations of the molecular basis for the development, progression, and treatment of cancer increasingly use complementary genomic assays to gather multiomic data, but management and analysis of such data remain complex. The cBioPortal for cancer genomics currently provides multiomic data from > 260 public studies, including The Cancer Genome Atlas (TCGA) data sets, but integration of different data types remains challenging and error prone for computational methods and tools using these resources. Recent advances in data infrastructure within the Bioconductor project enable a novel and powerful approach to creating fully integrated representations of these multiomic, pan-cancer databases. METHODS We provide a set of R/Bioconductor packages for working with TCGA legacy data and cBioPortal data, with special considerations for loading time; efficient representations in and out of memory; analysis platform; and an integrative framework, such as MultiAssayExperiment. Large methylation data sets are provided through out-of-memory data representation to provide responsive loading times and analysis capabilities on machines with limited memory. RESULTS We developed the curatedTCGAData and cBioPortalData R/Bioconductor packages to provide integrated multiomic data sets from the TCGA legacy database and the cBioPortal web application programming interface using the MultiAssayExperiment data structure. This suite of tools provides coordination of diverse experimental assays with clinicopathological data with minimal data management burden, as demonstrated through several greatly simplified multiomic and pan-cancer analyses. CONCLUSION These integrated representations enable analysts and tool developers to apply general statistical and plotting methods to extensive multiomic data through user-friendly commands and documented examples.
Collapse
Affiliation(s)
- Marcel Ramos
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY.,Institute for Implementation Science and Population Health, City University of New York, New York, NY.,Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Ludwig Geistlinger
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY.,Institute for Implementation Science and Population Health, City University of New York, New York, NY
| | - Sehyun Oh
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY.,Institute for Implementation Science and Population Health, City University of New York, New York, NY
| | - Lucas Schiffer
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY.,Institute for Implementation Science and Population Health, City University of New York, New York, NY.,Section of Computational Biomedicine, Boston University School of Medicine, Boston, MA
| | - Rimsha Azhar
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY.,Institute for Implementation Science and Population Health, City University of New York, New York, NY.,Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY
| | - Hanish Kodali
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY.,Institute for Implementation Science and Population Health, City University of New York, New York, NY
| | - Ino de Bruijn
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Jianjiong Gao
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY.,Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Vincent J Carey
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Martin Morgan
- Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Levi Waldron
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY.,Institute for Implementation Science and Population Health, City University of New York, New York, NY
| |
Collapse
|
24
|
Abstract
PURPOSE Allele-specific copy number alteration (CNA) analysis is essential to study the functional impact of single-nucleotide variants (SNVs) and the process of tumorigenesis. However, controversy over whether it can be performed with sufficient accuracy in data without matched normal profiles and a lack of open-source implementations have limited its application in clinical research and diagnosis. METHODS We benchmark allele-specific CNA analysis performance of whole-exome sequencing (WES) data against gold standard whole-genome SNP6 microarray data and against WES data sets with matched normal samples. We provide a workflow based on the open-source PureCN R/Bioconductor package in conjunction with widely used variant-calling and copy number segmentation algorithms for allele-specific CNA analysis from WES without matched normals. This workflow further classifies SNVs by somatic status and then uses this information to infer somatic mutational signatures and tumor mutational burden (TMB). RESULTS Application of our workflow to tumor-only WES data produces tumor purity and ploidy estimates that are highly concordant with estimates from SNP6 microarray data and matched normal WES data. The presence of cancer type–specific somatic mutational signatures was inferred with high accuracy. We also demonstrate high concordance of TMB between our tumor-only workflow and matched normal pipelines. CONCLUSION The proposed workflow provides, to our knowledge, the only open-source option with demonstrated high accuracy for comprehensive allele-specific CNA analysis and SNV classification of tumor-only WES. An implementation of the workflow is available on the Terra Cloud platform of the Broad Institute (Cambridge, MA).
Collapse
Affiliation(s)
- Sehyun Oh
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY.,Institute for Implementation Science and Population Health, City University of New York, New York, NY
| | - Ludwig Geistlinger
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY.,Institute for Implementation Science and Population Health, City University of New York, New York, NY
| | - Marcel Ramos
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY.,Institute for Implementation Science and Population Health, City University of New York, New York, NY
| | | | - Levi Waldron
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY.,Institute for Implementation Science and Population Health, City University of New York, New York, NY
| | - Markus Riester
- Novartis Institutes for BioMedical Research, Cambridge, MA
| |
Collapse
|
25
|
Tomic A, Tomic I, Waldron L, Geistlinger L, Kuhn M, Spreng RL, Dahora LC, Seaton KE, Tomaras G, Hill J, Duggal NA, Pollock RD, Lazarus NR, Harridge SD, Lord JM, Khatri P, Pollard AJ, Davis MM. SIMON: Open-Source Knowledge Discovery Platform. Patterns (N Y) 2021; 2:100178. [PMID: 33511368 PMCID: PMC7815964 DOI: 10.1016/j.patter.2020.100178] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Revised: 10/27/2020] [Accepted: 12/04/2020] [Indexed: 02/06/2023]
Abstract
Data analysis and knowledge discovery has become more and more important in biology and medicine with the increasing complexity of biological datasets, but the necessarily sophisticated programming skills and in-depth understanding of algorithms needed pose barriers to most biologists and clinicians to perform such research. We have developed a modular open-source software, SIMON, to facilitate the application of 180+ state-of-the-art machine-learning algorithms to high-dimensional biomedical data. With an easy-to-use graphical user interface, standardized pipelines, and automated approach for machine learning and other statistical analysis methods, SIMON helps to identify optimal algorithms and provides a resource that empowers non-technical and technical researchers to identify crucial patterns in biomedical data.
Collapse
Affiliation(s)
- Adriana Tomic
- Oxford Vaccine Group, Department of Paediatrics, University of Oxford, Oxford, UK,Institute of Immunity, Transplantation, and Infection, Stanford University School of Medicine, Stanford, CA, USA,Corresponding author
| | - Ivan Tomic
- Deep Medicine, Nuffield Department of Women's and Reproductive Health, University of Oxford, Oxford, UK,Corresponding author
| | - Levi Waldron
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY, USA,Institute for Implementation Science and Population Health, City University of New York, New York, NY, USA
| | - Ludwig Geistlinger
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY, USA,Institute for Implementation Science and Population Health, City University of New York, New York, NY, USA
| | | | | | | | - Kelly E. Seaton
- Duke Human Vaccine Institute, Duke University, Durham, NC, USA
| | - Georgia Tomaras
- Duke Human Vaccine Institute, Duke University, Durham, NC, USA
| | - Jennifer Hill
- Oxford Vaccine Group, Department of Paediatrics, University of Oxford, Oxford, UK
| | - Niharika A. Duggal
- MRC-Versus Arthritis Centre for Musculoskeletal Ageing Research, Institute of Inflammation and Ageing, University of Birmingham Research Labs, Birmingham, UK
| | - Ross D. Pollock
- Centre for Human and Applied Physiological Sciences, King's College London, UK
| | - Norman R. Lazarus
- Centre for Human and Applied Physiological Sciences, King's College London, UK
| | | | - Janet M. Lord
- MRC-Versus Arthritis Centre for Musculoskeletal Ageing Research, Institute of Inflammation and Ageing, University of Birmingham Research Labs, Birmingham, UK,NIHR Birmingham Biomedical Research Centre, University Hospital Birmingham NHS Foundation Trust and University of Birmingham, Birmingham, UK
| | - Purvesh Khatri
- Institute of Immunity, Transplantation, and Infection, Stanford University School of Medicine, Stanford, CA, USA,Center for Biomedical Informatics Research, Department of Medicine, Stanford University, Stanford, CA, USA
| | - Andrew J. Pollard
- Oxford Vaccine Group, Department of Paediatrics, University of Oxford, Oxford, UK
| | - Mark M. Davis
- Institute of Immunity, Transplantation, and Infection, Stanford University School of Medicine, Stanford, CA, USA,Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA, USA,Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA,Corresponding author
| |
Collapse
|
26
|
Zimba R, Kulkarni S, Berry A, You W, Mirzayi C, Westmoreland D, Parcesepe A, Waldron L, Rane M, Kochhar S, Robertson M, Maroko A, Grov C, Nash D. SARS-CoV-2 Testing Service Preferences of Adults in the United States: Discrete Choice Experiment. JMIR Public Health Surveill 2020; 6:e25546. [PMID: 33315584 PMCID: PMC7781587 DOI: 10.2196/25546] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 12/09/2020] [Accepted: 12/09/2020] [Indexed: 12/16/2022] Open
Abstract
Background Ascertaining preferences for SARS-CoV-2 testing and incorporating findings into the design and implementation of strategies for delivering testing services may enhance testing uptake and engagement, a prerequisite to reducing onward transmission. Objective This study aims to determine important drivers of decisions to obtain a SARS-CoV-2 test in the context of increasing community transmission. Methods We used a discrete choice experiment to assess preferences for SARS-CoV-2 test type, specimen type, testing venue, and results turnaround time. Participants (n=4793) from the US national longitudinal Communities, Households and SARS-CoV-2 Epidemiology (CHASING) COVID Cohort Study completed our online survey from July 30 to September 8, 2020. We estimated the relative importance of testing method attributes and part-worth utilities of attribute levels, and simulated the uptake of an optimized testing scenario relative to the current typical testing scenario of polymerase chain reaction (PCR) via nasopharyngeal swab in a provider’s office or urgent care clinic with results in >5 days. Results Test result turnaround time had the highest relative importance (30.4%), followed by test type (28.3%), specimen type (26.2%), and venue (15.0%). In simulations, immediate or same-day test results, both PCR and serology, or oral specimens substantially increased testing uptake over the current typical testing option. Simulated uptake of a hypothetical testing scenario of PCR and serology via a saliva sample at a pharmacy with same-day results was 97.7%, compared to 0.6% for the current typical testing scenario, with 1.8% opting for no test. Conclusions Testing strategies that offer both PCR and serology with noninvasive methods and rapid turnaround time would likely have the most uptake and engagement among residents in communities with increasing community transmission of SARS-CoV-2.
Collapse
Affiliation(s)
- Rebecca Zimba
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, United States
| | - Sarah Kulkarni
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, United States
| | - Amanda Berry
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, United States
| | - William You
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, United States
| | - Chloe Mirzayi
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, United States
| | - Drew Westmoreland
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, United States
| | - Angela Parcesepe
- The Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States.,Department of Maternal and Child Health, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Levi Waldron
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, United States.,Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY, United States
| | - Madhura Rane
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, United States
| | - Shivani Kochhar
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, United States
| | - McKaylee Robertson
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, United States
| | - Andrew Maroko
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, United States.,Department of Environmental, Occupational, and Geospatial Health Sciences, Graduate School of Public Health and Health Policy, City University of New York, New York, NY, United States
| | - Christian Grov
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, United States.,Department of Community Health and Social Sciences, Graduate School of Public Health and Health Policy, City University of New York, New York, NY, United States
| | - Denis Nash
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, United States.,Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY, United States
| |
Collapse
|
27
|
Zhang Y, Bernau C, Parmigiani G, Waldron L. The impact of different sources of heterogeneity on loss of accuracy from genomic prediction models. Biostatistics 2020; 21:253-268. [PMID: 30202918 DOI: 10.1093/biostatistics/kxy044] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2018] [Revised: 07/22/2018] [Accepted: 08/04/2018] [Indexed: 11/13/2022] Open
Abstract
Cross-study validation (CSV) of prediction models is an alternative to traditional cross-validation (CV) in domains where multiple comparable datasets are available. Although many studies have noted potential sources of heterogeneity in genomic studies, to our knowledge none have systematically investigated their intertwined impacts on prediction accuracy across studies. We employ a hybrid parametric/non-parametric bootstrap method to realistically simulate publicly available compendia of microarray, RNA-seq, and whole metagenome shotgun microbiome studies of health outcomes. Three types of heterogeneity between studies are manipulated and studied: (i) imbalances in the prevalence of clinical and pathological covariates, (ii) differences in gene covariance that could be caused by batch, platform, or tumor purity effects, and (iii) differences in the "true" model that associates gene expression and clinical factors to outcome. We assess model accuracy, while altering these factors. Lower accuracy is seen in CSV than in CV. Surprisingly, heterogeneity in known clinical covariates and differences in gene covariance structure have very limited contributions in the loss of accuracy when validating in new studies. However, forcing identical generative models greatly reduces the within/across study difference. These results, observed consistently for multiple disease outcomes and omics platforms, suggest that the most easily identifiable sources of study heterogeneity are not necessarily the primary ones that undermine the ability to accurately replicate the accuracy of omics prediction models in new studies. Unidentified heterogeneity, such as could arise from unmeasured confounding, may be more important.
Collapse
Affiliation(s)
- Yuqing Zhang
- Graduate Program in Bioinformatics, Boston University, 24 Cummington Mall, Boston, MA, USA
| | - Christoph Bernau
- Department of Medical Informatics, Biometry and Epidemiology, University of Munich, Marchioninistr. 15, Munich, Germany
| | - Giovanni Parmigiani
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 3 Blackfan Cir, Boston, MA, USA.,Department of Biostatistics, Harvard TH Chan School of Public Health, 677 Huntington Ave, Boston, MA, USA
| | - Levi Waldron
- Graduate School of Public Health and Health Policy, Institute for Implementation Science in Population Health, City University of New York, 55 W 125th St, New York, NY, USA
| |
Collapse
|
28
|
Romo ML, Zimba R, Kulkarni S, Berry A, You W, Mirzayi C, Westmoreland D, Parcesepe AM, Waldron L, Rane M, Kochhar S, Robertson M, Maroko AR, Grov C, Nash D. Patterns of SARS-CoV-2 testing preferences in a national cohort in the United States. medRxiv 2020:2020.12.22.20248747. [PMID: 33398293 PMCID: PMC7781336 DOI: 10.1101/2020.12.22.20248747] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
In order to understand preferences about SARS-CoV-2 testing, we conducted a discrete choice experiment among 4793 participants in the Communities, Households, and SARS-CoV-2 Epidemiology (CHASING COVID) Cohort Study from July 30-September 8, 2020. We used latent class analysis to identify distinct patterns of preferences related to testing and conducted a simulation to predict testing uptake if additional testing scenarios were offered. Five distinct patterns of SARS-CoV-2 testing emerged. "Comprehensive testers" (18.9%) ranked specimen type as most important and favored less invasive specimen types, with saliva most preferred, and also ranked venue and result turnaround time as highly important, with preferences for home testing and fast result turnaround time. "Fast track testers" (26.0%) ranked result turnaround time as most important and favored immediate and same day turnaround time. "Dual testers" (18.5%) ranked test type as most important and preferred both antibody and viral tests. "Non-invasive dual testers" (33.0%) ranked specimen type and test type as similarly most important, preferring cheek swab specimen type and both antibody and viral tests. "Home testers" (3.6%) ranked venue as most important and favored home-based testing. By offering less invasive (saliva specimen type), dual testing (both viral and antibody tests), and at home testing scenarios in addition to standard testing scenarios, simulation models predicted that testing uptake would increase from 81.7% to 98.1%. We identified substantial differences in preferences for SARS-CoV-2 testing and found that offering additional testing options, which consider this heterogeneity, would likely increase testing uptake. SIGNIFICANCE During the COVID-19 pandemic, diagnostic testing has allowed for early detection of cases and implementation of measures to reduce community transmission of SARS-CoV-2 infection. Understanding individuals' preferences about testing and the service models that deliver tests are relevant in efforts to increase and sustain uptake of SARS-CoV-2 testing, which, despite vaccine availability, will be required for the foreseeable future. We identified substantial differences in preferences for SARS-CoV-2 testing in a discrete choice experiment among a large national cohort of adults in the US. Offering additional testing options that account for or anticipate this heterogeneity in preferences (e.g., both viral and antibody tests, at home testing), would likely increase testing uptake. CLASSIFICATION Biological Sciences (major); Psychological and Cognitive Sciences (minor).
Collapse
Affiliation(s)
- Matthew L. Romo
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York, NY, 10027 USA
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York, NY, 10027 USA
| | - Rebecca Zimba
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York, NY, 10027 USA
| | - Sarah Kulkarni
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York, NY, 10027 USA
| | - Amanda Berry
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York, NY, 10027 USA
| | - William You
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York, NY, 10027 USA
| | - Chloe Mirzayi
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York, NY, 10027 USA
| | - Drew Westmoreland
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York, NY, 10027 USA
| | - Angela M. Parcesepe
- Department of Maternal and Child Health, Gillings School of Public Health, University of North Carolina, Chapel Hill, NC, 27599 USA
- Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27516 USA
| | - Levi Waldron
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York, NY, 10027 USA
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York, NY, 10027 USA
| | - Madhura Rane
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York, NY, 10027 USA
| | - Shivani Kochhar
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York, NY, 10027 USA
| | - McKaylee Robertson
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York, NY, 10027 USA
| | - Andrew R. Maroko
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York, NY, 10027 USA
- Department of Environmental, Occupational, and Geospatial Health Sciences, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York, NY, 10027 USA
| | - Christian Grov
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York, NY, 10027 USA
- Department of Community Health and Social Sciences, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York, NY, 10027 USA
| | - Denis Nash
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York, NY, 10027 USA
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York, NY, 10027 USA
| | | |
Collapse
|
29
|
Oh S, Abdelnabi J, Al-Dulaimi R, Aggarwal A, Ramos M, Davis S, Riester M, Waldron L. HGNChelper: identification and correction of invalid gene symbols for human and mouse. F1000Res 2020; 9:1493. [PMID: 33564398 DOI: 10.12688/f1000research.28033.1] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/10/2020] [Indexed: 11/20/2022] Open
Abstract
Gene symbols are recognizable identifiers for gene names but are unstable and error-prone due to aliasing, manual entry, and unintentional conversion by spreadsheets to date format. Official gene symbol resources such as HUGO Gene Nomenclature Committee (HGNC) for human genes and the Mouse Genome Informatics project (MGI) for mouse genes provide authoritative sources of valid, aliased, and outdated symbols, but lack a programmatic interface and correction of symbols converted by spreadsheets. We present HGNChelper, an R package that identifies known aliases and outdated gene symbols based on the HGNC human and MGI mouse gene symbol databases, in addition to common mislabeling introduced by spreadsheets, and provides corrections where possible. HGNChelper identified invalid gene symbols in the most recent Molecular Signatures Database (mSigDB 7.0) and in platform annotation files of the Gene Expression Omnibus, with prevalence ranging from ~3% in recent platforms to 30-40% in the earliest platforms from 2002-03. HGNChelper is installable from CRAN.
Collapse
Affiliation(s)
- Sehyun Oh
- Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, 10027, USA.,Institute for Implementation Science and Population Health, New York, 10027, USA
| | - Jasmine Abdelnabi
- Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, 10027, USA.,Institute for Implementation Science and Population Health, New York, 10027, USA
| | - Ragheed Al-Dulaimi
- Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, 10027, USA.,Institute for Implementation Science and Population Health, New York, 10027, USA.,School of Medicine, University of Utah, Utah, 84132, USA
| | - Ayush Aggarwal
- CSIR-Institute of Genomics and Integrative Biology, New Delhi, 110025, India.,Academy of Scientific and Innovative Research, Ghaziabad, Uttar Pradesh, 201 002, India
| | - Marcel Ramos
- Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, 10027, USA.,Institute for Implementation Science and Population Health, New York, 10027, USA
| | - Sean Davis
- Center for Cancer Research, National Cancer Institute, Maryland, 20892, USA
| | - Markus Riester
- Novartis Institutes for BioMedical Research Incorporation, Massachusetts, 02139, USA
| | - Levi Waldron
- Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, 10027, USA.,Institute for Implementation Science and Population Health, New York, 10027, USA
| |
Collapse
|
30
|
Oh S, Abdelnabi J, Al-Dulaimi R, Aggarwal A, Ramos M, Davis S, Riester M, Waldron L. HGNChelper: identification and correction of invalid gene symbols for human and mouse. F1000Res 2020; 9:1493. [PMID: 33564398 PMCID: PMC7856679 DOI: 10.12688/f1000research.28033.2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/28/2022] [Indexed: 11/20/2022] Open
Abstract
Gene symbols are recognizable identifiers for gene names but are unstable and error-prone due to aliasing, manual entry, and unintentional conversion by spreadsheets to date format. Official gene symbol resources such as HUGO Gene Nomenclature Committee (HGNC) for human genes and the Mouse Genome Informatics project (MGI) for mouse genes provide authoritative sources of valid, aliased, and outdated symbols, but lack a programmatic interface and correction of symbols converted by spreadsheets. We present HGNChelper, an R package that identifies known aliases and outdated gene symbols based on the HGNC human and MGI mouse gene symbol databases, in addition to common mislabeling introduced by spreadsheets, and provides corrections where possible. HGNChelper identified invalid gene symbols in the most recent Molecular Signatures Database (MSigDB 7.0) and in platform annotation files of the Gene Expression Omnibus, with prevalence ranging from ~3% in recent platforms to 30-40% in the earliest platforms from 2002-03. HGNChelper is installable from CRAN.
Collapse
Affiliation(s)
- Sehyun Oh
- Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, 10027, USA.,Institute for Implementation Science and Population Health, New York, 10027, USA
| | - Jasmine Abdelnabi
- Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, 10027, USA.,Institute for Implementation Science and Population Health, New York, 10027, USA
| | - Ragheed Al-Dulaimi
- Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, 10027, USA.,Institute for Implementation Science and Population Health, New York, 10027, USA.,School of Medicine, University of Utah, Utah, 84132, USA
| | - Ayush Aggarwal
- CSIR-Institute of Genomics and Integrative Biology, New Delhi, 110025, India.,Academy of Scientific and Innovative Research, Ghaziabad, Uttar Pradesh, 201 002, India
| | - Marcel Ramos
- Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, 10027, USA.,Institute for Implementation Science and Population Health, New York, 10027, USA
| | - Sean Davis
- Center for Cancer Research, National Cancer Institute, Maryland, 20892, USA
| | - Markus Riester
- Novartis Institutes for BioMedical Research Incorporation, Massachusetts, 02139, USA
| | - Levi Waldron
- Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, 10027, USA.,Institute for Implementation Science and Population Health, New York, 10027, USA
| |
Collapse
|
31
|
Nash D, Qasmieh S, Robertson M, Rane M, Zimba R, Kulkarni S, Berry A, You W, Mirzayi C, Westmoreland D, Parcesepe A, Waldron L, Kochhar S, Maroko AR, Grov C. Household factors and the risk of severe COVID-like illness early in the US pandemic. medRxiv 2020:2020.12.03.20243683. [PMID: 33300008 PMCID: PMC7724676 DOI: 10.1101/2020.12.03.20243683] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
OBJECTIVE To investigate the role of children in the home and household crowding as risk factors for severe COVID-19 disease. METHODS We used interview data from 6,831 U.S. adults screened for the Communities, Households and SARS/CoV-2 Epidemiology (CHASING) COVID Cohort Study in April 2020. RESULTS In logistic regression models, the adjusted odds ratio [aOR] of hospitalization due to COVID-19 for having (versus not having) children in the home was 10.5 (95% CI:5.7-19.1) among study participants living in multi-unit dwellings and 2.2 (95% CI:1.2-6.5) among those living in single unit dwellings. Among participants living in multi-unit dwellings, the aOR for COVID-19 hospitalization among participants with more than 4 persons in their household (versus 1 person) was 2.5 (95% CI:1.0-6.1), and 0.8 (95% CI:0.15-4.1) among those living in single unit dwellings. CONCLUSION Early in the US SARS-CoV-2 pandemic, certain household exposures likely increased the risk of both SARS-CoV-2 acquisition and the risk of severe COVID-19 disease.
Collapse
Affiliation(s)
- Denis Nash
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - Saba Qasmieh
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - McKaylee Robertson
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Madhura Rane
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Rebecca Zimba
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Sarah Kulkarni
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Amanda Berry
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - William You
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Chloe Mirzayi
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Drew Westmoreland
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Angela Parcesepe
- Department of Maternal and Child Health, Gillings School of Public Health, University of North Carolina, Chapel Hill, NC, USA
- Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Levi Waldron
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - Shivani Kochhar
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Andrew R Maroko
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Environmental, Occupational, and Geospatial Health Sciences, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - Christian Grov
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Community Health and Social Sciences, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | | |
Collapse
|
32
|
Renson A, Kasselman LJ, Dowd JB, Waldron L, Jones HE, Herd P. Gut bacterial taxonomic abundances vary with cognition, personality, and mood in the Wisconsin Longitudinal Study. Brain Behav Immun Health 2020; 9:100155. [PMID: 34589897 PMCID: PMC8474555 DOI: 10.1016/j.bbih.2020.100155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 10/06/2020] [Indexed: 10/30/2022] Open
Abstract
Animal studies have shown that the gut microbiome can influence memory, social behavior, and anxiety-like behavior. Several human studies show similar results where variation in the gut microbiome is associated with dementia, depression, and personality traits, though most of these studies are limited by small sample size and other biases. Here, we analyzed fecal samples from 313 participants in the Wisconsin Longitudinal Study, a randomly selected population-based cohort of older adults, with measured psycho-cognitive dimensions (cognition, mood, and personality) and key confounders. 16s V4 sequencing showed that Megamonas is associated with all measured psycho-cognitive traits, Fusobacterium is associated with cognitive and personality traits, Pseudoramibacter_Eubacterium is associated with mood and personality traits, Butyvibrio is associated with cognitive traits, and Cloacibacillus is associated with mood traits. These findings are robust to sensitivity analyses and provide novel evidence of shared relationships between the gut microbiome and multiple psycho-cognitive traits in older adults, confirming some of the animal literature, while also providing new insights. While we addressed some of the weaknesses in prior studies, further studies are necessary to elucidate temporal and causal relationships between the gut microbiome and multiple psycho-cognitive traits in well-phenotyped, randomly-selected population-based samples.
Collapse
Affiliation(s)
- Audrey Renson
- Department of Epidemiology and Biostatistics, CUNY School of Public Health, New York, NY, USA
| | - Lora J. Kasselman
- Department of Epidemiology and Biostatistics, CUNY School of Public Health, New York, NY, USA
- NYU Long Island School of Medicine, Mineola, NY, USA
| | - Jennifer B. Dowd
- Leverhulme Centre for Demographic Science, University of Oxford, Oxford, UK
| | - Levi Waldron
- Department of Epidemiology and Biostatistics, CUNY School of Public Health, New York, NY, USA
| | - Heidi E. Jones
- Department of Epidemiology and Biostatistics, CUNY School of Public Health, New York, NY, USA
| | - Pamela Herd
- McCourt School of Public Policy, Georgetown University, Washington, DC, 20057, USA
| |
Collapse
|
33
|
Moon JI, Zhang H, Waldron L, Iyer KR. "Stoma or no stoma": First report of intestinal transplantation without stoma. Am J Transplant 2020; 20:3550-3557. [PMID: 32431016 DOI: 10.1111/ajt.16065] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Revised: 04/05/2020] [Accepted: 05/04/2020] [Indexed: 01/25/2023]
Abstract
Recent data suggest that frequent endoscopy and biopsy without evidence of graft dysfunction does not appear to confer survival advantage after intestinal transplantation. After abandoning protocol surveillance, endoscopic examination was decreased significantly at our center. These observations led us to question the need for stoma creation in intestinal transplantation. Herein, we report clinical outcomes of intestinal transplantation without stoma, compared to conventional transplant with stoma. Data analysis was limited to adult intestinal transplantation without liver allograft between 2015 and 2018. We compared patient and graft survival, frequency of endoscopic evaluation, episodes of acute rejection, nutritional therapy, and renal function between "Control group (with stoma)," n = 18 grafts in 16 patients and "Study group (without stoma)," n = 16 grafts in 15 patients. Overall outcome was similar between the 2 groups with respect to graft and patient survival, episodes of acute rejection, and its response to treatment. Nutritional outcomes were similar in both groups. Fewer antidiarrheal medications were required in the study group, but this did not translate into demonstrable gains in preservation of renal function, despite an apparent trend to improvement. Intestinal transplantation without stoma appears to be an acceptable practice model without obvious adverse impact on outcome.
Collapse
Affiliation(s)
- Jang I Moon
- Department of Surgery, Icahn School of Medicine at Mount Sinai, Recanati Miller Transplantation Institute, New York, New York, USA
| | - Hongbin Zhang
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, Institute for Implementation Science and Population Health, City University of New York, New York, New York, USA
| | - Levi Waldron
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, Institute for Implementation Science and Population Health, City University of New York, New York, New York, USA
| | - Kishore R Iyer
- Department of Surgery, Icahn School of Medicine at Mount Sinai, Recanati Miller Transplantation Institute, New York, New York, USA
| |
Collapse
|
34
|
Parcesepe AM, Robertson M, Berry A, Maroko A, Zimba R, Grov C, Westmoreland D, Kulkarni S, Rane M, Salgado-You W, Mirzayi C, Waldron L, Nash D. The relationship between anxiety, health, and potential stressors among adults in the United States during the COVID-19 pandemic. medRxiv 2020. [PMID: 33173880 DOI: 10.1101/2020.10.30.20221440] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Objective To estimate the prevalence of anxiety symptoms and the association between moderate or severe anxiety symptoms and health and potential stressors among adults in the U.S. during the COVID-19 pandemic. Methods This analysis includes data from 5,250 adults in the Communities, Households and SARS/CoV-2 Epidemiology (CHASING) COVID Cohort Study surveyed in April 2020. Poisson models were used to estimate the association between moderate or severe anxiety symptoms and health and potential stressors among U.S. adults during the COVID-19 pandemic. Results Greater than one-third (35%) of participants reported moderate or severe anxiety symptoms. Having lost income due to COVID-19 (adjusted prevalence ratio [aPR] 1.27 (95% CI 1.16, 1.30), having recent COVID-like symptoms (aPR 1.17 (95% CI 1.05, 1,31), and having been previously diagnosed with depression (aPR 1.49, (95% CI 1.35, 1.64) were positively associated with anxiety symptoms. Conclusions Anxiety symptoms were common among adults in the U.S. during the COVID-19 pandemic. Strategies to screen and treat individuals at increased risk of anxiety, such as individuals experiencing financial hardship and individuals with prior diagnoses of depression, should be developed and implemented.
Collapse
|
35
|
Haibe-Kains B, Adam GA, Hosny A, Khodakarami F, Waldron L, Wang B, McIntosh C, Goldenberg A, Kundaje A, Greene CS, Broderick T, Hoffman MM, Leek JT, Korthauer K, Huber W, Brazma A, Pineau J, Tibshirani R, Hastie T, Ioannidis JPA, Quackenbush J, Aerts HJWL. Transparency and reproducibility in artificial intelligence. Nature 2020; 586:E14-E16. [PMID: 33057217 PMCID: PMC8144864 DOI: 10.1038/s41586-020-2766-y] [Citation(s) in RCA: 140] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Accepted: 08/10/2020] [Indexed: 01/15/2023]
Abstract
Breakthroughs in artificial intelligence (AI) hold enormous potential as it can automate complex tasks and go even beyond human performance. In their study, McKinney et al. showed the high potential of AI for breast cancer screening. However, the lack of methods’ details and algorithm code undermines its scientific value. Here, we identify obstacles hindering transparent and reproducible AI research as faced by McKinney et al., and provide solutions to these obstacles with implications for the broader field.
Collapse
Affiliation(s)
- Benjamin Haibe-Kains
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada.
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada.
| | - George Alexandru Adam
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
| | - Ahmed Hosny
- Artificial Intelligence in Medicine (AIM) Program, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Radiation Oncology and Radiology, Dana-Farber Cancer Institute, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Farnoosh Khodakarami
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Levi Waldron
- Department of Epidemiology and Biostatistics and Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health and Health Policy, New York, NY, USA
| | - Bo Wang
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
- Peter Munk Cardiac Centre, University Health Network, Toronto, Ontario, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Ontario, Canada
| | - Chris McIntosh
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
- Peter Munk Cardiac Centre, University Health Network, Toronto, Ontario, Canada
| | - Anna Goldenberg
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
- SickKids Research Institute, Toronto, Ontario, Canada
- Child and Brain Development Program, CIFAR, Toronto, Ontario, Canada
| | - Anshul Kundaje
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Casey S Greene
- Dept. of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Childhood Cancer Data Lab, Alex's Lemonade Stand Foundation, Philadelphia, PA, USA
| | - Tamara Broderick
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Michael M Hoffman
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
| | - Jeffrey T Leek
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Keegan Korthauer
- Department of Statistics, University of British Columbia, Vancouver, British Columbia, Canada
- BC Children's Hospital Research Institute, Vancouver, British Columbia, Canada
| | - Wolfgang Huber
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Alvis Brazma
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Joelle Pineau
- McGill University, Montreal, Quebec, Canada
- Montreal Institute for Learning Algorithms, Quebec, Canada
| | - Robert Tibshirani
- Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | - Trevor Hastie
- Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | - John P A Ioannidis
- Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
- Meta-Research Innovation Center at Stanford (METRICS), Stanford, CA, USA
- Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, CA, USA
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Hugo J W L Aerts
- Artificial Intelligence in Medicine (AIM) Program, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Radiation Oncology and Radiology, Dana-Farber Cancer Institute, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Radiology and Nuclear Medicine, Maastricht University, Maastricht, The Netherlands
- Cardiovascular Imaging Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
36
|
Zimba R, Kulkarni S, Berry A, You W, Mirzayi C, Westmoreland D, Parcesepe A, Waldron L, Rane M, Kochhar S, Robertson M, Maroko AR, Grov C, Nash D. Testing, Testing: What SARS-CoV-2 testing services do adults in the United States actually want? medRxiv 2020:2020.09.15.20195180. [PMID: 32995800 PMCID: PMC7523137 DOI: 10.1101/2020.09.15.20195180] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Importance: Ascertaining preferences for SARS-CoV-2 testing and incorporating findings into the design and implementation of strategies for delivering testing services may enhance testing uptake and engagement, a prerequisite to reducing onward transmission. Objective: To determine important drivers of decisions to obtain a SARS-CoV-2 test in the context of increasing community transmission. Design : A discrete choice experiment (DCE) was used to assess the relative importance of type of SARS-CoV-2 test, specimen type, testing venue, and results turnaround time. Uptake of an optimized testing scenario was simulated relative to the current typical testing scenario of polymerase chain reaction (PCR) via nasopharyngeal (NP) swab in a provider office or urgent care clinic with results in >5 days. Setting: Online survey, embedded in an existing cohort study, conducted during July 30 - September 8, 2020. Participants: Participants (n=4,793) were enrolled in the CHASING COVID Cohort Study, a national longitudinal cohort of adults >18 years residing in the 50 US states, Washington, DC, Puerto Rico, or Guam. Main Outcome(s) and Measure(s): Relative importance of SARS-CoV-2 testing method attributes, utilities of specific attribute levels, and probability of choosing a testing scenario based on preferences estimated from the DCE, the current typical testing option, or choosing not to test. Results: Turnaround time for test results had the highest relative importance (30.4%), followed by test type (28.3%), specimen type (26.2%), and venue (15.0%). Participants preferred fast results on both past and current infection and using a noninvasive specimen, preferably collected at home. Simulations suggested that providing immediate or same day test results, providing both PCR and serology, or collecting oral specimens would substantially increase testing uptake over the current typical testing option. Simulated uptake of a hypothetical testing scenario of PCR and serology via a saliva sample at a pharmacy with same day results was 97.7%, compared to 0.6% for the current typical testing scenario, with 1.8% opting for no test. Conclusions and Relevance: Testing strategies that offer both PCR and serology with non-invasive methods and rapid turnaround time would likely have the most uptake and engagement among residents in communities with increasing community transmission of SARS-CoV-2.
Collapse
Affiliation(s)
- Rebecca Zimba
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Sarah Kulkarni
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Amanda Berry
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - William You
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Chloe Mirzayi
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Drew Westmoreland
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Angela Parcesepe
- Department of Maternal and Child Health, Gillings School of Public Health, University of North Carolina, Chapel Hill, NC, USA
- Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Levi Waldron
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - Madhura Rane
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Shivani Kochhar
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - McKaylee Robertson
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
| | - Andrew R Maroko
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Environmental, Occupational, and Geospatial Health Sciences, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - Christian Grov
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Community Health and Social Sciences, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| | - Denis Nash
- Institute for Implementation Science in Population Health (ISPH), City University of New York (CUNY); New York City, New York USA
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY); New York City, New York USA
| |
Collapse
|
37
|
Geistlinger L, Oh S, Ramos M, Schiffer L, LaRue RS, Henzler CM, Munro SA, Daughters C, Nelson AC, Winterhoff BJ, Chang Z, Talukdar S, Shetty M, Mullany SA, Morgan M, Parmigiani G, Birrer MJ, Qin LX, Riester M, Starr TK, Waldron L. Multiomic Analysis of Subtype Evolution and Heterogeneity in High-Grade Serous Ovarian Carcinoma. Cancer Res 2020; 80:4335-4345. [PMID: 32747365 DOI: 10.1158/0008-5472.can-20-0521] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 06/13/2020] [Accepted: 07/29/2020] [Indexed: 12/15/2022]
Abstract
Multiple studies have identified transcriptome subtypes of high-grade serous ovarian carcinoma (HGSOC), but their interpretation and translation are complicated by tumor evolution and polyclonality accompanied by extensive accumulation of somatic aberrations, varying cell type admixtures, and different tissues of origin. In this study, we examined the chronology of HGSOC subtype evolution in the context of these factors using a novel integrative analysis of absolute copy-number analysis and gene expression in The Cancer Genome Atlas complemented by single-cell analysis of six independent tumors. Tumor purity, ploidy, and subclonality were reliably inferred from different genomic platforms, and these characteristics displayed marked differences between subtypes. Genomic lesions associated with HGSOC subtypes tended to be subclonal, implying subtype divergence at later stages of tumor evolution. Subclonality of recurrent HGSOC alterations was evident for proliferative tumors, characterized by extreme genomic instability, absence of immune infiltration, and greater patient age. In contrast, differentiated tumors were characterized by largely intact genome integrity, high immune infiltration, and younger patient age. Single-cell sequencing of 42,000 tumor cells revealed widespread heterogeneity in tumor cell type composition that drove bulk subtypes but demonstrated a lack of intrinsic subtypes among tumor epithelial cells. Our findings prompt the dismissal of discrete transcriptome subtypes for HGSOC and replacement by a more realistic model of continuous tumor development that includes mixtures of subclones, accumulation of somatic aberrations, infiltration of immune and stromal cells in proportions correlated with tumor stage and tissue of origin, and evolution between properties previously associated with discrete subtypes. SIGNIFICANCE: This study infers whether transcriptome-based groupings of tumors differentiate early in carcinogenesis and are, therefore, appropriate targets for therapy and demonstrates that this is not the case for HGSOC.
Collapse
Affiliation(s)
- Ludwig Geistlinger
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Institute for Implementation Science and Population Health, City University of New York, New York, New York
| | - Sehyun Oh
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Institute for Implementation Science and Population Health, City University of New York, New York, New York
| | - Marcel Ramos
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Institute for Implementation Science and Population Health, City University of New York, New York, New York
- Roswell Park Comprehensive Cancer Institute, Buffalo, New York
| | - Lucas Schiffer
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Institute for Implementation Science and Population Health, City University of New York, New York, New York
| | - Rebecca S LaRue
- Minnesota Supercomputing Institute, Minneapolis, Minnesota
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, Minnesota
| | - Christine M Henzler
- Minnesota Supercomputing Institute, Minneapolis, Minnesota
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, Minnesota
| | - Sarah A Munro
- Minnesota Supercomputing Institute, Minneapolis, Minnesota
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, Minnesota
| | - Claire Daughters
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, Minnesota
| | - Andrew C Nelson
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, Minnesota
- University of Minnesota Masonic Cancer Center, Minneapolis, Minnesota
| | - Boris J Winterhoff
- University of Minnesota Masonic Cancer Center, Minneapolis, Minnesota
- Department of Obstetrics, Gynecology and Women's Health, University of Minnesota, Minneapolis, Minnesota
| | - Zenas Chang
- Department of Obstetrics, Gynecology and Women's Health, University of Minnesota, Minneapolis, Minnesota
| | - Shobhana Talukdar
- Department of Obstetrics, Gynecology and Women's Health, University of Minnesota, Minneapolis, Minnesota
| | - Mihir Shetty
- Department of Obstetrics, Gynecology and Women's Health, University of Minnesota, Minneapolis, Minnesota
| | - Sally A Mullany
- Department of Obstetrics, Gynecology and Women's Health, University of Minnesota, Minneapolis, Minnesota
| | - Martin Morgan
- Roswell Park Comprehensive Cancer Institute, Buffalo, New York
| | - Giovanni Parmigiani
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, Massachusetts
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Michael J Birrer
- The Winthrop P Rockefeller Cancer Institute, University of Arkansas Medical Sciences, Little Rock, Arkansas
| | - Li-Xuan Qin
- Memorial Sloan Kettering Cancer Center, New York, New York
| | - Markus Riester
- Novartis Institutes for BioMedical Research, Cambridge, Massachusetts
| | - Timothy K Starr
- University of Minnesota Masonic Cancer Center, Minneapolis, Minnesota
- Department of Obstetrics, Gynecology and Women's Health, University of Minnesota, Minneapolis, Minnesota
| | - Levi Waldron
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York.
- Institute for Implementation Science and Population Health, City University of New York, New York, New York
| |
Collapse
|
38
|
Calgaro M, Romualdi C, Waldron L, Risso D, Vitulo N. Assessment of statistical methods from single cell, bulk RNA-seq, and metagenomics applied to microbiome data. Genome Biol 2020; 21:191. [PMID: 32746888 PMCID: PMC7398076 DOI: 10.1186/s13059-020-02104-1] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 07/14/2020] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND The correct identification of differentially abundant microbial taxa between experimental conditions is a methodological and computational challenge. Recent work has produced methods to deal with the high sparsity and compositionality characteristic of microbiome data, but independent benchmarks comparing these to alternatives developed for RNA-seq data analysis are lacking. RESULTS We compare methods developed for single-cell and bulk RNA-seq, and specifically for microbiome data, in terms of suitability of distributional assumptions, ability to control false discoveries, concordance, power, and correct identification of differentially abundant genera. We benchmark these methods using 100 manually curated datasets from 16S and whole metagenome shotgun sequencing. CONCLUSIONS The multivariate and compositional methods developed specifically for microbiome analysis did not outperform univariate methods developed for differential expression analysis of RNA-seq data. We recommend a careful exploratory data analysis prior to application of any inferential model and we present a framework to help scientists make an informed choice of analysis methods in a dataset-specific manner.
Collapse
Affiliation(s)
- Matteo Calgaro
- Department of Biotechnology, University of Verona, Verona, Italy
| | | | - Levi Waldron
- Graduate School of Public Health and Health Policy and Institute for Implementation Science in Public Health, City University of New York, New York, NY, USA
| | - Davide Risso
- Department of Statistical Sciences, University of Padova, Padova, Italy.
| | - Nicola Vitulo
- Department of Biotechnology, University of Verona, Verona, Italy.
| |
Collapse
|
39
|
Hartley S, Colas des Francs C, Aussert F, Martinot C, Dagneaux S, Londe V, Waldron L, Royant-Parola S. [The effects of quarantine for SARS-CoV-2 on sleep: An online survey]. Encephale 2020; 46:S53-S59. [PMID: 32475692 PMCID: PMC7211567 DOI: 10.1016/j.encep.2020.05.003] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 05/07/2020] [Indexed: 12/17/2022]
Abstract
Objectif Déterminer l’évolution du sommeil chez les Français pendant le confinement motivé par la pandémie du SARS-CoV-2 et définir les facteurs comportementaux associés à un sommeil détérioré. Méthodologie Une enquête en ligne via les réseaux sociaux pendant la période de confinement. Les questions ont ciblé les conditions de confinement, les comportements relatifs au sommeil et les éléments de l’environnement potentiellement perturbateurs du sommeil (exposition à la lumière et activités sportives). Résultats Au total, 1777 participants ont été inclus dont 77 % femmes, 72 % âgés de 25–54 ans. Les conditions de confinement les plus fréquentes étaient en couple avec enfants (36 %) et en maison avec jardin (51 %). Quarante-sept pour cent rapportent une diminution de la qualité du sommeil en confinement. Les facteurs associés à une détérioration du sommeil retenus par l’analyse multivariée sont une diminution de la durée du sommeil (OR 15,52 — p < 0,001), un coucher plus tardif (OR 1,72 — p < 0,001), un lever plus matinal (2,18 — p = 0,01), des horaires plus irréguliers (OR 2,29 — p < 0,001), une diminution de l’exposition à la lumière du jour (OR 1,46 — p = 0,01) et une augmentation de l’utilisation des écrans le soir (OR 1,33 — p = 0,04). Conclusion La mauvaise qualité subjective du sommeil en confinement est associée à une modification des comportements relatifs au sommeil et de l’exposition à la lumière (moins de lumière du jour et plus d’écran le soir). Pour optimiser le sommeil en confinement, des horaires adaptés et réguliers, une exposition de plus d’une heure/jour à la lumière du jour et l’éviction des écrans le soir sont à conseiller.
Collapse
Affiliation(s)
- S Hartley
- Réseau Morphée, 2, Grande rue, 92380 Garches, France; Unité du sommeil, EA 4047, université de Versailles Saint-Quentin en Yvelines, hôpital Raymond-Poincaré, AP-HP, 92380 Garches, France.
| | | | - F Aussert
- Réseau Morphée, 2, Grande rue, 92380 Garches, France; Centre des explorations multifonctionnelles, hôpital Antoine-Béclère, AP-HP, Clamart, France
| | - C Martinot
- Réseau Morphée, 2, Grande rue, 92380 Garches, France
| | - S Dagneaux
- Réseau Morphée, 2, Grande rue, 92380 Garches, France
| | - V Londe
- Réseau Morphée, 2, Grande rue, 92380 Garches, France
| | - L Waldron
- Réseau Morphée, 2, Grande rue, 92380 Garches, France
| | | |
Collapse
|
40
|
Geistlinger L, Csaba G, Santarelli M, Ramos M, Schiffer L, Turaga N, Law C, Davis S, Carey V, Morgan M, Zimmer R, Waldron L. Toward a gold standard for benchmarking gene set enrichment analysis. Brief Bioinform 2020; 22:545-556. [PMID: 32026945 PMCID: PMC7820859 DOI: 10.1093/bib/bbz158] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 10/11/2019] [Accepted: 11/09/2019] [Indexed: 12/22/2022] Open
Abstract
MOTIVATION Although gene set enrichment analysis has become an integral part of high-throughput gene expression data analysis, the assessment of enrichment methods remains rudimentary and ad hoc. In the absence of suitable gold standards, evaluations are commonly restricted to selected datasets and biological reasoning on the relevance of resulting enriched gene sets. RESULTS We develop an extensible framework for reproducible benchmarking of enrichment methods based on defined criteria for applicability, gene set prioritization and detection of relevant processes. This framework incorporates a curated compendium of 75 expression datasets investigating 42 human diseases. The compendium features microarray and RNA-seq measurements, and each dataset is associated with a precompiled GO/KEGG relevance ranking for the corresponding disease under investigation. We perform a comprehensive assessment of 10 major enrichment methods, identifying significant differences in runtime and applicability to RNA-seq data, fraction of enriched gene sets depending on the null hypothesis tested and recovery of the predefined relevance rankings. We make practical recommendations on how methods originally developed for microarray data can efficiently be applied to RNA-seq data, how to interpret results depending on the type of gene set test conducted and which methods are best suited to effectively prioritize gene sets with high phenotype relevance. AVAILABILITY http://bioconductor.org/packages/GSEABenchmarkeR. CONTACT ludwig.geistlinger@sph.cuny.edu.
Collapse
Affiliation(s)
- Ludwig Geistlinger
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY 10027, USA
| | - Gergely Csaba
- Institute for Implementation Science and Population Health, City University of New York, New York, NY 10027, USA
| | - Mara Santarelli
- Institute for Bioinformatics, Ludwig-Maximilians-Universität München, 80333 Munich, Germany
| | - Marcel Ramos
- Roswell Park Cancer Institute, Buffalo, NY 14203, USA
| | - Lucas Schiffer
- Graduate School of Arts and Sciences, Boston University, Boston, MA 02215, USA
| | - Nitesh Turaga
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
| | - Charity Law
- Department of Medical Biology, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Sean Davis
- Center for Cancer Research, National Cancer Institute, Bethesda, MD 20892, USA
| | | | | | | | - Levi Waldron
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY 10027, USA
| |
Collapse
|
41
|
Amezquita RA, Lun ATL, Becht E, Carey VJ, Carpp LN, Geistlinger L, Marini F, Rue-Albrecht K, Risso D, Soneson C, Waldron L, Pagès H, Smith ML, Huber W, Morgan M, Gottardo R, Hicks SC. Orchestrating single-cell analysis with Bioconductor. Nat Methods 2020; 17:137-145. [PMID: 31792435 PMCID: PMC7358058 DOI: 10.1038/s41592-019-0654-x] [Citation(s) in RCA: 332] [Impact Index Per Article: 83.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Revised: 09/13/2019] [Accepted: 10/14/2019] [Indexed: 12/24/2022]
Abstract
Recent technological advancements have enabled the profiling of a large number of genome-wide features in individual cells. However, single-cell data present unique challenges that require the development of specialized methods and software infrastructure to successfully derive biological insights. The Bioconductor project has rapidly grown to meet these demands, hosting community-developed open-source software distributed as R packages. Featuring state-of-the-art computational methods, standardized data infrastructure and interactive data visualization tools, we present an overview and online book (https://osca.bioconductor.org) of single-cell methods for prospective users.
Collapse
Affiliation(s)
| | - Aaron T L Lun
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
- Bioinformatics and Computational Biology, Genentech Inc., San Francisco, CA, USA
| | - Etienne Becht
- Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Vince J Carey
- Channing Division of Network Medicine, Brigham And Women's Hospital, Boston, MA, USA
| | | | - Ludwig Geistlinger
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY, USA
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, USA
| | - Federico Marini
- Center for Thrombosis and Hemostasis, Mainz, Germany
- Institute of Medical Biostatistics, Epidemiology and Informatics, Mainz, Germany
| | | | - Davide Risso
- Department of Statistical Sciences, University of Padua, Padua, Italy
- Division of Biostatistics and Epidemiology, Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY, USA
| | - Charlotte Soneson
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Levi Waldron
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY, USA
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, USA
| | - Hervé Pagès
- Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Mike L Smith
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Wolfgang Huber
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Martin Morgan
- Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA
| | | | - Stephanie C Hicks
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
| |
Collapse
|
42
|
Schwede M, Waldron L, Mok SC, Wei W, Basunia A, Merritt MA, Mitsiades CS, Parmigiani G, Harrington DP, Quackenbush J, Birrer MJ, Culhane AC. The Impact of Stroma Admixture on Molecular Subtypes and Prognostic Gene Signatures in Serous Ovarian Cancer. Cancer Epidemiol Biomarkers Prev 2019; 29:509-519. [PMID: 31871106 DOI: 10.1158/1055-9965.epi-18-1359] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Revised: 04/26/2019] [Accepted: 12/06/2019] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Recent efforts to improve outcomes for high-grade serous ovarian cancer, a leading cause of cancer death in women, have focused on identifying molecular subtypes and prognostic gene signatures, but existing subtypes have poor cross-study robustness. We tested the contribution of cell admixture in published ovarian cancer molecular subtypes and prognostic gene signatures. METHODS Gene signatures of tumor and stroma were developed using paired microdissected tissue from two independent studies. Stromal genes were investigated in two molecular subtype classifications and 61 published gene signatures. Prognostic performance of gene signatures of stromal admixture was evaluated in 2,527 ovarian tumors (16 studies). Computational simulations of increasing stromal cell proportion were performed by mixing gene-expression profiles of paired microdissected ovarian tumor and stroma. RESULTS Recently described ovarian cancer molecular subtypes are strongly associated with the cell admixture. Tumors were classified as different molecular subtypes in simulations where the percentage of stromal cells increased. Stromal gene expression in bulk tumors was associated with overall survival (hazard ratio, 1.17; 95% confidence interval, 1.11-1.23), and in one data set, increased stroma was associated with anatomic sampling location. Five published prognostic gene signatures were no longer prognostic in a multivariate model that adjusted for stromal content. CONCLUSIONS Cell admixture affects the interpretation and reproduction of ovarian cancer molecular subtypes and gene signatures derived from bulk tissue. Elucidating the role of stroma in the tumor microenvironment and in prognosis is important. IMPACT Single-cell analyses may be required to refine the molecular subtypes of high-grade serous ovarian cancer.
Collapse
Affiliation(s)
- Matthew Schwede
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, Massachusetts
| | - Levi Waldron
- Biostatistics, CUNY Graduate School of Public Health and Health Policy, New York, New York
| | - Samuel C Mok
- Department of Gynecologic Oncology and Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Wei Wei
- Pfizer, Andover, Massachusetts
| | - Azfar Basunia
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, Massachusetts.,Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | | | | | - Giovanni Parmigiani
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, Massachusetts.,Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - David P Harrington
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, Massachusetts.,Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Michael J Birrer
- Division of Hematology-Oncology, University of Alabama at Birmingham, Birmingham, Alabama.
| | - Aedín C Culhane
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, Massachusetts. .,Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| |
Collapse
|
43
|
Amezquita RA, Lun ATL, Becht E, Carey VJ, Carpp LN, Geistlinger L, Marini F, Rue-Albrecht K, Risso D, Soneson C, Waldron L, Pagès H, Smith ML, Huber W, Morgan M, Gottardo R, Hicks SC. Publisher Correction: Orchestrating single-cell analysis with Bioconductor. Nat Methods 2019; 17:242. [DOI: 10.1038/s41592-019-0700-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
44
|
da Silva V, Ramos M, Groenen M, Crooijmans R, Johansson A, Regitano L, Coutinho L, Zimmer R, Waldron L, Geistlinger L. CNVRanger: association analysis of CNVs with gene expression and quantitative phenotypes. Bioinformatics 2019; 36:972-973. [PMID: 31392308 PMCID: PMC9887538 DOI: 10.1093/bioinformatics/btz632] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Revised: 07/17/2019] [Accepted: 08/06/2019] [Indexed: 02/02/2023] Open
Abstract
SUMMARY Copy number variation (CNV) is a major type of structural genomic variation that is increasingly studied across different species for association with diseases and production traits. Established protocols for experimental detection and computational inference of CNVs from SNP array and next-generation sequencing data are available. We present the CNVRanger R/Bioconductor package which implements a comprehensive toolbox for structured downstream analysis of CNVs. This includes functionality for summarizing individual CNV calls across a population, assessing overlap with functional genomic regions, and genome-wide association analysis with gene expression and quantitative phenotypes. AVAILABILITY AND IMPLEMENTATION http://bioconductor.org/packages/CNVRanger.
Collapse
Affiliation(s)
- Vinicius da Silva
- Department of Animal Breeding and Genomics, Wageningen University and Research, 6708 PB Wageningen, The Netherlands,Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala 75007, Sweden
| | - Marcel Ramos
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY 10027, USA
| | - Martien Groenen
- Department of Animal Breeding and Genomics, Wageningen University and Research, 6708 PB Wageningen, The Netherlands
| | - Richard Crooijmans
- Department of Animal Breeding and Genomics, Wageningen University and Research, 6708 PB Wageningen, The Netherlands
| | - Anna Johansson
- Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala 75007, Sweden
| | | | - Luiz Coutinho
- Department of Animal Science, University of São Paulo, 13418-900 Piracicaba, Brazil
| | - Ralf Zimmer
- Department of Bioinformatics, Ludwig-Maximilians-Universität München, 80333 München, Germany
| | - Levi Waldron
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY 10027, USA
| | | |
Collapse
|
45
|
Renson A, Jones HE, Beghini F, Segata N, Zolnik CP, Usyk M, Moody TU, Thorpe L, Burk R, Waldron L, Dowd JB. Sociodemographic variation in the oral microbiome. Ann Epidemiol 2019; 35:73-80.e2. [PMID: 31151886 PMCID: PMC6626698 DOI: 10.1016/j.annepidem.2019.03.006] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Revised: 02/18/2019] [Accepted: 03/15/2019] [Indexed: 12/21/2022]
Abstract
PURPOSE Variations in the oral microbiome are potentially implicated in social inequalities in oral disease, cancers, and metabolic disease. We describe sociodemographic variation of oral microbiomes in a diverse sample. METHODS We performed 16S rRNA sequencing on mouthwash specimens in a subsample (n = 282) of the 2013-2014 population-based New York City Health and Nutrition Examination Study. We examined differential abundance of 216 operational taxonomic units, and alpha and beta diversity by age, sex, income, education, nativity, and race/ethnicity. For comparison, we examined differential abundance by diet, smoking status, and oral health behaviors. RESULTS Sixty-nine operational taxonomic units were differentially abundant by any sociodemographic variable (false discovery rate < 0.01), including 27 by race/ethnicity, 21 by family income, 19 by education, 3 by sex. We found 49 differentially abundant by smoking status, 23 by diet, 12 by oral health behaviors. Genera differing for multiple sociodemographic characteristics included Lactobacillus, Prevotella, Porphyromonas, Fusobacterium. CONCLUSIONS We identified oral microbiome variation consistent with health inequalities, more taxa differing by race/ethnicity than diet, and more by SES variables than oral health behaviors. Investigation is warranted into possible mediating effects of the oral microbiome in social disparities in oral and metabolic diseases and cancers.
Collapse
Affiliation(s)
- Audrey Renson
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY; Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC.
| | - Heidi E Jones
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY
| | - Francesco Beghini
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy
| | - Nicola Segata
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy
| | - Christine P Zolnik
- Department of Pediatrics, Albert Einstein College of Medicine, Bronx, NY; Department of Biology, Long Island University, Brooklyn, NY
| | - Mykhaylo Usyk
- Department of Pediatrics, Albert Einstein College of Medicine, Bronx, NY
| | - Thomas U Moody
- Department of Pediatrics, Albert Einstein College of Medicine, Bronx, NY; Immunology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Lorna Thorpe
- Department of Population Health, NYU School of Medicine, New York, NY
| | - Robert Burk
- Department of Pediatrics, Albert Einstein College of Medicine, Bronx, NY; Departments of Microbiology and Immunology, Epidemiology and Population Health, and Obstetrics, Gynecology and Women's Health, Albert Einstein College of Medicine, Bronx, NY
| | - Levi Waldron
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY; Institute for Implementation Science in Population Health, City University of New York, New York, NY
| | - Jennifer B Dowd
- Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY; Department of Global Health and Social Medicine, King's College London, London, UK
| |
Collapse
|
46
|
Gendoo DMA, Zon M, Sandhu V, Manem VSK, Ratanasirigulchai N, Chen GM, Waldron L, Haibe-Kains B. MetaGxData: Clinically Annotated Breast, Ovarian and Pancreatic Cancer Datasets and their Use in Generating a Multi-Cancer Gene Signature. Sci Rep 2019; 9:8770. [PMID: 31217513 PMCID: PMC6584731 DOI: 10.1038/s41598-019-45165-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Accepted: 05/31/2019] [Indexed: 12/13/2022] Open
Abstract
A wealth of transcriptomic and clinical data on solid tumours are under-utilized due to unharmonized data storage and format. We have developed the MetaGxData package compendium, which includes manually-curated and standardized clinical, pathological, survival, and treatment metadata across breast, ovarian, and pancreatic cancer data. MetaGxData is the largest compendium of curated transcriptomic data for these cancer types to date, spanning 86 datasets and encompassing 15,249 samples. Open access to standardized metadata across cancer types promotes use of their transcriptomic and clinical data in a variety of cross-tumour analyses, including identification of common biomarkers, and assessing the validity of prognostic signatures. Here, we demonstrate that MetaGxData is a flexible framework that facilitates meta-analyses by using it to identify common prognostic genes in ovarian and breast cancer. Furthermore, we use the data compendium to create the first gene signature that is prognostic in a meta-analysis across 3 cancer types. These findings demonstrate the potential of MetaGxData to serve as an important resource in oncology research, and provide a foundation for future development of cancer-specific compendia.
Collapse
Affiliation(s)
- Deena M A Gendoo
- Centre for Computational Biology, Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, B15 2TT, United Kingdom.
| | - Michael Zon
- Princess Margaret Cancer Center, University Health Network, Toronto, M5G 2C1, Canada.,Department of Biomedical Engineering, McMaster University, Toronto, L8S 4L8, Canada
| | - Vandana Sandhu
- Princess Margaret Cancer Center, University Health Network, Toronto, M5G 2C1, Canada
| | - Venkata S K Manem
- Princess Margaret Cancer Center, University Health Network, Toronto, M5G 2C1, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, M5S 3H7, Canada.,Institut Universitaire de Cardiologie et de Pneumologie de Québec, Université Laval, Québec City, G1V 4G5, Canada
| | | | - Gregory M Chen
- Princess Margaret Cancer Center, University Health Network, Toronto, M5G 2C1, Canada
| | - Levi Waldron
- Graduate School of Public Health and Health Policy, Institute of Implementation Science in Population Health, City University of New York School, New York, 11101, USA.
| | - Benjamin Haibe-Kains
- Princess Margaret Cancer Center, University Health Network, Toronto, M5G 2C1, Canada. .,Department of Medical Biophysics, University of Toronto, Toronto, M5S 3H7, Canada. .,Department of Computer Science, University of Toronto, Toronto, M5T 3A1, Canada. .,Ontario Institute of Cancer Research, Toronto, M5G 0A3, Canada. .,Vector Institute, Toronto, M5G 1M1, Canada.
| |
Collapse
|
47
|
Schiffer L, Azhar R, Shepherd L, Ramos M, Geistlinger L, Huttenhower C, Dowd JB, Segata N, Waldron L. HMP16SData: Efficient Access to the Human Microbiome Project Through Bioconductor. Am J Epidemiol 2019; 188:1023-1026. [PMID: 30649166 PMCID: PMC6545282 DOI: 10.1093/aje/kwz006] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Revised: 10/08/2018] [Accepted: 10/11/2018] [Indexed: 12/30/2022] Open
Abstract
Phase 1 of the Human Microbiome Project (HMP) investigated 18 body subsites of 242 healthy American adults to produce the first comprehensive reference for the composition and variation of the "healthy" human microbiome. Publicly available data sets from amplicon sequencing of two 16S ribosomal RNA variable regions, with extensive controlled-access participant data, provide a reference for ongoing microbiome studies. However, utilization of these data sets can be hindered by the complex bioinformatic steps required to access, import, decrypt, and merge the various components in formats suitable for ecological and statistical analysis. The HMP16SData package provides count data for both 16S ribosomal RNA variable regions, integrated with phylogeny, taxonomy, public participant data, and controlled participant data for authorized researchers, using standard integrative Bioconductor data objects. By removing bioinformatic hurdles of data access and management, HMP16SData enables epidemiologists with only basic R skills to quickly analyze HMP data.
Collapse
Affiliation(s)
- Lucas Schiffer
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Institute for Implementation Science in Population Health, City University of New York, New York, New York
| | - Rimsha Azhar
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Institute for Implementation Science in Population Health, City University of New York, New York, New York
| | - Lori Shepherd
- Roswell Park Cancer Institute, University of Buffalo, Buffalo, New York
| | - Marcel Ramos
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Institute for Implementation Science in Population Health, City University of New York, New York, New York
- Roswell Park Cancer Institute, University of Buffalo, Buffalo, New York
| | - Ludwig Geistlinger
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Institute for Implementation Science in Population Health, City University of New York, New York, New York
| | - Curtis Huttenhower
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
- the Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Jennifer B Dowd
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Department of Global Health and Social Medicine, King’s College London, London, United Kingdom
| | - Nicola Segata
- the Centre for Integrative Biology, University of Trento, Trento, Italy
| | - Levi Waldron
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Institute for Implementation Science in Population Health, City University of New York, New York, New York
| |
Collapse
|
48
|
Waldron L, Schiffer L, Azhar R, Ramos M, Geistlinger L, Segata N. Waldron et al. Reply to "Commentary on the HMP16SData Bioconductor Package". Am J Epidemiol 2019; 188:1031-1032. [PMID: 30689687 PMCID: PMC6545274 DOI: 10.1093/aje/kwz008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2018] [Accepted: 01/07/2019] [Indexed: 11/13/2022] Open
Affiliation(s)
- Levi Waldron
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Institute for Implementation Science in Population Health, City University of New York, New York, New York
| | - Lucas Schiffer
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Institute for Implementation Science in Population Health, City University of New York, New York, New York
| | - Rimsha Azhar
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Institute for Implementation Science in Population Health, City University of New York, New York, New York
| | - Marcel Ramos
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Institute for Implementation Science in Population Health, City University of New York, New York, New York
- Roswell Park Cancer Institute, University of Buffalo, Buffalo, New York
| | - Ludwig Geistlinger
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York
- Institute for Implementation Science in Population Health, City University of New York, New York, New York
| | - Nicola Segata
- the Centre for Integrative Biology, University of Trento, Trento, Italy
| |
Collapse
|
49
|
Pasolli E, Schiffer L, Manghi P, Renson A, Obenchain V, Truong DT, Beghini F, Malik F, Ramos M, Dowd JB, Huttenhower C, Morgan M, Segata N, Waldron L. Accessible, curated metagenomic data through ExperimentHub. Nat Methods 2019; 14:1023-1024. [PMID: 29088129 DOI: 10.1038/nmeth.4468] [Citation(s) in RCA: 211] [Impact Index Per Article: 42.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Edoardo Pasolli
- Centre for Integrative Biology, University of Trento, Trento, Italy
| | - Lucas Schiffer
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York, USA.,Institute for Implementation Science and Population Health, City University of New York, New York, New York, USA
| | - Paolo Manghi
- Centre for Integrative Biology, University of Trento, Trento, Italy
| | - Audrey Renson
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York, USA.,Institute for Implementation Science and Population Health, City University of New York, New York, New York, USA
| | - Valerie Obenchain
- Institute for Implementation Science and Population Health, City University of New York, New York, New York, USA
| | - Duy Tin Truong
- Centre for Integrative Biology, University of Trento, Trento, Italy
| | | | - Faizan Malik
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York, USA
| | - Marcel Ramos
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York, USA.,Institute for Implementation Science and Population Health, City University of New York, New York, New York, USA.,Roswell Park Cancer Institute, University of Buffalo, Buffalo, New York, USA
| | - Jennifer B Dowd
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York, USA.,Department of Global Health and Social Medicine, King's College London, London, UK
| | - Curtis Huttenhower
- Biostatistics Department, Harvard School of Public Health, Boston, Massachusetts, USA.,The Broad Institute, Cambridge, Massachusetts, USA
| | - Martin Morgan
- Roswell Park Cancer Institute, University of Buffalo, Buffalo, New York, USA
| | - Nicola Segata
- Centre for Integrative Biology, University of Trento, Trento, Italy
| | - Levi Waldron
- Graduate School of Public Health and Health Policy, City University of New York, New York, New York, USA.,Institute for Implementation Science and Population Health, City University of New York, New York, New York, USA
| |
Collapse
|
50
|
Thomas AM, Manghi P, Asnicar F, Pasolli E, Armanini F, Zolfo M, Beghini F, Manara S, Karcher N, Pozzi C, Gandini S, Serrano D, Tarallo S, Francavilla A, Gallo G, Trompetto M, Ferrero G, Mizutani S, Shiroma H, Shiba S, Shibata T, Yachida S, Yamada T, Wirbel J, Schrotz-King P, Ulrich CM, Brenner H, Arumugam M, Bork P, Zeller G, Cordero F, Dias-Neto E, Setubal JC, Tett A, Pardini B, Rescigno M, Waldron L, Naccarati A, Segata N. Metagenomic analysis of colorectal cancer datasets identifies cross-cohort microbial diagnostic signatures and a link with choline degradation. Nat Med 2019; 25:667-678. [PMID: 30936548 PMCID: PMC9533319 DOI: 10.1038/s41591-019-0405-7] [Citation(s) in RCA: 443] [Impact Index Per Article: 88.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Accepted: 02/20/2019] [Indexed: 02/07/2023]
Abstract
Several studies have investigated links between the gut microbiome and colorectal cancer (CRC), but questions remain about the replicability of biomarkers across cohorts and populations. We performed a meta-analysis of five publicly available datasets and two new cohorts and validated the findings on two additional cohorts, considering in total 969 fecal metagenomes. Unlike microbiome shifts associated with gastrointestinal syndromes, the gut microbiome in CRC showed reproducibly higher richness than controls (P < 0.01), partially due to expansions of species typically derived from the oral cavity. Meta-analysis of the microbiome functional potential identified gluconeogenesis and the putrefaction and fermentation pathways as being associated with CRC, whereas the stachyose and starch degradation pathways were associated with controls. Predictive microbiome signatures for CRC trained on multiple datasets showed consistently high accuracy in datasets not considered for model training and independent validation cohorts (average area under the curve, 0.84). Pooled analysis of raw metagenomes showed that the choline trimethylamine-lyase gene was overabundant in CRC (P = 0.001), identifying a relationship between microbiome choline metabolism and CRC. The combined analysis of heterogeneous CRC cohorts thus identified reproducible microbiome biomarkers and accurate disease-predictive models that can form the basis for clinical prognostic tests and hypothesis-driven mechanistic studies.
Collapse
Affiliation(s)
- Andrew Maltez Thomas
- Department CIBIO, University of Trento, Trento, Italy
- Biochemistry Department, Chemistry Institute, University of São Paulo, São Paulo, Brazil
- Medical Genomics Laboratory, CIPE/A.C. Camargo Cancer Center, São Paulo, Brazil
| | - Paolo Manghi
- Department CIBIO, University of Trento, Trento, Italy
| | | | | | | | - Moreno Zolfo
- Department CIBIO, University of Trento, Trento, Italy
| | | | - Serena Manara
- Department CIBIO, University of Trento, Trento, Italy
| | | | - Chiara Pozzi
- IEO, European Institute of Oncology IRCCS, Milan, Italy
| | - Sara Gandini
- IEO, European Institute of Oncology IRCCS, Milan, Italy
| | | | - Sonia Tarallo
- Italian Institute for Genomic Medicine, Turin, Italy
| | | | - Gaetano Gallo
- Department of Surgical and Medical Sciences, University of Catanzaro, Catanzaro, Italy
- Department of Colorectal Surgery, Clinica S. Rita, Vercelli, Italy
| | - Mario Trompetto
- Department of Colorectal Surgery, Clinica S. Rita, Vercelli, Italy
| | - Giulio Ferrero
- Department of Computer Science, University of Turin, Turin, Italy
| | - Sayaka Mizutani
- School of Life Science and Technology, Tokyo Institute of Technology, Tokyo, Japan
- Research Fellow of Japan Society for the Promotion of Science, Tokyo, Japan
| | - Hirotsugu Shiroma
- School of Life Science and Technology, Tokyo Institute of Technology, Tokyo, Japan
| | - Satoshi Shiba
- Division of Cancer Genomics, National Cancer Center Research Institute, Tokyo, Japan
| | - Tatsuhiro Shibata
- Division of Cancer Genomics, National Cancer Center Research Institute, Tokyo, Japan
- Human Genome Center, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Shinichi Yachida
- Division of Cancer Genomics, National Cancer Center Research Institute, Tokyo, Japan
- Department of Cancer Genome Informatics, Osaka University, Osaka, Japan
| | - Takuji Yamada
- School of Life Science and Technology, Tokyo Institute of Technology, Tokyo, Japan
- PRESTO, Japan Science and Technology Agency, Saitama, Japan
| | - Jakob Wirbel
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Petra Schrotz-King
- Division of Preventive Oncology, National Center for Tumor Diseases and German Cancer Research Center, Heidelberg, Germany
| | - Cornelia M Ulrich
- Huntsman Cancer Institute and Department of Population Health Sciences, University of Utah, Salt Lake City, UT, USA
| | - Hermann Brenner
- Division of Preventive Oncology, National Center for Tumor Diseases and German Cancer Research Center, Heidelberg, Germany
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, Heidelberg, Germany
- German Cancer Consortium, German Cancer Research Center, Heidelberg, Germany
| | - Manimozhiyan Arumugam
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
- Faculty of Healthy Sciences, University of Southern Denmark, Odense, Denmark
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Molecular Medicine Partnership Unit, Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, Berlin, Germany
- Department of Bioinformatics, Biocenter, University of Würzburg, Würzburg, Germany
| | - Georg Zeller
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | | | - Emmanuel Dias-Neto
- Medical Genomics Laboratory, CIPE/A.C. Camargo Cancer Center, São Paulo, Brazil
- Laboratory of Neurosciences, Institute of Psychiatry, University of São Paulo, São Paulo, Brazil
| | - João Carlos Setubal
- Biochemistry Department, Chemistry Institute, University of São Paulo, São Paulo, Brazil
- Biocomplexity Institute of Virginia Tech, Blacksburg, VA, USA
| | - Adrian Tett
- Department CIBIO, University of Trento, Trento, Italy
| | - Barbara Pardini
- Italian Institute for Genomic Medicine, Turin, Italy
- Department of Medical Sciences, University of Turin, Turin, Italy
| | - Maria Rescigno
- Mucosal Immunology and Microbiota Unit, Humanitas Research Hospital, Milan, Italy
| | - Levi Waldron
- Graduate School of Public Health and Health Policy, City University of New York, New York, NY, USA
- Institute for Implementation Science in Population Health, City University of New York, New York, NY, USA
| | - Alessio Naccarati
- Italian Institute for Genomic Medicine, Turin, Italy
- Department of Molecular Biology of Cancer, Institute of Experimental Medicine, Prague, Czech Republic
| | - Nicola Segata
- Department CIBIO, University of Trento, Trento, Italy.
| |
Collapse
|