1
|
Suhre K, Venkataraman GR, Guturu H, Halama A, Stephan N, Thareja G, Sarwath H, Motamedchaboki K, Donovan MKR, Siddiqui A, Batzoglou S, Schmidt F. Nanoparticle enrichment mass-spectrometry proteomics identifies protein-altering variants for precise pQTL mapping. Nat Commun 2024; 15:989. [PMID: 38307861 PMCID: PMC10837160 DOI: 10.1038/s41467-024-45233-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 01/16/2024] [Indexed: 02/04/2024] Open
Abstract
Proteogenomics studies generate hypotheses on protein function and provide genetic evidence for drug target prioritization. Most previous work has been conducted using affinity-based proteomics approaches. These technologies face challenges, such as uncertainty regarding target identity, non-specific binding, and handling of variants that affect epitope affinity binding. Mass spectrometry-based proteomics can overcome some of these challenges. Here we report a pQTL study using the Proteograph™ Product Suite workflow (Seer, Inc.) where we quantify over 18,000 unique peptides from nearly 3000 proteins in more than 320 blood samples from a multi-ethnic cohort in a bottom-up, peptide-centric, mass spectrometry-based proteomics approach. We identify 184 protein-altering variants in 137 genes that are significantly associated with their corresponding variant peptides, confirming target specificity of co-associated affinity binders, identifying putatively causal cis-encoded proteins and providing experimental evidence for their presence in blood, including proteins that may be inaccessible to affinity-based proteomics.
Collapse
Affiliation(s)
- Karsten Suhre
- Bioinformatics Core, Weill Cornell Medicine-Qatar, Education City, 24144, Doha, Qatar.
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, 10021, USA.
| | | | | | - Anna Halama
- Bioinformatics Core, Weill Cornell Medicine-Qatar, Education City, 24144, Doha, Qatar
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, 10021, USA
| | - Nisha Stephan
- Bioinformatics Core, Weill Cornell Medicine-Qatar, Education City, 24144, Doha, Qatar
| | - Gaurav Thareja
- Bioinformatics Core, Weill Cornell Medicine-Qatar, Education City, 24144, Doha, Qatar
| | - Hina Sarwath
- Proteomics Core, Weill Cornell Medicine-Qatar, Education City, 24144, Doha, Qatar
| | | | | | - Asim Siddiqui
- Seer, Inc., Redwood City, Redwood City, CA, 94065, USA
| | | | - Frank Schmidt
- Proteomics Core, Weill Cornell Medicine-Qatar, Education City, 24144, Doha, Qatar
| |
Collapse
|
2
|
Nguyen JP, Arthur TD, Fujita K, Salgado BM, Donovan MKR, Matsui H, Kim JH, D'Antonio-Chronowska A, D'Antonio M, Frazer KA. eQTL mapping in fetal-like pancreatic progenitor cells reveals early developmental insights into diabetes risk. Nat Commun 2023; 14:6928. [PMID: 37903777 PMCID: PMC10616100 DOI: 10.1038/s41467-023-42560-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 10/13/2023] [Indexed: 11/01/2023] Open
Abstract
The impact of genetic regulatory variation active in early pancreatic development on adult pancreatic disease and traits is not well understood. Here, we generate a panel of 107 fetal-like iPSC-derived pancreatic progenitor cells (iPSC-PPCs) from whole genome-sequenced individuals and identify 4065 genes and 4016 isoforms whose expression and/or alternative splicing are affected by regulatory variation. We integrate eQTLs identified in adult islets and whole pancreas samples, which reveal 1805 eQTL associations that are unique to the fetal-like iPSC-PPCs and 1043 eQTLs that exhibit regulatory plasticity across the fetal-like and adult pancreas tissues. Colocalization with GWAS risk loci for pancreatic diseases and traits show that some putative causal regulatory variants are active only in the fetal-like iPSC-PPCs and likely influence disease by modulating expression of disease-associated genes in early development, while others with regulatory plasticity likely exert their effects in both the fetal and adult pancreas by modulating expression of different disease genes in the two developmental stages.
Collapse
Affiliation(s)
- Jennifer P Nguyen
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA, 92093, USA
- Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Timothy D Arthur
- Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA
- Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Kyohei Fujita
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Bianca M Salgado
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Margaret K R Donovan
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA, 92093, USA
- Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Hiroko Matsui
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Ji Hyun Kim
- Department of Pediatrics, Dongguk University Ilsan Hospital, Goyang, South Korea
| | | | - Matteo D'Antonio
- Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Kelly A Frazer
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA.
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA.
| |
Collapse
|
3
|
Huang T, Wang J, Stukalov A, Donovan MKR, Ferdosi S, Williamson L, Just S, Castro G, Cantrell LS, Elgierari E, Benz RW, Huang Y, Motamedchaboki K, Hakimi A, Arrey T, Damoc E, Kreimer S, Farokhzad OC, Batzoglou S, Siddiqui A, Van Eyk JE, Hornburg D. Protein Coronas on Functionalized Nanoparticles Enable Quantitative and Precise Large-Scale Deep Plasma Proteomics. bioRxiv 2023:2023.08.28.555225. [PMID: 37693476 PMCID: PMC10491250 DOI: 10.1101/2023.08.28.555225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
Background The wide dynamic range of circulating proteins coupled with the diversity of proteoforms present in plasma has historically impeded comprehensive and quantitative characterization of the plasma proteome at scale. Automated nanoparticle (NP) protein corona-based proteomics workflows can efficiently compress the dynamic range of protein abundances into a mass spectrometry (MS)-accessible detection range. This enhances the depth and scalability of quantitative MS-based methods, which can elucidate the molecular mechanisms of biological processes, discover new protein biomarkers, and improve comprehensiveness of MS-based diagnostics. Methods Investigating multi-species spike-in experiments and a cohort, we investigated fold-change accuracy, linearity, precision, and statistical power for the using the Proteograph™ Product Suite, a deep plasma proteomics workflow, in conjunction with multiple MS instruments. Results We show that NP-based workflows enable accurate identification (false discovery rate of 1%) of more than 6,000 proteins from plasma (Orbitrap Astral) and, compared to a gold standard neat plasma workflow that is limited to the detection of hundreds of plasma proteins, facilitate quantification of more proteins with accurate fold-changes, high linearity, and precision. Furthermore, we demonstrate high statistical power for the discovery of biomarkers in small- and large-scale cohorts. Conclusions The automated NP workflow enables high-throughput, deep, and quantitative plasma proteomics investigation with sufficient power to discover new biomarker signatures with a peptide level resolution.
Collapse
Affiliation(s)
| | - Jian Wang
- Seer, Inc., Redwood City, CA, 94065 USA
| | | | | | | | | | - Seth Just
- Seer, Inc., Redwood City, CA, 94065 USA
| | | | | | | | | | | | | | | | | | - Eugen Damoc
- Thermo Fisher Scientific, (Bremen) GmbH, Germany
| | - Simion Kreimer
- Advanced Clinical Biosystems Research Institute, Precision Health, Barbra Streisand Women’s Heart Center at the Smidt Heart Institute, Cedars-Sinai Medical Center, 127 S. San Vicente Blvd., Los Angeles, CA, 90048, USA
| | | | | | | | - Jennifer E. Van Eyk
- Advanced Clinical Biosystems Research Institute, Precision Health, Barbra Streisand Women’s Heart Center at the Smidt Heart Institute, Cedars-Sinai Medical Center, 127 S. San Vicente Blvd., Los Angeles, CA, 90048, USA
| | | |
Collapse
|
4
|
Donovan MKR, Huang Y, Blume JE, Wang J, Hornburg D, Ferdosi S, Mohtashemi I, Kim S, Ko M, Benz RW, Platt TL, Batzoglou S, Diaz LA, Farokhzad OC, Siddiqui A. Functionally distinct BMP1 isoforms show an opposite pattern of abundance in plasma from non-small cell lung cancer subjects and controls. PLoS One 2023; 18:e0282821. [PMID: 36989217 PMCID: PMC10058078 DOI: 10.1371/journal.pone.0282821] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 02/23/2023] [Indexed: 03/30/2023] Open
Abstract
Advancements in deep plasma proteomics are enabling high-resolution measurement of plasma proteoforms, which may reveal a rich source of novel biomarkers previously concealed by aggregated protein methods. Here, we analyze 188 plasma proteomes from non-small cell lung cancer subjects (NSCLC) and controls to identify NSCLC-associated protein isoforms by examining differentially abundant peptides as a proxy for isoform-specific exon usage. We find four proteins comprised of peptides with opposite patterns of abundance between cancer and control subjects. One of these proteins, BMP1, has known isoforms that can explain this differential pattern, for which the abundance of the NSCLC-associated isoform increases with stage of NSCLC progression. The presence of cancer and control-associated isoforms suggests differential regulation of BMP1 isoforms. The identified BMP1 isoforms have known functional differences, which may reveal insights into mechanisms impacting NSCLC disease progression.
Collapse
Affiliation(s)
| | | | - John E Blume
- Seer, Inc., Redwood City, CA, United States of America
| | - Jian Wang
- Seer, Inc., Redwood City, CA, United States of America
| | | | - Shadi Ferdosi
- Seer, Inc., Redwood City, CA, United States of America
| | | | - Sangtae Kim
- Seer, Inc., Redwood City, CA, United States of America
| | - Marwin Ko
- Seer, Inc., Redwood City, CA, United States of America
| | - Ryan W Benz
- Seer, Inc., Redwood City, CA, United States of America
| | | | | | - Luis A Diaz
- The Ludwig Center and The Howard Hughes Medical Institute at Johns Hopkins Kimmel Cancer Center, Baltimore, MD, United States of America
| | | | - Asim Siddiqui
- Seer, Inc., Redwood City, CA, United States of America
| |
Collapse
|
5
|
Jakubosky D, D'Antonio M, Bonder MJ, Smail C, Donovan MKR, Young Greenwald WW, Matsui H, D'Antonio-Chronowska A, Stegle O, Smith EN, Montgomery SB, DeBoever C, Frazer KA. Properties of structural variants and short tandem repeats associated with gene expression and complex traits. Nat Commun 2020; 11:2927. [PMID: 32522982 PMCID: PMC7286898 DOI: 10.1038/s41467-020-16482-4] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Accepted: 05/05/2020] [Indexed: 12/14/2022] Open
Abstract
Structural variants (SVs) and short tandem repeats (STRs) comprise a broad group of diverse DNA variants which vastly differ in their sizes and distributions across the genome. Here, we identify genomic features of SV classes and STRs that are associated with gene expression and complex traits, including their locations relative to eGenes, likelihood of being associated with multiple eGenes, associated eGene types (e.g., coding, noncoding, level of evolutionary constraint), effect sizes, linkage disequilibrium with tagging single nucleotide variants used in GWAS, and likelihood of being associated with GWAS traits. We identify a set of high-impact SVs/STRs associated with the expression of three or more eGenes via chromatin loops and show that they are highly enriched for being associated with GWAS traits. Our study provides insights into the genomic properties of structural variant classes and short tandem repeats that are associated with gene expression and human traits.
Collapse
Affiliation(s)
- David Jakubosky
- Biomedical Sciences Graduate Program, University of California San Diego, La Jolla, CA, 92093-0419, USA
- Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, 92093-0419, USA
| | - Matteo D'Antonio
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Marc Jan Bonder
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Craig Smail
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, 94305, USA
- Department of Pathology, Stanford University, Stanford, California, 94305, USA
| | - Margaret K R Donovan
- Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, 92093-0419, USA
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - William W Young Greenwald
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - Hiroko Matsui
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | | | - Oliver Stegle
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center, Heidelberg, Germany
| | - Erin N Smith
- Department of Pediatrics, University of California San Diego, La Jolla, CA, 92093, USA
| | - Stephen B Montgomery
- Department of Pathology, Stanford University, Stanford, California, 94305, USA
- Department of Genetics, Stanford University, Stanford, California, 94305, USA
| | - Christopher DeBoever
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Kelly A Frazer
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA.
- Department of Pediatrics, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
6
|
Donovan MKR, D'Antonio-Chronowska A, D'Antonio M, Frazer KA. Cellular deconvolution of GTEx tissues powers discovery of disease and cell-type associated regulatory variants. Nat Commun 2020; 11:955. [PMID: 32075962 PMCID: PMC7031340 DOI: 10.1038/s41467-020-14561-0] [Citation(s) in RCA: 65] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2019] [Accepted: 01/17/2020] [Indexed: 12/31/2022] Open
Abstract
The Genotype-Tissue Expression (GTEx) resource has provided insights into the regulatory impact of genetic variation on gene expression across human tissues; however, thus far has not considered how variation acts at the resolution of the different cell types. Here, using gene expression signatures obtained from mouse cell types, we deconvolute bulk RNA-seq samples from 28 GTEx tissues to quantify cellular composition, which reveals striking heterogeneity across these samples. Conducting eQTL analyses for GTEx liver and skin samples using cell composition estimates as interaction terms, we identify thousands of genetic associations that are cell-type-associated. The skin cell-type associated eQTLs colocalize with skin diseases, indicating that variants which influence gene expression in distinct skin cell types play important roles in traits and disease. Our study provides a framework to estimate the cellular composition of GTEx tissues enabling the functional characterization of human genetic variation that impacts gene expression in cell-type-specific manners.
Collapse
Affiliation(s)
- Margaret K R Donovan
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA, 92093, USA
- Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA
| | | | - Matteo D'Antonio
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA, 92093, USA.
| | - Kelly A Frazer
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA, 92093, USA.
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
7
|
D'Antonio-Chronowska A, Donovan MKR, Young Greenwald WW, Nguyen JP, Fujita K, Hashem S, Matsui H, Soncin F, Parast M, Ward MC, Coulet F, Smith EN, Adler E, D'Antonio M, Frazer KA. Association of Human iPSC Gene Signatures and X Chromosome Dosage with Two Distinct Cardiac Differentiation Trajectories. Stem Cell Reports 2019; 13:924-938. [PMID: 31668852 PMCID: PMC6895695 DOI: 10.1016/j.stemcr.2019.09.011] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 09/27/2019] [Accepted: 09/30/2019] [Indexed: 11/30/2022] Open
Abstract
Despite the importance of understanding how variability across induced pluripotent stem cell (iPSC) lines due to non-genetic factors (clone and passage) influences their differentiation outcome, large-scale studies capable of addressing this question have not yet been conducted. Here, we differentiated 191 iPSC lines to generate iPSC-derived cardiovascular progenitor cells (iPSC-CVPCs). We observed cellular heterogeneity across the iPSC-CVPC samples due to varying fractions of two cell types: cardiomyocytes (CMs) and epicardium-derived cells (EPDCs). Comparing the transcriptomes of CM-fated and EPDC-fated iPSCs, we discovered that 91 signature genes and X chromosome dosage differences are associated with these two distinct cardiac developmental trajectories. In an independent set of 39 iPSCs differentiated into CMs, we confirmed that sex and transcriptional differences affect cardiac-fate outcome. Our study provides novel insights into how iPSC transcriptional and X chromosome gene dosage differences influence their response to differentiation stimuli and, hence, cardiac cell fate. Cellular heterogeneity across iPSC-CVPCs due to varying fractions of CMs and EPDCs iPSC non-genetic factors (clone and passage) associated with cardiac cell fate Expression levels of signature genes in iPSCs associated with cardiac lineage fate iPSC donor sex plays a role in cardiac lineage fate
Collapse
Affiliation(s)
| | - Margaret K R Donovan
- Bioinformatics and Systems Biology Graduate Program, UC San Diego, La Jolla, CA 92093, USA
| | | | - Jennifer Phuong Nguyen
- Bioinformatics and Systems Biology Graduate Program, UC San Diego, La Jolla, CA 92093, USA
| | - Kyohei Fujita
- Department of Pediatrics, UC San Diego, La Jolla, CA 92093, USA
| | - Sherin Hashem
- Division of Cardiology, Department of Medicine, UC San Diego, La Jolla, CA 92093, USA
| | - Hiroko Matsui
- Department of Pediatrics, UC San Diego, La Jolla, CA 92093, USA
| | | | - Mana Parast
- Department of Pathology, UC San Diego, La Jolla, CA 92093, USA
| | - Michelle C Ward
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Florence Coulet
- Department of Pediatrics, UC San Diego, La Jolla, CA 92093, USA
| | - Erin N Smith
- Department of Pediatrics, UC San Diego, La Jolla, CA 92093, USA
| | - Eric Adler
- Division of Cardiology, Department of Medicine, UC San Diego, La Jolla, CA 92093, USA
| | - Matteo D'Antonio
- Department of Pediatrics, UC San Diego, La Jolla, CA 92093, USA.
| | - Kelly A Frazer
- Department of Pediatrics, UC San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
8
|
Benaglio P, D'Antonio-Chronowska A, Ma W, Yang F, Young Greenwald WW, Donovan MKR, DeBoever C, Li H, Drees F, Singhal S, Matsui H, van Setten J, Sotoodehnia N, Gaulton KJ, Smith EN, D'Antonio M, Rosenfeld MG, Frazer KA. Allele-specific NKX2-5 binding underlies multiple genetic associations with human electrocardiographic traits. Nat Genet 2019; 51:1506-1517. [PMID: 31570892 PMCID: PMC6858543 DOI: 10.1038/s41588-019-0499-3] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2018] [Accepted: 08/15/2019] [Indexed: 12/15/2022]
Abstract
The cardiac transcription factor (TF) gene NKX2-5 has been associated with electrocardiographic (EKG) traits through genome-wide association studies (GWASs), but the extent to which differential binding of NKX2-5 at common regulatory variants contributes to these traits has not yet been studied. We analyzed transcriptomic and epigenomic data from induced pluripotent stem cell-derived cardiomyocytes from seven related individuals, and identified ~2,000 single-nucleotide variants associated with allele-specific effects (ASE-SNVs) on NKX2-5 binding. NKX2-5 ASE-SNVs were enriched for altered TF motifs, for heart-specific expression quantitative trait loci and for EKG GWAS signals. Using fine-mapping combined with epigenomic data from induced pluripotent stem cell-derived cardiomyocytes, we prioritized candidate causal variants for EKG traits, many of which were NKX2-5 ASE-SNVs. Experimentally characterizing two NKX2-5 ASE-SNVs (rs3807989 and rs590041) showed that they modulate the expression of target genes via differential protein binding in cardiac cells, indicating that they are functional variants underlying EKG GWAS signals. Our results show that differential NKX2-5 binding at numerous regulatory variants across the genome contributes to EKG phenotypes.
Collapse
Affiliation(s)
- Paola Benaglio
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA
| | | | - Wubin Ma
- Howard Hughes Medical Institute, Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Feng Yang
- Howard Hughes Medical Institute, Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | | | - Margaret K R Donovan
- Bioinformatics and Systems Biology, University of California, San Diego, La Jolla, CA, USA.,Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, USA
| | - Christopher DeBoever
- Bioinformatics and Systems Biology, University of California, San Diego, La Jolla, CA, USA
| | - He Li
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Frauke Drees
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Sanghamitra Singhal
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Hiroko Matsui
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Jessica van Setten
- Department of Cardiology, University Medical Center Utrecht, University of Utrecht, Utrecht, the Netherlands
| | - Nona Sotoodehnia
- Department of Medicine, Cardiovascular Health Research Unit, Division of Cardiology, University of Washington, Seattle, WA, USA.,Department of Epidemiology, Cardiovascular Health Research Unit, Division of Cardiology, University of Washington, Seattle, WA, USA
| | - Kyle J Gaulton
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Erin N Smith
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Matteo D'Antonio
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Michael G Rosenfeld
- Howard Hughes Medical Institute, Department of Medicine, University of California, San Diego, La Jolla, CA, USA.
| | - Kelly A Frazer
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA. .,Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
9
|
D'Antonio M, Benaglio P, Jakubosky D, Greenwald WW, Matsui H, Donovan MKR, Li H, Smith EN, D'Antonio-Chronowska A, Frazer KA. Insights into the Mutational Burden of Human Induced Pluripotent Stem Cells from an Integrative Multi-Omics Approach. Cell Rep 2018; 24:883-894. [PMID: 30044985 PMCID: PMC6467479 DOI: 10.1016/j.celrep.2018.06.091] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Revised: 04/29/2018] [Accepted: 06/21/2018] [Indexed: 12/16/2022] Open
Abstract
To understand the mutational burden of human induced pluripotent stem cells (iPSCs), we sequenced genomes of 18 fibroblast-derived iPSC lines and identified different classes of somatic mutations based on structure, origin, and frequency. Copy-number alterations affected 295 kb in each sample and strongly impacted gene expression. UV-damage mutations were present in ∼45% of the iPSCs and accounted for most of the observed heterogeneity in mutation rates across lines. Subclonal mutations (not present in all iPSCs within a line) composed 10% of point mutations and, compared with clonal variants, showed an enrichment in active promoters and increased association with altered gene expression. Our study shows that, by combining WGS, transcriptome, and epigenome data, we can understand the mutational burden of each iPSC line on an individual basis and suggests that this information could be used to prioritize iPSC lines for models of specific human diseases and/or transplantation therapy.
Collapse
Affiliation(s)
- Matteo D'Antonio
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Paola Benaglio
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - David Jakubosky
- Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA; Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA 92093, USA
| | - William W Greenwald
- Bioinformatics and Systems Biology, University of California, San Diego, La Jolla, CA 92093, USA
| | - Hiroko Matsui
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Margaret K R Donovan
- Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA 92093, USA; Bioinformatics and Systems Biology, University of California, San Diego, La Jolla, CA 92093, USA
| | - He Li
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Erin N Smith
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA 92093, USA
| | | | - Kelly A Frazer
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
10
|
Panopoulos AD, D'Antonio M, Benaglio P, Williams R, Hashem SI, Schuldt BM, DeBoever C, Arias AD, Garcia M, Nelson BC, Harismendy O, Jakubosky DA, Donovan MKR, Greenwald WW, Farnam K, Cook M, Borja V, Miller CA, Grinstein JD, Drees F, Okubo J, Diffenderfer KE, Hishida Y, Modesto V, Dargitz CT, Feiring R, Zhao C, Aguirre A, McGarry TJ, Matsui H, Li H, Reyna J, Rao F, O'Connor DT, Yeo GW, Evans SM, Chi NC, Jepsen K, Nariai N, Müller FJ, Goldstein LSB, Izpisua Belmonte JC, Adler E, Loring JF, Berggren WT, D'Antonio-Chronowska A, Smith EN, Frazer KA. iPSCORE: A Resource of 222 iPSC Lines Enabling Functional Characterization of Genetic Variation across a Variety of Cell Types. Stem Cell Reports 2017; 8:1086-1100. [PMID: 28410642 PMCID: PMC5390244 DOI: 10.1016/j.stemcr.2017.03.012] [Citation(s) in RCA: 94] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Revised: 03/08/2017] [Accepted: 03/13/2017] [Indexed: 11/18/2022] Open
Abstract
Large-scale collections of induced pluripotent stem cells (iPSCs) could serve as powerful model systems for examining how genetic variation affects biology and disease. Here we describe the iPSCORE resource: a collection of systematically derived and characterized iPSC lines from 222 ethnically diverse individuals that allows for both familial and association-based genetic studies. iPSCORE lines are pluripotent with high genomic integrity (no or low numbers of somatic copy-number variants) as determined using high-throughput RNA-sequencing and genotyping arrays, respectively. Using iPSCs from a family of individuals, we show that iPSC-derived cardiomyocytes demonstrate gene expression patterns that cluster by genetic background, and can be used to examine variants associated with physiological and disease phenotypes. The iPSCORE collection contains representative individuals for risk and non-risk alleles for 95% of SNPs associated with human phenotypes through genome-wide association studies. Our study demonstrates the utility of iPSCORE for examining how genetic variants influence molecular and physiological traits in iPSCs and derived cell lines.
Collapse
Affiliation(s)
- Athanasia D Panopoulos
- Gene Expression Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Matteo D'Antonio
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Paola Benaglio
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
| | - Roy Williams
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA; Center for Regenerative Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Sherin I Hashem
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Bernhard M Schuldt
- Zentrum für Integrative Psychiatrie, Universitätsklinikum Schleswig-Holstein, 24105 Kiel, Germany
| | - Christopher DeBoever
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Angelo D Arias
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
| | - Melvin Garcia
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Bradley C Nelson
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Olivier Harismendy
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - David A Jakubosky
- Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Margaret K R Donovan
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - William W Greenwald
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - KathyJean Farnam
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Megan Cook
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Victor Borja
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Carl A Miller
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Jonathan D Grinstein
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Frauke Drees
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
| | - Jonathan Okubo
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | | | - Yuriko Hishida
- Gene Expression Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Veronica Modesto
- Stem Cell Core, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Carl T Dargitz
- Stem Cell Core, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Rachel Feiring
- Stem Cell Core, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Chang Zhao
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Aitor Aguirre
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Thomas J McGarry
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Hiroko Matsui
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - He Li
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Joaquin Reyna
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Fangwen Rao
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Daniel T O'Connor
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Gene W Yeo
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Sylvia M Evans
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Neil C Chi
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Kristen Jepsen
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Naoki Nariai
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
| | - Franz-Josef Müller
- Zentrum für Integrative Psychiatrie, Universitätsklinikum Schleswig-Holstein, 24105 Kiel, Germany
| | - Lawrence S B Goldstein
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | | | - Eric Adler
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Jeanne F Loring
- Center for Regenerative Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - W Travis Berggren
- Stem Cell Core, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | | | - Erin N Smith
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
| | - Kelly A Frazer
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|