1
|
Zeng J, Wang D, Tong Z, Li Z, Wang G, Du Y, Li J, Miao J, Chen S. Development of a prognostic model for osteosarcoma based on macrophage polarization-related genes using machine learning: implications for personalized therapy. Clin Exp Med 2025; 25:146. [PMID: 40343502 PMCID: PMC12064610 DOI: 10.1007/s10238-024-01530-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2024] [Accepted: 11/25/2024] [Indexed: 05/11/2025]
Abstract
While neoadjuvant chemotherapy combined with surgical resection has improved the prognosis for patients with osteosarcoma, its impact on metastatic and recurrent cases remains limited. Immunotherapy is emerging as a promising alternative. However, the relationship between the phenotype of tumor-associated macrophages and the prognosis of osteosarcoma remains unclear. Differentially expressed gene during macrophage polarization were identified using the Monocle package. Weighted gene co-expression network analysis was conducted to select genes regulating macrophage polarization. The least absolute shrinkage and selection operator algorithm and multivariate Cox regression were used to construct long-term survival predictive strategies. Multiple machine learning algorithms identified target genes for pan-cancer analysis. Lentiviral transfection created stable strains with target gene knockdown, and CCK-8 and transwell migration assays verified the target gene's effects. Western blot and flow cytometry assessed the impact of target genes on macrophage polarization. A total of 141 genes regulating macrophage polarization were identified, from which eight genes were selected to construct prognostic models. Significant differences between high-risk and low-risk groups were observed in immune cell activation, immune-related signaling pathways, and immune function. The prognostic model and target gene were validated to provide more precise immunotherapy options for osteosarcoma and other tumors. BNIP3 knockdown decreased osteosarcoma cell proliferation and migration and promoted macrophage polarization to the M2 phenotype. The constructed prognostic model offers precise immunotherapy regimens and valuable insights into mechanisms underlying current studies. Furthermore, BNIP3 may serve as a potential immunotherapeutic target for osteosarcoma and other tumors.
Collapse
Affiliation(s)
- Jin Zeng
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Dong Wang
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - ZhaoChen Tong
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - ZiXin Li
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - GuoWei Wang
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - YuMeng Du
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Jinsong Li
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Jinglei Miao
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Shijie Chen
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China.
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China.
| |
Collapse
|
2
|
Wang J, Zhou Y, Zhang M, Li X, Liu T, Liu Y, Xie H, Wang K, Li P, Xu Z, Duan B. Resolving floral development dynamics using genome and single-cell temporal transcriptome of Dendrobium devonianum. PLANT BIOTECHNOLOGY JOURNAL 2025. [PMID: 40238860 DOI: 10.1111/pbi.70094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2025] [Revised: 03/21/2025] [Accepted: 04/04/2025] [Indexed: 04/18/2025]
Abstract
Dendrobium devonianum, a species of the Orchidaceae family, is notable for its unique floral characteristics, which include two yellow spots and purple tips on its labellum, as well as fringed edges. However, the molecular mechanisms underlying flower pattern formation in D. devonianum remain poorly understood, hindering advancements in its breeding process. Here, a chromosome-scale genome of D. devonianum was presented for the first time, revealing two significant polyploidization events. Additionally, a high-resolution single-cell transcriptomic atlas was constructed, capturing 11 distinct cell clusters. Expression patterns of MADS-box genes were identified through temporal and spatial bulk RNA-Seq, revealing alignment with the ABCDE model of flower formation. Meanwhile, mass spectrometry imaging and scRNA analyses showed that the yellow spots were primarily associated with carotenoid biosynthesis gene expression, while the purple colour is predominantly linked to anthocyanin biosynthesis gene expression. These genes were mainly expressed in the epidermis and vascular cells. Developmental trajectory analyses of epidermal cells further uncovered a gene regulatory network and several transcription factors likely responsible for fringes formation along the labellum margin. This study provides valuable insights into the molecular mechanisms driving floral colour differentiation and structural traits in D. devonianum, contributing to a deeper understanding of orchid evolution, diversification and breeding.
Collapse
Affiliation(s)
- Jing Wang
- College of Pharmaceutical Science, Dali University, Dali, China
- College of Life Science, Northeast Forestry University, Harbin, China
| | - Ying Zhou
- College of Pharmaceutical Science, Dali University, Dali, China
- Institute of Caulis Dendrobii Longling County, Baoshan, China
| | - Manchang Zhang
- Institute of Caulis Dendrobii Longling County, Baoshan, China
- International Joint Laboratory for the Development and Utilization of Traditional Chinese Medicine Resources in Yunnan Province, Baoshan, Dali, China
- Baoshan Food and Drug Inspection and Testing Center, Baoshan, China
| | - Xinping Li
- College of Pharmaceutical Science, Dali University, Dali, China
- College of Life Science, Northeast Forestry University, Harbin, China
- International Joint Laboratory for the Development and Utilization of Traditional Chinese Medicine Resources in Yunnan Province, Baoshan, Dali, China
| | - Tingxia Liu
- College of Pharmaceutical Science, Dali University, Dali, China
- International Joint Laboratory for the Development and Utilization of Traditional Chinese Medicine Resources in Yunnan Province, Baoshan, Dali, China
| | - Yinglin Liu
- College of Pharmaceutical Science, Dali University, Dali, China
- International Joint Laboratory for the Development and Utilization of Traditional Chinese Medicine Resources in Yunnan Province, Baoshan, Dali, China
| | - He Xie
- Tobacco Breeding and Biotechnology Research Center, Yunnan Academy of Tobacco Agricultural Sciences, Kunming, Yunnan, China
| | - Kaiying Wang
- Chinese PLA Center for Disease Control and Prevention, Beijing, China
| | - Peng Li
- Chinese PLA Center for Disease Control and Prevention, Beijing, China
| | - Zhichao Xu
- College of Life Science, Northeast Forestry University, Harbin, China
- International Joint Laboratory for the Development and Utilization of Traditional Chinese Medicine Resources in Yunnan Province, Baoshan, Dali, China
| | - Baozhong Duan
- College of Pharmaceutical Science, Dali University, Dali, China
- College of Life Science, Northeast Forestry University, Harbin, China
- International Joint Laboratory for the Development and Utilization of Traditional Chinese Medicine Resources in Yunnan Province, Baoshan, Dali, China
| |
Collapse
|
3
|
Wang Y, Zheng P, Cheng YC, Wang Z, Aravkin A. WENDY: Covariance dynamics based gene regulatory network inference. Math Biosci 2024; 377:109284. [PMID: 39168402 DOI: 10.1016/j.mbs.2024.109284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 06/25/2024] [Accepted: 08/16/2024] [Indexed: 08/23/2024]
Abstract
Determining gene regulatory network (GRN) structure is a central problem in biology, with a variety of inference methods available for different types of data. For a widely prevalent and challenging use case, namely single-cell gene expression data measured after intervention at multiple time points with unknown joint distributions, there is only one known specifically developed method, which does not fully utilize the rich information contained in this data type. We develop an inference method for the GRN in this case, netWork infErence by covariaNce DYnamics, dubbed WENDY. The core idea of WENDY is to model the dynamics of the covariance matrix, and solve this dynamics as an optimization problem to determine the regulatory relationships. To evaluate its effectiveness, we compare WENDY with other inference methods using synthetic data and experimental data. Our results demonstrate that WENDY performs well across different data sets.
Collapse
Affiliation(s)
- Yue Wang
- Irving Institute for Cancer Dynamics and Department of Statistics, Columbia University, New York, 10027, NY, USA.
| | - Peng Zheng
- Institute for Health Metrics and Evaluation, Seattle, 98195, WA, USA; Department of Health Metrics Sciences, University of Washington, Seattle, 98195, WA, USA
| | - Yu-Chen Cheng
- Department of Data Science, Dana-Farber Cancer Institute, Boston, 02215, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, 02115, MA, USA; Center for Cancer Evolution, Dana-Farber Cancer Institute, Boston, 02215, MA, USA; Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, 02138, MA, USA
| | - Zikun Wang
- Laboratory of Genetics, The Rockefeller University, New York, 10065, NY, USA
| | - Aleksandr Aravkin
- Department of Applied Mathematics, University of Washington, Seattle, 98195, WA, USA
| |
Collapse
|
4
|
Wang X, Zhang T, Zheng B, Lu Y, Liang Y, Xu G, Zhao L, Tao Y, Song Q, You H, Hu H, Li X, Sun K, Li T, Zhang Z, Wang J, Lan X, Pan D, Fu YX, Yue B, Zheng H. Lymphotoxin-β promotes breast cancer bone metastasis colonization and osteolytic outgrowth. Nat Cell Biol 2024; 26:1597-1612. [PMID: 39147874 DOI: 10.1038/s41556-024-01478-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 07/11/2024] [Indexed: 08/17/2024]
Abstract
Bone metastasis is a lethal consequence of breast cancer. Here we used single-cell transcriptomics to investigate the molecular mechanisms underlying bone metastasis colonization-the rate-limiting step in the metastatic cascade. We identified that lymphotoxin-β (LTβ) is highly expressed in tumour cells within the bone microenvironment and this expression is associated with poor bone metastasis-free survival. LTβ promotes tumour cell colonization and outgrowth in multiple breast cancer models. Mechanistically, tumour-derived LTβ activates osteoblasts through nuclear factor-κB2 signalling to secrete CCL2/5, which facilitates tumour cell adhesion to osteoblasts and accelerates osteoclastogenesis, leading to bone metastasis progression. Blocking LTβ signalling with a decoy receptor significantly suppressed bone metastasis in vivo, whereas clinical sample analysis revealed significantly higher LTβ expression in bone metastases than in primary tumours. Our findings highlight LTβ as a bone niche-induced factor that promotes tumour cell colonization and osteolytic outgrowth and underscore its potential as a therapeutic target for patients with bone metastatic disease.
Collapse
Affiliation(s)
- Xuxiang Wang
- Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Tengjiang Zhang
- Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Bingxin Zheng
- Department of Orthopedic Oncology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Youxue Lu
- Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Yong Liang
- Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Guoyuan Xu
- Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Luyang Zhao
- Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Yuwei Tao
- Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Qianhui Song
- Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Huiwen You
- Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Haitian Hu
- Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Xuan Li
- Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Keyong Sun
- Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Tianqi Li
- School of Life Sciences and Beijing Advanced Innovation Center for Structural Biology, Tsinghua University, Beijing, China
| | - Zian Zhang
- Department of Joint Surgery, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Jianbin Wang
- School of Life Sciences and Beijing Advanced Innovation Center for Structural Biology, Tsinghua University, Beijing, China
| | - Xun Lan
- State Key Laboratory of Molecular Oncology and Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Deng Pan
- State Key Laboratory of Molecular Oncology and Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Yang-Xin Fu
- State Key Laboratory of Molecular Oncology and Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China
| | - Bin Yue
- Department of Orthopedic Oncology, The Affiliated Hospital of Qingdao University, Qingdao, China.
| | - Hanqiu Zheng
- State Key Laboratory of Molecular Oncology and Center for Cancer Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China.
- SXMU-Tsinghua Collaborative Innovation Center for Frontier Medicine, Shanxi Medical University, Taiyuan, China.
| |
Collapse
|
5
|
Zhang D, Zhao F, Li J, Guo P, Liu H, Lu T, Li S, Li Z, Li Y. Comprehensive single-cell transcriptomic profiling reveals molecular subtypes and prognostic biomarkers with implications for targeted therapy in esophageal squamous cell carcinoma. Transl Oncol 2024; 44:101948. [PMID: 38582059 PMCID: PMC11004200 DOI: 10.1016/j.tranon.2024.101948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 02/05/2024] [Accepted: 03/26/2024] [Indexed: 04/08/2024] Open
Abstract
BACKGROUND Esophageal squamous cell carcinoma (ESCC) is a genetically heterogeneous disease with poor clinical outcomes. Identification of biomarkers linked to DNA replication stress may enable improved prognostic risk stratification and guide therapeutic decision making. We performed integrated single-cell RNA sequencing and computational analyses to define the molecular determinants and subtypes underlying ESCC heterogeneity. METHODS Single-cell RNA sequencing was performed on ESCC samples and analyzed using Seurat. Differential gene expression analysis was used to identify esophageal cell phenotypes. DNA replication stress-related genes were intersected with single-cell differential expression data to identify potential prognostic genes, which were used to generate a DNA replication stress (DRS) score. This score and associated genes were evaluated in survival analysis. Putative prognostic biomarkers were evaluated by Cox regression and consensus clustering. Mendelian randomization analyses assessed the causal role of PRKCB. RESULTS High DRS score associated with poor survival. Four genes (CDKN2A, NUP155, PPP2R2A, PRKCB) displayed prognostic utility. Three molecular subtypes were identified with discrete survival and immune properties. A 12-gene signature displayed robust prognostic performance. PRKCB was overexpressed in ESCC, while PRKCB knockdown reduced ESCC cell migration. CONCLUSIONS This integrated single-cell sequencing analysis provides new insights into the molecular heterogeneity and prognostic determinants underlying ESCC. The findings identify potential prognostic biomarkers and a gene expression signature that may enable improved patient risk stratification in ESCC. Experimental validation of the role of PRKCB substantiates the potential clinical utility of our results.
Collapse
Affiliation(s)
- Dengfeng Zhang
- Department of Thoracic Surgery, The Second Hospital of Hebei Medical University, Shijiazhuang 050000, China
| | - Fangchao Zhao
- Department of Thoracic Surgery, The Second Hospital of Hebei Medical University, Shijiazhuang 050000, China
| | - Jing Li
- Department of Thoracic Surgery, The Second Hospital of Hebei Medical University, Shijiazhuang 050000, China
| | - Pengfei Guo
- Department of Thoracic Surgery, The Second Hospital of Hebei Medical University, Shijiazhuang 050000, China
| | - Haitao Liu
- College of Life Science, Inner Mongolia University, Hohhot 010000, China
| | - Tianxing Lu
- Department of Thoracic Surgery, The Second Hospital of Hebei Medical University, Shijiazhuang 050000, China
| | - Shujun Li
- Department of Thoracic Surgery, The Second Hospital of Hebei Medical University, Shijiazhuang 050000, China.
| | - Zhirong Li
- Provincial Center for Clinical Laboratories, The Second Hospital of Hebei Medical University, Shijiazhuang 050000, China.
| | - Yishuai Li
- Department of Thoracic Surgery, Hebei Chest Hospital, Shijiazhuang 050000, China; Hebei Provincial Key Laboratory of Pulmonary Diseases, Shijiazhuang 050000, China.
| |
Collapse
|
6
|
Vo HK, Dawes JHP, Kelsh RN. Oscillatory differentiation dynamics fundamentally restricts the resolution of pseudotime reconstruction algorithms. J R Soc Interface 2024; 21:20230537. [PMID: 38503342 PMCID: PMC10950464 DOI: 10.1098/rsif.2023.0537] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 02/20/2024] [Indexed: 03/21/2024] Open
Abstract
The challenge to understand differentiation and cell lineages in development has resulted in many bioinformatics software tools, notably those working with gene expression data obtained via single-cell RNA sequencing obtained at snapshots in time. Reconstruction methods for trajectories often proceed by dimension reduction, data clustering and then computation of a tree graph in which edges indicate closely related clusters. Cell lineages can then be deduced by following paths through the tree. In the case of multi-potent cells undergoing differentiation, this trajectory reconstruction involves the reconstruction of multiple distinct lineages corresponding to commitment to each of a set of distinct fates. Recent work suggests that there may be cases in which the cell differentiation process involves trajectories that explore, in a dynamic and oscillatory fashion, propensity to differentiate into a number of possible cell fates before commitment finally occurs. Here, we show theoretically that the presence of such oscillations provides intrinsic constraints on the quality and resolution of the trajectory reconstruction process, even for idealized noise-free data. These constraints point to inherent common limitations of current methodologies and serve both to provide additional challenge in the development of software tools and also may help to understand features observed in recent experiments.
Collapse
Affiliation(s)
- Huy K. Vo
- Department of Mathematical Sciences, University of Bath, BA2 7AY Bath, UK
| | | | - Robert N. Kelsh
- Department of Life Sciences, University of Bath, BA2 7AY Bath, UK
| |
Collapse
|
7
|
Sidiropoulos DN, Ho WJ, Jaffee EM, Kagohara LT, Fertig EJ. Systems immunology spanning tumors, lymph nodes, and periphery. CELL REPORTS METHODS 2023; 3:100670. [PMID: 38086385 PMCID: PMC10753389 DOI: 10.1016/j.crmeth.2023.100670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 10/20/2023] [Accepted: 11/17/2023] [Indexed: 12/21/2023]
Abstract
The immune system defines a complex network of tissues and cell types that orchestrate responses across the body in a dynamic manner. The local and systemic interactions between immune and cancer cells contribute to disease progression. Lymphocytes are activated in lymph nodes, traffic through the periphery, and impact cancer progression through their interactions with tumor cells. As a result, therapeutic response and resistance are mediated across tissues, and a comprehensive understanding of lymphocyte dynamics requires a systems-level approach. In this review, we highlight experimental and computational methods that can leverage the study of leukocyte trafficking through an immunomics lens and reveal how adaptive immunity shapes cancer.
Collapse
Affiliation(s)
- Dimitrios N Sidiropoulos
- Johns Hopkins University School of Medicine, Baltimore, MD, USA; Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA
| | - Won Jin Ho
- Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA
| | - Elizabeth M Jaffee
- Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA
| | - Luciane T Kagohara
- Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA.
| | - Elana J Fertig
- Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, USA; Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
8
|
Kang JB, Raveane A, Nathan A, Soranzo N, Raychaudhuri S. Methods and Insights from Single-Cell Expression Quantitative Trait Loci. Annu Rev Genomics Hum Genet 2023; 24:277-303. [PMID: 37196361 PMCID: PMC10784788 DOI: 10.1146/annurev-genom-101422-100437] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
Recent advancements in single-cell technologies have enabled expression quantitative trait locus (eQTL) analysis across many individuals at single-cell resolution. Compared with bulk RNA sequencing, which averages gene expression across cell types and cell states, single-cell assays capture the transcriptional states of individual cells, including fine-grained, transient, and difficult-to-isolate populations at unprecedented scale and resolution. Single-cell eQTL (sc-eQTL) mapping can identify context-dependent eQTLs that vary with cell states, including some that colocalize with disease variants identified in genome-wide association studies. By uncovering the precise contexts in which these eQTLs act, single-cell approaches can unveil previously hidden regulatory effects and pinpoint important cell states underlying molecular mechanisms of disease. Here, we present an overview of recently deployed experimental designs in sc-eQTL studies. In the process, we consider the influence of study design choices such as cohort, cell states, and ex vivo perturbations. We then discuss current methodologies, modeling approaches, and technical challenges as well as future opportunities and applications.
Collapse
Affiliation(s)
- Joyce B Kang
- Center for Data Sciences and Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA; ,
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA;
| | | | - Aparna Nathan
- Center for Data Sciences and Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA; ,
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA;
| | - Nicole Soranzo
- Human Technopole, Milan, Italy; ,
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, United Kingdom
- British Heart Foundation Centre of Research Excellence and Department of Haematology, University of Cambridge, Cambridge, United Kingdom
| | - Soumya Raychaudhuri
- Center for Data Sciences and Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA; ,
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA;
- Centre for Genetics and Genomics Versus Arthritis, University of Manchester, Manchester, United Kingdom
| |
Collapse
|
9
|
Mao S, Liu J, Zhao W, Zhou X. LVPT: Lazy Velocity Pseudotime Inference Method. Biomolecules 2023; 13:1242. [PMID: 37627306 PMCID: PMC10452358 DOI: 10.3390/biom13081242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 08/09/2023] [Accepted: 08/10/2023] [Indexed: 08/27/2023] Open
Abstract
The emergence of RNA velocity has enriched our understanding of the dynamic transcriptional landscape within individual cells. In light of this breakthrough, we embarked on integrating RNA velocity with cellular pseudotime inference, aiming to improve the prediction of cell orders along biological trajectories beyond existing methods. Here, we developed LVPT, a novel method for pseudotime and trajectory inference. LVPT introduces a lazy probability to indicate the probability that the cell stays in the original state and calculates the transition matrix based on RNA velocity to provide the probability and direction of cell differentiation. LVPT shows better and comparable performance of pseudotime inference compared with other existing methods on both simulated datasets with different structures and real datasets. The validation results were consistent with prior knowledge, indicating that LVPT is an accurate and efficient method for pseudotime inference.
Collapse
Affiliation(s)
- Shuainan Mao
- The Department of Biotherapy and West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu 610041, China
- Med-X Center for Informatics, Sichuan University, Chengdu 610041, China
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Jiajia Liu
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Weiling Zhao
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX 77054, USA
| |
Collapse
|
10
|
Erbe R, Stein-O’Brien G, Fertig EJ. Transcriptomic forecasting with neural ordinary differential equations. PATTERNS (NEW YORK, N.Y.) 2023; 4:100793. [PMID: 37602211 PMCID: PMC10435954 DOI: 10.1016/j.patter.2023.100793] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 04/03/2023] [Accepted: 06/13/2023] [Indexed: 08/22/2023]
Abstract
Single-cell transcriptomics technologies can uncover changes in the molecular states that underlie cellular phenotypes. However, understanding the dynamic cellular processes requires extending from inferring trajectories from snapshots of cellular states to estimating temporal changes in cellular gene expression. To address this challenge, we have developed a neural ordinary differential-equation-based method, RNAForecaster, for predicting gene expression states in single cells for multiple future time steps in an embedding-independent manner. We demonstrate that RNAForecaster can accurately predict future expression states in simulated single-cell transcriptomic data with cellular tracking over time. We then show that by using metabolic labeling single-cell RNA sequencing (scRNA-seq) data from constitutively dividing cells, RNAForecaster accurately recapitulates many of the expected changes in gene expression during progression through the cell cycle over a 3-day period. Thus, RNAForecaster enables short-term estimation of future expression states in biological systems from high-throughput datasets with temporal information.
Collapse
Affiliation(s)
- Rossin Erbe
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Johns Hopkins Convergence Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA
| | - Genevieve Stein-O’Brien
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Johns Hopkins Convergence Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Kavli Neurodiscovery Institute, Baltimore, MD, USA
- Single Cell Training and Analysis Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Elana J. Fertig
- Johns Hopkins Convergence Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA
- Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
11
|
Xu X, Zhao Y, Ying Y, Zhu H, Luo J, Mou T, Zhang Z. m7G-related genes-NCBP2 and EIF4E3 determine immune contexture in head and neck squamous cell carcinoma by regulating CCL4/CCL5 expression. Mol Carcinog 2023; 62:1091-1106. [PMID: 37067401 DOI: 10.1002/mc.23548] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 03/15/2023] [Accepted: 04/09/2023] [Indexed: 04/18/2023]
Abstract
Aberrant N7 -methylguanosine (m7G) levels closely correlate with tumor genesis and progression. NCBP2 and EIF4E3 are two important m7G-related cap-binding genes. This study aimed to identify the relationship between the EIF4E3/NCBP2 function and immunological characteristics of head and neck squamous cell carcinoma (HNSCC). Hierarchical clustering was employed in classifying HNSCC patients into two groups based on the expressions of NCBP2 and EIF4E3. The differentially expressed genes were identified between the two groups, and GO functional enrichment was subsequently performed. Weighted gene co-expression network analysis was conducted to identify the hub genes related to EIF4E3/NCBP2 expression and immunity. The differential infiltration of immune cells and the response to immunotherapy were compared between the two groups. Single-cell sequence and trajectory analyses were performed to predict cell differentiation and display the expression of EIF4E3/NCBP2 in each state. In addition, quantitative real-time PCR, spatial transcriptome analysis, transwell assay, and western blotting were conducted to verify the biological function of EIF4E3/NCBP2. Here, group A showed a higher EIF4E3 expression and a lower NCBP2 expression, which had higher immune scores, proportion of most immune cells, immune activities, expression of immunomodulatory targets, and a better response to cancer immunotherapy. Besides, 56 hub molecules with notable immune regulation significance were identified. A risk model containing 17 hub genes and a prognostic nomogram was successfully established. Moreover, HNSCC tissues had a lower EIF4E3 expression and a higher NCBP2 expression than normal tissues. NCBP2 and EIF4E3 played a vital role in the differentiation of monocytes. Furthermore, the expression of CCL4/CCL5 can be regulated via EIF4E3 overexpression and NCBP2 knockdown. Collectively, NCBP2 and EIF4E3 can affect downstream gene expression, as well as immune contexture and response to immunotherapy, which could induce "cold-to-hot" tumor transformation in HNSCC patients.
Collapse
Affiliation(s)
- Xuhui Xu
- Department of Stomatology, Taizhou Central Hospital (Taizhou University Hospital), Taizhou, Zhejiang, China
| | - Yue Zhao
- Department of Stomatology, Taizhou Central Hospital (Taizhou University Hospital), Taizhou, Zhejiang, China
| | - Yukang Ying
- Department of Stomatology, Taizhou Central Hospital (Taizhou University Hospital), Taizhou, Zhejiang, China
| | - Haoran Zhu
- Health Science Center, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Jun Luo
- Department of Stomatology, Taizhou Central Hospital (Taizhou University Hospital), Taizhou, Zhejiang, China
| | - Tingchen Mou
- Department of Stomatology, Taizhou Central Hospital (Taizhou University Hospital), Taizhou, Zhejiang, China
| | - Zhenxing Zhang
- Department of Stomatology, Taizhou Central Hospital (Taizhou University Hospital), Taizhou, Zhejiang, China
| |
Collapse
|
12
|
Yang Y, Li G, Zhong Y, Xu Q, Chen BJ, Lin YT, Chapkin R, Cai JJ. Gene knockout inference with variational graph autoencoder learning single-cell gene regulatory networks. Nucleic Acids Res 2023; 51:6578-6592. [PMID: 37246643 PMCID: PMC10359630 DOI: 10.1093/nar/gkad450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 05/02/2023] [Accepted: 05/11/2023] [Indexed: 05/30/2023] Open
Abstract
In this paper, we introduce Gene Knockout Inference (GenKI), a virtual knockout (KO) tool for gene function prediction using single-cell RNA sequencing (scRNA-seq) data in the absence of KO samples when only wild-type (WT) samples are available. Without using any information from real KO samples, GenKI is designed to capture shifting patterns in gene regulation caused by the KO perturbation in an unsupervised manner and provide a robust and scalable framework for gene function studies. To achieve this goal, GenKI adapts a variational graph autoencoder (VGAE) model to learn latent representations of genes and interactions between genes from the input WT scRNA-seq data and a derived single-cell gene regulatory network (scGRN). The virtual KO data is then generated by computationally removing all edges of the KO gene-the gene to be knocked out for functional study-from the scGRN. The differences between WT and virtual KO data are discerned by using their corresponding latent parameters derived from the trained VGAE model. Our simulations show that GenKI accurately approximates the perturbation profiles upon gene KO and outperforms the state-of-the-art under a series of evaluation conditions. Using publicly available scRNA-seq data sets, we demonstrate that GenKI recapitulates discoveries of real-animal KO experiments and accurately predicts cell type-specific functions of KO genes. Thus, GenKI provides an in-silico alternative to KO experiments that may partially replace the need for genetically modified animals or other genetically perturbed systems.
Collapse
Affiliation(s)
- Yongjian Yang
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA
| | - Guanxun Li
- Department of Statistics, Texas A&M University, College Station, TX 77843, USA
| | - Yan Zhong
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, 3663 North Zhongshan Road, Shanghai 200062, China
| | - Qian Xu
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Bo-Jia Chen
- Graduate Institute of Microbiology and Public Health, College of Veterinary Medicine, National Chung Hsing University, Taichung 402, Taiwan
| | - Yu-Te Lin
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan
| | - Robert S Chapkin
- Program in Integrative & Complex Diseases, Department of Nutrition, Texas A&M University, College Station, TX 77843, USA
| | - James J Cai
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
- Interdisciplinary Program of Genetics, Texas A&M University, College Station, TX 77843, USA
| |
Collapse
|
13
|
Kumasaka N, Rostom R, Huang N, Polanski K, Meyer KB, Patel S, Boyd R, Gomez C, Barnett SN, Panousis NI, Schwartzentruber J, Ghoussaini M, Lyons PA, Calero-Nieto FJ, Göttgens B, Barnes JL, Worlock KB, Yoshida M, Nikolić MZ, Stephenson E, Reynolds G, Haniffa M, Marioni JC, Stegle O, Hagai T, Teichmann SA. Mapping interindividual dynamics of innate immune response at single-cell resolution. Nat Genet 2023; 55:1066-1075. [PMID: 37308670 PMCID: PMC10260404 DOI: 10.1038/s41588-023-01421-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2021] [Accepted: 04/27/2023] [Indexed: 06/14/2023]
Abstract
Common genetic variants across individuals modulate the cellular response to pathogens and are implicated in diverse immune pathologies, yet how they dynamically alter the response upon infection is not well understood. Here, we triggered antiviral responses in human fibroblasts from 68 healthy donors, and profiled tens of thousands of cells using single-cell RNA-sequencing. We developed GASPACHO (GAuSsian Processes for Association mapping leveraging Cell HeterOgeneity), a statistical approach designed to identify nonlinear dynamic genetic effects across transcriptional trajectories of cells. This approach identified 1,275 expression quantitative trait loci (local false discovery rate 10%) that manifested during the responses, many of which were colocalized with susceptibility loci identified by genome-wide association studies of infectious and autoimmune diseases, including the OAS1 splicing quantitative trait locus in a COVID-19 susceptibility locus. In summary, our analytical approach provides a unique framework for delineation of the genetic variants that shape a wide spectrum of transcriptional responses at single-cell resolution.
Collapse
Affiliation(s)
- Natsuhiko Kumasaka
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- Medical Support Center of Japan Environment and Children's Study (JECS), National Center for Child Health and Development, Tokyo, Japan
| | - Raghd Rostom
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Ni Huang
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | | | - Kerstin B Meyer
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Sharad Patel
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Rachel Boyd
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Celine Gomez
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Sam N Barnett
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | | | - Jeremy Schwartzentruber
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- Open Targets, Wellcome Genome Campus, Hinxton, UK
| | - Maya Ghoussaini
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- Open Targets, Wellcome Genome Campus, Hinxton, UK
| | - Paul A Lyons
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, Jeffrey Cheah Biomedical Centre, Cambridge, UK
- Department of Medicine, University of Cambridge, Cambridge, UK
| | | | - Berthold Göttgens
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
| | - Josephine L Barnes
- UCL Respiratory, Division of Medicine, University College London, London, UK
| | - Kaylee B Worlock
- UCL Respiratory, Division of Medicine, University College London, London, UK
| | - Masahiro Yoshida
- UCL Respiratory, Division of Medicine, University College London, London, UK
| | - Marko Z Nikolić
- UCL Respiratory, Division of Medicine, University College London, London, UK
- University College London Hospitals NHS Foundation Trust, London, UK
| | - Emily Stephenson
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Gary Reynolds
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Muzlifah Haniffa
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, UK
- NIHR Newcastle Biomedical Research Centre, Newcastle Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
- Department of Dermatology, Newcastle Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
| | - John C Marioni
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Oliver Stegle
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center, Heidelberg, Germany
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Tzachi Hagai
- Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel.
| | - Sarah A Teichmann
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK.
- Theory of Condensed Matter Group, Cavendish Laboratory/Department of Physics, University of Cambridge, Cambridge, UK.
| |
Collapse
|
14
|
Zhang Y, Sun H, Lian X, Tang J, Zhu F. ANPELA: Significantly Enhanced Quantification Tool for Cytometry-Based Single-Cell Proteomics. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023; 10:e2207061. [PMID: 36950745 DOI: 10.1002/advs.202207061] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 02/13/2023] [Indexed: 05/27/2023]
Abstract
ANPELA is widely used for quantifying traditional bulk proteomic data. Recently, there is a clear shift from bulk proteomics to the single-cell ones (SCP), for which powerful cytometry techniques demonstrate the fantastic capacity of capturing cellular heterogeneity that is completely overlooked by traditional bulk profiling. However, the in-depth and high-quality quantification of SCP data is still challenging and severely affected by the large numbers of quantification workflows and extreme performance dependence on the studied datasets. In other words, the proper selection of well-performing workflow(s) for any studied dataset is elusory, and it is urgently needed to have a significantly enhanced and accelerated tool to address this issue. However, no such tool is developed yet. Herein, ANPELA is therefore updated to its 2.0 version (https://idrblab.org/anpela/), which is unique in providing the most comprehensive set of quantification alternatives (>1000 workflows) among all existing tools, enabling systematic performance evaluation from multiple perspectives based on machine learning, and identifying the optimal workflow(s) using overall performance ranking together with the parallel computation. Extensive validation on different benchmark datasets and representative application scenarios suggest the great application potential of ANPELA in current SCP research for gaining more accurate and reliable biological insights.
Collapse
Affiliation(s)
- Ying Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Huaicheng Sun
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Xichen Lian
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Jing Tang
- Department of Bioinformatics, Chongqing Medical University, Chongqing, 400016, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| |
Collapse
|
15
|
Hicks EM, Seah C, Cote A, Marchese S, Brennand KJ, Nestler EJ, Girgenti MJ, Huckins LM. Integrating genetics and transcriptomics to study major depressive disorder: a conceptual framework, bioinformatic approaches, and recent findings. Transl Psychiatry 2023; 13:129. [PMID: 37076454 PMCID: PMC10115809 DOI: 10.1038/s41398-023-02412-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 03/17/2023] [Accepted: 03/24/2023] [Indexed: 04/21/2023] Open
Abstract
Major depressive disorder (MDD) is a complex and heterogeneous psychiatric syndrome with genetic and environmental influences. In addition to neuroanatomical and circuit-level disturbances, dysregulation of the brain transcriptome is a key phenotypic signature of MDD. Postmortem brain gene expression data are uniquely valuable resources for identifying this signature and key genomic drivers in human depression; however, the scarcity of brain tissue limits our capacity to observe the dynamic transcriptional landscape of MDD. It is therefore crucial to explore and integrate depression and stress transcriptomic data from numerous, complementary perspectives to construct a richer understanding of the pathophysiology of depression. In this review, we discuss multiple approaches for exploring the brain transcriptome reflecting dynamic stages of MDD: predisposition, onset, and illness. We next highlight bioinformatic approaches for hypothesis-free, genome-wide analyses of genomic and transcriptomic data and their integration. Last, we summarize the findings of recent genetic and transcriptomic studies within this conceptual framework.
Collapse
Affiliation(s)
- Emily M Hicks
- Pamela Sklar Division of Psychiatric Genomics, Departments of Psychiatry and of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
| | - Carina Seah
- Pamela Sklar Division of Psychiatric Genomics, Departments of Psychiatry and of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
| | - Alanna Cote
- Pamela Sklar Division of Psychiatric Genomics, Departments of Psychiatry and of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
| | - Shelby Marchese
- Pamela Sklar Division of Psychiatric Genomics, Departments of Psychiatry and of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
| | - Kristen J Brennand
- Pamela Sklar Division of Psychiatric Genomics, Departments of Psychiatry and of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
- Department of Genetics, Yale University School of Medicine, New Haven, CT, 06511, USA
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, 06511, USA
| | - Eric J Nestler
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
| | - Matthew J Girgenti
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, 06511, USA.
| | - Laura M Huckins
- Pamela Sklar Division of Psychiatric Genomics, Departments of Psychiatry and of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA.
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, 06511, USA.
| |
Collapse
|
16
|
Cifuentes-Bernal AM, Pham VVH, Li X, Liu L, Li J, Duy Le T. Dynamic cancer drivers: a causal approach for cancer driver discovery based on bio-pathological trajectories. Brief Funct Genomics 2022; 21:455-465. [PMID: 36124841 PMCID: PMC10467634 DOI: 10.1093/bfgp/elac030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 08/08/2022] [Accepted: 08/23/2022] [Indexed: 12/14/2022] Open
Abstract
The traditional way for discovering genes which drive cancer (namely cancer drivers) neglects the dynamic information of cancer development, even though it is well known that cancer progresses dynamically. To enhance cancer driver discovery, we expand cancer driver concept to dynamic cancer driver as a gene driving one or more bio-pathological transitions during cancer progression. Our method refers to the fact that cancer should not be considered as a single process but a compendium of altered biological processes causing the disease to develop over time. Reciprocally, different drivers of cancer can potentially be discovered by analysing different bio-pathological pathways. We propose a novel approach for causal inference of genes driving one or more core processes during cancer development (i.e. dynamic cancer driver). We use the concept of pseudotime for inferring the latent progression of samples along a biological transition during cancer and identifying a critical event when such a process is significantly deviated from normal to carcinogenic. We infer driver genes by assessing the causal effect they have on the process after such a critical event. We have applied our method to single-cell and bulk sequencing datasets of breast cancer. The evaluation results show that our method outperforms well-recognized cancer driver inference methods. These results suggest that including information of the underlying dynamics of cancer improves the inference process (in comparison with using static data), and allows us to discover different sets of driver genes from different processes in cancer. R scripts and datasets can be found at https://github.com/AndresMCB/DynamicCancerDriver.
Collapse
Affiliation(s)
- Andres M Cifuentes-Bernal
- UniSA STEM Unit, University of South Australia,
Mawson Lakes Blvd, 5095, South Australia , Australia
| | - Vu V H Pham
- UniSA STEM Unit, University of South Australia,
Mawson Lakes Blvd, 5095, South Australia , Australia
| | - Xiaomei Li
- UniSA STEM Unit, University of South Australia,
Mawson Lakes Blvd, 5095, South Australia , Australia
| | - Lin Liu
- UniSA STEM Unit, University of South Australia,
Mawson Lakes Blvd, 5095, South Australia , Australia
| | - Jiuyong Li
- UniSA STEM Unit, University of South Australia,
Mawson Lakes Blvd, 5095, South Australia , Australia
| | - Thuc Duy Le
- UniSA STEM Unit, University of South Australia,
Mawson Lakes Blvd, 5095, South Australia , Australia
| |
Collapse
|
17
|
Saul D, Kosinsky RL, Atkinson EJ, Doolittle ML, Zhang X, LeBrasseur NK, Pignolo RJ, Robbins PD, Niedernhofer LJ, Ikeno Y, Jurk D, Passos JF, Hickson LJ, Xue A, Monroe DG, Tchkonia T, Kirkland JL, Farr JN, Khosla S. A new gene set identifies senescent cells and predicts senescence-associated pathways across tissues. Nat Commun 2022; 13:4827. [PMID: 35974106 PMCID: PMC9381717 DOI: 10.1038/s41467-022-32552-1] [Citation(s) in RCA: 395] [Impact Index Per Article: 131.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 08/05/2022] [Indexed: 02/01/2023] Open
Abstract
Although cellular senescence drives multiple age-related co-morbidities through the senescence-associated secretory phenotype, in vivo senescent cell identification remains challenging. Here, we generate a gene set (SenMayo) and validate its enrichment in bone biopsies from two aged human cohorts. We further demonstrate reductions in SenMayo in bone following genetic clearance of senescent cells in mice and in adipose tissue from humans following pharmacological senescent cell clearance. We next use SenMayo to identify senescent hematopoietic or mesenchymal cells at the single cell level from human and murine bone marrow/bone scRNA-seq data. Thus, SenMayo identifies senescent cells across tissues and species with high fidelity. Using this senescence panel, we are able to characterize senescent cells at the single cell level and identify key intercellular signaling pathways. SenMayo also represents a potentially clinically applicable panel for monitoring senescent cell burden with aging and other conditions as well as in studies of senolytic drugs.
Collapse
Affiliation(s)
- Dominik Saul
- Division of Endocrinology, Mayo Clinic, Rochester, MN, 55905, USA.
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA.
- Department of Trauma, Orthopedics and Reconstructive Surgery, Georg-August-University of Goettingen, Goettingen, Germany.
| | - Robyn Laura Kosinsky
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, 55905, USA
| | | | - Madison L Doolittle
- Division of Endocrinology, Mayo Clinic, Rochester, MN, 55905, USA
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA
| | - Xu Zhang
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA
| | - Nathan K LeBrasseur
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA
| | - Robert J Pignolo
- Division of Endocrinology, Mayo Clinic, Rochester, MN, 55905, USA
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA
| | - Paul D Robbins
- Institute on the Biology of Aging and Metabolism, Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Laura J Niedernhofer
- Institute on the Biology of Aging and Metabolism, Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Yuji Ikeno
- Department of Pathology, University of Texas Health, San Antonio, TX, USA
| | - Diana Jurk
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA
| | - João F Passos
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA
| | - LaTonya J Hickson
- Division of Nephrology and Hypertension, Mayo Clinic, Jacksonville, FL, USA
| | - Ailing Xue
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA
| | - David G Monroe
- Division of Endocrinology, Mayo Clinic, Rochester, MN, 55905, USA
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA
| | - Tamara Tchkonia
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA
| | - James L Kirkland
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA
| | - Joshua N Farr
- Division of Endocrinology, Mayo Clinic, Rochester, MN, 55905, USA.
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA.
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA.
| | - Sundeep Khosla
- Division of Endocrinology, Mayo Clinic, Rochester, MN, 55905, USA.
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA.
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA.
| |
Collapse
|
18
|
Lombardo SD, Wangsaputra IF, Menche J, Stevens A. Network Approaches for Charting the Transcriptomic and Epigenetic Landscape of the Developmental Origins of Health and Disease. Genes (Basel) 2022; 13:764. [PMID: 35627149 PMCID: PMC9141211 DOI: 10.3390/genes13050764] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 04/04/2022] [Accepted: 04/13/2022] [Indexed: 02/04/2023] Open
Abstract
The early developmental phase is of critical importance for human health and disease later in life. To decipher the molecular mechanisms at play, current biomedical research is increasingly relying on large quantities of diverse omics data. The integration and interpretation of the different datasets pose a critical challenge towards the holistic understanding of the complex biological processes that are involved in early development. In this review, we outline the major transcriptomic and epigenetic processes and the respective datasets that are most relevant for studying the periconceptional period. We cover both basic data processing and analysis steps, as well as more advanced data integration methods. A particular focus is given to network-based methods. Finally, we review the medical applications of such integrative analyses.
Collapse
Affiliation(s)
- Salvo Danilo Lombardo
- Max Perutz Labs, Department of Structural and Computational Biology, University of Vienna, 1030 Vienna, Austria;
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, 1030 Vienna, Austria
| | - Ivan Fernando Wangsaputra
- Maternal and Fetal Health Research Group, Division of Developmental Biology and Medicine, Faculty of Biology, Medicine and Health, University of Manchester, Manchester M13 9WL, UK;
| | - Jörg Menche
- Max Perutz Labs, Department of Structural and Computational Biology, University of Vienna, 1030 Vienna, Austria;
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, 1030 Vienna, Austria
- Faculty of Mathematics, University of Vienna, 1030 Vienna, Austria
| | - Adam Stevens
- Maternal and Fetal Health Research Group, Division of Developmental Biology and Medicine, Faculty of Biology, Medicine and Health, University of Manchester, Manchester M13 9WL, UK;
| |
Collapse
|
19
|
Pseudotime Analysis Reveals Exponential Trends in DNA Methylation Aging with Mortality Associated Timescales. Cells 2022; 11:cells11050767. [PMID: 35269389 PMCID: PMC8909670 DOI: 10.3390/cells11050767] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 02/08/2022] [Accepted: 02/14/2022] [Indexed: 01/27/2023] Open
Abstract
The epigenetic trajectory of DNA methylation profiles has a nonlinear relationship with time, reflecting rapid changes in DNA methylation early in life that progressively slow with age. In this study, we use pseudotime analysis to determine the functional form of these trajectories. Unlike epigenetic clocks that constrain the functional form of methylation changes with time, pseudotime analysis orders samples along a path, based on similarities in a latent dimension, to provide an unbiased trajectory. We show that pseudotime analysis can be applied to DNA methylation in human blood and brain tissue and find that it is highly correlated with the epigenetic states described by the Epigenetic Pacemaker. Moreover, we show that the pseudotime trajectory can be modeled with respect to time, using a sum of two exponentials, with coefficients that are close to the timescales of human age-associated mortality. Thus, for the first time, we can identify age-associated molecular changes that appear to track the exponential dynamics of mortality risk.
Collapse
|
20
|
Lin X, Chi D, Meng Q, Gong Q, Tong Z. Single-Cell Sequencing Unveils the Heterogeneity of Nonimmune Cells in Chronic Apical Periodontitis. Front Cell Dev Biol 2022; 9:820274. [PMID: 35237614 PMCID: PMC8883837 DOI: 10.3389/fcell.2021.820274] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 12/24/2021] [Indexed: 12/12/2022] Open
Abstract
Chronic apical periodontitis (CAP) is a unique dynamic interaction between microbial invasions and host defense mechanisms, resulting in infiltration of immune cells, bone absorption, and periapical granuloma formation. To help to understand periapical tissue pathophysiology, we constituted a single-cell atlas for 26,737 high-quality cells from inflammatory periapical tissue and uncovered the complex cellular landscape. The eight types of cells, including nonimmune cells and immune cells, were identified in the periapical tissue of CAP. Considering the key roles of nonimmune cells in CAP, we emphasized osteo-like cells, basal/stromal cells, endothelial cells, and epithelial cells, and discovered their diversity and heterogeneity. The temporal profiling of genomic alterations from common CAP to typical periapical granuloma provided predictions for transcription factors and biological processes. Our study presented potential clues that the shift of inflammatory cytokines, chemokines, proteases, and growth factors initiated polymorphic cell differentiation, lymphangiogenesis, and angiogenesis during CAP.
Collapse
Affiliation(s)
- Xinwei Lin
- Department of Operative Dentistry and Endodontics, Hospital of Stomatology, Sun Yat-sen University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
| | - Danlu Chi
- Department of Operative Dentistry and Endodontics, Hospital of Stomatology, Sun Yat-sen University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
| | - Qingzhen Meng
- Department of Operative Dentistry and Endodontics, Hospital of Stomatology, Sun Yat-sen University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
| | - Qimei Gong
- Department of Operative Dentistry and Endodontics, Hospital of Stomatology, Sun Yat-sen University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
- *Correspondence: Qimei Gong, ; Zhongchun Tong,
| | - Zhongchun Tong
- Department of Operative Dentistry and Endodontics, Hospital of Stomatology, Sun Yat-sen University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
- *Correspondence: Qimei Gong, ; Zhongchun Tong,
| |
Collapse
|
21
|
Redhu N, Thakur Z. Network biology and applications. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00024-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
22
|
Cardona-Alberich A, Tourbez M, Pearce SF, Sibley CR. Elucidating the cellular dynamics of the brain with single-cell RNA sequencing. RNA Biol 2021; 18:1063-1084. [PMID: 33499699 PMCID: PMC8216183 DOI: 10.1080/15476286.2020.1870362] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 12/17/2020] [Accepted: 12/24/2020] [Indexed: 12/18/2022] Open
Abstract
Single-cell RNA-sequencing (scRNA-seq) has emerged in recent years as a breakthrough technology to understand RNA metabolism at cellular resolution. In addition to allowing new cell types and states to be identified, scRNA-seq can permit cell-type specific differential gene expression changes, pre-mRNA processing events, gene regulatory networks and single-cell developmental trajectories to be uncovered. More recently, a new wave of multi-omic adaptations and complementary spatial transcriptomics workflows have been developed that facilitate the collection of even more holistic information from individual cells. These developments have unprecedented potential to provide penetrating new insights into the basic neural cell dynamics and molecular mechanisms relevant to the nervous system in both health and disease. In this review we discuss this maturation of single-cell RNA-sequencing over the past decade, and review the different adaptations of the technology that can now be applied both at different scales and for different purposes. We conclude by highlighting how these methods have already led to many exciting discoveries across neuroscience that have furthered our cellular understanding of the neurological disease.
Collapse
Affiliation(s)
- Aida Cardona-Alberich
- Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological Sciences, Edinburgh University, Edinburgh, UK
| | - Manon Tourbez
- Simons Initiative for the Developing Brain, University of Edinburgh, Edinburgh, UK
| | - Sarah F. Pearce
- Simons Initiative for the Developing Brain, University of Edinburgh, Edinburgh, UK
| | - Christopher R. Sibley
- Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological Sciences, Edinburgh University, Edinburgh, UK
- Simons Initiative for the Developing Brain, University of Edinburgh, Edinburgh, UK
- Centre for Discovery Brain Sciences, University of Edinburgh, Edinburgh, UK
- Euan MacDonald Centre for MND Research, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
23
|
Fang S, Kirk PDW, Bantscheff M, Lilley KS, Crook OM. A Bayesian semi-parametric model for thermal proteome profiling. Commun Biol 2021; 4:810. [PMID: 34188175 PMCID: PMC8241860 DOI: 10.1038/s42003-021-02306-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Accepted: 06/07/2021] [Indexed: 02/06/2023] Open
Abstract
The thermal stability of proteins can be altered when they interact with small molecules, other biomolecules or are subject to post-translation modifications. Thus monitoring the thermal stability of proteins under various cellular perturbations can provide insights into protein function, as well as potentially determine drug targets and off-targets. Thermal proteome profiling is a highly multiplexed mass-spectrommetry method for monitoring the melting behaviour of thousands of proteins in a single experiment. In essence, thermal proteome profiling assumes that proteins denature upon heating and hence become insoluble. Thus, by tracking the relative solubility of proteins at sequentially increasing temperatures, one can report on the thermal stability of a protein. Standard thermodynamics predicts a sigmoidal relationship between temperature and relative solubility and this is the basis of current robust statistical procedures. However, current methods do not model deviations from this behaviour and they do not quantify uncertainty in the melting profiles. To overcome these challenges, we propose the application of Bayesian functional data analysis tools which allow complex temperature-solubility behaviours. Our methods have improved sensitivity over the state-of-the art, identify new drug-protein associations and have less restrictive assumptions than current approaches. Our methods allows for comprehensive analysis of proteins that deviate from the predicted sigmoid behaviour and we uncover potentially biphasic phenomena with a series of published datasets.
Collapse
Affiliation(s)
- Siqi Fang
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK
- Milner Therapeutics Institute, Jeffrey Cheah Biomedical Centre, University of Cambridge, Cambridge, UK
| | - Paul D W Kirk
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, UK
- Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, University of Cambridge, Cambridge, UK
| | | | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK.
- Milner Therapeutics Institute, Jeffrey Cheah Biomedical Centre, University of Cambridge, Cambridge, UK.
| | - Oliver M Crook
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK.
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, UK.
- Milner Therapeutics Institute, Jeffrey Cheah Biomedical Centre, University of Cambridge, Cambridge, UK.
| |
Collapse
|
24
|
Wei J, Zhou T, Zhang X, Tian T. DTFLOW: Inference and Visualization of Single-cell Pseudotime Trajectory Using Diffusion Propagation. GENOMICS, PROTEOMICS & BIOINFORMATICS 2021; 19:306-318. [PMID: 33662626 PMCID: PMC8602766 DOI: 10.1016/j.gpb.2020.08.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/13/2019] [Revised: 05/26/2020] [Accepted: 10/29/2020] [Indexed: 12/13/2022]
Abstract
One of the major challenges in single-cell data analysis is the determination of cellular developmental trajectories using single-cell data. Although substantial studies have been conducted in recent years, more effective methods are still strongly needed to infer the developmental processes accurately. This work devises a new method, named DTFLOW, for determining the pseudo-temporal trajectories with multiple branches. DTFLOW consists of two major steps: a new method called Bhattacharyya kernel feature decomposition (BKFD) to reduce the data dimensions, and a novel approach named Reverse Searching on k-nearest neighbor graph (RSKG) to identify the multi-branching processes of cellular differentiation. In BKFD, we first establish a stationary distribution for each cell to represent the transition of cellular developmental states based on the random walk with restart algorithm, and then propose a new distance metric for calculating pseudotime of single cells by introducing the Bhattacharyya kernel matrix. The effectiveness of DTFLOW is rigorously examined by using four single-cell datasets. We compare the efficiency of DTFLOW with the published state-of-the-art methods. Simulation results suggest that DTFLOW has superior accuracy and strong robustness properties for constructing pseudotime trajectories. The Python source code of DTFLOW can be freely accessed at https://github.com/statway/DTFLOW.
Collapse
Affiliation(s)
- Jiangyong Wei
- College of Science, Huazhong Agricultural University, Wuhan 430070, China; School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan 430073, China
| | - Tianshou Zhou
- School of Mathematics and Statistics, Sun Yat-sen University, Guangzhou 510275, China
| | - Xinan Zhang
- School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China
| | - Tianhai Tian
- School of Mathematics, Monash University, Melbourne, VIC 3800, Australia.
| |
Collapse
|
25
|
Trajectory modeling of endothelial-to-mesenchymal transition reveals galectin-3 as a mediator in pulmonary fibrosis. Cell Death Dis 2021; 12:327. [PMID: 33771973 PMCID: PMC7998015 DOI: 10.1038/s41419-021-03603-0] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Revised: 03/08/2021] [Accepted: 03/09/2021] [Indexed: 12/19/2022]
Abstract
The endothelial-to-mesenchymal transition (EndMT) is an important source of fibrotic cells in idiopathic pulmonary fibrosis (IPF). However, how endothelial cells (ECs) are activated and how EndMT impact IPF remain largely elusive. Here, we use unsupervised pseudotemporal analysis to recognize the heterogeneity of ECs and reconstruct EndMT trajectory of bleomycin (BLM)-treated Tie2creER/+;Rosa26tdTomato/+ IPF mice. Genes like C3ar1 and Lgals3 (protein name galectin-3) are highly correlated with the transitional pseudotime, whose expression is gradually upregulated during the fate switch of ECs from quiescence to activation in fibrosis. Inhibition of galectin-3 via siRNA or protein antagonists in mice could alleviate the pathogenesis of IPF and the transition of ECs. With the stimulation of human pulmonary microvascular endothelial cells (HPMECs) by recombinant proteins and/or siRNAs for galectin-3 in vitro, β-catenin/GSK3β signaling and its upstream regulator AKT are perturbed, which indicates they mediate the EndMT progress. These results suggest that EndMT is essential to IPF process and provide potential therapeutic targets for vascular remodeling.
Collapse
|
26
|
Kopf A, Claassen M. Latent representation learning in biology and translational medicine. PATTERNS (NEW YORK, N.Y.) 2021; 2:100198. [PMID: 33748792 PMCID: PMC7961186 DOI: 10.1016/j.patter.2021.100198] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Current data generation capabilities in the life sciences render scientists in an apparently contradicting situation. While it is possible to simultaneously measure an ever-increasing number of systems parameters, the resulting data are becoming increasingly difficult to interpret. Latent variable modeling allows for such interpretation by learning non-measurable hidden variables from observations. This review gives an overview over the different formal approaches to latent variable modeling, as well as applications at different scales of biological systems, such as molecular structures, intra- and intercellular regulatory up to physiological networks. The focus is on demonstrating how these approaches have enabled interpretable representations and ultimately insights in each of these domains. We anticipate that a wider dissemination of latent variable modeling in the life sciences will enable a more effective and productive interpretation of studies based on heterogeneous and high-dimensional data modalities.
Collapse
Affiliation(s)
- Andreas Kopf
- Institute of Molecular Systems Biology, ETH Zürich, 8093 Zürich, Switzerland
| | - Manfred Claassen
- Division of Clinical Bioinformatics, Department of Internal Medicine I, University Hospital Tübingen, 72076 Tübingen, Germany
- Computer Science Department, Eberhard Karls University of Tübingen, 72076 Tübingen, Germany
- Cluster of Excellence Machine Learning (EXC 2064), Eberhard Karls University of Tübingen, 72076 Tübingen, Germany
| |
Collapse
|
27
|
Liu W, Sun X, Peng L, Zhou L, Lin H, Jiang Y. RWRNET: A Gene Regulatory Network Inference Algorithm Using Random Walk With Restart. Front Genet 2020; 11:591461. [PMID: 33101398 PMCID: PMC7545090 DOI: 10.3389/fgene.2020.591461] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Accepted: 09/02/2020] [Indexed: 11/30/2022] Open
Abstract
Inferring gene regulatory networks from expression data is essential in identifying complex regulatory relationships among genes and revealing the mechanism of certain diseases. Various computation methods have been developed for inferring gene regulatory networks. However, these methods focus on the local topology of the network rather than on the global topology. From network optimisation standpoint, emphasising the global topology of the network also reduces redundant regulatory relationships. In this study, we propose a novel network inference algorithm using Random Walk with Restart (RWRNET) that combines local and global topology relationships. The method first captures the local topology through three elements of random walk and then combines the local topology with the global topology by Random Walk with Restart. The Markov Blanket discovery algorithm is then used to deal with isolated genes. The proposed method is compared with several state-of-the-art methods on the basis of six benchmark datasets. Experimental results demonstrated the effectiveness of the proposed method.
Collapse
Affiliation(s)
- Wei Liu
- School of Computer Science, Xiangtan University, Xiangtan, China.,Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, China
| | - Xingen Sun
- School of Computer Science, Xiangtan University, Xiangtan, China
| | - Li Peng
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China
| | - Lili Zhou
- School of Computer Science, Xiangtan University, Xiangtan, China
| | - Hui Lin
- School of Computer Science, Xiangtan University, Xiangtan, China
| | - Yi Jiang
- School of Computer Science, Xiangtan University, Xiangtan, China
| |
Collapse
|
28
|
Schmidt M, Loeffler-Wirth H, Binder H. Developmental scRNAseq Trajectories in Gene- and Cell-State Space-The Flatworm Example. Genes (Basel) 2020; 11:E1214. [PMID: 33081343 PMCID: PMC7603055 DOI: 10.3390/genes11101214] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Revised: 10/13/2020] [Accepted: 10/14/2020] [Indexed: 12/19/2022] Open
Abstract
Single-cell RNA sequencing has become a standard technique to characterize tissue development. Hereby, cross-sectional snapshots of the diversity of cell transcriptomes were transformed into (pseudo-) longitudinal trajectories of cell differentiation using computational methods, which are based on similarity measures distinguishing cell phenotypes. Cell development is driven by alterations of transcriptional programs e.g., by differentiation from stem cells into various tissues or by adapting to micro-environmental requirements. We here complement developmental trajectories in cell-state space by trajectories in gene-state space to more clearly address this latter aspect. Such trajectories can be generated using self-organizing maps machine learning. The method transforms multidimensional gene expression patterns into two dimensional data landscapes, which resemble the metaphoric Waddington epigenetic landscape. Trajectories in this landscape visualize transcriptional programs passed by cells along their developmental paths from stem cells to differentiated tissues. In addition, we generated developmental "vector fields" using RNA-velocities to forecast changes of RNA abundance in the expression landscapes. We applied the method to tissue development of planarian as an illustrative example. Gene-state space trajectories complement our data portrayal approach by (pseudo-)temporal information about changing transcriptional programs of the cells. Future applications can be seen in the fields of tissue and cell differentiation, ageing and tumor progression and also, using other data types such as genome, methylome, and also clinical and epidemiological phenotype data.
Collapse
Affiliation(s)
- Maria Schmidt
- IZBI, Interdisciplinary Centre for Bioinformatics, Universität Leipzig, Härtelstr. 16–18, 04107 Leipzig, Germany; (H.L.-W.); (H.B.)
| | | | | |
Collapse
|
29
|
Jang SE, Qiu L, Chan LL, Tan EK, Zeng L. Current Status of Stem Cell-Derived Therapies for Parkinson's Disease: From Cell Assessment and Imaging Modalities to Clinical Trials. Front Neurosci 2020; 14:558532. [PMID: 33177975 PMCID: PMC7596695 DOI: 10.3389/fnins.2020.558532] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Accepted: 09/17/2020] [Indexed: 12/23/2022] Open
Abstract
Curative therapies or treatments reversing the progression of Parkinson’s disease (PD) have attracted considerable interest in the last few decades. PD is characterized by the gradual loss of dopaminergic (DA) neurons and decreased striatal dopamine levels. Current challenges include optimizing neuroprotective strategies, developing personalized drug therapy, and minimizing side effects from the long-term prescription of pharmacological drugs used to relieve short-term motor symptoms. Transplantation of DA cells into PD patients’ brains to replace degenerated DA has the potential to change the treatment paradigm. Herein, we provide updates on current progress in stem cell-derived DA neuron transplantation as a therapeutic alternative for PD. We briefly highlight cell sources for transplantation and focus on cell assessment methods such as identification of genetic markers, single-cell sequencing, and imaging modalities used to access cell survival and function. More importantly, we summarize clinical reports of patients who have undergone cell-derived transplantation in PD to better perceive lessons that can be drawn from past and present clinical outcomes. Modifying factors include (1) source of the stem cells, (2) quality of the stem cells, (3) age of the patient, (4) stage of disease progression at the time of cell therapy, (5) surgical technique/practices, and (6) the use of immunosuppression. We await the outcomes of joint efforts in clinical trials around the world such as NYSTEM and CiRA to further guide us in the selection of the most suitable parameters for cell-based neurotransplantation in PD.
Collapse
Affiliation(s)
- Se Eun Jang
- Neural Stem Cell Research Lab, Research Department, National Neuroscience Institute, Singapore, Singapore
| | - Lifeng Qiu
- Neural Stem Cell Research Lab, Research Department, National Neuroscience Institute, Singapore, Singapore
| | - Ling Ling Chan
- Department of Diagnostic Radiology, Singapore General Hospital, Singapore, Singapore.,Neuroscience & Behavioral Disorders Program, Duke University and National University of Singapore (DUKE-NUS), Graduate Medical School, Singapore, Singapore
| | - Eng-King Tan
- Neuroscience & Behavioral Disorders Program, Duke University and National University of Singapore (DUKE-NUS), Graduate Medical School, Singapore, Singapore.,Department of Neurology, National Neuroscience Institute, Singapore General Hospital Campus, Singapore, Singapore
| | - Li Zeng
- Neural Stem Cell Research Lab, Research Department, National Neuroscience Institute, Singapore, Singapore.,Neuroscience & Behavioral Disorders Program, Duke University and National University of Singapore (DUKE-NUS), Graduate Medical School, Singapore, Singapore.,Lee Kong Chian School of Medicine, Nanyang Technological University, Novena Campus, Singapore, Singapore
| |
Collapse
|
30
|
Abstract
BACKGROUND Oscillatory genes, with periodic expression at the mRNA and/or protein level, have been shown to play a pivotal role in many biological contexts. However, with the exception of the circadian clock and cell cycle, only a few such genes are known. Detecting oscillatory genes from snapshot single-cell experiments is a challenging task due to the lack of time information. Oscope is a recently proposed method to identify co-oscillatory gene pairs using single-cell RNA-seq data. Although promising, the current implementation of Oscope does not provide a principled statistical criterion for selecting oscillatory genes. RESULTS We improve the optimisation scheme underlying Oscope and provide a well-calibrated non-parametric hypothesis test to select oscillatory genes at a given FDR threshold. We evaluate performance on synthetic data and three real datasets and show that our approach is more sensitive than the original Oscope formulation, discovering larger sets of known oscillators while avoiding the need for less interpretable thresholds. We also describe how our proposed pseudo-time estimation method is more accurate in recovering the true cell order for each gene cluster while requiring substantially less computation time than the extended nearest insertion approach. CONCLUSIONS OscoNet is a robust and versatile approach to detect oscillatory gene networks from snapshot single-cell data addressing many of the limitations of the original Oscope method.
Collapse
|
31
|
Savulescu AF, Jacobs C, Negishi Y, Davignon L, Mhlanga MM. Pinpointing Cell Identity in Time and Space. Front Mol Biosci 2020; 7:209. [PMID: 32923457 PMCID: PMC7456825 DOI: 10.3389/fmolb.2020.00209] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 07/30/2020] [Indexed: 01/15/2023] Open
Abstract
Mammalian cells display a broad spectrum of phenotypes, morphologies, and functional niches within biological systems. Our understanding of mechanisms at the individual cellular level, and how cells function in concert to form tissues, organs and systems, has been greatly facilitated by centuries of extensive work to classify and characterize cell types. Classic histological approaches are now complemented with advanced single-cell sequencing and spatial transcriptomics for cell identity studies. Emerging data suggests that additional levels of information should be considered, including the subcellular spatial distribution of molecules such as RNA and protein, when classifying cells. In this Perspective piece we describe the importance of integrating cell transcriptional state with tissue and subcellular spatial and temporal information for thorough characterization of cell type and state. We refer to recent studies making use of single cell RNA-seq and/or image-based cell characterization, which highlight a need for such in-depth characterization of cell populations. We also describe the advances required in experimental, imaging and analytical methods to address these questions. This Perspective concludes by framing this argument in the context of projects such as the Human Cell Atlas, and related fields of cancer research and developmental biology.
Collapse
Affiliation(s)
- Anca F. Savulescu
- Division of Chemical, Systems & Synthetic Biology, Faculty of Health Sciences, Institute of Infectious Disease & Molecular Medicine, University of Cape Town, Cape Town, South Africa
| | - Caron Jacobs
- Division of Chemical, Systems & Synthetic Biology, Faculty of Health Sciences, Institute of Infectious Disease & Molecular Medicine, University of Cape Town, Cape Town, South Africa
- SAMRC/NHLS/UCT Molecular Mycobacteriology Research Unit, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Cape Town, South Africa
- Wellcome Centre for Infectious Diseases Research in Africa, University of Cape Town, Cape Town, South Africa
| | - Yutaka Negishi
- Division of Chemical, Systems & Synthetic Biology, Faculty of Health Sciences, Institute of Infectious Disease & Molecular Medicine, University of Cape Town, Cape Town, South Africa
| | - Laurianne Davignon
- Division of Chemical, Systems & Synthetic Biology, Faculty of Health Sciences, Institute of Infectious Disease & Molecular Medicine, University of Cape Town, Cape Town, South Africa
| | - Musa M. Mhlanga
- Division of Chemical, Systems & Synthetic Biology, Faculty of Health Sciences, Institute of Infectious Disease & Molecular Medicine, University of Cape Town, Cape Town, South Africa
- Wellcome Centre for Infectious Diseases Research in Africa, University of Cape Town, Cape Town, South Africa
- Instituto de Medicina Molecular, Faculdade de Medicina da Universidade de Lisboa, Lisbon, Portugal
| |
Collapse
|
32
|
Verma A, Engelhardt BE. A robust nonlinear low-dimensional manifold for single cell RNA-seq data. BMC Bioinformatics 2020; 21:324. [PMID: 32693778 PMCID: PMC7374962 DOI: 10.1186/s12859-020-03625-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2019] [Accepted: 06/22/2020] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Modern developments in single-cell sequencing technologies enable broad insights into cellular state. Single-cell RNA sequencing (scRNA-seq) can be used to explore cell types, states, and developmental trajectories to broaden our understanding of cellular heterogeneity in tissues and organs. Analysis of these sparse, high-dimensional experimental results requires dimension reduction. Several methods have been developed to estimate low-dimensional embeddings for filtered and normalized single-cell data. However, methods have yet to be developed for unfiltered and unnormalized count data that estimate uncertainty in the low-dimensional space. We present a nonlinear latent variable model with robust, heavy-tailed error and adaptive kernel learning to estimate low-dimensional nonlinear structure in scRNA-seq data. RESULTS Gene expression in a single cell is modeled as a noisy draw from a Gaussian process in high dimensions from low-dimensional latent positions. This model is called the Gaussian process latent variable model (GPLVM). We model residual errors with a heavy-tailed Student's t-distribution to estimate a manifold that is robust to technical and biological noise found in normalized scRNA-seq data. We compare our approach to common dimension reduction tools across a diverse set of scRNA-seq data sets to highlight our model's ability to enable important downstream tasks such as clustering, inferring cell developmental trajectories, and visualizing high throughput experiments on available experimental data. CONCLUSION We show that our adaptive robust statistical approach to estimate a nonlinear manifold is well suited for raw, unfiltered gene counts from high-throughput sequencing technologies for visualization, exploration, and uncertainty estimation of cell states.
Collapse
Affiliation(s)
- Archit Verma
- Chemical and Biological Engineering, Princeton University, 50-70 Olden Street, Princeton, 08540 NJ USA
| | - Barbara E. Engelhardt
- Computer Science, Center for Statistics and Machine Learning, 35 Olden Street, Princeton, 08540 NJ USA
| |
Collapse
|
33
|
Gene regulatory network inference from sparsely sampled noisy data. Nat Commun 2020; 11:3493. [PMID: 32661225 PMCID: PMC7359369 DOI: 10.1038/s41467-020-17217-1] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Accepted: 06/11/2020] [Indexed: 12/16/2022] Open
Abstract
The complexity of biological systems is encoded in gene regulatory networks. Unravelling this intricate web is a fundamental step in understanding the mechanisms of life and eventually developing efficient therapies to treat and cure diseases. The major obstacle in inferring gene regulatory networks is the lack of data. While time series data are nowadays widely available, they are typically noisy, with low sampling frequency and overall small number of samples. This paper develops a method called BINGO to specifically deal with these issues. Benchmarked with both real and simulated time-series data covering many different gene regulatory networks, BINGO clearly and consistently outperforms state-of-the-art methods. The novelty of BINGO lies in a nonparametric approach featuring statistical sampling of continuous gene expression profiles. BINGO's superior performance and ease of use, even by non-specialists, make gene regulatory network inference available to any researcher, helping to decipher the complex mechanisms of life.
Collapse
|
34
|
Lin C, Bar-Joseph Z. Continuous-state HMMs for modeling time-series single-cell RNA-Seq data. Bioinformatics 2020; 35:4707-4715. [PMID: 31038684 DOI: 10.1093/bioinformatics/btz296] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Revised: 02/11/2019] [Accepted: 04/18/2019] [Indexed: 12/11/2022] Open
Abstract
MOTIVATION Methods for reconstructing developmental trajectories from time-series single-cell RNA-Seq (scRNA-Seq) data can be largely divided into two categories. The first, often referred to as pseudotime ordering methods are deterministic and rely on dimensionality reduction followed by an ordering step. The second learns a probabilistic branching model to represent the developmental process. While both types have been successful, each suffers from shortcomings that can impact their accuracy. RESULTS We developed a new method based on continuous-state HMMs (CSHMMs) for representing and modeling time-series scRNA-Seq data. We define the CSHMM model and provide efficient learning and inference algorithms which allow the method to determine both the structure of the branching process and the assignment of cells to these branches. Analyzing several developmental single-cell datasets, we show that the CSHMM method accurately infers branching topology and correctly and continuously assign cells to paths, improving upon prior methods proposed for this task. Analysis of genes based on the continuous cell assignment identifies known and novel markers for different cell types. AVAILABILITY AND IMPLEMENTATION Software and Supporting website: www.andrew.cmu.edu/user/chiehl1/CSHMM/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chieh Lin
- Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, US
| | - Ziv Bar-Joseph
- Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, US.,Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, US
| |
Collapse
|
35
|
Saint-Antoine MM, Singh A. Network inference in systems biology: recent developments, challenges, and applications. Curr Opin Biotechnol 2020; 63:89-98. [PMID: 31927423 PMCID: PMC7308210 DOI: 10.1016/j.copbio.2019.12.002] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 12/03/2019] [Indexed: 12/12/2022]
Abstract
One of the most interesting, difficult, and potentially useful topics in computational biology is the inference of gene regulatory networks (GRNs) from expression data. Although researchers have been working on this topic for more than a decade and much progress has been made, it remains an unsolved problem and even the most sophisticated inference algorithms are far from perfect. In this paper, we review the latest developments in network inference, including state-of-the-art algorithms like PIDC, Phixer, and more. We also discuss unsolved computational challenges, including the optimal combination of algorithms, integration of multiple data sources, and pseudo-temporal ordering of static expression data. Lastly, we discuss some exciting applications of network inference in cancer research, and provide a list of useful software tools for researchers hoping to conduct their own network inference analyses.
Collapse
Affiliation(s)
- Michael M Saint-Antoine
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, Delaware 19716, USA
| | - Abhyudai Singh
- Electrical and Computer Engineering, University of Delaware, Newark, Delaware 19716, USA.
| |
Collapse
|
36
|
|
37
|
Putative regulators for the continuum of erythroid differentiation revealed by single-cell transcriptome of human BM and UCB cells. Proc Natl Acad Sci U S A 2020; 117:12868-12876. [PMID: 32457162 DOI: 10.1073/pnas.1915085117] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Fine-resolution differentiation trajectories of adult human hematopoietic stem cells (HSCs) involved in the generation of red cells is critical for understanding dynamic developmental changes that accompany human erythropoiesis. Using single-cell RNA sequencing (scRNA-seq) of primary human terminal erythroid cells (CD34-CD235a+) isolated directly from adult bone marrow (BM) and umbilical cord blood (UCB), we documented the transcriptome of terminally differentiated human erythroblasts at unprecedented resolution. The insights enabled us to distinguish polychromatic erythroblasts (PolyEs) at the early and late stages of development as well as the different development stages of orthochromatic erythroblasts (OrthoEs). We further identified a set of putative regulators of terminal erythroid differentiation and functionally validated three of the identified genes, AKAP8L, TERF2IP, and RNF10, by monitoring cell differentiation and apoptosis. We documented that knockdown of AKAP8L suppressed the commitment of HSCs to erythroid lineage and cell proliferation and delayed differentiation of colony-forming unit-erythroid (CFU-E) to the proerythroblast stage (ProE). In contrast, the knockdown of TERF2IP and RNF10 delayed differentiation of PolyE to OrthoE stage. Taken together, the convergence and divergence of the transcriptional continuums at single-cell resolution underscore the transcriptional regulatory networks that underlie human fetal and adult terminal erythroid differentiation.
Collapse
|
38
|
Strauss ME, Kirk PDW, Reid JE, Wernisch L. GPseudoClust: deconvolution of shared pseudo-profiles at single-cell resolution. Bioinformatics 2020; 36:1484-1491. [PMID: 31608923 PMCID: PMC7703763 DOI: 10.1093/bioinformatics/btz778] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Revised: 09/20/2019] [Accepted: 10/09/2019] [Indexed: 01/21/2023] Open
Abstract
MOTIVATION Many methods have been developed to cluster genes on the basis of their changes in mRNA expression over time, using bulk RNA-seq or microarray data. However, single-cell data may present a particular challenge for these algorithms, since the temporal ordering of cells is not directly observed. One way to address this is to first use pseudotime methods to order the cells, and then apply clustering techniques for time course data. However, pseudotime estimates are subject to high levels of uncertainty, and failing to account for this uncertainty is liable to lead to erroneous and/or over-confident gene clusters. RESULTS The proposed method, GPseudoClust, is a novel approach that jointly infers pseudotemporal ordering and gene clusters, and quantifies the uncertainty in both. GPseudoClust combines a recent method for pseudotime inference with non-parametric Bayesian clustering methods, efficient Markov Chain Monte Carlo sampling and novel subsampling strategies which aid computation. We consider a broad array of simulated and experimental datasets to demonstrate the effectiveness of GPseudoClust in a range of settings. AVAILABILITY AND IMPLEMENTATION An implementation is available on GitHub: https://github.com/magStra/nonparametricSummaryPSM and https://github.com/magStra/GPseudoClust. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Magdalena E Strauss
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge CB2 0SR, UK
| | - Paul D W Kirk
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge CB2 0SR, UK
- Department of Medicine, University of Cambridge, Addenbrooke's Hospital, Cambridge CB2 0SP, UK
| | - John E Reid
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge CB2 0SR, UK
| | - Lorenz Wernisch
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge CB2 0SR, UK
| |
Collapse
|
39
|
Verrou K, Tsamardinos I, Papoutsoglou G. Learning Pathway Dynamics from Single-Cell Proteomic Data: A Comparative Study. Cytometry A 2020; 97:241-252. [PMID: 32100455 PMCID: PMC7687117 DOI: 10.1002/cyto.a.23976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Revised: 11/07/2019] [Accepted: 01/13/2020] [Indexed: 12/01/2022]
Abstract
Single-cell platforms provide statistically large samples of snapshot observations capable of resolving intrercellular heterogeneity. Currently, there is a growing literature on algorithms that exploit this attribute in order to infer the trajectory of biological mechanisms, such as cell proliferation and differentiation. Despite the efforts, the trajectory inference methodology has not yet been used for addressing the challenging problem of learning the dynamics of protein signaling systems. In this work, we assess this prospect by testing the performance of this class of algorithms on four proteomic temporal datasets. To evaluate the learning quality, we design new general-purpose evaluation metrics that are able to quantify performance on (i) the biological meaning of the output, (ii) the consistency of the inferred trajectory, (iii) the algorithm robustness, (iv) the correlation of the learning output with the initial dataset, and (v) the roughness of the cell parameter levels though the inferred trajectory. We show that experimental time alone is insufficient to provide knowledge about the order of proteins during signal transduction. Accordingly, we show that the inferred trajectories provide richer information about the underlying dynamics. We learn that established methods tested on high-dimensional data with small sample size, slow dynamics, and complex structures (e.g. bifurcations) cannot always work in the signaling setting. Among the methods we evaluate, Scorpius and a newly introduced approach that combines Diffusion Maps and Principal Curves were found to perform adequately in recovering the progression of signal transduction although their performance on some metrics varies from one dataset to another. The novel metrics we devise highlight that it is difficult to conclude, which one method is universally applicable for the task. Arguably, there are still many challenges and open problems to resolve. © 2020 The Authors. Cytometry Part A published by Wiley Periodicals, Inc. on behalf of International Society for Advancement of Cytometry.
Collapse
Affiliation(s)
| | - Ioannis Tsamardinos
- Computer Science DepartmentUniversity of CreteHeraklionGreece
- Gnosis Data Analysis PCHeraklionGreece
| | | |
Collapse
|
40
|
Lin C, Ding J, Bar-Joseph Z. Inferring TF activation order in time series scRNA-Seq studies. PLoS Comput Biol 2020; 16:e1007644. [PMID: 32069291 PMCID: PMC7048296 DOI: 10.1371/journal.pcbi.1007644] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Revised: 02/28/2020] [Accepted: 01/09/2020] [Indexed: 12/11/2022] Open
Abstract
Methods for the analysis of time series single cell expression data (scRNA-Seq) either do not utilize information about transcription factors (TFs) and their targets or only study these as a post-processing step. Using such information can both, improve the accuracy of the reconstructed model and cell assignments, while at the same time provide information on how and when the process is regulated. We developed the Continuous-State Hidden Markov Models TF (CSHMM-TF) method which integrates probabilistic modeling of scRNA-Seq data with the ability to assign TFs to specific activation points in the model. TFs are assumed to influence the emission probabilities for cells assigned to later time points allowing us to identify not just the TFs controlling each path but also their order of activation. We tested CSHMM-TF on several mouse and human datasets. As we show, the method was able to identify known and novel TFs for all processes, assigned time of activation agrees with both expression information and prior knowledge and combinatorial predictions are supported by known interactions. We also show that CSHMM-TF improves upon prior methods that do not utilize TF-gene interaction. An important attribute of time series single cell RNA-Seq (scRNA-Seq) data, is the ability to infer continuous trajectories of genes based on orderings of the cells. While several methods have been developed for ordering cells and inferring such trajectories, to date it was not possible to use these to infer the temporal activity of several key TFs. These TFs are are only post-transcriptionally regulated and so their expression does not provide complete information on their activity. To address this we developed the Continuous-State Hidden Markov Models TF (CSHMM-TF) methods that assigns continuous activation time to TFs based on both, their expression and the expression of their targets. Applying our method to several time series scRNA-Seq datasets we show that it correctly identifies the key regulators for the processes being studied. We analyze the temporal assignments for these TFs and show that they provide new insights about combinatorial regulation and the ordering of TF activation. We used several complementary sources to validate some of these predictions and discuss a number of other novel suggestions based on the method. As we show, the method is able to scale to large and noisy datasets and so is appropriate for several studies utilizing time series scRNA-Seq data.
Collapse
Affiliation(s)
- Chieh Lin
- Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Jun Ding
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Ziv Bar-Joseph
- Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- * E-mail:
| |
Collapse
|
41
|
Lähnemann D, Köster J, Szczurek E, McCarthy DJ, Hicks SC, Robinson MD, Vallejos CA, Campbell KR, Beerenwinkel N, Mahfouz A, Pinello L, Skums P, Stamatakis A, Attolini CSO, Aparicio S, Baaijens J, Balvert M, Barbanson BD, Cappuccio A, Corleone G, Dutilh BE, Florescu M, Guryev V, Holmer R, Jahn K, Lobo TJ, Keizer EM, Khatri I, Kielbasa SM, Korbel JO, Kozlov AM, Kuo TH, Lelieveldt BP, Mandoiu II, Marioni JC, Marschall T, Mölder F, Niknejad A, Rączkowska A, Reinders M, Ridder JD, Saliba AE, Somarakis A, Stegle O, Theis FJ, Yang H, Zelikovsky A, McHardy AC, Raphael BJ, Shah SP, Schönhuth A. Eleven grand challenges in single-cell data science. Genome Biol 2020; 21:31. [PMID: 32033589 PMCID: PMC7007675 DOI: 10.1186/s13059-020-1926-6] [Citation(s) in RCA: 679] [Impact Index Per Article: 135.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 01/02/2020] [Indexed: 02/08/2023] Open
Abstract
The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years.
Collapse
Affiliation(s)
- David Lähnemann
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Department of Paediatric Oncology, Haematology and Immunology, Medical Faculty, Heinrich Heine University, University Hospital, Düsseldorf, Germany
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Johannes Köster
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, USA
| | - Ewa Szczurek
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warszawa, Poland
| | - Davis J. McCarthy
- Bioinformatics and Cellular Genomics, St Vincent’s Institute of Medical Research, Fitzroy, Australia
- Melbourne Integrative Genomics, School of BioSciences–School of Mathematics & Statistics, Faculty of Science, University of Melbourne, Melbourne, Australia
| | - Stephanie C. Hicks
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD USA
| | - Mark D. Robinson
- Institute of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zürich, Zürich, Switzerland
| | - Catalina A. Vallejos
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, UK
- The Alan Turing Institute, British Library, London, UK
| | - Kieran R. Campbell
- Department of Statistics, University of British Columbia, Vancouver, Canada
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada
- Data Science Institute, University of British Columbia, Vancouver, Canada
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Ahmed Mahfouz
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
| | - Luca Pinello
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital Research Institute, Charlestown, USA
- Department of Pathology, Harvard Medical School, Boston, USA
- Broad Institute of Harvard and MIT, Cambridge, MA USA
| | - Pavel Skums
- Department of Computer Science, Georgia State University, Atlanta, USA
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | | | - Samuel Aparicio
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, Canada
| | - Jasmijn Baaijens
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
| | - Marleen Balvert
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
| | - Buys de Barbanson
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
- Quantitative biology, Hubrecht Institute, Utrecht, The Netherlands
| | - Antonio Cappuccio
- Institute for Advanced Study, University of Amsterdam, Amsterdam, The Netherlands
| | - Giacomo Corleone
- Department of Surgery and Cancer, The Imperial Centre for Translational and Experimental Medicine, Imperial College London, London, UK
| | - Bas E. Dutilh
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
- Centre for Molecular and Biomolecular Informatics, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Maria Florescu
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
- Quantitative biology, Hubrecht Institute, Utrecht, The Netherlands
| | - Victor Guryev
- European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Rens Holmer
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Katharina Jahn
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Thamar Jessurun Lobo
- European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Emma M. Keizer
- Biometris, Wageningen University & Research, Wageningen, The Netherlands
| | - Indu Khatri
- Department of Immunohematology and Blood Transfusion, Leiden University Medical Center, Leiden, The Netherlands
| | - Szymon M. Kielbasa
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Jan O. Korbel
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Alexey M. Kozlov
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Tzu-Hao Kuo
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Boudewijn P.F. Lelieveldt
- PRB lab, Delft University of Technology, Delft, The Netherlands
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Ion I. Mandoiu
- Computer Science & Engineering Department, University of Connecticut, Storrs, USA
| | - John C. Marioni
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Tobias Marschall
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
- Max Planck Institute for Informatics, Saarbrücken, Germany
| | - Felix Mölder
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Institute of Pathology, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| | - Amir Niknejad
- Computation molecular design, Zuse Institute Berlin, Berlin, Germany
- Mathematics Department, Mount Saint Vincent, New York, USA
| | - Alicja Rączkowska
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warszawa, Poland
| | - Marcel Reinders
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
| | - Jeroen de Ridder
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
| | - Antoine-Emmanuel Saliba
- Helmholtz Institute for RNA-based Infection Research, Helmholtz-Center for Infection Research, Würzburg, Germany
| | - Antonios Somarakis
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Oliver Stegle
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center–DKFZ, Heidelberg, Germany
| | - Fabian J. Theis
- Institute of Computational Biology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany
| | - Huan Yang
- Division of Drug Discovery and Safety, Leiden Academic Center for Drug Research–LACDR–Leiden University, Leiden, The Netherlands
| | - Alex Zelikovsky
- Department of Computer Science, Georgia State University, Atlanta, USA
- The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Alice C. McHardy
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | | | - Sohrab P. Shah
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, USA
| | - Alexander Schönhuth
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
42
|
Strauß ME, Reid JE, Wernisch L. GPseudoRank: a permutation sampler for single cell orderings. Bioinformatics 2019; 35:611-618. [PMID: 30052778 PMCID: PMC6230469 DOI: 10.1093/bioinformatics/bty664] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Revised: 06/13/2018] [Accepted: 07/24/2018] [Indexed: 11/30/2022] Open
Abstract
Motivation A number of pseudotime methods have provided point estimates of the ordering of cells for scRNA-seq data. A still limited number of methods also model the uncertainty of the pseudotime estimate. However, there is still a need for a method to sample from complicated and multi-modal distributions of orders, and to estimate changes in the amount of the uncertainty of the order during the course of a biological development, as this can support the selection of suitable cells for the clustering of genes or for network inference. Results In applications to scRNA-seq data we demonstrate the potential of GPseudoRank to sample from complex and multi-modal posterior distributions and to identify phases of lower and higher pseudotime uncertainty during a biological process. GPseudoRank also correctly identifies cells precocious in their antiviral response and links uncertainty in the ordering to metastable states. A variant of the method extends the advantages of Bayesian modelling and MCMC to large droplet-based scRNA-seq datasets. Availability and implementation Our method is available on github: https://github.com/magStra/GPseudoRank. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Magdalena E Strauß
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | - John E Reid
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, UK.,Alan Turing Institute, London, UK
| | - Lorenz Wernisch
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| |
Collapse
|
43
|
Chan TE, Stumpf MPH, Babtie AC. Gene Regulatory Network Inference from Single-Cell Data Using Multivariate Information Measures. Cell Syst 2019; 5:251-267.e3. [PMID: 28957658 PMCID: PMC5624513 DOI: 10.1016/j.cels.2017.08.014] [Citation(s) in RCA: 320] [Impact Index Per Article: 53.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Revised: 04/26/2017] [Accepted: 08/24/2017] [Indexed: 12/03/2022]
Abstract
While single-cell gene expression experiments present new challenges for data processing, the cell-to-cell variability observed also reveals statistical relationships that can be used by information theory. Here, we use multivariate information theory to explore the statistical dependencies between triplets of genes in single-cell gene expression datasets. We develop PIDC, a fast, efficient algorithm that uses partial information decomposition (PID) to identify regulatory relationships between genes. We thoroughly evaluate the performance of our algorithm and demonstrate that the higher-order information captured by PIDC allows it to outperform pairwise mutual information-based algorithms when recovering true relationships present in simulated data. We also infer gene regulatory networks from three experimental single-cell datasets and illustrate how network context, choices made during analysis, and sources of variability affect network inference. PIDC tutorials and open-source software for estimating PID are available. PIDC should facilitate the identification of putative functional relationships and mechanistic hypotheses from single-cell transcriptomic data. PIDC infers gene regulatory networks from single-cell transcriptomic data Multivariate information measures and context in PIDC improve network inference Heterogeneity in single-cell data carries information about gene-gene interactions Fast, efficient, open-source software is made freely available
Collapse
Affiliation(s)
- Thalia E Chan
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK
| | - Michael P H Stumpf
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK; MRC London Institute of Medical Sciences, Hammersmith Campus, Imperial College London, London W12 0NN, UK.
| | - Ann C Babtie
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK.
| |
Collapse
|
44
|
Affiliation(s)
- Lan Huong Nguyen
- Institute for Mathematical and Computational Engineering, Stanford University, Stanford, California, United States of America
| | - Susan Holmes
- Department of Statistics, Stanford University, Stanford, California, United States of America
| |
Collapse
|
45
|
Pierson E, Koh PW, Hashimoto T, Koller D, Leskovec J, Eriksson N, Liang P. Inferring Multidimensional Rates of Aging from Cross-Sectional Data. PROCEEDINGS OF MACHINE LEARNING RESEARCH 2019; 89:97-107. [PMID: 31538144 PMCID: PMC6752884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Modeling how individuals evolve over time is a fundamental problem in the natural and social sciences. However, existing datasets are often cross-sectional with each individual observed only once, making it impossible to apply traditional time-series methods. Motivated by the study of human aging, we present an interpretable latent-variable model that learns temporal dynamics from cross-sectional data. Our model represents each individual's features over time as a nonlinear function of a low-dimensional, linearly-evolving latent state. We prove that when this nonlinear function is constrained to be order-isomorphic, the model family is identifiable solely from cross-sectional data provided the distribution of time-independent variation is known. On the UK Biobank human health dataset, our model reconstructs the observed data while learning interpretable rates of aging associated with diseases, mortality, and aging risk factors.
Collapse
|
46
|
Yau C, Campbell K. Bayesian statistical learning for big data biology. Biophys Rev 2019; 11:95-102. [PMID: 30729409 PMCID: PMC6381359 DOI: 10.1007/s12551-019-00499-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Accepted: 01/08/2019] [Indexed: 11/10/2022] Open
Abstract
Bayesian statistical learning provides a coherent probabilistic framework for modelling uncertainty in systems. This review describes the theoretical foundations underlying Bayesian statistics and outlines the computational frameworks for implementing Bayesian inference in practice. We then describe the use of Bayesian learning in single-cell biology for the analysis of high-dimensional, large data sets.
Collapse
Affiliation(s)
- Christopher Yau
- Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, UK.
- The Alan Turing Institute, London, UK.
| | - Kieran Campbell
- Department of Statistics, University of British Columbia, Vancouver, Canada
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada
| |
Collapse
|
47
|
Lang C, Campbell KR, Ryan BJ, Carling P, Attar M, Vowles J, Perestenko OV, Bowden R, Baig F, Kasten M, Hu MT, Cowley SA, Webber C, Wade-Martins R. Single-Cell Sequencing of iPSC-Dopamine Neurons Reconstructs Disease Progression and Identifies HDAC4 as a Regulator of Parkinson Cell Phenotypes. Cell Stem Cell 2019; 24:93-106.e6. [PMID: 30503143 PMCID: PMC6327112 DOI: 10.1016/j.stem.2018.10.023] [Citation(s) in RCA: 106] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Revised: 07/13/2018] [Accepted: 10/23/2018] [Indexed: 11/29/2022]
Abstract
Induced pluripotent stem cell (iPSC)-derived dopamine neurons provide an opportunity to model Parkinson's disease (PD), but neuronal cultures are confounded by asynchronous and heterogeneous appearance of disease phenotypes in vitro. Using high-resolution, single-cell transcriptomic analyses of iPSC-derived dopamine neurons carrying the GBA-N370S PD risk variant, we identified a progressive axis of gene expression variation leading to endoplasmic reticulum stress. Pseudotime analysis of genes differentially expressed (DE) along this axis identified the transcriptional repressor histone deacetylase 4 (HDAC4) as an upstream regulator of disease progression. HDAC4 was mislocalized to the nucleus in PD iPSC-derived dopamine neurons and repressed genes early in the disease axis, leading to late deficits in protein homeostasis. Treatment of iPSC-derived dopamine neurons with HDAC4-modulating compounds upregulated genes early in the DE axis and corrected PD-related cellular phenotypes. Our study demonstrates how single-cell transcriptomics can exploit cellular heterogeneity to reveal disease mechanisms and identify therapeutic targets.
Collapse
Affiliation(s)
- Charmaine Lang
- Oxford Parkinson's Disease Centre, Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford, UK
| | - Kieran R Campbell
- Oxford Parkinson's Disease Centre, Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford, UK; The Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Brent J Ryan
- Oxford Parkinson's Disease Centre, Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford, UK
| | - Phillippa Carling
- Oxford Parkinson's Disease Centre, Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford, UK
| | - Moustafa Attar
- The Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Jane Vowles
- Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford OX1 3RE, UK
| | - Olga V Perestenko
- Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford OX1 3RE, UK
| | - Rory Bowden
- The Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Fahd Baig
- Oxford Parkinson's Disease Centre, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Meike Kasten
- Department of Psychiatry and Psychotherapy and Institute of Neurogenetics, University of Lübeck, Lübeck, Germany
| | - Michele T Hu
- Oxford Parkinson's Disease Centre, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Sally A Cowley
- Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford OX1 3RE, UK
| | - Caleb Webber
- Oxford Parkinson's Disease Centre, Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford, UK.
| | - Richard Wade-Martins
- Oxford Parkinson's Disease Centre, Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford, UK.
| |
Collapse
|
48
|
Tischler J, Gruhn WH, Reid J, Allgeyer E, Buettner F, Marr C, Theis F, Simons BD, Wernisch L, Surani MA. Metabolic regulation of pluripotency and germ cell fate through α-ketoglutarate. EMBO J 2019; 38:e99518. [PMID: 30257965 PMCID: PMC6315289 DOI: 10.15252/embj.201899518] [Citation(s) in RCA: 66] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Revised: 08/24/2018] [Accepted: 08/27/2018] [Indexed: 12/16/2022] Open
Abstract
An intricate link is becoming apparent between metabolism and cellular identities. Here, we explore the basis for such a link in an in vitro model for early mouse embryonic development: from naïve pluripotency to the specification of primordial germ cells (PGCs). Using single-cell RNA-seq with statistical modelling and modulation of energy metabolism, we demonstrate a functional role for oxidative mitochondrial metabolism in naïve pluripotency. We link mitochondrial tricarboxylic acid cycle activity to IDH2-mediated production of alpha-ketoglutarate and through it, the activity of key epigenetic regulators. Accordingly, this metabolite has a role in the maintenance of naïve pluripotency as well as in PGC differentiation, likely through preserving a particular histone methylation status underlying the transient state of developmental competence for the PGC fate. We reveal a link between energy metabolism and epigenetic control of cell state transitions during a developmental trajectory towards germ cell specification, and establish a paradigm for stabilizing fleeting cellular states through metabolic modulation.
Collapse
Affiliation(s)
- Julia Tischler
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, UK
| | - Wolfram H Gruhn
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, UK
| | - John Reid
- MRC Biostatistics Unit, Cambridge Institute of Public Health, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- The Alan Turing Institute, British Library, London, UK
| | - Edward Allgeyer
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, UK
| | - Florian Buettner
- Institute of Computational Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany
| | - Carsten Marr
- Institute of Computational Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany
| | - Fabian Theis
- Institute of Computational Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany
- Department of Mathematics, Chair of Mathematical Modeling of Biological Systems Technische Universität München, Garching, Germany
| | - Ben D Simons
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, UK
| | - Lorenz Wernisch
- MRC Biostatistics Unit, Cambridge Institute of Public Health, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - M Azim Surani
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, UK
| |
Collapse
|
49
|
Karbalayghareh A, Braga-Neto U, Dougherty ER. Classification of Single-Cell Gene Expression Trajectories from Incomplete and Noisy Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:193-207. [PMID: 29053466 DOI: 10.1109/tcbb.2017.2763946] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
This paper studies classification of gene-expression trajectories coming from two classes, healthy and mutated (cancerous) using Boolean networks with perturbation (BNps) to model the dynamics of each class at the state level. Each class has its own BNp, which is partially known based on gene pathways. We employ a Gaussian model at the observation level to show the expression values of the genes given the hidden binary states at each time point. We use expectation maximization (EM) to learn the BNps and the unknown model parameters, derive closed-form updates for the parameters, and propose a learning algorithm. After learning, a plug-in Bayes classifier is used to classify unlabeled trajectories, which can have missing data. Measuring gene expressions at different times yields trajectories only when measurements come from a single cell. In multiple-cell scenarios, the expression values are averages over many cells with possibly different states. Via the central-limit theorem, we propose another model for expression data in multiple-cell scenarios. Simulations demonstrate that single-cell trajectory data can outperform multiple-cell average expression data relative to classification error, especially in high-noise situations. We also consider data generated via a mammalian cell-cycle network, both the wild-type and with a common mutation affecting p27.
Collapse
|
50
|
Dondelinger F, Mukherjee S. Statistical Network Inference for Time-Varying Molecular Data with Dynamic Bayesian Networks. Methods Mol Biol 2019; 1883:25-48. [PMID: 30547395 DOI: 10.1007/978-1-4939-8882-2_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/24/2023]
Abstract
In this chapter, we review the problem of network inference from time-course data, focusing on a class of graphical models known as dynamic Bayesian networks (DBNs). We discuss the relationship of DBNs to models based on ordinary differential equations, and consider extensions to nonlinear time dynamics. We provide an introduction to time-varying DBN models, which allow for changes to the network structure and parameters over time. We also discuss causal perspectives on network inference, including issues around model semantics that can arise due to missing variables. We present a case study of applying time-varying DBNs to gene expression measurements over the life cycle of Drosophila melanogaster. We finish with a discussion of future perspectives, including possible applications of time-varying network inference to single-cell gene expression data.
Collapse
Affiliation(s)
| | - Sach Mukherjee
- German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany
| |
Collapse
|