151
|
Watson ER, Taherian Fard A, Mar JC. Computational Methods for Single-Cell Imaging and Omics Data Integration. Front Mol Biosci 2022; 8:768106. [PMID: 35111809 PMCID: PMC8801747 DOI: 10.3389/fmolb.2021.768106] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 11/29/2021] [Indexed: 12/12/2022] Open
Abstract
Integrating single cell omics and single cell imaging allows for a more effective characterisation of the underlying mechanisms that drive a phenotype at the tissue level, creating a comprehensive profile at the cellular level. Although the use of imaging data is well established in biomedical research, its primary application has been to observe phenotypes at the tissue or organ level, often using medical imaging techniques such as MRI, CT, and PET. These imaging technologies complement omics-based data in biomedical research because they are helpful for identifying associations between genotype and phenotype, along with functional changes occurring at the tissue level. Single cell imaging can act as an intermediary between these levels. Meanwhile new technologies continue to arrive that can be used to interrogate the genome of single cells and its related omics datasets. As these two areas, single cell imaging and single cell omics, each advance independently with the development of novel techniques, the opportunity to integrate these data types becomes more and more attractive. This review outlines some of the technologies and methods currently available for generating, processing, and analysing single-cell omics- and imaging data, and how they could be integrated to further our understanding of complex biological phenomena like ageing. We include an emphasis on machine learning algorithms because of their ability to identify complex patterns in large multidimensional data.
Collapse
Affiliation(s)
| | - Atefeh Taherian Fard
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD, Australia
| | - Jessica Cara Mar
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
152
|
Luo Q, Yu Y, Lan X. SIGNET: single-cell RNA-seq-based gene regulatory network prediction using multiple-layer perceptron bagging. Brief Bioinform 2022; 23:bbab547. [PMID: 34962260 PMCID: PMC8769917 DOI: 10.1093/bib/bbab547] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 11/13/2021] [Accepted: 11/25/2021] [Indexed: 11/17/2022] Open
Abstract
High-throughput single-cell RNA-seq data have provided unprecedented opportunities for deciphering the regulatory interactions among genes. However, such interactions are complex and often nonlinear or nonmonotonic, which makes their inference using linear models challenging. We present SIGNET, a deep learning-based framework for capturing complex regulatory relationships between genes under the assumption that the expression levels of transcription factors participating in gene regulation are strong predictors of the expression of their target genes. Evaluations based on a variety of real and simulated scRNA-seq datasets showed that SIGNET is more sensitive to ChIP-seq validated regulatory interactions in different types of cells, particularly rare cells. Therefore, this process is more effective for various downstream analyses, such as cell clustering and gene regulatory network inference. We demonstrated that SIGNET is a useful tool for identifying important regulatory modules driving various biological processes.
Collapse
Affiliation(s)
- Qinhuan Luo
- School of Medicine, Tsinghua University, Beijing, China
| | - Yongzhen Yu
- School of Medicine, Tsinghua University, Beijing, China
| | - Xun Lan
- School of Medicine,and the Tsinghua-Peking Center for Life science, MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing, China
| |
Collapse
|
153
|
Ben Guebila M, Lopes-Ramos CM, Weighill D, Sonawane A, Burkholz R, Shamsaei B, Platig J, Glass K, Kuijjer M, Quackenbush J. GRAND: a database of gene regulatory network models across human conditions. Nucleic Acids Res 2022; 50:D610-D621. [PMID: 34508353 PMCID: PMC8728257 DOI: 10.1093/nar/gkab778] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 08/17/2021] [Accepted: 09/08/2021] [Indexed: 12/14/2022] Open
Abstract
Gene regulation plays a fundamental role in shaping tissue identity, function, and response to perturbation. Regulatory processes are controlled by complex networks of interacting elements, including transcription factors, miRNAs and their target genes. The structure of these networks helps to determine phenotypes and can ultimately influence the development of disease or response to therapy. We developed GRAND (https://grand.networkmedicine.org) as a database for computationally-inferred, context-specific gene regulatory network models that can be compared between biological states, or used to predict which drugs produce changes in regulatory network structure. The database includes 12 468 genome-scale networks covering 36 human tissues, 28 cancers, 1378 unperturbed cell lines, as well as 173 013 TF and gene targeting scores for 2858 small molecule-induced cell line perturbation paired with phenotypic information. GRAND allows the networks to be queried using phenotypic information and visualized using a variety of interactive tools. In addition, it includes a web application that matches disease states to potentially therapeutic small molecule drugs using regulatory network properties.
Collapse
Affiliation(s)
- Marouen Ben Guebila
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | | | - Deborah Weighill
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - Abhijeet Rajendra Sonawane
- Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA02115, USA
| | - Rebekka Burkholz
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - Behrouz Shamsaei
- Division of Biostatistics and Bioinformatics, Department of Environmental and Public Health Sciences, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - John Platig
- Channing Division of Network Medicine, Department of Medicine, Harvard Medical School and Brigham and Women’s Hospital, Boston, MA, USA
| | - Kimberly Glass
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Harvard Medical School and Brigham and Women’s Hospital, Boston, MA, USA
| | - Marieke L Kuijjer
- Center for Molecular Medicine Norway, Faculty of Medicine, University of Oslo, Oslo, Norway
- Leiden University Medical Center, Leiden, The Netherlands
| | - John Quackenbush
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Harvard Medical School and Brigham and Women’s Hospital, Boston, MA, USA
| |
Collapse
|
154
|
Emerging Machine Learning Techniques for Modelling Cellular Complex Systems in Alzheimer's Disease. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1338:199-208. [PMID: 34973026 DOI: 10.1007/978-3-030-78775-2_24] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
We live in the big data era in the biomedical field, where machine learning has a very important contribution to the interpretation of complex biological processes and diseases, since it has the potential to create predictive models from multidimensional data sets. Part of the application of machine learning in biomedical science is to study and model complex cellular systems such as biological networks. In this context, the study of complex diseases, such as Alzheimer's diseases (AD), benefits from established methodologies of network science and machine learning as they offer algorithmic tools and techniques that can address the limitations and challenges of modeling and studying cellular AD-related networks. In this paper we analyze the opportunities and challenges at the intersection of machine learning and network biology and whether this can affect the biological interpretation and clarification of diseases. Specifically, we focus on GRN techniques which through omics data and the use of machine learning techniques can construct a network that captures all the information at the molecular level for the disease under study. We record the emerging machine learning techniques that are focus on ensemble tree-based techniques in the area of classification and regression. Their potential for unraveling the complexity of model cellular systems in complex diseases, such as AD, offers the opportunity for novel machine learning methodologies to decipher the mechanisms of the various AD processes.
Collapse
|
155
|
Liu J, Wang H, Sun W, Liu Y. Prioritizing Autism Risk Genes using Personalized Graphical Models Estimated from Single Cell RNA-seq Data. J Am Stat Assoc 2022; 117:38-51. [PMID: 35529781 PMCID: PMC9070996 DOI: 10.1080/01621459.2021.1933495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Hundreds of autism risk genes have been reported recently, mainly based on genetic studies where these risk genes have more de novo mutations in autism subjects than healthy controls. However, as a complex disease, autism is likely associated with more risk genes and many of them may not be identifiable through de novo mutations. We hypothesize that more autism risk genes can be identified through their connections with known autism risk genes in personalized gene-gene interaction graphs. We estimate such personalized graphs using single cell RNA sequencing (scRNA-seq) while appropriately modeling the cell dependence and possible zero-inflation in the scRNA-seq data. The sample size, which is the number of cells per individual, ranges from 891 to 1,241 in our case study using scRNA-seq data in autism subjects and controls. We consider 1,500 genes in our analysis. Since the number of genes is larger or comparable to the sample size, we perform penalized estimation. We score each gene's relevance by applying a simple graph kernel smoothing method to each personalized graph. The molecular functions of the top-scored genes are related to autism diseases. For example, a candidate gene RYR2 that encodes protein ryanodine receptor 2 is involved in neurotransmission, a process that is impaired in ASD patients. While our method provides a systemic and unbiased approach to prioritize autism risk genes, the relevance of these genes needs to be further validated in functional studies.
Collapse
Affiliation(s)
- Jianyu Liu
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill
| | - Haodong Wang
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill
| | - Wei Sun
- Biostatistics Program, Public Health Sciences Division Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Yufeng Liu
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill,Department of Genetics, Department of Biostatistics, Carolina Center for Genome Science, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill,
| |
Collapse
|
156
|
Multi-Omics Profiling of the Tumor Microenvironment. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1361:283-326. [DOI: 10.1007/978-3-030-91836-1_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
157
|
Wu G, Li Y. Distinct characteristics of correlation analysis at the single-cell and the population level. Stat Appl Genet Mol Biol 2022; 21:sagmb-2022-0015. [PMID: 35918809 DOI: 10.1515/sagmb-2022-0015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 06/13/2022] [Indexed: 11/15/2022]
Abstract
Correlation analysis is widely used in biological studies to infer molecular relationships within biological networks. Recently, single-cell analysis has drawn tremendous interests, for its ability to obtain high-resolution molecular phenotypes. It turns out that there is little overlap of co-expressed genes identified in single-cell level investigations with that of population level investigations. However, the nature of the relationship of correlations between single-cell and population levels remains unclear. In this manuscript, we aimed to unveil the origin of the differences between the correlation coefficients at the single-cell level and that at the population level, and bridge the gap between them. Through developing formulations to link correlations at the single-cell and the population level, we illustrated that aggregated correlations could be stronger, weaker or equal to the corresponding individual correlations, depending on the variations and the correlations within the population. When the correlation within the population is weaker than the individual correlation, the aggregated correlation is stronger than the corresponding individual correlation. Besides, our data indicated that aggregated correlation is more likely to be stronger than the corresponding individual correlation, and it was rare to find gene-pairs exclusively strongly correlated at the single-cell level. Through a bottom-up approach to model interactions between molecules in a signaling cascade or a multi-regulator-controlled gene expression, we surprisingly found that the existence of interaction between two components could not be excluded simply based on their low correlation coefficients, suggesting a reconsideration of connectivity within biological networks which was derived solely from correlation analysis. We also investigated the impact of technical random measurement errors on the correlation coefficients for the single-cell level and the population level. The results indicate that the aggregated correlation is relatively robust and less affected. Because of the heterogeneity among single cells, correlation coefficients calculated based on data of the single-cell level might be different from that of the population level. Depending on the specific question we are asking, proper sampling and normalization procedure should be done before we draw any conclusions.
Collapse
Affiliation(s)
- Guoyu Wu
- School of Clinical Pharmacy, Guangdong Pharmaceutical University, Guangzhou, China
- Key Specialty of Clinical Pharmacy, The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou, China
- NMPA Key Laboratory for Technology Research and Evaluation of Pharmacovigilance, Guangdong Pharmaceutical University, Guangzhou, China
| | - Yuchao Li
- Max Planck Institute for Molecular Genetics, Berlin, Germany
- MegaLab, MegaRobo Technologies Co., Ltd, Beijing, China
| |
Collapse
|
158
|
Yan M, Hu J, Yuan H, Xu L, Liao G, Jiang Z, Zhu J, Pang B, Ping Y, Zhang Y, Xiao Y, Li X. Dynamic regulatory networks of T cell trajectory dissect transcriptional control of T cell state transition. MOLECULAR THERAPY. NUCLEIC ACIDS 2021; 26:1115-1129. [PMID: 34786214 PMCID: PMC8577129 DOI: 10.1016/j.omtn.2021.10.011] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 09/13/2021] [Accepted: 10/06/2021] [Indexed: 12/26/2022]
Abstract
T cells exhibit heterogeneous functional states, which correlate with responsiveness to immune checkpoint blockade and prognosis of tumor patients. However, the molecular regulatory mechanisms underlying the dynamic process of T cell state transition remain largely unknown. Based on single-cell transcriptome data of T cells in non-small cell lung cancer, we combined cell states and pseudo-times to propose a pipeline to construct dynamic regulatory networks for dissecting the process of T cell dysfunction. Candidate regulators at different stages were revealed in the process of tumor-infiltrating T cell dysfunction. Through comparing dynamic networks across the T cell state transition, we revealed frequent regulatory interaction rewiring and further refined critical regulators mediating each state transition. Several known regulators were identified, including TCF7, EOMES, ID2, and TOX. Notably, one of the critical regulators, TSC22D3, was frequently identified in the state transitions from the intermediate state to the pre-dysfunction and dysfunction state, exerting diverse roles in each state transition by regulatory interaction rewiring. Moreover, higher expression of TSC22D3 was associated with the clinical outcome of tumor patients. Our study embedded transcription factors (TFs) within the temporal dynamic networks, providing a comprehensive view of dynamic regulatory mechanisms controlling the process of T cell state transition.
Collapse
Affiliation(s)
- Min Yan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Jing Hu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Huating Yuan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Liwen Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Gaoming Liao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Zedong Jiang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Jiali Zhu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Bo Pang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Yanyan Ping
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Yunpeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Yun Xiao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
- Key Laboratory of High Throughput Omics Big Data for Cold Region’s Major Diseases in Heilongjiang Province, Harbin, Heilongjiang 150081, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
- Key Laboratory of High Throughput Omics Big Data for Cold Region’s Major Diseases in Heilongjiang Province, Harbin, Heilongjiang 150081, China
| |
Collapse
|
159
|
Zhang Y, Chen Q, Gong M, Zeng Y, Gao D. Gene regulatory networks analysis of muscle-invasive bladder cancer subtypes using differential graphical model. BMC Genomics 2021; 22:863. [PMID: 34852762 PMCID: PMC8638098 DOI: 10.1186/s12864-021-08113-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Recently, erdafitinib (Balversa), the first targeted therapy drug for genetic alteration, was approved to metastatic urothelial carcinoma. Cancer genomics research has been greatly encouraged. Currently, a large number of gene regulatory networks between different states have been constructed, which can reveal the difference states of genes. However, they have not been applied to the subtypes of Muscle-invasive bladder cancer (MIBC). RESULTS In this paper, we propose a method that construct gene regulatory networks under different molecular subtypes of MIBC, and analyse the regulatory differences between different molecular subtypes. Through differential expression analysis and the differential network analysis of the top 100 differential genes in the network, we find that SERPINI1, NOTUM, FGFR1 and other genes have significant differences in expression and regulatory relationship between MIBC subtypes. CONCLUSIONS Furthermore, pathway enrichment analysis and differential network analysis demonstrate that Neuroactive ligand-receptor interaction and Cytokine-cytokine receptor interaction are significantly enriched pathways, and the genes contained in them are significant diversity in the subtypes of bladder cancer.
Collapse
Affiliation(s)
- Yongqing Zhang
- School of Computer Science, Chengdu University of Information Technology, Chengdu, 610225, China
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Qingyuan Chen
- School of Computer Science, Chengdu University of Information Technology, Chengdu, 610225, China
| | - Meiqin Gong
- West China Second University Hospital, Sichuan University, Chengdu, 610041, China
| | - Yuanqi Zeng
- School of Computer Science, Chengdu University of Information Technology, Chengdu, 610225, China
| | - Dongrui Gao
- School of Computer Science, Chengdu University of Information Technology, Chengdu, 610225, China.
- School of life Science and technology, center for information in medicine, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| |
Collapse
|
160
|
Regondi C, Fratelli M, Damia G, Guffanti F, Ganzinelli M, Matteucci M, Masseroli M. Predictive modeling of gene expression regulation. BMC Bioinformatics 2021; 22:571. [PMID: 34837938 PMCID: PMC8626902 DOI: 10.1186/s12859-021-04481-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 11/15/2021] [Indexed: 11/24/2022] Open
Abstract
Background In-depth analysis of regulation networks of genes aberrantly expressed in cancer is essential for better understanding tumors and identifying key genes that could be therapeutically targeted. Results We developed a quantitative analysis approach to investigate the main biological relationships among different regulatory elements and target genes; we applied it to Ovarian Serous Cystadenocarcinoma and 177 target genes belonging to three main pathways (DNA REPAIR, STEM CELLS and GLUCOSE METABOLISM) relevant for this tumor. Combining data from ENCODE and TCGA datasets, we built a predictive linear model for the regulation of each target gene, assessing the relationships between its expression, promoter methylation, expression of genes in the same or in the other pathways and of putative transcription factors. We proved the reliability and significance of our approach in a similar tumor type (basal-like Breast cancer) and using a different existing algorithm (ARACNe), and we obtained experimental confirmations on potentially interesting results. Conclusions The analysis of the proposed models allowed disclosing the relations between a gene and its related biological processes, the interconnections between the different gene sets, and the evaluation of the relevant regulatory elements at single gene level. This led to the identification of already known regulators and/or gene correlations and to unveil a set of still unknown and potentially interesting biological relationships for their pharmacological and clinical use. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04481-1.
Collapse
Affiliation(s)
- Chiara Regondi
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133, Milan, Italy.
| | - Maddalena Fratelli
- Pharmacogenomics Unit, Istituto di Ricerche Farmacologiche Mario Negri, IRCCS, 20156, Milan, Italy
| | - Giovanna Damia
- Laboratory of Molecular Pharmacology, Istituto di Ricerche Farmacologiche Mario Negri, IRCCS, 20156, Milan, Italy
| | - Federica Guffanti
- Laboratory of Molecular Pharmacology, Istituto di Ricerche Farmacologiche Mario Negri, IRCCS, 20156, Milan, Italy
| | - Monica Ganzinelli
- Laboratory of Molecular Pharmacology, Istituto di Ricerche Farmacologiche Mario Negri, IRCCS, 20156, Milan, Italy.,Department of Medical Oncology, Fondazione IRCCS Istituto Nazionale dei Tumori, 20133, Milan, Italy
| | - Matteo Matteucci
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133, Milan, Italy
| | - Marco Masseroli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133, Milan, Italy
| |
Collapse
|
161
|
He L, Lu A, Qin L, Zhang Q, Ling H, Tan D, He Y. Application of single-cell RNA sequencing technology in liver diseases: a narrative review. ANNALS OF TRANSLATIONAL MEDICINE 2021; 9:1598. [PMID: 34790804 PMCID: PMC8576673 DOI: 10.21037/atm-21-4824] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 10/14/2021] [Indexed: 11/26/2022]
Abstract
Objective This review aimed to summarize the application of single-cell transcriptome sequencing technology in liver diseases. Background The increasing application of single-cell ribonucleic acid (RNA) sequencing (scRNA-seq) in life science and biomedical research has greatly improved our understanding of cellular heterogeneity in immunology, oncology, and developmental biology. scRNA-seq has proven to be a powerful tool for identifying and classifying cell subsets, characterizing rare or small cell subsets and tracking cell differentiation along the dynamic cell stages. Globally, liver disease has high rates of morbidity and mortality, and its exact pathological mechanism remains unclear, current treatment options are limited to clearance of the underlying cause or liver transplantation, which cannot overwhelm and cure liver diseases. scRNA-seq provides many novel insights for healthy and diseased livers. Methods In this review, we searched for related articles in the PubMed database and summarized the advances of scRNA-seq in revealing the molecular mechanisms of liver development, regeneration, and disease. We also discussed the challenges and future application potential of scRNA-seq, which is expected to enhance the ability to explore the field of liver research and accelerate the clinical application of liver precision medicine. Conclusions With the continuous improvement of scRNA-seq technology, scRNA-seq is expected to unlock new avenues for liver biology exploration, liver disease diagnosis, and personalized treatment, which will pave the way for breakthrough innovation in personalized medicine.
Collapse
Affiliation(s)
- Lian He
- The Key Laboratory of Basic Pharmacology of Minstry of Education and Joint International Research Laboratory of Ethnomedicine of Ministry of Education, School of Pharmacy, Zunyi Medical University, Zunyi, China
| | - Anjing Lu
- The Key Laboratory of Basic Pharmacology of Minstry of Education and Joint International Research Laboratory of Ethnomedicine of Ministry of Education, School of Pharmacy, Zunyi Medical University, Zunyi, China.,Shanghai Nature-Standard Technology Service Co., Ltd., Shanghai, China
| | - Lin Qin
- The Key Laboratory of Basic Pharmacology of Minstry of Education and Joint International Research Laboratory of Ethnomedicine of Ministry of Education, School of Pharmacy, Zunyi Medical University, Zunyi, China
| | - Qianru Zhang
- The Key Laboratory of Basic Pharmacology of Minstry of Education and Joint International Research Laboratory of Ethnomedicine of Ministry of Education, School of Pharmacy, Zunyi Medical University, Zunyi, China
| | - Hua Ling
- School of Pharmacy, Georgia Campus-Philadelphia College of Osteopathic Medicine, Suwanee, GA, USA
| | - Daopeng Tan
- The Key Laboratory of Basic Pharmacology of Minstry of Education and Joint International Research Laboratory of Ethnomedicine of Ministry of Education, School of Pharmacy, Zunyi Medical University, Zunyi, China
| | - Yuqi He
- The Key Laboratory of Basic Pharmacology of Minstry of Education and Joint International Research Laboratory of Ethnomedicine of Ministry of Education, School of Pharmacy, Zunyi Medical University, Zunyi, China
| |
Collapse
|
162
|
Kashima M, Shida Y, Yamashiro T, Hirata H, Kurosaka H. Intracellular and Intercellular Gene Regulatory Network Inference From Time-Course Individual RNA-Seq. FRONTIERS IN BIOINFORMATICS 2021; 1:777299. [PMID: 36303726 PMCID: PMC9580923 DOI: 10.3389/fbinf.2021.777299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Accepted: 10/26/2021] [Indexed: 11/13/2022] Open
Abstract
Gene regulatory network (GRN) inference is an effective approach to understand the molecular mechanisms underlying biological events. Generally, GRN inference mainly targets intracellular regulatory relationships such as transcription factors and their associated targets. In multicellular organisms, there are both intracellular and intercellular regulatory mechanisms. Thus, we hypothesize that GRNs inferred from time-course individual (whole embryo) RNA-Seq during development can reveal intercellular regulatory relationships (signaling pathways) underlying the development. Here, we conducted time-course bulk RNA-Seq of individual mouse embryos during early development, followed by pseudo-time analysis and GRN inference. The results demonstrated that GRN inference from RNA-Seq with pseudo-time can be applied for individual bulk RNA-Seq similar to scRNA-Seq. Validation using an experimental-source-based database showed that our approach could significantly infer GRN for all transcription factors in the database. Furthermore, the inferred ligand-related and receptor-related downstream genes were significantly overlapped. Thus, the inferred GRN based on whole organism could include intercellular regulatory relationships, which cannot be inferred from scRNA-Seq based only on gene expression data. Overall, inferring GRN from time-course bulk RNA-Seq is an effective approach to understand the regulatory relationships underlying biological events in multicellular organisms.
Collapse
Affiliation(s)
- Makoto Kashima
- College of Science and Engineering, Aoyama Gakuin University, Sagamihara, Japan
| | - Yuki Shida
- Department of Orthodontics and Dentofacial Orthopedics, Osaka University, Suita, Japan
| | - Takashi Yamashiro
- Department of Orthodontics and Dentofacial Orthopedics, Osaka University, Suita, Japan
| | - Hiromi Hirata
- College of Science and Engineering, Aoyama Gakuin University, Sagamihara, Japan
| | - Hiroshi Kurosaka
- Department of Orthodontics and Dentofacial Orthopedics, Osaka University, Suita, Japan
| |
Collapse
|
163
|
Chen J, Cheong C, Lan L, Zhou X, Liu J, Lyu A, Cheung WK, Zhang L. DeepDRIM: a deep neural network to reconstruct cell-type-specific gene regulatory network using single-cell RNA-seq data. Brief Bioinform 2021; 22:bbab325. [PMID: 34424948 PMCID: PMC8499812 DOI: 10.1093/bib/bbab325] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 07/12/2021] [Accepted: 07/26/2021] [Indexed: 01/11/2023] Open
Abstract
Single-cell RNA sequencing has enabled to capture the gene activities at single-cell resolution, thus allowing reconstruction of cell-type-specific gene regulatory networks (GRNs). The available algorithms for reconstructing GRNs are commonly designed for bulk RNA-seq data, and few of them are applicable to analyze scRNA-seq data by dealing with the dropout events and cellular heterogeneity. In this paper, we represent the joint gene expression distribution of a gene pair as an image and propose a novel supervised deep neural network called DeepDRIM which utilizes the image of the target TF-gene pair and the ones of the potential neighbors to reconstruct GRN from scRNA-seq data. Due to the consideration of TF-gene pair's neighborhood context, DeepDRIM can effectively eliminate the false positives caused by transitive gene-gene interactions. We compared DeepDRIM with nine GRN reconstruction algorithms designed for either bulk or single-cell RNA-seq data. It achieves evidently better performance for the scRNA-seq data collected from eight cell lines. The simulated data show that DeepDRIM is robust to the dropout rate, the cell number and the size of the training data. We further applied DeepDRIM to the scRNA-seq gene expression of B cells from the bronchoalveolar lavage fluid of the patients with mild and severe coronavirus disease 2019. We focused on the cell-type-specific GRN alteration and observed targets of TFs that were differentially expressed between the two statuses to be enriched in lysosome, apoptosis, response to decreased oxygen level and microtubule, which had been proved to be associated with coronavirus infection.
Collapse
Affiliation(s)
- Jiaxing Chen
- Department of Computer Science, Hong Kong Baptist University, Waterloo Road, Kowloon Tong, Hong Kong
| | - ChinWang Cheong
- Department of Computer Science, Hong Kong Baptist University, Waterloo Road, Kowloon Tong, Hong Kong
| | - Liang Lan
- Department of Computer Science, Hong Kong Baptist University, Waterloo Road, Kowloon Tong, Hong Kong
| | - Xin Zhou
- Department of Biomedical Engineering, Vanderbilt University, Vanderbilt Place Nashville, 37235, TN, USA
| | - Jiming Liu
- Department of Computer Science, Hong Kong Baptist University, Waterloo Road, Kowloon Tong, Hong Kong
| | - Aiping Lyu
- School of Chinese Medicine, Hong Kong Baptist University, Waterloo Road, Kowloon Tong, Hong Kong
| | - William K Cheung
- Department of Computer Science, Hong Kong Baptist University, Waterloo Road, Kowloon Tong, Hong Kong
| | - Lu Zhang
- Department of Computer Science, Hong Kong Baptist University, Waterloo Road, Kowloon Tong, Hong Kong
| |
Collapse
|
164
|
Integration of functional genomics data to uncover cell type-specific pathways affected in Parkinson's disease. Biochem Soc Trans 2021; 49:2091-2100. [PMID: 34581766 PMCID: PMC8589426 DOI: 10.1042/bst20210128] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 08/25/2021] [Accepted: 08/31/2021] [Indexed: 12/22/2022]
Abstract
Parkinson's disease (PD) is the second most prevalent late-onset neurodegenerative disorder worldwide after Alzheimer's disease for which available drugs only deliver temporary symptomatic relief. Loss of dopaminergic neurons (DaNs) in the substantia nigra and intracellular alpha-synuclein inclusions are the main hallmarks of the disease but the events that cause this degeneration remain uncertain. Despite cell types other than DaNs such as astrocytes, microglia and oligodendrocytes have been recently associated with the pathogenesis of PD, we still lack an in-depth characterisation of PD-affected brain regions at cell-type resolution that could help our understanding of the disease mechanisms. Nevertheless, publicly available large-scale brain-specific genomic, transcriptomic and epigenomic datasets can be further exploited to extract different layers of cell type-specific biological information for the reconstruction of cell type-specific transcriptional regulatory networks. By intersecting disease risk variants within the networks, it may be possible to study the functional role of these risk variants and their combined effects at cell type- and pathway levels, that, in turn, can facilitate the identification of key regulators involved in disease progression, which are often potential therapeutic targets.
Collapse
|
165
|
Weng G, Kim J, Won KJ. VeTra: a tool for trajectory inference based on RNA velocity. Bioinformatics 2021; 37:3509-3513. [PMID: 33974009 PMCID: PMC8545348 DOI: 10.1093/bioinformatics/btab364] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 04/11/2021] [Accepted: 05/10/2021] [Indexed: 11/20/2022] Open
Abstract
MOTIVATION Trajectory inference (TI) for single cell RNA sequencing (scRNAseq) data is a powerful approach to interpret dynamic cellular processes such as cell cycle and development. Still, however, accurate inference of trajectory is challenging. Recent development of RNA velocity provides an approach to visualize cell state transition without relying on prior knowledge. RESULTS To perform TI and group cells based on RNA velocity we developed VeTra. By applying cosine similarity and merging weakly connected components, VeTra identifies cell groups from the direction of cell transition. Besides, VeTra suggests key regulators from the inferred trajectory. VeTra is a useful tool for TI and subsequent analysis. AVAILABILITY AND IMPLEMENTATION The Vetra is available at https://github.com/wgzgithub/VeTra. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Guangzheng Weng
- Department of Biology, The Bioinformatics Centre, University of Copenhagen, 2200 Copenhagen N, Denmark
- Biotech Research and Innovation Centre (BRIC), University of Copenhagen, 2200 Copenhagen N, Denmark
- Novo Nordisk Foundation Center for Stem Cell Biology, DanStem, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Junil Kim
- Biotech Research and Innovation Centre (BRIC), University of Copenhagen, 2200 Copenhagen N, Denmark
- Novo Nordisk Foundation Center for Stem Cell Biology, DanStem, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
- Department of Bioinformatics, School of Systems Biomedical Science, Soongsil University, 06978 Seoul, South Korea
| | - Kyoung Jae Won
- Biotech Research and Innovation Centre (BRIC), University of Copenhagen, 2200 Copenhagen N, Denmark
- Novo Nordisk Foundation Center for Stem Cell Biology, DanStem, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
| |
Collapse
|
166
|
Jeong H, Shin S, Yeom HG. Accurate Single-Cell Clustering through Ensemble Similarity Learning. Genes (Basel) 2021; 12:genes12111670. [PMID: 34828276 PMCID: PMC8623803 DOI: 10.3390/genes12111670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 10/10/2021] [Accepted: 10/20/2021] [Indexed: 11/16/2022] Open
Abstract
Single-cell sequencing provides novel means to interpret the transcriptomic profiles of individual cells. To obtain in-depth analysis of single-cell sequencing, it requires effective computational methods to accurately predict single-cell clusters because single-cell sequencing techniques only provide the transcriptomic profiles of each cell. Although an accurate estimation of the cell-to-cell similarity is an essential first step to derive reliable single-cell clustering results, it is challenging to obtain the accurate similarity measurement because it highly depends on a selection of genes for similarity evaluations and the optimal set of genes for the accurate similarity estimation is typically unknown. Moreover, due to technical limitations, single-cell sequencing includes a larger number of artificial zeros, and the technical noise makes it difficult to develop effective single-cell clustering algorithms. Here, we describe a novel single-cell clustering algorithm that can accurately predict single-cell clusters in large-scale single-cell sequencing by effectively reducing the zero-inflated noise and accurately estimating the cell-to-cell similarities. First, we construct an ensemble similarity network based on different similarity estimates, and reduce the artificial noise using a random walk with restart framework. Finally, starting from a larger number small size but highly consistent clusters, we iteratively merge a pair of clusters with the maximum similarities until it reaches the predicted number of clusters. Extensive performance evaluation shows that the proposed single-cell clustering algorithm can yield the accurate single-cell clustering results and it can help deciphering the key messages underlying complex biological mechanisms.
Collapse
Affiliation(s)
- Hyundoo Jeong
- Department of Mechatronics Engineering, Incheon National University, Incheon 22012, Korea;
| | - Sungtae Shin
- Department of Mechanical Engineering, Dong-A University, Busan 49315, Korea;
| | - Hong-Gi Yeom
- Department of Electronics Engineering, Chosun University, Gwangju 61452, Korea
- Correspondence:
| |
Collapse
|
167
|
Nakajima N, Hayashi T, Fujiki K, Shirahige K, Akiyama T, Akutsu T, Nakato R. Codependency and mutual exclusivity for gene community detection from sparse single-cell transcriptome data. Nucleic Acids Res 2021; 49:e104. [PMID: 34291282 PMCID: PMC8501962 DOI: 10.1093/nar/gkab601] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 05/25/2021] [Accepted: 07/04/2021] [Indexed: 12/04/2022] Open
Abstract
Single-cell RNA-seq (scRNA-seq) can be used to characterize cellular heterogeneity in thousands of cells. The reconstruction of a gene network based on coexpression patterns is a fundamental task in scRNA-seq analyses, and the mutual exclusivity of gene expression can be critical for understanding such heterogeneity. Here, we propose an approach for detecting communities from a genetic network constructed on the basis of coexpression properties. The community-based comparison of multiple coexpression networks enables the identification of functionally related gene clusters that cannot be fully captured through differential gene expression-based analysis. We also developed a novel metric referred to as the exclusively expressed index (EEI) that identifies mutually exclusive gene pairs from sparse scRNA-seq data. EEI quantifies and ranks the exclusive expression levels of all gene pairs from binary expression patterns while maintaining robustness against a low sequencing depth. We applied our methods to glioblastoma scRNA-seq data and found that gene communities were partially conserved after serum stimulation despite a considerable number of differentially expressed genes. We also demonstrate that the identification of mutually exclusive gene sets with EEI can improve the sensitivity of capturing cellular heterogeneity. Our methods complement existing approaches and provide new biological insights, even for a large, sparse dataset, in the single-cell analysis field.
Collapse
Affiliation(s)
- Natsu Nakajima
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Tomoatsu Hayashi
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Katsunori Fujiki
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Katsuhiko Shirahige
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Tetsu Akiyama
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan
| | - Ryuichiro Nakato
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| |
Collapse
|
168
|
Ma X, Somasundaram A, Qi Z, Hartman D, Singh H, Osmanbeyoglu H. SPaRTAN, a computational framework for linking cell-surface receptors to transcriptional regulators. Nucleic Acids Res 2021; 49:9633-9647. [PMID: 34500467 PMCID: PMC8464045 DOI: 10.1093/nar/gkab745] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 08/09/2021] [Accepted: 09/06/2021] [Indexed: 12/22/2022] Open
Abstract
The identity and functions of specialized cell types are dependent on the complex interplay between signaling and transcriptional networks. Recently single-cell technologies have been developed that enable simultaneous quantitative analysis of cell-surface receptor expression with transcriptional states. To date, these datasets have not been used to systematically develop cell-context-specific maps of the interface between signaling and transcriptional regulators orchestrating cellular identity and function. We present SPaRTAN (Single-cell Proteomic and RNA based Transcription factor Activity Network), a computational method to link cell-surface receptors to transcription factors (TFs) by exploiting cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) datasets with cis-regulatory information. SPaRTAN is applied to immune cell types in the blood to predict the coupling of signaling receptors with cell context-specific TFs. Selected predictions are validated by prior knowledge and flow cytometry analyses. SPaRTAN is then used to predict the signaling coupled TF states of tumor infiltrating CD8+ T cells in malignant peritoneal and pleural mesotheliomas. SPaRTAN enhances the utility of CITE-seq datasets to uncover TF and cell-surface receptor relationships in diverse cellular states.
Collapse
Affiliation(s)
- Xiaojun Ma
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
- UPMC Hillman Cancer Center, Pittsburgh, PA 15213, USA
| | - Ashwin Somasundaram
- Department of Medicine, Division of Hematology/Oncology, University of Pittsburgh, Pittsburgh, PA 15213, USA
- UPMC Hillman Cancer Center, Pittsburgh, PA 15213, USA
| | - Zengbiao Qi
- UPMC Hillman Cancer Center, Pittsburgh, PA 15213, USA
| | - Douglas J Hartman
- Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, PA 15213, USA
- UPMC Hillman Cancer Center, Pittsburgh, PA 15213, USA
| | - Harinder Singh
- Center for Systems Immunology and Department of Immunology, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Hatice Ulku Osmanbeyoglu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA 15261, USA
- UPMC Hillman Cancer Center, Pittsburgh, PA 15213, USA
| |
Collapse
|
169
|
Yang B. Gene Regulatory Network Identification based on Forest Graph-embedded Deep Feedforward Network. 2021 6TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTERNET OF THINGS 2021. [DOI: 10.1145/3493287.3493297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
170
|
Yang B, Bao W, Zhang W, Wang H, Song C, Chen Y, Jiang X. Reverse engineering gene regulatory network based on complex-valued ordinary differential equation model. BMC Bioinformatics 2021; 22:448. [PMID: 34544363 PMCID: PMC8451084 DOI: 10.1186/s12859-021-04367-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Accepted: 09/09/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The growing researches of molecular biology reveal that complex life phenomena have the ability to demonstrating various types of interactions in the level of genomics. To establish the interactions between genes or proteins and understand the intrinsic mechanisms of biological systems have become an urgent need and study hotspot. RESULTS In order to forecast gene expression data and identify more accurate gene regulatory network, complex-valued version of ordinary differential equation (CVODE) is proposed in this paper. In order to optimize CVODE model, a complex-valued hybrid evolutionary method based on Grammar-guided genetic programming and complex-valued firefly algorithm is presented. CONCLUSIONS When tested on three real gene expression datasets from E. coli and Human Cell, the experiment results suggest that CVODE model could improve 20-50% prediction accuracy of gene expression data, which could also infer more true-positive regulatory relationships and less false-positive regulations than ordinary differential equation.
Collapse
Affiliation(s)
- Bin Yang
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277160, China
| | - Wenzheng Bao
- School of Information and Electrical Engineering, Xuzhou University of Technology, Xuzhou, 221018, China.
| | - Wei Zhang
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277160, China
| | - Haifeng Wang
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277160, China
| | - Chuandong Song
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277160, China
| | - Yuehui Chen
- School of Information Science and Engineering, University of Jinan, Jinan, 250022, China
| | - Xiuying Jiang
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277160, China
| |
Collapse
|
171
|
Raharinirina NA, Peppert F, von Kleist M, Schütte C, Sunkara V. Inferring gene regulatory networks from single-cell RNA-seq temporal snapshot data requires higher-order moments. PATTERNS 2021; 2:100332. [PMID: 34553172 PMCID: PMC8441581 DOI: 10.1016/j.patter.2021.100332] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 02/23/2021] [Accepted: 07/22/2021] [Indexed: 11/30/2022]
Abstract
Single-cell RNA sequencing (scRNA-seq) has become ubiquitous in biology. Recently, there has been a push for using scRNA-seq snapshot data to infer the underlying gene regulatory networks (GRNs) steering cellular function. To date, this aspiration remains unrealized due to technical and computational challenges. In this work we focus on the latter, which is under-represented in the literature. We took a systemic approach by subdividing the GRN inference into three fundamental components: data pre-processing, feature extraction, and inference. We observed that the regulatory signature is captured in the statistical moments of scRNA-seq data and requires computationally intensive minimization solvers to extract it. Furthermore, current data pre-processing might not conserve these statistical moments. Although our moment-based approach is a didactic tool for understanding the different compartments of GRN inference, this line of thinking—finding computationally feasible multi-dimensional statistics of data—is imperative for designing GRN inference methods. Single-cell RNA-seq temporal snapshot data for detecting regulation Challenges in data pre-processing, feature extraction, and network inference for GRNs Encoding of regulatory information in higher-order raw moments Non-linear least-squares inference for temporal scRNA-seq snapshot data
Single-cell RNA sequencing (scRNA-seq) has become ubiquitous in biology. Recently, there has been a push for using scRNA-seq snapshot data to infer the underlying gene regulatory networks (GRNs) steering cellular function. A recent benchmark of 12 GRN methods demonstrated that the algorithms struggled to predict the ground-truth GRNs and speculated that the low performance was due to the insufficient resolution in the scRNA-seq data. Rather than proposing another method, this paper focuses on how to decompose a GRN problem into three subproblems (pre-processing, feature extraction, and inference), so that the gene regulatory information is preserved in each step. Subsequently, we discuss how to best approach each of the three subproblems.
Collapse
Affiliation(s)
| | - Felix Peppert
- Explainable A.I. for Biology, Zuse Institute Berlin, 14195 Berlin, Germany
| | - Max von Kleist
- MF1 Bioinformatics, Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, Germany
| | - Christof Schütte
- Mathematics of Complex Systems, Zuse Institute Berlin, 14195 Berlin, Germany.,Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany
| | - Vikram Sunkara
- Mathematics of Complex Systems, Zuse Institute Berlin, 14195 Berlin, Germany.,Explainable A.I. for Biology, Zuse Institute Berlin, 14195 Berlin, Germany
| |
Collapse
|
172
|
Li H. Single-cell RNA sequencing in Drosophila: Technologies and applications. WILEY INTERDISCIPLINARY REVIEWS. DEVELOPMENTAL BIOLOGY 2021; 10:e396. [PMID: 32940008 PMCID: PMC7960577 DOI: 10.1002/wdev.396] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 08/09/2020] [Accepted: 08/20/2020] [Indexed: 12/12/2022]
Abstract
Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful tool for investigating cell states and functions at the single-cell level. It has greatly revolutionized transcriptomic studies in many life science research fields, such as neurobiology, immunology, and developmental biology. With the fast development of both experimental platforms and bioinformatics approaches over the past decade, scRNA-seq is becoming economically feasible and experimentally practical for many biomedical laboratories. Drosophila has served as an excellent model organism for dissecting cellular and molecular mechanisms that underlie tissue development, adult cell function, disease, and aging. The recent application of scRNA-seq methods to Drosophila tissues has led to a number of exciting discoveries. In this review, I will provide a summary of recent scRNA-seq studies in Drosophila, focusing on technical approaches and biological applications. I will also discuss current challenges and future opportunities of making new discoveries using scRNA-seq in Drosophila. This article is categorized under: Technologies > Analysis of the Transcriptome.
Collapse
Affiliation(s)
- Hongjie Li
- Department of Biology, Stanford University, Stanford, California, USA
| |
Collapse
|
173
|
Saint-André V. Computational biology approaches for mapping transcriptional regulatory networks. Comput Struct Biotechnol J 2021; 19:4884-4895. [PMID: 34522292 PMCID: PMC8426465 DOI: 10.1016/j.csbj.2021.08.028] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2021] [Revised: 08/16/2021] [Accepted: 08/16/2021] [Indexed: 12/13/2022] Open
Abstract
Transcriptional Regulatory Networks (TRNs) are mainly responsible for the cell-type- or cell-state-specific expression of gene sets from the same DNA sequence. However, so far there are no precise maps of TRNs available for each cell-type or cell-state, and no ideal tool to map those networks clearly and in full from biological samples. In this review, major approaches and tools to map TRNs from high-throughput data are presented, depending on the type of methods or data used to infer them, and their advantages and limitations are discussed. After summarizing the main principles defining the topology and structure–function relationships in TRNs, an overview of the extensive work done to map TRNs from bulk transcriptomic data will be presented by type of methodological approach. Most recent modellings of TRNs using other types of molecular data or integrating different data types, including single-cell RNA-sequencing and chromatin information, will then be discussed, before briefly concluding with improvements expected to come in the field.
Collapse
Affiliation(s)
- Violaine Saint-André
- Hub de Bioinformatique et Biostatistique - Département Biologie Computationnelle, Institut Pasteur, Paris, France
| |
Collapse
|
174
|
Davis-Marcisak EF, Deshpande A, Stein-O'Brien GL, Ho WJ, Laheru D, Jaffee EM, Fertig EJ, Kagohara LT. From bench to bedside: Single-cell analysis for cancer immunotherapy. Cancer Cell 2021; 39:1062-1080. [PMID: 34329587 PMCID: PMC8406623 DOI: 10.1016/j.ccell.2021.07.004] [Citation(s) in RCA: 64] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 06/16/2021] [Accepted: 07/02/2021] [Indexed: 01/04/2023]
Abstract
Single-cell technologies are emerging as powerful tools for cancer research. These technologies characterize the molecular state of each cell within a tumor, enabling new exploration of tumor heterogeneity, microenvironment cell-type composition, and cell state transitions that affect therapeutic response, particularly in the context of immunotherapy. Analyzing clinical samples has great promise for precision medicine but is technically challenging. Successfully identifying predictors of response requires well-coordinated, multi-disciplinary teams to ensure adequate sample processing for high-quality data generation and computational analysis for data interpretation. Here, we review current approaches to sample processing and computational analysis regarding their application to translational cancer immunotherapy research.
Collapse
Affiliation(s)
- Emily F Davis-Marcisak
- McKusick-Nathans Institute of the Department of Genetic Medicine, Johns Hopkins School of Medicine, 550 N Broadway, Suite 1101E, Baltimore, MD 21205, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Atul Deshpande
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Genevieve L Stein-O'Brien
- McKusick-Nathans Institute of the Department of Genetic Medicine, Johns Hopkins School of Medicine, 550 N Broadway, Suite 1101E, Baltimore, MD 21205, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Won J Ho
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Daniel Laheru
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Elizabeth M Jaffee
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Elana J Fertig
- McKusick-Nathans Institute of the Department of Genetic Medicine, Johns Hopkins School of Medicine, 550 N Broadway, Suite 1101E, Baltimore, MD 21205, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Department of Applied Mathematics and Statistics, Johns Hopkins University Whiting School of Engineering, Baltimore, MD, USA; Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Luciane T Kagohara
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
175
|
Mishra S, Srivastava D, Kumar V. Improving gene network inference with graph wavelets and making insights about ageing-associated regulatory changes in lungs. Brief Bioinform 2021; 22:bbaa360. [PMID: 33381809 PMCID: PMC7799288 DOI: 10.1093/bib/bbaa360] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 10/12/2020] [Accepted: 11/10/2020] [Indexed: 01/20/2023] Open
Abstract
Using gene-regulatory-networks-based approach for single-cell expression profiles can reveal unprecedented details about the effects of external and internal factors. However, noise and batch effect in sparse single-cell expression profiles can hamper correct estimation of dependencies among genes and regulatory changes. Here, we devise a conceptually different method using graphwavelet filters for improving gene network (GWNet)-based analysis of the transcriptome. Our approach improved the performance of several gene network-inference methods. Most Importantly, GWNet improved consistency in the prediction of gene regulatory network using single-cell transcriptome even in the presence of batch effect. The consistency of predicted gene network enabled reliable estimates of changes in the influence of genes not highlighted by differential-expression analysis. Applying GWNet on the single-cell transcriptome profile of lung cells, revealed biologically relevant changes in the influence of pathways and master regulators due to ageing. Surprisingly, the regulatory influence of ageing on pneumocytes type II cells showed noticeable similarity with patterns due to the effect of novel coronavirus infection in human lung.
Collapse
|
176
|
scLink: Inferring Sparse Gene Co-expression Networks from Single-cell Expression Data. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:475-492. [PMID: 34252628 PMCID: PMC8896229 DOI: 10.1016/j.gpb.2020.11.006] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 10/23/2020] [Accepted: 12/26/2020] [Indexed: 11/23/2022]
Abstract
A system-level understanding of the regulation and coordination mechanisms of gene expression is essential for studying the complexity of biological processes in health and disease. With the rapid development of single-cell RNA sequencing technologies, it is now possible to investigate gene interactions in a cell type-specific manner. Here we propose the scLink method, which uses statistical network modeling to understand the co-expression relationships among genes and construct sparse gene co-expression networks from single-cell gene expression data. We use both simulation and real data studies to demonstrate the advantages of scLink and its ability to improve single-cell gene network analysis. The scLink R package is available at https://github.com/Vivianstats/scLink.
Collapse
|
177
|
Shu H, Zhou J, Lian Q, Li H, Zhao D, Zeng J, Ma J. Modeling gene regulatory networks using neural network architectures. NATURE COMPUTATIONAL SCIENCE 2021; 1:491-501. [PMID: 38217125 DOI: 10.1038/s43588-021-00099-8] [Citation(s) in RCA: 74] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 06/15/2021] [Indexed: 01/15/2024]
Abstract
Gene regulatory networks (GRNs) encode the complex molecular interactions that govern cell identity. Here we propose DeepSEM, a deep generative model that can jointly infer GRNs and biologically meaningful representation of single-cell RNA sequencing (scRNA-seq) data. In particular, we developed a neural network version of the structural equation model (SEM) to explicitly model the regulatory relationships among genes. Benchmark results show that DeepSEM achieves comparable or better performance on a variety of single-cell computational tasks, such as GRN inference, scRNA-seq data visualization, clustering and simulation, compared with the state-of-the-art methods. In addition, the gene regulations predicted by DeepSEM on cell-type marker genes in the mouse cortex can be validated by epigenetic data, which further demonstrates the accuracy and efficiency of our method. DeepSEM can provide a useful and powerful tool to analyze scRNA-seq data and infer a GRN.
Collapse
Affiliation(s)
- Hantao Shu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Jingtian Zhou
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
| | - Qiuyu Lian
- UM-SJTU Joint Institute, Shanghai Jiao Tong University, Shanghai, China
- Department of Automation, Shanghai Jiao Tong University, Shanghai, China
| | - Han Li
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Dan Zhao
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China.
| | - Jianzhu Ma
- Institute for Artificial Intelligence, Peking University, Beijing, China.
| |
Collapse
|
178
|
Tripathi RK, Wilkins O. Single cell gene regulatory networks in plants: Opportunities for enhancing climate change stress resilience. PLANT, CELL & ENVIRONMENT 2021; 44:2006-2017. [PMID: 33522607 PMCID: PMC8359182 DOI: 10.1111/pce.14012] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 01/21/2021] [Accepted: 01/22/2021] [Indexed: 05/05/2023]
Abstract
Global warming poses major challenges for plant survival and agricultural productivity. Thus, efforts to enhance stress resilience in plants are key strategies for protecting food security. Gene regulatory networks (GRNs) are a critical mechanism conferring stress resilience. Until recently, predicting GRNs of the individual cells that make up plants and other multicellular organisms was impeded by aggregate population scale measurements of transcriptome and other genome-scale features. With the advancement of high-throughput single cell RNA-seq and other single cell assays, learning GRNs for individual cells is now possible, in principle. In this article, we report on recent advances in experimental and analytical methodologies for single cell sequencing assays especially as they have been applied to the study of plants. We highlight recent advances and ongoing challenges for scGRN prediction, and finally, we highlight the opportunity to use scGRN discovery for studying and ultimately enhancing abiotic stress resilience in plants.
Collapse
Affiliation(s)
- Rajiv K. Tripathi
- Department of Biological SciencesUniversity of ManitobaWinnipegManitobaCanada
| | - Olivia Wilkins
- Department of Biological SciencesUniversity of ManitobaWinnipegManitobaCanada
| |
Collapse
|
179
|
Bartlett TE, Kosmidis I, Silva R. Two-way sparsity for time-varying networks with applications in genomics. Ann Appl Stat 2021. [DOI: 10.1214/20-aoas1416] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
180
|
Katebi A, Ramirez D, Lu M. Computational systems-biology approaches for modeling gene networks driving epithelial-mesenchymal transitions. COMPUTATIONAL AND SYSTEMS ONCOLOGY 2021; 1:e1021. [PMID: 34164628 PMCID: PMC8219219 DOI: 10.1002/cso2.1021] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Epithelial-mesenchymal transition (EMT) is an important biological process through which epithelial cells undergo phenotypic transitions to mesenchymal cells by losing cell-cell adhesion and gaining migratory properties that cells use in embryogenesis, wound healing, and cancer metastasis. An important research topic is to identify the underlying gene regulatory networks (GRNs) governing the decision making of EMT and develop predictive models based on the GRNs. The advent of recent genomic technology, such as single-cell RNA sequencing, has opened new opportunities to improve our understanding about the dynamical controls of EMT. In this article, we review three major types of computational and mathematical approaches and methods for inferring and modeling GRNs driving EMT. We emphasize (1) the bottom-up approaches, where GRNs are constructed through literature search; (2) the top-down approaches, where GRNs are derived from genome-wide sequencing data; (3) the combined top-down and bottom-up approaches, where EMT GRNs are constructed and simulated by integrating bioinformatics and mathematical modeling. We discuss the methodologies and applications of each approach and the available resources for these studies.
Collapse
Affiliation(s)
- Ataur Katebi
- Department of Bioengineering, Northeastern University, Boston, Massachusetts, USA
- Center for Theoretical Biological Physics, Northeastern University, Boston, Massachusetts, USA
| | - Daniel Ramirez
- Center for Theoretical Biological Physics, Northeastern University, Boston, Massachusetts, USA
- College of Health Solutions, Arizona State University, Tempe, Arizona, USA
| | - Mingyang Lu
- Department of Bioengineering, Northeastern University, Boston, Massachusetts, USA
- Center for Theoretical Biological Physics, Northeastern University, Boston, Massachusetts, USA
| |
Collapse
|
181
|
Stein-O'Brien GL, Ainsile MC, Fertig EJ. Forecasting cellular states: from descriptive to predictive biology via single-cell multiomics. CURRENT OPINION IN SYSTEMS BIOLOGY 2021; 26:24-32. [PMID: 34660940 PMCID: PMC8516130 DOI: 10.1016/j.coisb.2021.03.008] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
As the single cell field races to characterize each cell type, state, and behavior, the complexity of the computational analysis approaches the complexity of the biological systems. Single cell and imaging technologies now enable unprecedented measurements of state transitions in biological systems, providing high-throughput data that capture tens-of-thousands of measurements on hundreds-of-thousands of samples. Thus, the definition of cell type and state is evolving to encompass the broad range of biological questions now attainable. To answer these questions requires the development of computational tools for integrated multi-omics analysis. Merged with mathematical models, these algorithms will be able to forecast future states of biological systems, going from statistical inferences of phenotypes to time course predictions of the biological systems with dynamic maps analogous to weather systems. Thus, systems biology for forecasting biological system dynamics from multi-omic data represents the future of cell biology empowering a new generation of technology-driven predictive medicine.
Collapse
Affiliation(s)
- Genevieve L Stein-O'Brien
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD
- Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD
- Convergence Institute, Johns Hopkins University, Baltimore, MD
| | - Michaela C Ainsile
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD
| | - Elana J Fertig
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD
- Convergence Institute, Johns Hopkins University, Baltimore, MD
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD
- Department of Applied Mathematics & Statistics, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD
| |
Collapse
|
182
|
Liu J, Fan Z, Zhao W, Zhou X. Machine Intelligence in Single-Cell Data Analysis: Advances and New Challenges. Front Genet 2021; 12:655536. [PMID: 34135939 PMCID: PMC8203333 DOI: 10.3389/fgene.2021.655536] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 04/26/2021] [Indexed: 12/18/2022] Open
Abstract
The rapid development of single-cell technologies allows for dissecting cellular heterogeneity at different omics layers with an unprecedented resolution. In-dep analysis of cellular heterogeneity will boost our understanding of complex biological systems or processes, including cancer, immune system and chronic diseases, thereby providing valuable insights for clinical and translational research. In this review, we will focus on the application of machine learning methods in single-cell multi-omics data analysis. We will start with the pre-processing of single-cell RNA sequencing (scRNA-seq) data, including data imputation, cross-platform batch effect removal, and cell cycle and cell-type identification. Next, we will introduce advanced data analysis tools and methods used for copy number variance estimate, single-cell pseudo-time trajectory analysis, phylogenetic tree inference, cell-cell interaction, regulatory network inference, and integrated analysis of scRNA-seq and spatial transcriptome data. Finally, we will present the latest analyzing challenges, such as multi-omics integration and integrated analysis of scRNA-seq data.
Collapse
Affiliation(s)
- Jiajia Liu
- College of Electronic and Information Engineering, Tongji University, Shanghai, China
- School of Biomedical Informatics, The University of Texas Health Science Centre at Houston, Houston, TX, United States
| | - Zhiwei Fan
- School of Biomedical Informatics, The University of Texas Health Science Centre at Houston, Houston, TX, United States
- West China School of Public Health, West China Fourth Hospital, Sichuan University, Chengdu, China
| | - Weiling Zhao
- School of Biomedical Informatics, The University of Texas Health Science Centre at Houston, Houston, TX, United States
| | - Xiaobo Zhou
- School of Biomedical Informatics, The University of Texas Health Science Centre at Houston, Houston, TX, United States
| |
Collapse
|
183
|
Nguyen H, Tran D, Tran B, Pehlivan B, Nguyen T. A comprehensive survey of regulatory network inference methods using single cell RNA sequencing data. Brief Bioinform 2021; 22:bbaa190. [PMID: 34020546 PMCID: PMC8138892 DOI: 10.1093/bib/bbaa190] [Citation(s) in RCA: 78] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 06/19/2020] [Accepted: 07/24/2020] [Indexed: 12/13/2022] Open
Abstract
Gene regulatory network is a complicated set of interactions between genetic materials, which dictates how cells develop in living organisms and react to their surrounding environment. Robust comprehension of these interactions would help explain how cells function as well as predict their reactions to external factors. This knowledge can benefit both developmental biology and clinical research such as drug development or epidemiology research. Recently, the rapid advance of single-cell sequencing technologies, which pushed the limit of transcriptomic profiling to the individual cell level, opens up an entirely new area for regulatory network research. To exploit this new abundant source of data and take advantage of data in single-cell resolution, a number of computational methods have been proposed to uncover the interactions hidden by the averaging process in standard bulk sequencing. In this article, we review 15 such network inference methods developed for single-cell data. We discuss their underlying assumptions, inference techniques, usability, and pros and cons. In an extensive analysis using simulation, we also assess the methods' performance, sensitivity to dropout and time complexity. The main objective of this survey is to assist not only life scientists in selecting suitable methods for their data and analysis purposes but also computational scientists in developing new methods by highlighting outstanding challenges in the field that remain to be addressed in the future development.
Collapse
Affiliation(s)
- Hung Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Duc Tran
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Bang Tran
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Bahadir Pehlivan
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Tin Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| |
Collapse
|
184
|
Gan Y, Xin Y, Hu X, Zou G. Inferring gene regulatory network from single-cell transcriptomic data by integrating multiple prior networks. Comput Biol Chem 2021; 93:107512. [PMID: 34044202 DOI: 10.1016/j.compbiolchem.2021.107512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2021] [Accepted: 05/12/2021] [Indexed: 11/29/2022]
Abstract
Gene regulatory network models the interactions between transcription factors and target genes. Reconstructing gene regulation network is critically important to understand gene function in a particular cellular context, providing key insights into complex biological systems. We develop a new computational method, named iMPRN, which integrates multiple prior networks to infer regulatory network. Based on the network component analysis model, iMPRN adopts linear regression, graph embedding, and elastic networks to optimize each prior network in line with specific biological context. For each rewired prior networks, iMPRN evaluate the confidence of the regulatory edges in each network based on B scores and finally integrated these optimized networks. We validate the effectiveness of iMPRN by comparing it with four widely-used gene regulatory network reconstruction algorithms on a simulation data set. The results show that iMPRN can infer the gene regulatory network more accurately. Further, on a real scRNA-seq dataset, iMPRN is respectively applied to reconstruct gene regulatory networks for malignant and nonmalignant head and neck tumor cells, demonstrating distinctive differences in their corresponding regulatory networks.
Collapse
Affiliation(s)
- Yanglan Gan
- School of Computer Science and Technology, Donghua University, Shanghai, China
| | - Yongchang Xin
- School of Computer Science and Technology, Donghua University, Shanghai, China
| | - Xin Hu
- School of Computer Science and Technology, Donghua University, Shanghai, China
| | - Guobing Zou
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| |
Collapse
|
185
|
Mitra R, MacLean AL. RVAgene: Generative modeling of gene expression time series data. Bioinformatics 2021; 37:3252-3262. [PMID: 33974008 PMCID: PMC8504625 DOI: 10.1093/bioinformatics/btab260] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 04/19/2021] [Accepted: 04/22/2021] [Indexed: 12/04/2022] Open
Abstract
Motivation Methods to model dynamic changes in gene expression at a genome-wide level are not currently sufficient for large (temporally rich or single-cell) datasets. Variational autoencoders offer means to characterize large datasets and have been used effectively to characterize features of single-cell datasets. Here, we extend these methods for use with gene expression time series data. Results We present RVAgene: a recurrent variational autoencoder to model gene expression dynamics. RVAgene learns to accurately and efficiently reconstruct temporal gene profiles. It also learns a low dimensional representation of the data via a recurrent encoder network that can be used for biological feature discovery, and from which we can generate new gene expression data by sampling the latent space. We test RVAgene on simulated and real biological datasets, including embryonic stem cell differentiation and kidney injury response dynamics. In all cases, RVAgene accurately reconstructed complex gene expression temporal profiles. Via cross validation, we show that a low-error latent space representation can be learnt using only a fraction of the data. Through clustering and gene ontology term enrichment analysis on the latent space, we demonstrate the potential of RVAgene for unsupervised discovery. In particular, RVAgene identifies new programs of shared gene regulation of Lox family genes in response to kidney injury. Availability and implementation All datasets analyzed in this manuscript are publicly available and have been published previously. RVAgene is available in Python, at GitHub: https://github.com/maclean-lab/RVAgene; Zenodo archive: http://doi.org/10.5281/zenodo.4271097. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Raktim Mitra
- Quantitative and Computational Biology, University of Southern California, Los Angeles, CA-90007, USA
| | - Adam L MacLean
- Quantitative and Computational Biology, University of Southern California, Los Angeles, CA-90007, USA
| |
Collapse
|
186
|
Li L, Xiong F, Wang Y, Zhang S, Gong Z, Li X, He Y, Shi L, Wang F, Liao Q, Xiang B, Zhou M, Li X, Li Y, Li G, Zeng Z, Xiong W, Guo C. What are the applications of single-cell RNA sequencing in cancer research: a systematic review. JOURNAL OF EXPERIMENTAL & CLINICAL CANCER RESEARCH : CR 2021; 40:163. [PMID: 33975628 PMCID: PMC8111731 DOI: 10.1186/s13046-021-01955-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 04/20/2021] [Indexed: 12/18/2022]
Abstract
Single-cell RNA sequencing (scRNA-seq) is a tool for studying gene expression at the single-cell level that has been widely used due to its unprecedented high resolution. In the present review, we outline the preparation process and sequencing platforms for the scRNA-seq analysis of solid tumor specimens and discuss the main steps and methods used during data analysis, including quality control, batch-effect correction, normalization, cell cycle phase assignment, clustering, cell trajectory and pseudo-time reconstruction, differential expression analysis and gene set enrichment analysis, as well as gene regulatory network inference. Traditional bulk RNA sequencing does not address the heterogeneity within and between tumors, and since the development of the first scRNA-seq technique, this approach has been widely used in cancer research to better understand cancer cell biology and pathogenetic mechanisms. ScRNA-seq has been of great significance for the development of targeted therapy and immunotherapy. In the second part of this review, we focus on the application of scRNA-seq in solid tumors, and summarize the findings and achievements in tumor research afforded by its use. ScRNA-seq holds promise for improving our understanding of the molecular characteristics of cancer, and potentially contributing to improved diagnosis, prognosis, and therapeutics.
Collapse
Affiliation(s)
- Lvyuan Li
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Fang Xiong
- Department of Stomatology, Xiangya Hospital, Central South University, Changsha, China
| | - Yumin Wang
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China.,Department of Stomatology, Xiangya Hospital, Central South University, Changsha, China
| | - Shanshan Zhang
- Department of Stomatology, Xiangya Hospital, Central South University, Changsha, China
| | - Zhaojian Gong
- Department of Oral and Maxillofacial Surgery, The Second Xiangya Hospital, Central South University, Changsha, China
| | - Xiayu Li
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Yi He
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China
| | - Lei Shi
- Department of Oral and Maxillofacial Surgery, The Second Xiangya Hospital, Central South University, Changsha, China
| | - Fuyan Wang
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Qianjin Liao
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China
| | - Bo Xiang
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Ming Zhou
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Xiaoling Li
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Yong Li
- Department of Medicine, Dan L Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, USA
| | - Guiyuan Li
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Zhaoyang Zeng
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Wei Xiong
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China. .,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China.
| | - Can Guo
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China. .,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China.
| |
Collapse
|
187
|
Li X, Zhang W, Zhang J, Li G. ModularBoost: an efficient network inference algorithm based on module decomposition. BMC Bioinformatics 2021; 22:153. [PMID: 33761871 PMCID: PMC7992795 DOI: 10.1186/s12859-021-04074-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2020] [Accepted: 03/11/2021] [Indexed: 11/15/2022] Open
Abstract
Background Given expression data, gene regulatory network(GRN) inference approaches try to determine regulatory relations. However, current inference methods ignore the inherent topological characters of GRN to some extent, leading to structures that lack clear biological explanation. To increase the biophysical meanings of inferred networks, this study performed data-driven module detection before network inference. Gene modules were identified by decomposition-based methods. Results ICA-decomposition based module detection methods have been used to detect functional modules directly from transcriptomic data. Experiments about time-series expression, curated and scRNA-seq datasets suggested that the advantages of the proposed ModularBoost method over established methods, especially in the efficiency and accuracy. For scRNA-seq datasets, the ModularBoost method outperformed other candidate inference algorithms. Conclusions As a complicated task, GRN inference can be decomposed into several tasks of reduced complexity. Using identified gene modules as topological constraints, the initial inference problem can be accomplished by inferring intra-modular and inter-modular interactions respectively. Experimental outcomes suggest that the proposed ModularBoost method can improve the accuracy and efficiency of inference algorithms by introducing topological constraints.
Collapse
Affiliation(s)
- Xinyu Li
- State Key Laboratory of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Zheda Road, 310027, Hangzhou, China
| | - Wei Zhang
- State Key Laboratory of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Zheda Road, 310027, Hangzhou, China.
| | - Jianming Zhang
- State Key Laboratory of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Zheda Road, 310027, Hangzhou, China.
| | - Guang Li
- State Key Laboratory of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Zheda Road, 310027, Hangzhou, China
| |
Collapse
|
188
|
Kang Y, Thieffry D, Cantini L. Evaluating the Reproducibility of Single-Cell Gene Regulatory Network Inference Algorithms. Front Genet 2021; 12:617282. [PMID: 33828580 PMCID: PMC8019823 DOI: 10.3389/fgene.2021.617282] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 02/24/2021] [Indexed: 12/13/2022] Open
Abstract
Networks are powerful tools to represent and investigate biological systems. The development of algorithms inferring regulatory interactions from functional genomics data has been an active area of research. With the advent of single-cell RNA-seq data (scRNA-seq), numerous methods specifically designed to take advantage of single-cell datasets have been proposed. However, published benchmarks on single-cell network inference are mostly based on simulated data. Once applied to real data, these benchmarks take into account only a small set of genes and only compare the inferred networks with an imposed ground-truth. Here, we benchmark six single-cell network inference methods based on their reproducibility, i.e., their ability to infer similar networks when applied to two independent datasets for the same biological condition. We tested each of these methods on real data from three biological conditions: human retina, T-cells in colorectal cancer, and human hematopoiesis. Once taking into account networks with up to 100,000 links, GENIE3 results to be the most reproducible algorithm and, together with GRNBoost2, show higher intersection with ground-truth biological interactions. These results are independent from the single-cell sequencing platform, the cell type annotation system and the number of cells constituting the dataset. Finally, GRNBoost2 and CLR show more reproducible performance once a more stringent thresholding is applied to the networks (1,000–100 links). In order to ensure the reproducibility and ease extensions of this benchmark study, we implemented all the analyses in scNET, a Jupyter notebook available at https://github.com/ComputationalSystemsBiology/scNET.
Collapse
Affiliation(s)
- Yoonjee Kang
- Computational Systems Biology Team, Institut de Biologie de l'Ecole Normale Supérieure, CNRS UMR 8197, INSERM U1024, Ecole Normale Supérieure, Paris Sciences et Lettres Research University, Paris, France
| | - Denis Thieffry
- Computational Systems Biology Team, Institut de Biologie de l'Ecole Normale Supérieure, CNRS UMR 8197, INSERM U1024, Ecole Normale Supérieure, Paris Sciences et Lettres Research University, Paris, France
| | - Laura Cantini
- Computational Systems Biology Team, Institut de Biologie de l'Ecole Normale Supérieure, CNRS UMR 8197, INSERM U1024, Ecole Normale Supérieure, Paris Sciences et Lettres Research University, Paris, France
| |
Collapse
|
189
|
A single-cell analysis of the Arabidopsis vegetative shoot apex. Dev Cell 2021; 56:1056-1074.e8. [PMID: 33725481 DOI: 10.1016/j.devcel.2021.02.021] [Citation(s) in RCA: 164] [Impact Index Per Article: 41.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 12/06/2020] [Accepted: 02/19/2021] [Indexed: 01/13/2023]
Abstract
The shoot apical meristem allows for reiterative formation of new aerial structures throughout the life cycle of a plant. We use single-cell RNA sequencing to define the cellular taxonomy of the Arabidopsis vegetative shoot apex at the transcriptome level. We find that the shoot apex is composed of highly heterogeneous cells, which can be partitioned into 7 broad populations with 23 transcriptionally distinct cell clusters. We delineate cell-cycle continuums and developmental trajectories of epidermal cells, vascular tissue, and leaf mesophyll cells and infer transcription factors and gene expression signatures associated with cell fate decisions. Integrative analysis of shoot and root apical cell populations further reveals common and distinct features of epidermal and vascular tissues. Our results, thus, offer a valuable resource for investigating the basic principles underlying cell division and differentiation in plants at single-cell resolution.
Collapse
|
190
|
c-CSN: Single-cell RNA Sequencing Data Analysis by Conditional Cell-specific Network. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:319-329. [PMID: 33684532 PMCID: PMC8602759 DOI: 10.1016/j.gpb.2020.05.005] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 04/13/2020] [Accepted: 07/08/2020] [Indexed: 12/28/2022]
Abstract
The rapid advancement of single-cell technologies has shed new light on the complex mechanisms of cellular heterogeneity. However, compared to bulk RNA sequencing (RNA-seq), single-cell RNA-seq (scRNA-seq) suffers from higher noise and lower coverage, which brings new computational difficulties. Based on statistical independence, cell-specific network (CSN) is able to quantify the overall associations between genes for each cell, yet suffering from a problem of overestimation related to indirect effects. To overcome this problem, we propose the c-CSN method, which can construct the conditional cell-specific network (CCSN) for each cell. c-CSN method can measure the direct associations between genes by eliminating the indirect associations. c-CSN can be used for cell clustering and dimension reduction on a network basis of single cells. Intuitively, each CCSN can be viewed as the transformation from less “reliable” gene expression to more “reliable” gene–gene associations in a cell. Based on CCSN, we further design network flow entropy (NFE) to estimate the differentiation potency of a single cell. A number of scRNA-seq datasets were used to demonstrate the advantages of our approach. 1) One direct association network is generated for one cell. 2) Most existing scRNA-seq methods designed for gene expression matrices are also applicable to c-CSN-transformed degree matrices. 3) CCSN-based NFE helps resolving the direction of differentiation trajectories by quantifying the potency of each cell. c-CSN is publicly available at https://github.com/LinLi-0909/c-CSN.
Collapse
|
191
|
Sun X, Zhang J, Nie Q. Inferring latent temporal progression and regulatory networks from cross-sectional transcriptomic data of cancer samples. PLoS Comput Biol 2021; 17:e1008379. [PMID: 33667222 PMCID: PMC7968745 DOI: 10.1371/journal.pcbi.1008379] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 03/17/2021] [Accepted: 02/15/2021] [Indexed: 12/19/2022] Open
Abstract
Unraveling molecular regulatory networks underlying disease progression is critically important for understanding disease mechanisms and identifying drug targets. The existing methods for inferring gene regulatory networks (GRNs) rely mainly on time-course gene expression data. However, most available omics data from cross-sectional studies of cancer patients often lack sufficient temporal information, leading to a key challenge for GRN inference. Through quantifying the latent progression using random walks-based manifold distance, we propose a latent-temporal progression-based Bayesian method, PROB, for inferring GRNs from the cross-sectional transcriptomic data of tumor samples. The robustness of PROB to the measurement variabilities in the data is mathematically proved and numerically verified. Performance evaluation on real data indicates that PROB outperforms other methods in both pseudotime inference and GRN inference. Applications to bladder cancer and breast cancer demonstrate that our method is effective to identify key regulators of cancer progression or drug targets. The identified ACSS1 is experimentally validated to promote epithelial-to-mesenchymal transition of bladder cancer cells, and the predicted FOXM1-targets interactions are verified and are predictive of relapse in breast cancer. Our study suggests new effective ways to clinical transcriptomic data modeling for characterizing cancer progression and facilitates the translation of regulatory network-based approaches into precision medicine.
Collapse
Affiliation(s)
- Xiaoqiang Sun
- Key Laboratory of Tropical Disease Control, Chinese Ministry of Education; Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
- School of Mathematics, Sun Yat-sen University, Guangzhou, China
| | - Ji Zhang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, Guangdong, China
| | - Qing Nie
- Department of Mathematics and Department of Developmental & Cell Biology, NSF-Simons Center for Multiscale Cell Fate Research, University of California Irvine, Irvine, California, United States of America
| |
Collapse
|
192
|
Zhang Y, Chang X, Liu X. Inference of gene regulatory networks using pseudo-time series data. Bioinformatics 2021; 37:2423-2431. [PMID: 33576787 DOI: 10.1093/bioinformatics/btab099] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 01/18/2021] [Accepted: 02/10/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Inferring gene regulatory networks (GRNs) from high-throughput data is an important and challenging problem in systems biology. Although numerous GRN methods have been developed, most have focused on the verification of the specific data set. However, it is difficult to establish directed topological networks that are both suitable for time-series and non-time-series datasets due to the complexity and diversity of biological networks. RESULTS Here, we proposed a novel method, GNIPLR (Gene networks inference based on projection and lagged regression) to infer GRNs from time-series or non-time-series gene expression data. GNIPLR projected gene data twice using the LASSO projection (LSP) algorithm and the linear projection (LP) approximation to produce a linear and monotonous pseudo-time series, and then determined the direction of regulation in combination with lagged regression analyses. The proposed algorithm was validated using simulated and real biological data. Moreover, we also applied the GNIPLR algorithm to the liver hepatocellular carcinoma (LIHC) and bladder urothelial carcinoma (BLCA) cancer expression datasets. These analyses revealed significantly higher accuracy and AUC values than other popular methods. AVAILABILITY The GNIPLR tool is freely available at https://github.com/zyllluck/GNIPLR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yuelei Zhang
- Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310012, China.,Institute of Statistics and Applied Mathematics, Anhui University of Finance and Economics, Bengbu, 233030, China.,School of Mathematics and Statistics, Shandong University, Weihai, Shandong, 264209, China
| | - Xiao Chang
- Institute of Statistics and Applied Mathematics, Anhui University of Finance and Economics, Bengbu, 233030, China
| | - Xiaoping Liu
- Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310012, China.,School of Mathematics and Statistics, Shandong University, Weihai, Shandong, 264209, China
| |
Collapse
|
193
|
Zhao M, He W, Tang J, Zou Q, Guo F. A comprehensive overview and critical evaluation of gene regulatory network inference technologies. Brief Bioinform 2021; 22:6128842. [PMID: 33539514 DOI: 10.1093/bib/bbab009] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 12/11/2020] [Accepted: 01/06/2021] [Indexed: 12/12/2022] Open
Abstract
Gene regulatory network (GRN) is the important mechanism of maintaining life process, controlling biochemical reaction and regulating compound level, which plays an important role in various organisms and systems. Reconstructing GRN can help us to understand the molecular mechanism of organisms and to reveal the essential rules of a large number of biological processes and reactions in organisms. Various outstanding network reconstruction algorithms use specific assumptions that affect prediction accuracy, in order to deal with the uncertainty of processing. In order to study why a certain method is more suitable for specific research problem or experimental data, we conduct research from model-based, information-based and machine learning-based method classifications. There are obviously different types of computational tools that can be generated to distinguish GRNs. Furthermore, we discuss several classical, representative and latest methods in each category to analyze core ideas, general steps, characteristics, etc. We compare the performance of state-of-the-art GRN reconstruction technologies on simulated networks and real networks under different scaling conditions. Through standardized performance metrics and common benchmarks, we quantitatively evaluate the stability of various methods and the sensitivity of the same algorithm applying to different scaling networks. The aim of this study is to explore the most appropriate method for a specific GRN, which helps biologists and medical scientists in discovering potential drug targets and identifying cancer biomarkers.
Collapse
Affiliation(s)
- Mengyuan Zhao
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Wenying He
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jijun Tang
- University of South Carolina, Tianjin, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Fei Guo
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
194
|
Grønning AGB, Oubounyt M, Kanev K, Lund J, Kacprowski T, Zehn D, Röttger R, Baumbach J. Enabling single-cell trajectory network enrichment. NATURE COMPUTATIONAL SCIENCE 2021; 1:153-163. [PMID: 38217228 DOI: 10.1038/s43588-021-00025-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Accepted: 01/15/2021] [Indexed: 01/15/2024]
Abstract
Single-cell sequencing (scRNA-seq) technologies allow the investigation of cellular differentiation processes with unprecedented resolution. Although powerful software packages for scRNA-seq data analysis exist, systems biology-based tools for trajectory analysis are rare and typically difficult to handle. This hampers biological exploration and prevents researchers from gaining deeper insights into the molecular control of developmental processes. Here, to address this, we have developed Scellnetor; a network-constraint time-series clustering algorithm. It allows extraction of temporal differential gene expression network patterns (modules) that explain the difference in regulation of two developmental trajectories. Using well-characterized experimental model systems, we demonstrate the capacity of Scellnetor as a hypothesis generator to identify putative mechanisms driving haematopoiesis or mechanistically interpretable subnetworks driving dysfunctional CD8 T-cell development in chronic infections. Altogether, Scellnetor allows for single-cell trajectory network enrichment, which effectively lifts scRNA-seq data analysis to a systems biology level.
Collapse
Affiliation(s)
- Alexander G B Grønning
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark.
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
| | - Mhaned Oubounyt
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
- Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Kristiyan Kanev
- Division of Animal Physiology and Immunology, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Jesper Lund
- Department of Biostatistics and Epidemiology, University of Southern Denmark, Odense, Denmark
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Brunswick, Germany
| | - Dietmar Zehn
- Division of Animal Physiology and Immunology, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Richard Röttger
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Jan Baumbach
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark.
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany.
- Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany.
| |
Collapse
|
195
|
Wang YXR, Li L, Li JJ, Huang H. Network Modeling in Biology: Statistical Methods for Gene and Brain Networks. Stat Sci 2021; 36:89-108. [PMID: 34305304 PMCID: PMC8296984 DOI: 10.1214/20-sts792] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The rise of network data in many different domains has offered researchers new insight into the problem of modeling complex systems and propelled the development of numerous innovative statistical methodologies and computational tools. In this paper, we primarily focus on two types of biological networks, gene networks and brain networks, where statistical network modeling has found both fruitful and challenging applications. Unlike other network examples such as social networks where network edges can be directly observed, both gene and brain networks require careful estimation of edges using covariates as a first step. We provide a discussion on existing statistical and computational methods for edge esitimation and subsequent statistical inference problems in these two types of biological networks.
Collapse
Affiliation(s)
- Y X Rachel Wang
- School of Mathematics and Statistics, University of Sydney, Australia
| | - Lexin Li
- Department of Biostatistics and Epidemiology, School of Public Health, University of California, Berkeley
| | | | - Haiyan Huang
- Department of Statistics, University of California, Berkeley
| |
Collapse
|
196
|
AlMusawi S, Ahmed M, Nateri AS. Understanding cell-cell communication and signaling in the colorectal cancer microenvironment. Clin Transl Med 2021; 11:e308. [PMID: 33635003 PMCID: PMC7868082 DOI: 10.1002/ctm2.308] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2020] [Revised: 12/31/2020] [Accepted: 01/19/2021] [Indexed: 12/12/2022] Open
Abstract
Carcinomas are complex heterocellular systems containing epithelial cancer cells, stromal fibroblasts, and multiple immune cell-types. Cell-cell communication between these tumor microenvironments (TME) and cells drives cancer progression and influences response to existing therapies. In order to provide better treatments for patients, we must understand how various cell-types collaborate within the TME to drive cancer and consider the multiple signals present between and within different cancer types. To investigate how tissues function, we need a model to measure both how signals are transferred between cells and how that information is processed within cells. The interplay of collaboration between different cell-types requires cell-cell communication. This article aims to review the current in vitro and in vivo mono-cellular and multi-cellular cultures models of colorectal cancer (CRC), and to explore how they can be used for single-cell multi-omics approaches for isolating multiple types of molecules from a single-cell required for cell-cell communication to distinguish cancer cells from normal cells. Integrating the existing single-cell signaling measurements and models, and through understanding the cell identity and how different cell types communicate, will help predict drug sensitivities in tumor cells and between- and within-patients responses.
Collapse
Affiliation(s)
- Shaikha AlMusawi
- Cancer Genetics & Stem Cell Group, BioDiscovery Institute, Division of Cancer & Stem Cells, School of MedicineUniversity of NottinghamNottinghamUK
| | - Mehreen Ahmed
- Cancer Genetics & Stem Cell Group, BioDiscovery Institute, Division of Cancer & Stem Cells, School of MedicineUniversity of NottinghamNottinghamUK
- Department of Laboratory Medicine, Division of Translational Cancer ResearchLund UniversityLundSweden
| | - Abdolrahman S. Nateri
- Cancer Genetics & Stem Cell Group, BioDiscovery Institute, Division of Cancer & Stem Cells, School of MedicineUniversity of NottinghamNottinghamUK
| |
Collapse
|
197
|
Nayak R, Hasija Y. A hitchhiker's guide to single-cell transcriptomics and data analysis pipelines. Genomics 2021; 113:606-619. [PMID: 33485955 DOI: 10.1016/j.ygeno.2021.01.007] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2020] [Revised: 12/30/2020] [Accepted: 01/18/2021] [Indexed: 12/20/2022]
Abstract
Single-cell transcriptomics (SCT) is a tour de force in the era of big omics data that has led to the accumulation of massive cellular transcription data at an astounding resolution of single cells. It provides valuable insights into cells previously unachieved by bulk cell analysis and is proving crucial in uncovering cellular heterogeneity, identifying rare cell populations, distinct cell-lineage trajectories, and mechanisms involved in complex cellular processes. SCT data is highly complex and necessitates advanced statistical and computational methods for analysis. This review provides a comprehensive overview of the steps in a typical SCT workflow, starting from experimental protocol to data analysis, deliberating various pipelines used. We discuss recent trends, challenges, machine learning methods for data analysis, and future prospects. We conclude by listing the multitude of scRNA-seq data applications and how it shall revolutionize our understanding of cellular biology and diseases.
Collapse
Affiliation(s)
- Richa Nayak
- Department of Biotechnology, Delhi Technological University, Delhi 110042, India
| | - Yasha Hasija
- Department of Biotechnology, Delhi Technological University, Delhi 110042, India.
| |
Collapse
|
198
|
Kim J, T. Jakobsen S, Natarajan KN, Won KJ. TENET: gene network reconstruction using transfer entropy reveals key regulatory factors from single cell transcriptomic data. Nucleic Acids Res 2021; 49:e1. [PMID: 33170214 PMCID: PMC7797076 DOI: 10.1093/nar/gkaa1014] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 10/05/2020] [Accepted: 10/14/2020] [Indexed: 12/22/2022] Open
Abstract
Accurate prediction of gene regulatory rules is important towards understanding of cellular processes. Existing computational algorithms devised for bulk transcriptomics typically require a large number of time points to infer gene regulatory networks (GRNs), are applicable for a small number of genes and fail to detect potential causal relationships effectively. Here, we propose a novel approach 'TENET' to reconstruct GRNs from single cell RNA sequencing (scRNAseq) datasets. Employing transfer entropy (TE) to measure the amount of causal relationships between genes, TENET predicts large-scale gene regulatory cascades/relationships from scRNAseq data. TENET showed better performance than other GRN reconstructors, in identifying key regulators from public datasets. Specifically from scRNAseq, TENET identified key transcriptional factors in embryonic stem cells (ESCs) and during direct cardiomyocytes reprogramming, where other predictors failed. We further demonstrate that known target genes have significantly higher TE values, and TENET predicted higher TE genes were more influenced by the perturbation of their regulator. Using TENET, we identified and validated that Nme2 is a culture condition specific stem cell factor. These results indicate that TENET is uniquely capable of identifying key regulators from scRNAseq data.
Collapse
Affiliation(s)
- Junil Kim
- Biotech Research and Innovation Centre (BRIC), University of Copenhagen, 2200 Copenhagen N, Denmark
- Novo Nordisk Foundation Center for Stem Cell Biology, DanStem, Faculty of Health and Medical Sciences, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen N, Denmark
| | - Simon T. Jakobsen
- Functional Genomics and Metabolism Unit, Department of Biochemistry and Molecular Biology, University of Southern Denmark, Denmark
| | - Kedar N Natarajan
- Functional Genomics and Metabolism Unit, Department of Biochemistry and Molecular Biology, University of Southern Denmark, Denmark
- Danish Institute of Advanced Study (D-IAS), University of Southern Denmark, Denmark
| | - Kyoung-Jae Won
- Biotech Research and Innovation Centre (BRIC), University of Copenhagen, 2200 Copenhagen N, Denmark
- Novo Nordisk Foundation Center for Stem Cell Biology, DanStem, Faculty of Health and Medical Sciences, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen N, Denmark
| |
Collapse
|
199
|
Cheung FKM, Qin J. The Methods and Tools for Molecular Network Construction. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11464-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
|
200
|
|