201
|
Osorio D, Zhong Y, Li G, Huang JZ, Cai JJ. scTenifoldNet: A Machine Learning Workflow for Constructing and Comparing Transcriptome-wide Gene Regulatory Networks from Single-Cell Data. PATTERNS (NEW YORK, N.Y.) 2020; 1:100139. [PMID: 33336197 PMCID: PMC7733883 DOI: 10.1016/j.patter.2020.100139] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 09/29/2020] [Accepted: 10/12/2020] [Indexed: 02/02/2023]
Abstract
We present scTenifoldNet-a machine learning workflow built upon principal-component regression, low-rank tensor approximation, and manifold alignment-for constructing and comparing single-cell gene regulatory networks (scGRNs) using data from single-cell RNA sequencing. scTenifoldNet reveals regulatory changes in gene expression between samples by comparing the constructed scGRNs. With real data, scTenifoldNet identifies specific gene expression programs associated with different biological processes, providing critical insights into the underlying mechanism of regulatory networks governing cellular transcriptional activities.
Collapse
Affiliation(s)
- Daniel Osorio
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Yan Zhong
- Department of Statistics, Texas A&M University, College Station, TX 77843, USA
| | - Guanxun Li
- Department of Statistics, Texas A&M University, College Station, TX 77843, USA
| | - Jianhua Z. Huang
- Department of Statistics, Texas A&M University, College Station, TX 77843, USA
| | - James J. Cai
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA
- Interdisciplinary Program of Genetics, Texas A&M University, College Station, TX 77843, USA
| |
Collapse
|
202
|
Single-cell network biology for resolving cellular heterogeneity in human diseases. Exp Mol Med 2020; 52:1798-1808. [PMID: 33244151 PMCID: PMC8080824 DOI: 10.1038/s12276-020-00528-0] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 08/26/2020] [Accepted: 08/31/2020] [Indexed: 01/10/2023] Open
Abstract
Understanding cellular heterogeneity is the holy grail of biology and medicine. Cells harboring identical genomes show a wide variety of behaviors in multicellular organisms. Genetic circuits underlying cell-type identities will facilitate the understanding of the regulatory programs for differentiation and maintenance of distinct cellular states. Such a cell-type-specific gene network can be inferred from coregulatory patterns across individual cells. Conventional methods of transcriptome profiling using tissue samples provide only average signals of diverse cell types. Therefore, reconstructing gene regulatory networks for a particular cell type is not feasible with tissue-based transcriptome data. Recently, single-cell omics technology has emerged and enabled the capture of the transcriptomic landscape of every individual cell. Although single-cell gene expression studies have already opened up new avenues, network biology using single-cell transcriptome data will further accelerate our understanding of cellular heterogeneity. In this review, we provide an overview of single-cell network biology and summarize recent progress in method development for network inference from single-cell RNA sequencing (scRNA-seq) data. Then, we describe how cell-type-specific gene networks can be utilized to study regulatory programs specific to disease-associated cell types and cellular states. Moreover, with scRNA data, modeling personal or patient-specific gene networks is feasible. Therefore, we also introduce potential applications of single-cell network biology for precision medicine. We envision a rapid paradigm shift toward single-cell network analysis for systems biology in the near future. Gene regulatory networks reconstructed from single-cell RNA sequencing datasets are allowing researchers to better understand the molecular circuits and cell states that contribute to complex human disease. Junha Cha and Insuk Lee from Yonsei University in Seoul, South Korea, review the concept of ‘single-cell network biology’, which involves using computational algorithms on genetic expression data from thousands of cells to infer functional interactions in various biological contexts. This systems biology approach to analyzing the profiles of messenger RNA in single cells is helping researchers discover new signaling pathways that could serve as disease biomarkers or therapeutic targets. In the future, patient-specific models of personal gene networks could explain why certain genetic variants affect disease risk. This research could also eventually lead to new types of individualized medical treatments.
Collapse
|
203
|
Dai H, Jin QQ, Li L, Chen LN. Reconstructing gene regulatory networks in single-cell transcriptomic data analysis. Zool Res 2020; 41:599-604. [PMID: 33124218 PMCID: PMC7671911 DOI: 10.24272/j.issn.2095-8137.2020.215] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 10/20/2020] [Indexed: 11/07/2022] Open
Abstract
Gene regulatory networks play pivotal roles in our understanding of biological processes/mechanisms at the molecular level. Many studies have developed sample-specific or cell-type-specific gene regulatory networks from single-cell transcriptomic data based on a large amount of cell samples. Here, we review the state-of-the-art computational algorithms and describe various applications of gene regulatory networks in biological studies.
Collapse
Affiliation(s)
- Hao Dai
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
- Institute of Brain-Intelligence Technology, Zhangjiang Laboratory, Shanghai 201210, China
| | - Qi-Qi Jin
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Lin Li
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Luo-Nan Chen
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou, Zhejiang 310024, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| |
Collapse
|
204
|
Levy M, Frishberg A, Gat-Viks I. Inferring cellular heterogeneity of associations from single cell genomics. Bioinformatics 2020; 36:3466-3473. [PMID: 32129824 DOI: 10.1093/bioinformatics/btaa151] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 02/01/2020] [Accepted: 02/27/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Cell-to-cell variation has uncovered associations between cellular phenotypes. However, it remains challenging to address the cellular diversity of such associations. RESULTS Here, we do not rely on the conventional assumption that the same association holds throughout the entire cell population. Instead, we assume that associations may exist in a certain subset of the cells. We developed CEllular Niche Association (CENA) to reliably predict pairwise associations together with the cell subsets in which the associations are detected. CENA does not rely on predefined subsets but only requires that the cells of each predicted subset would share a certain characteristic state. CENA may therefore reveal dynamic modulation of dependencies along cellular trajectories of temporally evolving states. Using simulated data, we show the advantage of CENA over existing methods and its scalability to a large number of cells. Application of CENA to real biological data demonstrates dynamic changes in associations that would be otherwise masked. AVAILABILITY AND IMPLEMENTATION CENA is available as an R package at Github: https://github.com/mayalevy/CENA and is accompanied by a complete set of documentations and instructions. CONTACT iritgv@gmail.com. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Maya Levy
- School of Molecular Cell Biology and Biotechnology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Amit Frishberg
- School of Molecular Cell Biology and Biotechnology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Irit Gat-Viks
- School of Molecular Cell Biology and Biotechnology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
205
|
Ma B, Fang M, Jiao X. Inference of gene regulatory networks based on nonlinear ordinary differential equations. Bioinformatics 2020; 36:4885-4893. [DOI: 10.1093/bioinformatics/btaa032] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Revised: 12/30/2019] [Accepted: 01/15/2020] [Indexed: 01/05/2023] Open
Abstract
Abstract
Motivation
Gene regulatory networks (GRNs) capture the regulatory interactions between genes, resulting from the fundamental biological process of transcription and translation. In some cases, the topology of GRNs is not known, and has to be inferred from gene expression data. Most of the existing GRNs reconstruction algorithms are either applied to time-series data or steady-state data. Although time-series data include more information about the system dynamics, steady-state data imply stability of the underlying regulatory networks.
Results
In this article, we propose a method for inferring GRNs from time-series and steady-state data jointly. We make use of a non-linear ordinary differential equations framework to model dynamic gene regulation and an importance measurement strategy to infer all putative regulatory links efficiently. The proposed method is evaluated extensively on the artificial DREAM4 dataset and two real gene expression datasets of yeast and Escherichia coli. Based on public benchmark datasets, the proposed method outperforms other popular inference algorithms in terms of overall score. By comparing the performance on the datasets with different scales, the results show that our method still keeps good robustness and accuracy at a low computational complexity.
Availability and implementation
The proposed method is written in the Python language, and is available at: https://github.com/lab319/GRNs_nonlinear_ODEs
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Baoshan Ma
- College of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Mingkun Fang
- College of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Xiangtian Jiao
- College of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| |
Collapse
|
206
|
Dibaeinia P, Sinha S. SERGIO: A Single-Cell Expression Simulator Guided by Gene Regulatory Networks. Cell Syst 2020; 11:252-271.e11. [PMID: 32871105 PMCID: PMC7530147 DOI: 10.1016/j.cels.2020.08.003] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 03/18/2020] [Accepted: 08/04/2020] [Indexed: 12/14/2022]
Abstract
A common approach to benchmarking of single-cell transcriptomics tools is to generate synthetic datasets that statistically resemble experimental data. However, most existing single-cell simulators do not incorporate transcription factor-gene regulatory interactions that underlie expression dynamics. Here, we present SERGIO, a simulator of single-cell gene expression data that models the stochastic nature of transcription as well as regulation of genes by multiple transcription factors according to a user-provided gene regulatory network. SERGIO can simulate any number of cell types in steady state or cells differentiating to multiple fates. We show that datasets generated by SERGIO are statistically comparable to experimental data generated by Illumina HiSeq2000, Drop-seq, Illumina 10X chromium, and Smart-seq. We use SERGIO to benchmark several single-cell analysis tools, including GRN inference methods, and identify Tcf7, Gata3, and Bcl11b as key drivers of T cell differentiation by performing in silico knockout experiments. SERGIO is freely available for download here: https://github.com/PayamDiba/SERGIO.
Collapse
Affiliation(s)
- Payam Dibaeinia
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute of Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Cancer Center at Illinois, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.
| |
Collapse
|
207
|
Møller AF, Natarajan KN. Predicting gene regulatory networks from cell atlases. Life Sci Alliance 2020; 3:3/11/e202000658. [PMID: 32958603 PMCID: PMC7536823 DOI: 10.26508/lsa.202000658] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Revised: 08/24/2020] [Accepted: 08/31/2020] [Indexed: 12/17/2022] Open
Abstract
Integrated single-cell gene regulatory network from three mouse cell atlases captures global and cell type–specific regulatory modules and crosstalk, important for cellular identity. Recent single-cell RNA-sequencing atlases have surveyed and identified major cell types across different mouse tissues. Here, we computationally reconstruct gene regulatory networks from three major mouse cell atlases to capture functional regulators critical for cell identity, while accounting for a variety of technical differences, including sampled tissues, sequencing depth, and author assigned cell type labels. Extracting the regulatory crosstalk from mouse atlases, we identify and distinguish global regulons active in multiple cell types from specialised cell type–specific regulons. We demonstrate that regulon activities accurately distinguish individual cell types, despite differences between individual atlases. We generate an integrated network that further uncovers regulon modules with coordinated activities critical for cell types, and validate modules using available experimental data. Inferring regulatory networks during myeloid differentiation from wild-type and Irf8 KO cells, we uncover functional contribution of Irf8 regulon activity and composition towards monocyte lineage. Our analysis provides an avenue to further extract and integrate the regulatory crosstalk from single-cell expression data.
Collapse
Affiliation(s)
- Andreas Fønss Møller
- Department of Biochemistry and Molecular Biology, Functional Genomics and Metabolism Unit, University of Southern Denmark, Odense, Denmark
| | - Kedar Nath Natarajan
- Department of Biochemistry and Molecular Biology, Functional Genomics and Metabolism Unit, University of Southern Denmark, Odense, Denmark .,Danish Institute of Advanced Study, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
208
|
Single-cell multiomics: technologies and data analysis methods. Exp Mol Med 2020; 52:1428-1442. [PMID: 32929225 PMCID: PMC8080692 DOI: 10.1038/s12276-020-0420-2] [Citation(s) in RCA: 289] [Impact Index Per Article: 57.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Revised: 02/14/2020] [Accepted: 02/28/2020] [Indexed: 01/31/2023] Open
Abstract
Advances in single-cell isolation and barcoding technologies offer unprecedented opportunities to profile DNA, mRNA, and proteins at a single-cell resolution. Recently, bulk multiomics analyses, such as multidimensional genomic and proteogenomic analyses, have proven beneficial for obtaining a comprehensive understanding of cellular events. This benefit has facilitated the development of single-cell multiomics analysis, which enables cell type-specific gene regulation to be examined. The cardinal features of single-cell multiomics analysis include (1) technologies for single-cell isolation, barcoding, and sequencing to measure multiple types of molecules from individual cells and (2) the integrative analysis of molecules to characterize cell types and their functions regarding pathophysiological processes based on molecular signatures. Here, we summarize the technologies for single-cell multiomics analyses (mRNA-genome, mRNA-DNA methylation, mRNA-chromatin accessibility, and mRNA-protein) as well as the methods for the integrative analysis of single-cell multiomics data. The expansion of single-cell profiling technologies will provide unprecedented insights into the molecular mechanisms inherent in disease. Novel technologies known collectively as ‘single-cell multiomics’ enable systematic, high-resolution profiling of DNA, RNA and proteins in individual cells. This provides valuable data about gene regulation and molecular populations, and cellular processes during disease development and progression. Daehee Hwang and co-workers at Seoul National University, Seoul, South Korea, reviewed existing single-cell multiomics technologies and highlighted ways to integrate the data generated. Analytical features of multiomics allow scientists to isolate, sequence and label (or ‘barcode’) multiple molecules in single cells. Different sequencing techniques can be used for different purposes, such as exploring gene mutation coverage or measuring RNA transcripts. Combining these sequencing data will help identify links between significant features during disease.
Collapse
|
209
|
Ando Y, Kwon ATJ, Shin JW. An era of single-cell genomics consortia. Exp Mol Med 2020; 52:1409-1418. [PMID: 32929222 PMCID: PMC8080593 DOI: 10.1038/s12276-020-0409-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 01/24/2020] [Accepted: 02/10/2020] [Indexed: 12/24/2022] Open
Abstract
The human body consists of 37 trillion single cells represented by over 50 organs that are stitched together to make us who we are, yet we still have very little understanding about the basic units of our body: what cell types and states make up our organs both compositionally and spatially. Previous efforts to profile a wide range of human cell types have been attempted by the FANTOM and GTEx consortia. Now, with the advancement in genomic technologies, profiling the human body at single-cell resolution is possible and will generate an unprecedented wealth of data that will accelerate basic and clinical research with tangible applications to future medicine. To date, several major organs have been profiled, but the challenges lie in ways to integrate single-cell genomics data in a meaningful way. In recent years, several consortia have begun to introduce harmonization and equity in data collection and analysis. Herein, we introduce existing and nascent single-cell genomics consortia, and present benefits to necessitate single-cell genomic consortia in a regional environment to achieve the universal human cell reference dataset.
Collapse
Affiliation(s)
- Yoshinari Ando
- RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-Cho, Tsurumi-Ku, Yokohama, 230-0045, Japan
| | - Andrew Tae-Jun Kwon
- RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-Cho, Tsurumi-Ku, Yokohama, 230-0045, Japan
| | - Jay W Shin
- RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-Cho, Tsurumi-Ku, Yokohama, 230-0045, Japan.
| |
Collapse
|
210
|
Li Y, Ma A, Mathé EA, Li L, Liu B, Ma Q. Elucidation of Biological Networks across Complex Diseases Using Single-Cell Omics. Trends Genet 2020; 36:951-966. [PMID: 32868128 DOI: 10.1016/j.tig.2020.08.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2020] [Revised: 07/29/2020] [Accepted: 08/04/2020] [Indexed: 12/14/2022]
Abstract
Single-cell multimodal omics (scMulti-omics) technologies have made it possible to trace cellular lineages during differentiation and to identify new cell types in heterogeneous cell populations. The derived information is especially promising for computing cell-type-specific biological networks encoded in complex diseases and improving our understanding of the underlying gene regulatory mechanisms. The integration of these networks could, therefore, give rise to a heterogeneous regulatory landscape (HRL) in support of disease diagnosis and drug therapeutics. In this review, we provide an overview of this field and pay particular attention to how diverse biological networks can be inferred in a specific cell type based on integrative methods. Then, we discuss how HRL can advance our understanding of regulatory mechanisms underlying complex diseases and aid in the prediction of prognosis and therapeutic responses. Finally, we outline challenges and future trends that will be central to bringing the field of HRL in complex diseases forward.
Collapse
Affiliation(s)
- Yang Li
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA
| | - Anjun Ma
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA
| | - Ewy A Mathé
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences, National Institutes of Health (NIH), Rockville, MD, 20892, USA
| | - Lang Li
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA
| | - Bingqiang Liu
- School of Mathematics, Shandong University, Jinan, Shandong, 250100, China.
| | - Qin Ma
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA.
| |
Collapse
|
211
|
Hao Shi, Yan KK, Ding L, Qian C, Chi H, Yu J. Network Approaches for Dissecting the Immune System. iScience 2020; 23:101354. [PMID: 32717640 PMCID: PMC7390880 DOI: 10.1016/j.isci.2020.101354] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Revised: 06/21/2020] [Accepted: 07/08/2020] [Indexed: 02/06/2023] Open
Abstract
The immune system is a complex biological network composed of hierarchically organized genes, proteins, and cellular components that combat external pathogens and monitor the onset of internal disease. To meet and ultimately defeat these challenges, the immune system orchestrates an exquisitely complex interplay of numerous cells, often with highly specialized functions, in a tissue-specific manner. One of the major methodologies of systems immunology is to measure quantitatively the components and interaction levels in the immunologic networks to construct a computational network and predict the response of the components to perturbations. The recent advances in high-throughput sequencing techniques have provided us with a powerful approach to dissecting the complexity of the immune system. Here we summarize the latest progress in integrating omics data and network approaches to construct networks and to infer the underlying signaling and transcriptional landscape, as well as cell-cell communication, in the immune system, with a focus on hematopoiesis, adaptive immunity, and tumor immunology. Understanding the network regulation of immune cells has provided new insights into immune homeostasis and disease, with important therapeutic implications for inflammation, cancer, and other immune-mediated disorders.
Collapse
Affiliation(s)
- Hao Shi
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Immunology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Koon-Kiu Yan
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Liang Ding
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Chenxi Qian
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Immunology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Hongbo Chi
- Department of Immunology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Jiyang Yu
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA.
| |
Collapse
|
212
|
Sekula M, Gaskins J, Datta S. A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data. BMC Bioinformatics 2020; 21:361. [PMID: 32811424 PMCID: PMC7437941 DOI: 10.1186/s12859-020-03707-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Accepted: 08/04/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Gene co-expression networks (GCNs) are powerful tools that enable biologists to examine associations between genes during different biological processes. With the advancement of new technologies, such as single-cell RNA sequencing (scRNA-seq), there is a need for developing novel network methods appropriate for new types of data. RESULTS We present a novel sparse Bayesian factor model to explore the network structure associated with genes in scRNA-seq data. Latent factors impact the gene expression values for each cell and provide flexibility to account for common features of scRNA-seq: high proportions of zero values, increased cell-to-cell variability, and overdispersion due to abnormally large expression counts. From our model, we construct a GCN by analyzing the positive and negative associations of the factors that are shared between each pair of genes. CONCLUSIONS Simulation studies demonstrate that our methodology has high power in identifying gene-gene associations while maintaining a nominal false discovery rate. In real data analyses, our model identifies more known and predicted protein-protein interactions than other competing network models.
Collapse
Affiliation(s)
- Michael Sekula
- Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY, USA.
| | - Jeremy Gaskins
- Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY, USA
| | - Susmita Datta
- Department of Biostatistics, University of Florida, Gainesville, FL, USA
| |
Collapse
|
213
|
Del Sol A, Jung S. The Importance of Computational Modeling in Stem Cell Research. Trends Biotechnol 2020; 39:126-136. [PMID: 32800604 DOI: 10.1016/j.tibtech.2020.07.006] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Revised: 07/13/2020] [Accepted: 07/15/2020] [Indexed: 12/30/2022]
Abstract
The generation of large amounts of omics data is increasingly enabling not only the processing and analysis of large data sets but also the development of computational models in the field of stem cell research. Although computational models have been proposed in recent decades, we believe that the stem cell community is not fully aware of the potentiality of computational modeling in guiding their experimental research. In this regard, we discuss how single-cell technologies provide the right framework for computational modeling at different scales of biological organization in order to address challenges in the stem cell field and to guide experimentalists in the design of new strategies for stem cell therapies and treatment of congenital disorders.
Collapse
Affiliation(s)
- Antonio Del Sol
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, Esch-sur-Alzette, L-4367 Belvaux, Luxembourg; CIC bioGUNE-BRTA (Basque Research and Technology Alliance), Bizkaia Technology Park, 801 Building, 48160 Derio, Spain; IKERBASQUE, Basque Foundation for Science, Bilbao 48013, Spain.
| | - Sascha Jung
- CIC bioGUNE-BRTA (Basque Research and Technology Alliance), Bizkaia Technology Park, 801 Building, 48160 Derio, Spain
| |
Collapse
|
214
|
Ramachandran P, Matchett KP, Dobie R, Wilson-Kanamori JR, Henderson NC. Single-cell technologies in hepatology: new insights into liver biology and disease pathogenesis. Nat Rev Gastroenterol Hepatol 2020; 17:457-472. [PMID: 32483353 DOI: 10.1038/s41575-020-0304-x] [Citation(s) in RCA: 161] [Impact Index Per Article: 32.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/08/2020] [Indexed: 12/19/2022]
Abstract
Liver disease is a major global health-care problem, affecting an estimated 844 million people worldwide. Despite this substantial burden, therapeutic options for liver disease remain limited, in part owing to a paucity of detailed analyses defining the cellular and molecular mechanisms that drive these conditions in humans. Single-cell transcriptomic technologies are transforming our understanding of cellular diversity and function in health and disease. In this Review, we discuss how these technologies have been applied in hepatology, advancing our understanding of cellular heterogeneity and providing novel insights into fundamental liver biology such as the metabolic zonation of hepatocytes, endothelial cells and hepatic stellate cells, and the cellular mechanisms underpinning liver regeneration. Application of these methodologies is also uncovering critical pathophysiological changes driving disease states such as hepatic fibrosis, where distinct populations of macrophages, endothelial cells and mesenchymal cells reside within a spatially distinct fibrotic niche and interact to promote scar formation. In addition, single-cell approaches are starting to dissect key cellular and molecular functions in liver cancer. In the near future, new techniques such as spatial transcriptomics and multiomic approaches will further deepen our understanding of disease pathogenesis, enabling the identification of novel therapeutic targets for patients across the spectrum of liver diseases.
Collapse
Affiliation(s)
- Prakash Ramachandran
- Centre for Inflammation Research, The Queen's Medical Research Institute, University of Edinburgh, Edinburgh, UK
| | - Kylie P Matchett
- Centre for Inflammation Research, The Queen's Medical Research Institute, University of Edinburgh, Edinburgh, UK
| | - Ross Dobie
- Centre for Inflammation Research, The Queen's Medical Research Institute, University of Edinburgh, Edinburgh, UK
| | - John R Wilson-Kanamori
- Centre for Inflammation Research, The Queen's Medical Research Institute, University of Edinburgh, Edinburgh, UK
| | - Neil C Henderson
- Centre for Inflammation Research, The Queen's Medical Research Institute, University of Edinburgh, Edinburgh, UK. .,MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
215
|
Zheng X, Huang Y, Zou X. scPADGRN: A preconditioned ADMM approach for reconstructing dynamic gene regulatory network using single-cell RNA sequencing data. PLoS Comput Biol 2020; 16:e1007471. [PMID: 32716923 PMCID: PMC7410337 DOI: 10.1371/journal.pcbi.1007471] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 08/06/2020] [Accepted: 05/28/2020] [Indexed: 12/23/2022] Open
Abstract
Disease development and cell differentiation both involve dynamic changes; therefore, the reconstruction of dynamic gene regulatory networks (DGRNs) is an important but difficult problem in systems biology. With recent technical advances in single-cell RNA sequencing (scRNA-seq), large volumes of scRNA-seq data are being obtained for various processes. However, most current methods of inferring DGRNs from bulk samples may not be suitable for scRNA-seq data. In this work, we present scPADGRN, a novel DGRN inference method using “time-series” scRNA-seq data. scPADGRN combines the preconditioned alternating direction method of multipliers with cell clustering for DGRN reconstruction. It exhibits advantages in accuracy, robustness and fast convergence. Moreover, a quantitative index called Differentiation Genes’ Interaction Enrichment (DGIE) is presented to quantify the interaction enrichment of genes related to differentiation. From the DGIE scores of relevant subnetworks, we infer that the functions of embryonic stem (ES) cells are most active initially and may gradually fade over time. The communication strength of known contributing genes that facilitate cell differentiation increases from ES cells to terminally differentiated cells. We also identify several genes responsible for the changes in the DGIE scores occurring during cell differentiation based on three real single-cell datasets. Our results demonstrate that single-cell analyses based on network inference coupled with quantitative computations can reveal key transcriptional regulators involved in cell differentiation and disease development. Single-cell RNA sequencing (scRNA-seq) data are gaining popularity for providing access to cell-level measurements. Currently, time-series scRNA-seq data allow researchers to study dynamic changes during biological processes. This work proposes a novel method, scPADGRN, for application to time-series scRNA-seq data to construct dynamic gene regulatory networks, which are informative for investigating dynamic changes during disease development and cell differentiation. The proposed method shows satisfactory performance on both simulated data and three real datasets concerning cell differentiation. To quantify network dynamics, we present a quantitative index, DGIE, to measure the degree of activity of a certain set of genes in a regulatory network. Quantitative computations based on dynamic networks identify key regulators in cell differentiation and reveal the activity states of the identified regulators. Specifically, Bhlhe40, Msx2, Foxa2 and Dnmt3l might be important regulatory genes involved in differentiation from mouse ES cells to primitive endoderm (PrE) cells. For differentiation from mouse embryonic fibroblast cells to myocytes, Scx, Fos and Tcf12 are suggested to be key regulators. Sox5, Meis2, Hoxb3, Tcf7l1 and Plagl1 critically contribute during differentiation from human ES cells to definitive endoderm cells. These results may guide further theoretical and experimental efforts to understand cell differentiation processes and explore cell heterogeneity.
Collapse
Affiliation(s)
- Xiao Zheng
- School of Mathematics and Statistics, Wuhan University, Wuhan, Hubei, China
| | - Yuan Huang
- Department of Biostatistics, Yale University, New Haven, Connecticut, United States of America
| | - Xiufen Zou
- School of Mathematics and Statistics, Wuhan University, Wuhan, Hubei, China
- * E-mail:
| |
Collapse
|
216
|
Hie B, Peters J, Nyquist SK, Shalek AK, Berger B, Bryson BD. Computational Methods for Single-Cell RNA Sequencing. Annu Rev Biomed Data Sci 2020. [DOI: 10.1146/annurev-biodatasci-012220-100601] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Single-cell RNA sequencing (scRNA-seq) has provided a high-dimensional catalog of millions of cells across species and diseases. These data have spurred the development of hundreds of computational tools to derive novel biological insights. Here, we outline the components of scRNA-seq analytical pipelines and the computational methods that underlie these steps. We describe available methods, highlight well-executed benchmarking studies, and identify opportunities for additional benchmarking studies and computational methods. As the biochemical approaches for single-cell omics advance, we propose coupled development of robust analytical pipelines suited for the challenges that new data present and principled selection of analytical methods that are suited for the biological questions to be addressed.
Collapse
Affiliation(s)
- Brian Hie
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Joshua Peters
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, Massachusetts 02139, USA
| | - Sarah K. Nyquist
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, Massachusetts 02139, USA
- Program in Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Alex K. Shalek
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, Massachusetts 02139, USA
- Department of Chemistry, Institute for Medical Engineering & Science (IMES), and Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Bryan D. Bryson
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
217
|
Beisang DJ, Smith K, Yang L, Benyumov A, Gilbertsen A, Herrera J, Lock E, Racila E, Forster C, Sandri BJ, Henke CA, Bitterman PB. Single-cell RNA sequencing reveals that lung mesenchymal progenitor cells in IPF exhibit pathological features early in their differentiation trajectory. Sci Rep 2020; 10:11162. [PMID: 32636398 PMCID: PMC7341888 DOI: 10.1038/s41598-020-66630-5] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 05/06/2020] [Indexed: 12/12/2022] Open
Abstract
In Idiopathic Pulmonary Fibrosis (IPF), there is unrelenting scarring of the lung mediated by pathological mesenchymal progenitor cells (MPCs) that manifest autonomous fibrogenicity in xenograft models. To determine where along their differentiation trajectory IPF MPCs acquire fibrogenic properties, we analyzed the transcriptome of 335 MPCs isolated from the lungs of 3 control and 3 IPF patients at the single-cell level. Using transcriptional entropy as a metric for differentiated state, we found that the least differentiated IPF MPCs displayed the largest differences in their transcriptional profile compared to control MPCs. To validate entropy as a surrogate for differentiated state functionally, we identified increased CD44 as a characteristic of the most entropic IPF MPCs. Using FACS to stratify IPF MPCs based on CD44 expression, we determined that CD44hi IPF MPCs manifested an increased capacity for anchorage-independent colony formation compared to CD44lo IPF MPCs. To validate our analysis morphologically, we used two differentially expressed genes distinguishing IPF MPCs from control (CD44, cell surface; and MARCKS, intracellular). In IPF lung tissue, pathological MPCs resided in the highly cellular perimeter region of the fibroblastic focus. Our data support the concept that IPF fibroblasts acquire a cell-autonomous pathological phenotype early in their differentiation trajectory.
Collapse
Affiliation(s)
- Daniel J Beisang
- University of Minnesota, Department of Pediatrics, Division of Pediatric Pulmonology, Minneapolis, USA
| | - Karen Smith
- University of Minnesota, Department of Medicine, Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Minneapolis, USA
| | - Libang Yang
- University of Minnesota, Department of Medicine, Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Minneapolis, USA
| | - Alexey Benyumov
- University of Minnesota, Department of Medicine, Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Minneapolis, USA
| | - Adam Gilbertsen
- University of Minnesota, Department of Medicine, Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Minneapolis, USA
| | - Jeremy Herrera
- University of Manchester, School of Biological Sciences, Division of Cell Matrix Biology & Regenerative Medicine, Manchester, United Kingdom
| | - Eric Lock
- University of Minnesota, School of Public Health, Division of Biostatistics, Minneapolis, USA
| | - Emilian Racila
- University of Minnesota, Department of Laboratory Medicine and Pathology, Minneapolis, USA
| | - Colleen Forster
- University of Minnesota, Clinical and Translational Science Institute, Minneapolis, USA
| | - Brian J Sandri
- University of Minnesota, Department of Pediatrics, Division of Neonatology, Minneapolis, USA
| | - Craig A Henke
- University of Minnesota, Department of Medicine, Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Minneapolis, USA
| | - Peter B Bitterman
- University of Minnesota, Department of Medicine, Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Minneapolis, USA.
| |
Collapse
|
218
|
Chowdhury HA, Bhattacharyya DK, Kalita JK. (Differential) Co-Expression Analysis of Gene Expression: A Survey of Best Practices. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1154-1173. [PMID: 30668502 DOI: 10.1109/tcbb.2019.2893170] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Analysis of gene expression data is widely used in transcriptomic studies to understand functions of molecules inside a cell and interactions among molecules. Differential co-expression analysis studies diseases and phenotypic variations by finding modules of genes whose co-expression patterns vary across conditions. We review the best practices in gene expression data analysis in terms of analysis of (differential) co-expression, co-expression network, differential networking, and differential connectivity considering both microarray and RNA-seq data along with comparisons. We highlight hurdles in RNA-seq data analysis using methods developed for microarrays. We include discussion of necessary tools for gene expression analysis throughout the paper. In addition, we shed light on scRNA-seq data analysis by including preprocessing and scRNA-seq in co-expression analysis along with useful tools specific to scRNA-seq. To get insights, biological interpretation and functional profiling is included. Finally, we provide guidelines for the analyst, along with research issues and challenges which should be addressed.
Collapse
|
219
|
Hu X, Hu Y, Wu F, Leung RWT, Qin J. Integration of single-cell multi-omics for gene regulatory network inference. Comput Struct Biotechnol J 2020; 18:1925-1938. [PMID: 32774787 PMCID: PMC7385034 DOI: 10.1016/j.csbj.2020.06.033] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Revised: 06/17/2020] [Accepted: 06/20/2020] [Indexed: 12/20/2022] Open
Abstract
The advancement of single-cell sequencing technology in recent years has provided an opportunity to reconstruct gene regulatory networks (GRNs) with the data from thousands of single cells in one sample. This uncovers regulatory interactions in cells and speeds up the discoveries of regulatory mechanisms in diseases and biological processes. Therefore, more methods have been proposed to reconstruct GRNs using single-cell sequencing data. In this review, we introduce technologies for sequencing single-cell genome, transcriptome, and epigenome. At the same time, we present an overview of current GRN reconstruction strategies utilizing different single-cell sequencing data. Bioinformatics tools were grouped by their input data type and mathematical principles for reader's convenience, and the fundamental mathematics inherent in each group will be discussed. Furthermore, the adaptabilities and limitations of these different methods will also be summarized and compared, with the hope to facilitate researchers recognizing the most suitable tools for them.
Collapse
Affiliation(s)
- Xinlin Hu
- Shenzhen Key Laboratory of Advanced Machine Learning and Applications, College of Mathematics and Statistics, Shenzhen University, Shenzhen 518060, China
| | - Yaohua Hu
- Shenzhen Key Laboratory of Advanced Machine Learning and Applications, College of Mathematics and Statistics, Shenzhen University, Shenzhen 518060, China
| | - Fanjie Wu
- School of Pharmaceutical Sciences (Shenzhen), Sun Yat-sen University, Shenzhen 518107, China
| | - Ricky Wai Tak Leung
- School of Pharmaceutical Sciences (Shenzhen), Sun Yat-sen University, Shenzhen 518107, China
| | - Jing Qin
- School of Pharmaceutical Sciences (Shenzhen), Sun Yat-sen University, Shenzhen 518107, China
| |
Collapse
|
220
|
Falco MM, Peña-Chilet M, Loucera C, Hidalgo MR, Dopazo J. Mechanistic models of signaling pathways deconvolute the glioblastoma single-cell functional landscape. NAR Cancer 2020; 2:zcaa011. [PMID: 34316686 PMCID: PMC8210212 DOI: 10.1093/narcan/zcaa011] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 06/08/2020] [Accepted: 06/11/2020] [Indexed: 02/07/2023] Open
Abstract
Single-cell RNA sequencing is revealing an unexpectedly large degree of heterogeneity in gene expression levels across cell populations. However, little is known on the functional consequences of this heterogeneity and the contribution of individual cell fate decisions to the collective behavior of the tissues these cells are part of. Here, we use mechanistic modeling of signaling circuits, which reveals a complex functional landscape at single-cell level. Different clusters of neoplastic glioblastoma cells have been defined according to their differences in signaling circuit activity profiles triggering specific cancer hallmarks, which suggest different functional strategies with distinct degrees of aggressiveness. Moreover, mechanistic modeling of effects of targeted drug inhibitions at single-cell level revealed, how in some cells, the substitution of VEGFA, the target of bevacizumab, by other expressed proteins, like PDGFD, KITLG and FGF2, keeps the VEGF pathway active, insensitive to the VEGFA inhibition by the drug. Here, we describe for the first time mechanisms that individual cells use to avoid the effect of a targeted therapy, providing an explanation for the innate resistance to the treatment displayed by some cells. Our results suggest that mechanistic modeling could become an important asset for the definition of personalized therapeutic interventions.
Collapse
Affiliation(s)
- Matías M Falco
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocío, 41013 Sevilla, Spain
| | - María Peña-Chilet
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocío, 41013 Sevilla, Spain
| | - Carlos Loucera
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocío, 41013 Sevilla, Spain
| | - Marta R Hidalgo
- Unidad de Bioinformática y Bioestadística, Centro de Investigación Príncipe Felipe (CIPF), 46012 Valencia, Spain
| | - Joaquín Dopazo
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocío, 41013 Sevilla, Spain
| |
Collapse
|
221
|
Aubin-Frankowski PC, Vert JP. Gene regulation inference from single-cell RNA-seq data with linear differential equations and velocity inference. Bioinformatics 2020; 36:4774-4780. [DOI: 10.1093/bioinformatics/btaa576] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Revised: 05/04/2020] [Accepted: 06/11/2020] [Indexed: 11/14/2022] Open
Abstract
Abstract
Motivation
Single-cell RNA sequencing (scRNA-seq) offers new possibilities to infer gene regulatory network (GRNs) for biological processes involving a notion of time, such as cell differentiation or cell cycles. It also raises many challenges due to the destructive measurements inherent to the technology.
Results
In this work, we propose a new method named GRISLI for de novo GRN inference from scRNA-seq data. GRISLI infers a velocity vector field in the space of scRNA-seq data from profiles of individual cells, and models the dynamics of cell trajectories with a linear ordinary differential equation to reconstruct the underlying GRN with a sparse regression procedure. We show on real data that GRISLI outperforms a recently proposed state-of-the-art method for GRN reconstruction from scRNA-seq data.
Availability and implementation
The MATLAB code of GRISLI is available at: https://github.com/PCAubin/GRISLI.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Jean-Philippe Vert
- MINES ParisTech, PSL Research University, CBIO – Centre for Computational Biology, F-75006 Paris, France
- Google Research, Brain team, 75009 Paris, France
| |
Collapse
|
222
|
Chanda P, Costa E, Hu J, Sukumar S, Van Hemert J, Walia R. Information Theory in Computational Biology: Where We Stand Today. ENTROPY (BASEL, SWITZERLAND) 2020; 22:E627. [PMID: 33286399 PMCID: PMC7517167 DOI: 10.3390/e22060627] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 05/31/2020] [Accepted: 06/03/2020] [Indexed: 12/30/2022]
Abstract
"A Mathematical Theory of Communication" was published in 1948 by Claude Shannon to address the problems in the field of data compression and communication over (noisy) communication channels. Since then, the concepts and ideas developed in Shannon's work have formed the basis of information theory, a cornerstone of statistical learning and inference, and has been playing a key role in disciplines such as physics and thermodynamics, probability and statistics, computational sciences and biological sciences. In this article we review the basic information theory based concepts and describe their key applications in multiple major areas of research in computational biology-gene expression and transcriptomics, alignment-free sequence comparison, sequencing and error correction, genome-wide disease-gene association mapping, metabolic networks and metabolomics, and protein sequence, structure and interaction analysis.
Collapse
Affiliation(s)
- Pritam Chanda
- Corteva Agriscience™, Indianapolis, IN 46268, USA
- Computer and Information Science, Indiana University-Purdue University, Indianapolis, IN 46202, USA
| | - Eduardo Costa
- Corteva Agriscience™, Mogi Mirim, Sao Paulo 13801-540, Brazil
| | - Jie Hu
- Corteva Agriscience™, Indianapolis, IN 46268, USA
| | | | | | - Rasna Walia
- Corteva Agriscience™, Johnston, IA 50131, USA
| |
Collapse
|
223
|
Holding AN, Cook HV, Markowetz F. Data generation and network reconstruction strategies for single cell transcriptomic profiles of CRISPR-mediated gene perturbations. BIOCHIMICA ET BIOPHYSICA ACTA. GENE REGULATORY MECHANISMS 2020; 1863:194441. [PMID: 31756390 DOI: 10.1016/j.bbagrm.2019.194441] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 10/01/2019] [Accepted: 10/01/2019] [Indexed: 02/05/2023]
Abstract
Recent advances in single-cell RNA-sequencing (scRNA-seq) in combination with CRISPR/Cas9 technologies have enabled the development of methods for large-scale perturbation studies with transcriptional readouts. These methods are highly scalable and have the potential to provide a wealth of information on the biological networks that underlie cellular response. Here we discuss how to overcome several key challenges to generate and analyse data for the confident reconstruction of models of the underlying cellular network. Some challenges are generic, and apply to analysing any single-cell transcriptomic data, while others are specific to combined single-cell CRISPR/Cas9 data, in particular barcode swapping, knockdown efficiency, multiplicity of infection and potential confounding factors. We also provide a curated collection of published data sets to aid the development of analysis strategies. Finally, we discuss several network reconstruction approaches, including co-expression networks and Bayesian networks, as well as their limitations, and highlight the potential of Nested Effects Models for network reconstruction from scRNA-seq data. This article is part of a Special Issue entitled: Transcriptional Profiles and Regulatory Gene Networks edited by Dr. Dr. Federico Manuel Giorgi and Dr. Shaun Mahony.
Collapse
Affiliation(s)
- Andrew N Holding
- Department of Biology, University of York, York, UK; York Biomedical Research Institute, University of York, York, UK; CRUK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, UK; The Alan Turing Institute, 96 Euston Road, Kings Cross, London, UK
| | - Helen V Cook
- Department of Biology, University of York, York, UK
| | | |
Collapse
|
224
|
Shang L, Smith JA, Zhou X. Leveraging gene co-expression patterns to infer trait-relevant tissues in genome-wide association studies. PLoS Genet 2020; 16:e1008734. [PMID: 32310941 PMCID: PMC7192514 DOI: 10.1371/journal.pgen.1008734] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Revised: 04/30/2020] [Accepted: 03/24/2020] [Indexed: 12/11/2022] Open
Abstract
Genome-wide association studies (GWASs) have identified many SNPs associated with various common diseases. Understanding the biological functions of these identified SNP associations requires identifying disease/trait relevant tissues or cell types. Here, we develop a network method, CoCoNet, to facilitate the identification of trait-relevant tissues or cell types. Different from existing approaches, CoCoNet incorporates tissue-specific gene co-expression networks constructed from either bulk or single cell RNA sequencing (RNAseq) studies with GWAS data for trait-tissue inference. In particular, CoCoNet relies on a covariance regression network model to express gene-level effect measurements for the given GWAS trait as a function of the tissue-specific co-expression adjacency matrix. With a composite likelihood-based inference algorithm, CoCoNet is scalable to tens of thousands of genes. We validate the performance of CoCoNet through extensive simulations. We apply CoCoNet for an in-depth analysis of four neurological disorders and four autoimmune diseases, where we integrate the corresponding GWASs with bulk RNAseq data from 38 tissues and single cell RNAseq data from 10 cell types. In the real data applications, we show how CoCoNet can help identify specific glial cell types relevant for neurological disorders and identify disease-targeted colon tissues as relevant for autoimmune diseases.
Collapse
Affiliation(s)
- Lulu Shang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States of America
| | - Jennifer A. Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, United States of America
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, United States of America
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States of America
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, United States of America
| |
Collapse
|
225
|
Qiu X, Rahimzamani A, Wang L, Ren B, Mao Q, Durham T, McFaline-Figueroa JL, Saunders L, Trapnell C, Kannan S. Inferring Causal Gene Regulatory Networks from Coupled Single-Cell Expression Dynamics Using Scribe. Cell Syst 2020; 10:265-274.e11. [PMID: 32135093 PMCID: PMC7223477 DOI: 10.1016/j.cels.2020.02.003] [Citation(s) in RCA: 83] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Revised: 06/08/2019] [Accepted: 02/05/2020] [Indexed: 01/13/2023]
Abstract
Here, we present Scribe (https://github.com/aristoteleo/Scribe-py), a toolkit for detecting and visualizing causal regulatory interactions between genes and explore the potential for single-cell experiments to power network reconstruction. Scribe employs restricted directed information to determine causality by estimating the strength of information transferred from a potential regulator to its downstream target. We apply Scribe and other leading approaches for causal network reconstruction to several types of single-cell measurements and show that there is a dramatic drop in performance for "pseudotime"-ordered single-cell data compared with true time-series data. We demonstrate that performing causal inference requires temporal coupling between measurements. We show that methods such as "RNA velocity" restore some degree of coupling through an analysis of chromaffin cell fate commitment. These analyses highlight a shortcoming in experimental and computational methods for analyzing gene regulation at single-cell resolution and suggest ways of overcoming it.
Collapse
Affiliation(s)
- Xiaojie Qiu
- Molecular & Cellular Biology Program, University of Washington, Seattle, WA, USA; Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Arman Rahimzamani
- Department of Electrical Engineering, University of Washington, Seattle, WA, USA
| | - Li Wang
- Department of Mathematics, University of Texas at Arlington, Arlington, TX, USA
| | - Bingcheng Ren
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Qi Mao
- HERE company, Chicago, IL 60606, USA
| | - Timothy Durham
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | | | - Lauren Saunders
- Molecular & Cellular Biology Program, University of Washington, Seattle, WA, USA; Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Cole Trapnell
- Molecular & Cellular Biology Program, University of Washington, Seattle, WA, USA; Department of Genome Sciences, University of Washington, Seattle, WA, USA; Brotman-Baty Institute for Precision Medicine, Seattle, WA, USA.
| | - Sreeram Kannan
- Department of Electrical Engineering, University of Washington, Seattle, WA, USA.
| |
Collapse
|
226
|
Verrou K, Tsamardinos I, Papoutsoglou G. Learning Pathway Dynamics from Single-Cell Proteomic Data: A Comparative Study. Cytometry A 2020; 97:241-252. [PMID: 32100455 PMCID: PMC7687117 DOI: 10.1002/cyto.a.23976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Revised: 11/07/2019] [Accepted: 01/13/2020] [Indexed: 12/01/2022]
Abstract
Single-cell platforms provide statistically large samples of snapshot observations capable of resolving intrercellular heterogeneity. Currently, there is a growing literature on algorithms that exploit this attribute in order to infer the trajectory of biological mechanisms, such as cell proliferation and differentiation. Despite the efforts, the trajectory inference methodology has not yet been used for addressing the challenging problem of learning the dynamics of protein signaling systems. In this work, we assess this prospect by testing the performance of this class of algorithms on four proteomic temporal datasets. To evaluate the learning quality, we design new general-purpose evaluation metrics that are able to quantify performance on (i) the biological meaning of the output, (ii) the consistency of the inferred trajectory, (iii) the algorithm robustness, (iv) the correlation of the learning output with the initial dataset, and (v) the roughness of the cell parameter levels though the inferred trajectory. We show that experimental time alone is insufficient to provide knowledge about the order of proteins during signal transduction. Accordingly, we show that the inferred trajectories provide richer information about the underlying dynamics. We learn that established methods tested on high-dimensional data with small sample size, slow dynamics, and complex structures (e.g. bifurcations) cannot always work in the signaling setting. Among the methods we evaluate, Scorpius and a newly introduced approach that combines Diffusion Maps and Principal Curves were found to perform adequately in recovering the progression of signal transduction although their performance on some metrics varies from one dataset to another. The novel metrics we devise highlight that it is difficult to conclude, which one method is universally applicable for the task. Arguably, there are still many challenges and open problems to resolve. © 2020 The Authors. Cytometry Part A published by Wiley Periodicals, Inc. on behalf of International Society for Advancement of Cytometry.
Collapse
Affiliation(s)
| | - Ioannis Tsamardinos
- Computer Science DepartmentUniversity of CreteHeraklionGreece
- Gnosis Data Analysis PCHeraklionGreece
| | | |
Collapse
|
227
|
Efremova M, Vento-Tormo R, Park JE, Teichmann SA, James KR. Immunology in the Era of Single-Cell Technologies. Annu Rev Immunol 2020; 38:727-757. [PMID: 32075461 DOI: 10.1146/annurev-immunol-090419-020340] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Immune cells are characterized by diversity, specificity, plasticity, and adaptability-properties that enable them to contribute to homeostasis and respond specifically and dynamically to the many threats encountered by the body. Single-cell technologies, including the assessment of transcriptomics, genomics, and proteomics at the level of individual cells, are ideally suited to studying these properties of immune cells. In this review we discuss the benefits of adopting single-cell approaches in studying underappreciated qualities of immune cells and highlight examples where these technologies have been critical to advancing our understanding of the immune system in health and disease.
Collapse
Affiliation(s)
- Mirjana Efremova
- Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, United Kingdom; ,
| | - Roser Vento-Tormo
- Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, United Kingdom; ,
| | - Jong-Eun Park
- Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, United Kingdom; ,
| | - Sarah A Teichmann
- Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, United Kingdom; , .,Theory of Condensed Matter, Department of Physics, University of Cambridge, Cambridgeshire CB3 0HE, United Kingdom.,European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire CB10 1SA, United Kingdom
| | - Kylie R James
- Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, United Kingdom; ,
| |
Collapse
|
228
|
Wang H, Lian Y, Li C, Ma Y, Yan Z, Dong C. SIN-KNO: A method of gene regulatory network inference using single-cell transcription and gene knockout data. J Bioinform Comput Biol 2020; 17:1950035. [PMID: 32019417 DOI: 10.1142/s0219720019500355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
As a tool of interpreting and analyzing genetic data, gene regulatory network (GRN) could reveal regulatory relationships between genes, proteins, and small molecules, as well as understand physiological activities and functions within biological cells, interact in pathways, and how to make changes in the organism. Traditional GRN research focuses on the analysis of the regulatory relationships through the average of cellular gene expressions. These methods are difficult to identify the cell heterogeneity of gene expression. Existing methods for inferring GRN using single-cell transcriptional data lack expression information when genes reach steady state, and the high dimensionality of single-cell data leads to high temporal and spatial complexity of the algorithm. In order to solve the problem in traditional GRN inference methods, including the lack of cellular heterogeneity information, single-cell data complexity and lack of steady-state information, we propose a method for GRN inference using single-cell transcription and gene knockout data, called SINgle-cell transcription data-KNOckout data (SIN-KNO), which focuses on combining dynamic and steady-state information of regulatory relationship contained in gene expression. Capturing cell heterogeneity information could help understand the gene expression difference in different cells. So, we could observe gene expression changes more accurately. Gene knockout data could observe the gene expression levels at steady-state of all other genes when one gene is knockout. Classifying the genes before analyzing the single-cell data could determine a large number of non-existent regulation, greatly reducing the number of regulation required for inference. In order to show the efficiency, the proposed method has been compared with several typical methods in this area including GENIE3, JUMP3, and SINCERITIES. The results of the evaluation indicate that the proposed method can analyze the diversified information contained in the two types of data, establish a more accurate gene regulation network, and improve the computational efficiency. The method provides a new thinking for dealing with large datasets and high computational complexity of single-cell data in the GRN inference.
Collapse
Affiliation(s)
- Huiqing Wang
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, Shanxi, China
| | - Yuanyuan Lian
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, Shanxi, China
| | - Chun Li
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, Shanxi, China
| | - Yue Ma
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, Shanxi, China
| | - Zhiliang Yan
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, Shanxi, China
| | - Chunlin Dong
- Dryland Agriculture Research Center, Shanxi Academy of Agricultural Sciences, Taiyuan, Shanxi, China
| |
Collapse
|
229
|
Pratapa A, Jalihal AP, Law JN, Bharadwaj A, Murali TM. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat Methods 2020; 17:147-154. [PMID: 31907445 PMCID: PMC7098173 DOI: 10.1038/s41592-019-0690-6] [Citation(s) in RCA: 385] [Impact Index Per Article: 77.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 11/22/2019] [Indexed: 01/10/2023]
Abstract
We present a systematic evaluation of state-of-the-art algorithms for inferring gene regulatory networks from single-cell transcriptional data. As the ground truth for assessing accuracy, we use synthetic networks with predictable trajectories, literature-curated Boolean models and diverse transcriptional regulatory networks. We develop a strategy to simulate single-cell transcriptional data from synthetic and Boolean networks that avoids pitfalls of previously used methods. Furthermore, we collect networks from multiple experimental single-cell RNA-seq datasets. We develop an evaluation framework called BEELINE. We find that the area under the precision-recall curve and early precision of the algorithms are moderate. The methods are better in recovering interactions in synthetic networks than Boolean models. The algorithms with the best early precision values for Boolean models also perform well on experimental datasets. Techniques that do not require pseudotime-ordered cells are generally more accurate. Based on these results, we present recommendations to end users. BEELINE will aid the development of gene regulatory network inference algorithms.
Collapse
Affiliation(s)
- Aditya Pratapa
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | - Amogh P Jalihal
- Genetics, Bioinformatics, and Computational Biology Ph.D. Program, Virginia Tech, Blacksburg, VA, USA
| | - Jeffrey N Law
- Genetics, Bioinformatics, and Computational Biology Ph.D. Program, Virginia Tech, Blacksburg, VA, USA
| | - Aditya Bharadwaj
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | - T M Murali
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA.
| |
Collapse
|
230
|
Chen X, Li M, Zheng R, Wu FX, Wang J. D3GRN: a data driven dynamic network construction method to infer gene regulatory networks. BMC Genomics 2019; 20:929. [PMID: 31881937 PMCID: PMC6933629 DOI: 10.1186/s12864-019-6298-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND To infer gene regulatory networks (GRNs) from gene-expression data is still a fundamental and challenging problem in systems biology. Several existing algorithms formulate GRNs inference as a regression problem and obtain the network with an ensemble strategy. Recent studies on data driven dynamic network construction provide us a new perspective to solve the regression problem. RESULTS In this study, we propose a data driven dynamic network construction method to infer gene regulatory network (D3GRN), which transforms the regulatory relationship of each target gene into functional decomposition problem and solves each sub problem by using the Algorithm for Revealing Network Interactions (ARNI). To remedy the limitation of ARNI in constructing networks solely from the unit level, a bootstrapping and area based scoring method is taken to infer the final network. On DREAM4 and DREAM5 benchmark datasets, D3GRN performs competitively with the state-of-the-art algorithms in terms of AUPR. CONCLUSIONS We have proposed a novel data driven dynamic network construction method by combining ARNI with bootstrapping and area based scoring strategy. The proposed method performs well on the benchmark datasets, contributing as a competitive method to infer gene regulatory networks in a new perspective.
Collapse
Affiliation(s)
- Xiang Chen
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, China.
| | - Ruiqing Zheng
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Fang-Xiang Wu
- Department of Mechanical Engineering and Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
231
|
Mercatelli D, Scalambra L, Triboli L, Ray F, Giorgi FM. Gene regulatory network inference resources: A practical overview. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1863:194430. [PMID: 31678629 DOI: 10.1016/j.bbagrm.2019.194430] [Citation(s) in RCA: 75] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 09/06/2019] [Accepted: 09/09/2019] [Indexed: 02/08/2023]
Abstract
Transcriptional regulation is a fundamental molecular mechanism involved in almost every aspect of life, from homeostasis to development, from metabolism to behavior, from reaction to stimuli to disease progression. In recent years, the concept of Gene Regulatory Networks (GRNs) has grown popular as an effective applied biology approach for describing the complex and highly dynamic set of transcriptional interactions, due to its easy-to-interpret features. Since cataloguing, predicting and understanding every GRN connection in all species and cellular contexts remains a great challenge for biology, researchers have developed numerous tools and methods to infer regulatory processes. In this review, we catalogue these methods in six major areas, based on the dominant underlying information leveraged to infer GRNs: Coexpression, Sequence Motifs, Chromatin Immunoprecipitation (ChIP), Orthology, Literature and Protein-Protein Interaction (PPI) specifically focused on transcriptional complexes. The methods described here cover a wide range of user-friendliness: from web tools that require no prior computational expertise to command line programs and algorithms for large scale GRN inferences. Each method for GRN inference described herein effectively illustrates a type of transcriptional relationship, with many methods being complementary to others. While a truly holistic approach for inferring and displaying GRNs remains one of the greatest challenges in the field of systems biology, we believe that the integration of multiple methods described herein provides an effective means with which experimental and computational biologists alike may obtain the most complete pictures of transcriptional relationships. This article is part of a Special Issue entitled: Transcriptional Profiles and Regulatory Gene Networks edited by Dr. Federico Manuel Giorgi and Dr. Shaun Mahony.
Collapse
Affiliation(s)
- Daniele Mercatelli
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Laura Scalambra
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Luca Triboli
- Centre for Integrative Biology (CIBIO), University of Trento, Italy
| | - Forest Ray
- Department of Systems Biology, Columbia University Medical Center, New York, NY, United States
| | - Federico M Giorgi
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.
| |
Collapse
|
232
|
Tolios A, De Las Rivas J, Hovig E, Trouillas P, Scorilas A, Mohr T. Computational approaches in cancer multidrug resistance research: Identification of potential biomarkers, drug targets and drug-target interactions. Drug Resist Updat 2019; 48:100662. [PMID: 31927437 DOI: 10.1016/j.drup.2019.100662] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2019] [Revised: 10/15/2019] [Accepted: 10/17/2019] [Indexed: 02/07/2023]
Abstract
Like physics in the 19th century, biology and molecular biology in particular, has been fertilized and enhanced like few other scientific fields, by the incorporation of mathematical methods. In the last decades, a whole new scientific field, bioinformatics, has developed with an output of over 30,000 papers a year (Pubmed search using the keyword "bioinformatics"). Huge databases of mass throughput data have been established, with ArrayExpress alone containing more than 2.7 million assays (October 2019). Computational methods have become indispensable tools in molecular biology, particularly in one of the most challenging areas of cancer research, multidrug resistance (MDR). However, confronted with a plethora of different algorithms, approaches, and methods, the average researcher faces key questions: Which methods do exist? Which methods can be used to tackle the aims of a given study? Or, more generally, how do I use computational biology/bioinformatics to bolster my research? The current review is aimed at providing guidance to existing methods with relevance to MDR research. In particular, we provide an overview on: a) the identification of potential biomarkers using expression data; b) the prediction of treatment response by machine learning methods; c) the employment of network approaches to identify gene/protein regulatory networks and potential key players; d) the identification of drug-target interactions; e) the use of bipartite networks to identify multidrug targets; f) the identification of cellular subpopulations with the MDR phenotype; and, finally, g) the use of molecular modeling methods to guide and enhance drug discovery. This review shall serve as a guide through some of the basic concepts useful in MDR research. It shall give the reader some ideas about the possibilities in MDR research by using computational tools, and, finally, it shall provide a short overview of relevant literature.
Collapse
Affiliation(s)
- A Tolios
- Department of Blood Group Serology and Transfusion Medicine, Medical University of Vienna, Vienna, Austria; Department of Laboratory Medicine, Medical University of Vienna, Vienna, Austria; Institute of Clinical Chemistry and Laboratory Medicine, Heinrich Heine University, Duesseldorf, Germany.
| | - J De Las Rivas
- Bioinformatics and Functional Genomics Group, Cancer Research Center (CiC-IMBCC, CSIC/USAL/IBSAL), Consejo Superior de Investigaciones Científicas (CSIC) and University of Salamanca (USAL), Campus Miguel de Unamuno s/n, Salamanca, Spain.
| | - E Hovig
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital and Center for Bioinformatics, Department of Informatics, University of Oslo, Oslo, Norway.
| | - P Trouillas
- UMR 1248 INSERM, Univ. Limoges, 2 rue du Dr Marland, 87052, Limoges, France; RCPTM, University Palacký of Olomouc, tr. 17. listopadu 12, 771 46, Olomouc, Czech Republic.
| | - A Scorilas
- Department of Biochemistry & Molecular Biology, Faculty of Biology, National and Kapodistrian University of Athens, Athens, Greece.
| | - T Mohr
- Institute of Cancer Research, Department of Medicine I, Medical University of Vienna, Vienna, Austria; ScienceConsult - DI Thomas Mohr KG, Guntramsdorf, Austria.
| |
Collapse
|
233
|
Chew G, Petretto E. Transcriptional Networks of Microglia in Alzheimer's Disease and Insights into Pathogenesis. Genes (Basel) 2019; 10:E798. [PMID: 31614849 PMCID: PMC6826883 DOI: 10.3390/genes10100798] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 09/30/2019] [Accepted: 10/11/2019] [Indexed: 02/07/2023] Open
Abstract
Microglia, the main immune cells of the central nervous system, are increasingly implicated in Alzheimer's disease (AD). Manifold transcriptomic studies in the brain have not only highlighted microglia's role in AD pathogenesis, but also mapped crucial pathological processes and identified new therapeutic targets. An important component of many of these transcriptomic studies is the investigation of gene expression networks in AD brain, which has provided important new insights into how coordinated gene regulatory programs in microglia (and other cell types) underlie AD pathogenesis. Given the rapid technological advancements in transcriptional profiling, spanning from microarrays to single-cell RNA sequencing (scRNA-seq), tools used for mapping gene expression networks have evolved to keep pace with the unique features of each transcriptomic platform. In this article, we review the trajectory of transcriptomic network analyses in AD from brain to microglia, highlighting the corresponding methodological developments. Lastly, we discuss examples of how transcriptional network analysis provides new insights into AD mechanisms and pathogenesis.
Collapse
Affiliation(s)
- Gabriel Chew
- Programme in Cardiovascular and Metabolic Disorders, Duke-NUS Medical School, 8 College Road, 69857 Singapore, Singapore.
| | - Enrico Petretto
- Programme in Cardiovascular and Metabolic Disorders, Duke-NUS Medical School, 8 College Road, 69857 Singapore, Singapore.
| |
Collapse
|
234
|
Denyer T, Ma X, Klesen S, Scacchi E, Nieselt K, Timmermans MCP. Spatiotemporal Developmental Trajectories in the Arabidopsis Root Revealed Using High-Throughput Single-Cell RNA Sequencing. Dev Cell 2019; 48:840-852.e5. [PMID: 30913408 DOI: 10.1016/j.devcel.2019.02.022] [Citation(s) in RCA: 312] [Impact Index Per Article: 52.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 01/30/2019] [Accepted: 02/22/2019] [Indexed: 12/11/2022]
Abstract
High-throughput single-cell RNA sequencing (scRNA-seq) is becoming a cornerstone of developmental research, providing unprecedented power in understanding dynamic processes. Here, we present a high-resolution scRNA-seq expression atlas of the Arabidopsis root composed of thousands of independently profiled cells. This atlas provides detailed spatiotemporal information, identifying defining expression features for all major cell types, including the scarce cells of the quiescent center. These reveal key developmental regulators and downstream genes that translate cell fate into distinctive cell shapes and functions. Developmental trajectories derived from pseudotime analysis depict a finely resolved cascade of cell progressions from the niche through differentiation that are supported by mirroring expression waves of highly interconnected transcription factors. This study demonstrates the power of applying scRNA-seq to plants and provides an unparalleled spatiotemporal perspective of root cell differentiation.
Collapse
Affiliation(s)
- Tom Denyer
- Center for Plant Molecular Biology, University of Tübingen, Auf der Morgenstelle 32, Tübingen 72076, Germany
| | - Xiaoli Ma
- Center for Plant Molecular Biology, University of Tübingen, Auf der Morgenstelle 32, Tübingen 72076, Germany
| | - Simon Klesen
- Center for Plant Molecular Biology, University of Tübingen, Auf der Morgenstelle 32, Tübingen 72076, Germany
| | - Emanuele Scacchi
- Center for Plant Molecular Biology, University of Tübingen, Auf der Morgenstelle 32, Tübingen 72076, Germany
| | - Kay Nieselt
- Center for Bioinformatics, University of Tübingen, Sand 14, Tübingen 72076, Germany
| | - Marja C P Timmermans
- Center for Plant Molecular Biology, University of Tübingen, Auf der Morgenstelle 32, Tübingen 72076, Germany.
| |
Collapse
|
235
|
Blencowe M, Arneson D, Ding J, Chen YW, Saleem Z, Yang X. Network modeling of single-cell omics data: challenges, opportunities, and progresses. Emerg Top Life Sci 2019; 3:379-398. [PMID: 32270049 PMCID: PMC7141415 DOI: 10.1042/etls20180176] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Revised: 06/07/2019] [Accepted: 06/24/2019] [Indexed: 01/07/2023]
Abstract
Single-cell multi-omics technologies are rapidly evolving, prompting both methodological advances and biological discoveries at an unprecedented speed. Gene regulatory network modeling has been used as a powerful approach to elucidate the complex molecular interactions underlying biological processes and systems, yet its application in single-cell omics data modeling has been met with unique challenges and opportunities. In this review, we discuss these challenges and opportunities, and offer an overview of the recent development of network modeling approaches designed to capture dynamic networks, within-cell networks, and cell-cell interaction or communication networks. Finally, we outline the remaining gaps in single-cell gene network modeling and the outlooks of the field moving forward.
Collapse
Affiliation(s)
- Montgomery Blencowe
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
| | - Douglas Arneson
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
| | - Jessica Ding
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
| | - Yen-Wei Chen
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
- Molecular Toxicology Program, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
| | - Zara Saleem
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
| | - Xia Yang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
- Molecular Toxicology Program, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
| |
Collapse
|
236
|
Zhang Q, Caudle WM, Pi J, Bhattacharya S, Andersen ME, Kaminski NE, Conolly RB. Embracing Systems Toxicology at Single-Cell Resolution. CURRENT OPINION IN TOXICOLOGY 2019; 16:49-57. [PMID: 31768481 PMCID: PMC6876623 DOI: 10.1016/j.cotox.2019.04.003] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
As systems biology expands its multi-omic spectrum to increasing resolutions, distinguishing cells based on single-cell profiles becomes feasible. Unlike traditional bulk assays that average cellular responses and blur the distinct identities of responsive cells, single-cell technologies enable sensitive detection of small cellular changes and precise identification of those cells perturbed by toxicants. Among the suite of omic technologies that continue to expand and become affordable, single-cell RNA sequencing (scRNA-seq) is at the cutting edge and leading the way to transform systems toxicology. Single-cell systems toxicology can provide a wealth of information to elucidate cell-specific alterations and response trajectories, detect points-of-departure, map and develop dynamical models of toxicity pathways.
Collapse
Affiliation(s)
- Qiang Zhang
- Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, Georgia, USA
| | - W. Michael Caudle
- Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, Georgia, USA
| | - Jingbo Pi
- Program of Environmental Toxicology, School of Public Health, China Medical University, Shenyang, China
| | - Sudin Bhattacharya
- Department of Biomedical Engineering, Department of Pharmacology and Toxicology, Center for Research on Ingredient Safety, Institute for Quantitative Health Science and Engineering, and Institute for Integrative Toxicology, Michigan State University, East Lansing, Michigan, USA
| | | | - Norbert E. Kaminski
- Departments of Pharmacology and Toxicology and Institute for Integrative Toxicology, Michigan State University, East Lansing, Michigan, USA
| | - Rory B. Conolly
- Integrated Systems Toxicology Division, National Health and Environmental Effects Research Laboratory, United States Environmental Protection Agency, Durham, North Carolina, USA
| |
Collapse
|
237
|
Zeng T, Dai H. Single-Cell RNA Sequencing-Based Computational Analysis to Describe Disease Heterogeneity. Front Genet 2019; 10:629. [PMID: 31354786 PMCID: PMC6640157 DOI: 10.3389/fgene.2019.00629] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 06/17/2019] [Indexed: 12/25/2022] Open
Abstract
The trillions of cells in the human body can be viewed as elementary but essential biological units that achieve different body states, but the low resolution of previous cell isolation and measurement approaches limits our understanding of the cell-specific molecular profiles. The recent establishment and rapid growth of single-cell sequencing technology has facilitated the identification of molecular profiles of heterogeneous cells, especially on the transcription level of single cells [single-cell RNA sequencing (scRNA-seq)]. As a novel method, the robustness of scRNA-seq under changing conditions will determine its practical potential in major research programs and clinical applications. In this review, we first briefly presented the scRNA-seq-related methods from the point of view of experiments and computation. Then, we compared several state-of-the-art scRNA-seq analysis frameworks mainly by analyzing their performance robustness on independent scRNA-seq datasets for the same complex disease. Finally, we elaborated on our hypothesis on consensus scRNA-seq analysis and summarized the potential indicative and predictive roles of individual cells in understanding disease heterogeneity by single-cell technologies.
Collapse
Affiliation(s)
- Tao Zeng
- Key Laboratory of Systems Biology, Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | | |
Collapse
|
238
|
Luecken MD, Theis FJ. Current best practices in single-cell RNA-seq analysis: a tutorial. Mol Syst Biol 2019; 15:e8746. [PMID: 31217225 PMCID: PMC6582955 DOI: 10.15252/msb.20188746] [Citation(s) in RCA: 1139] [Impact Index Per Article: 189.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Revised: 03/15/2019] [Accepted: 04/03/2019] [Indexed: 12/21/2022] Open
Abstract
Single-cell RNA-seq has enabled gene expression to be studied at an unprecedented resolution. The promise of this technology is attracting a growing user base for single-cell analysis methods. As more analysis tools are becoming available, it is becoming increasingly difficult to navigate this landscape and produce an up-to-date workflow to analyse one's data. Here, we detail the steps of a typical single-cell RNA-seq analysis, including pre-processing (quality control, normalization, data correction, feature selection, and dimensionality reduction) and cell- and gene-level downstream analysis. We formulate current best-practice recommendations for these steps based on independent comparison studies. We have integrated these best-practice recommendations into a workflow, which we apply to a public dataset to further illustrate how these steps work in practice. Our documented case study can be found at https://www.github.com/theislab/single-cell-tutorial This review will serve as a workflow tutorial for new entrants into the field, and help established users update their analysis pipelines.
Collapse
Affiliation(s)
- Malte D Luecken
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
- Department of Mathematics, Technische Universität München, Garching bei München, Germany
| |
Collapse
|
239
|
Tritschler S, Büttner M, Fischer DS, Lange M, Bergen V, Lickert H, Theis FJ. Concepts and limitations for learning developmental trajectories from single cell genomics. Development 2019; 146. [DOI: 10.1242/dev.170506] [Citation(s) in RCA: 132] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/30/2023]
Abstract
ABSTRACT
Single cell genomics has become a popular approach to uncover the cellular heterogeneity of progenitor and terminally differentiated cell types with great precision. This approach can also delineate lineage hierarchies and identify molecular programmes of cell-fate acquisition and segregation. Nowadays, tens of thousands of cells are routinely sequenced in single cell-based methods and even more are expected to be analysed in the future. However, interpretation of the resulting data is challenging and requires computational models at multiple levels of abstraction. In contrast to other applications of single cell sequencing, where clustering approaches dominate, developmental systems are generally modelled using continuous structures, trajectories and trees. These trajectory models carry the promise of elucidating mechanisms of development, disease and stimulation response at very high molecular resolution. However, their reliable analysis and biological interpretation requires an understanding of their underlying assumptions and limitations. Here, we review the basic concepts of such computational approaches and discuss the characteristics of developmental processes that can be learnt from trajectory models.
Collapse
Affiliation(s)
- Sophie Tritschler
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany
- Institute of Diabetes and Regeneration Research, Helmholtz Zentrum München, 85764 Neuherberg, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85353 Freising, Germany
| | - Maren Büttner
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany
- Department of Mathematics, Technische Universität München, 85748 Garching, Germany
| | - David S. Fischer
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85353 Freising, Germany
| | - Marius Lange
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany
- Department of Mathematics, Technische Universität München, 85748 Garching, Germany
| | - Volker Bergen
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany
- Department of Mathematics, Technische Universität München, 85748 Garching, Germany
| | - Heiko Lickert
- Institute of Diabetes and Regeneration Research, Helmholtz Zentrum München, 85764 Neuherberg, Germany
- German Center for Diabetes Research, 85764 Neuherberg, Germany
- Institute of Stem Cell Research, Helmholtz Zentrum München, 85764 Neuherberg, Germany
| | - Fabian J. Theis
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany
- Department of Mathematics, Technische Universität München, 85748 Garching, Germany
| |
Collapse
|
240
|
Iacono G, Massoni-Badosa R, Heyn H. Single-cell transcriptomics unveils gene regulatory network plasticity. Genome Biol 2019; 20:110. [PMID: 31159854 PMCID: PMC6547541 DOI: 10.1186/s13059-019-1713-4] [Citation(s) in RCA: 131] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Accepted: 05/08/2019] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Single-cell RNA sequencing (scRNA-seq) plays a pivotal role in our understanding of cellular heterogeneity. Current analytical workflows are driven by categorizing principles that consider cells as individual entities and classify them into complex taxonomies. RESULTS We devise a conceptually different computational framework based on a holistic view, where single-cell datasets are used to infer global, large-scale regulatory networks. We develop correlation metrics that are specifically tailored to single-cell data, and then generate, validate, and interpret single-cell-derived regulatory networks from organs and perturbed systems, such as diabetes and Alzheimer's disease. Using tools from graph theory, we compute an unbiased quantification of a gene's biological relevance and accurately pinpoint key players in organ function and drivers of diseases. CONCLUSIONS Our approach detects multiple latent regulatory changes that are invisible to single-cell workflows based on clustering or differential expression analysis, significantly broadening the biological insights that can be obtained with this leading technology.
Collapse
Affiliation(s)
- Giovanni Iacono
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, 08028, Barcelona, Spain.
| | - Ramon Massoni-Badosa
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, 08028, Barcelona, Spain
| | - Holger Heyn
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, 08028, Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
| |
Collapse
|
241
|
Bonnaffoux A, Herbach U, Richard A, Guillemin A, Gonin-Giraud S, Gros PA, Gandrillon O. WASABI: a dynamic iterative framework for gene regulatory network inference. BMC Bioinformatics 2019; 20:220. [PMID: 31046682 PMCID: PMC6498543 DOI: 10.1186/s12859-019-2798-1] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Accepted: 04/09/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Inference of gene regulatory networks from gene expression data has been a long-standing and notoriously difficult task in systems biology. Recently, single-cell transcriptomic data have been massively used for gene regulatory network inference, with both successes and limitations. RESULTS In the present work we propose an iterative algorithm called WASABI, dedicated to inferring a causal dynamical network from time-stamped single-cell data, which tackles some of the limitations associated with current approaches. We first introduce the concept of waves, which posits that the information provided by an external stimulus will affect genes one-by-one through a cascade, like waves spreading through a network. This concept allows us to infer the network one gene at a time, after genes have been ordered regarding their time of regulation. We then demonstrate the ability of WASABI to correctly infer small networks, which have been simulated in silico using a mechanistic model consisting of coupled piecewise-deterministic Markov processes for the proper description of gene expression at the single-cell level. We finally apply WASABI on in vitro generated data on an avian model of erythroid differentiation. The structure of the resulting gene regulatory network sheds a new light on the molecular mechanisms controlling this process. In particular, we find no evidence for hub genes and a much more distributed network structure than expected. Interestingly, we find that a majority of genes are under the direct control of the differentiation-inducing stimulus. CONCLUSIONS Together, these results demonstrate WASABI versatility and ability to tackle some general gene regulatory networks inference issues. It is our hope that WASABI will prove useful in helping biologists to fully exploit the power of time-stamped single-cell data.
Collapse
Affiliation(s)
- Arnaud Bonnaffoux
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
- Inria Team Dracula, Inria Center Grenoble Rhône-Alpes, Lyon, France
- Cosmotech, Lyon, France
| | - Ulysse Herbach
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
- Inria Team Dracula, Inria Center Grenoble Rhône-Alpes, Lyon, France
- Univ Lyon, Université Claude Bernard Lyon 1, CNRS UMR 5208, Institut Camille Jordan, Villeurbanne, France
| | - Angélique Richard
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
| | - Anissa Guillemin
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
| | - Sandrine Gonin-Giraud
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
| | | | - Olivier Gandrillon
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
- Inria Team Dracula, Inria Center Grenoble Rhône-Alpes, Lyon, France
| |
Collapse
|
242
|
Lee LYH, Loscalzo J. Network Medicine in Pathobiology. THE AMERICAN JOURNAL OF PATHOLOGY 2019; 189:1311-1326. [PMID: 31014954 DOI: 10.1016/j.ajpath.2019.03.009] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 03/05/2019] [Indexed: 12/11/2022]
Abstract
The past decade has witnessed exponential growth in the generation of high-throughput human data across almost all known dimensions of biological systems. The discipline of network medicine has rapidly evolved in parallel, providing an unbiased, comprehensive biological framework through which to interrogate and integrate systematically these large-scale, multi-omic data to enhance our understanding of disease mechanisms and to design drugs that reflect a deep knowledge of molecular pathobiology. In this review, we discuss the key principles of network medicine and the human disease network and explore the latest applications of network medicine in this multi-omic era. We also highlight the current conceptual and technological challenges, which serve as exciting opportunities by which to improve and expand the network-based applications beyond the artificial boundaries of the current state of human pathobiology.
Collapse
Affiliation(s)
| | - Joseph Loscalzo
- Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts.
| |
Collapse
|
243
|
Computational methods for Gene Regulatory Networks reconstruction and analysis: A review. Artif Intell Med 2019; 95:133-145. [DOI: 10.1016/j.artmed.2018.10.006] [Citation(s) in RCA: 71] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Revised: 10/23/2018] [Accepted: 10/23/2018] [Indexed: 01/14/2023]
|
244
|
Zhang W, Li W, Zhang J, Wang N. Data Integration of Hybrid Microarray and Single Cell Expression Data to Enhance Gene Network Inference. Curr Bioinform 2019. [DOI: 10.2174/1574893614666190104142228] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Background:
Gene Regulatory Network (GRN) inference algorithms aim to explore
casual interactions between genes and transcriptional factors. High-throughput transcriptomics
data including DNA microarray and single cell expression data contain complementary
information in network inference.
Objective:
To enhance GRN inference, data integration across various types of expression data
becomes an economic and efficient solution.
Method:
In this paper, a novel E-alpha integration rule-based ensemble inference algorithm is
proposed to merge complementary information from microarray and single cell expression data.
This paper implements a Gradient Boosting Tree (GBT) inference algorithm to compute
importance scores for candidate gene-gene pairs. The proposed E-alpha rule quantitatively
evaluates the credibility levels of each information source and determines the final ranked list.
Results:
Two groups of in silico gene networks are applied to illustrate the effectiveness of the
proposed E-alpha integration. Experimental outcomes with size50 and size100 in silico gene
networks suggest that the proposed E-alpha rule significantly improves performance metrics
compared with single information source.
Conclusion:
In GRN inference, the integration of hybrid expression data using E-alpha rule
provides a feasible and efficient way to enhance performance metrics than solely increasing
sample sizes.
Collapse
Affiliation(s)
- Wei Zhang
- Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, 310013, China
| | - Wenchao Li
- Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, 310013, China
| | - Jianming Zhang
- Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, 310013, China
| | - Ning Wang
- Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, 310013, China
| |
Collapse
|
245
|
Dondelinger F, Mukherjee S. Statistical Network Inference for Time-Varying Molecular Data with Dynamic Bayesian Networks. Methods Mol Biol 2019; 1883:25-48. [PMID: 30547395 DOI: 10.1007/978-1-4939-8882-2_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/24/2023]
Abstract
In this chapter, we review the problem of network inference from time-course data, focusing on a class of graphical models known as dynamic Bayesian networks (DBNs). We discuss the relationship of DBNs to models based on ordinary differential equations, and consider extensions to nonlinear time dynamics. We provide an introduction to time-varying DBN models, which allow for changes to the network structure and parameters over time. We also discuss causal perspectives on network inference, including issues around model semantics that can arise due to missing variables. We present a case study of applying time-varying DBNs to gene expression measurements over the life cycle of Drosophila melanogaster. We finish with a discussion of future perspectives, including possible applications of time-varying network inference to single-cell gene expression data.
Collapse
Affiliation(s)
| | - Sach Mukherjee
- German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany
| |
Collapse
|
246
|
Todorov H, Cannoodt R, Saelens W, Saeys Y. Network Inference from Single-Cell Transcriptomic Data. Methods Mol Biol 2019; 1883:235-249. [PMID: 30547403 DOI: 10.1007/978-1-4939-8882-2_10] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Recent technological breakthroughs in single-cell RNA sequencing are revolutionizing modern experimental design in biology. The increasing size of the single-cell expression data from which networks can be inferred allows identifying more complex, non-linear dependencies between genes. Moreover, the inter-cellular variability that is observed in single-cell expression data can be used to infer not only one global network representing all the cells, but also numerous regulatory networks that are more specific to certain conditions. By experimentally perturbing certain genes, the deconvolution of the true contribution of these genes can also be greatly facilitated. In this chapter, we will therefore tackle the advantages of single-cell transcriptomic data and show how new methods exploit this novel data type to enhance the inference of gene regulatory networks.
Collapse
Affiliation(s)
- Helena Todorov
- Data Mining and Modelling for Biomedicine, VIB Center for Inflammation Research, Ghent, Belgium. .,Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium. .,Centre International de Recherche en Infectiologie, Inserm, U1111, Université Claude Bernard Lyon 1, CNRS, UMR5308, École Normale Supérieure de Lyon, Univ Lyon, Lyon, France.
| | - Robrecht Cannoodt
- Data Mining and Modelling for Biomedicine, VIB Center for Inflammation Research, Ghent, Belgium.,Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Wouter Saelens
- Data Mining and Modelling for Biomedicine, VIB Center for Inflammation Research, Ghent, Belgium.,Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| | - Yvan Saeys
- Data Mining and Modelling for Biomedicine, VIB Center for Inflammation Research, Ghent, Belgium.,Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| |
Collapse
|
247
|
Todorov H, Saeys Y. Computational approaches for high‐throughput single‐cell data analysis. FEBS J 2018; 286:1451-1467. [DOI: 10.1111/febs.14613] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2018] [Revised: 06/04/2018] [Accepted: 07/25/2018] [Indexed: 12/31/2022]
Affiliation(s)
- Helena Todorov
- Data Mining and Modelling for Biomedicine VIB Center for Inflammation Research Ghent Belgium
- Department of Applied Mathematics, Computer Science and Statistics Ghent University Belgium
- Centre International de Recherche en Infectiologie Inserm U1111, Université Claude Bernard Lyon 1 CNRS, UMR5308 École Normale Supérieure de Lyon Univ Lyon France
| | - Yvan Saeys
- Data Mining and Modelling for Biomedicine VIB Center for Inflammation Research Ghent Belgium
- Department of Applied Mathematics, Computer Science and Statistics Ghent University Belgium
| |
Collapse
|
248
|
Hwang B, Lee JH, Bang D. Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp Mol Med 2018; 50:1-14. [PMID: 30089861 PMCID: PMC6082860 DOI: 10.1038/s12276-018-0071-8] [Citation(s) in RCA: 1027] [Impact Index Per Article: 146.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Accepted: 12/13/2017] [Indexed: 12/15/2022] Open
Abstract
Rapid progress in the development of next-generation sequencing (NGS) technologies in recent years has provided many valuable insights into complex biological systems, ranging from cancer genomics to diverse microbial communities. NGS-based technologies for genomics, transcriptomics, and epigenomics are now increasingly focused on the characterization of individual cells. These single-cell analyses will allow researchers to uncover new and potentially unexpected biological discoveries relative to traditional profiling methods that assess bulk populations. Single-cell RNA sequencing (scRNA-seq), for example, can reveal complex and rare cell populations, uncover regulatory relationships between genes, and track the trajectories of distinct cell lineages in development. In this review, we will focus on technical challenges in single-cell isolation and library preparation and on computational analysis pipelines available for analyzing scRNA-seq data. Further technical improvements at the level of molecular and cell biology and in available bioinformatics tools will greatly facilitate both the basic science and medical applications of these sequencing technologies. Showing which genes are expressed, or switched on, in individual cells may help to reveal the first signs of disease. Each cell in an organism contains the same genetic information, but cell type and behavior depend on which genes are expressed. Previously, researchers could only sequence cells in batches, averaging the results, but technological improvements now allow sequencing of the genes expressed in an individual cell, known as single-cell RNA sequencing (scRNA-seq). Ji Hyun Lee (Kyung Hee University, Seoul) and Duhee Bang and Byungjin Hwang (Yonsei University, Seoul) have reviewed the available scRNA-seq technologies and the strategies available to analyze the large quantities of data produced. They conclude that scRNA-seq will impact both basic and medical science, from illuminating drug resistance in cancer to revealing the complex pathways of cell differentiation during development.
Collapse
Affiliation(s)
- Byungjin Hwang
- Department of Chemistry, Yonsei University, Seoul, Korea
| | - Ji Hyun Lee
- Department of Clinical Pharmacology and Therapeutics, College of Medicine, Kyung Hee University, Seoul, Korea. .,Kyung Hee Medical Science Research Institute, Kyung Hee University, Seoul, Korea.
| | - Duhee Bang
- Department of Chemistry, Yonsei University, Seoul, Korea.
| |
Collapse
|
249
|
Hon CC, Shin JW, Carninci P, Stubbington MJT. The Human Cell Atlas: Technical approaches and challenges. Brief Funct Genomics 2018; 17:283-294. [PMID: 29092000 PMCID: PMC6063304 DOI: 10.1093/bfgp/elx029] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The Human Cell Atlas is a large, international consortium that aims to identify and describe every cell type in the human body. The comprehensive cellular maps that arise from this ambitious effort have the potential to transform many aspects of fundamental biology and clinical practice. Here, we discuss the technical approaches that could be used today to generate such a resource and also the technical challenges that will be encountered.
Collapse
Affiliation(s)
- Chung-Chau Hon
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa, Japan
| | - Jay W Shin
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa, Japan
| | - Piero Carninci
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa, Japan
| | | |
Collapse
|
250
|
Fiers MWEJ, Minnoye L, Aibar S, Bravo González-Blas C, Kalender Atak Z, Aerts S. Mapping gene regulatory networks from single-cell omics data. Brief Funct Genomics 2018; 17:246-254. [PMID: 29342231 PMCID: PMC6063279 DOI: 10.1093/bfgp/elx046] [Citation(s) in RCA: 143] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Single-cell techniques are advancing rapidly and are yielding unprecedented insight into cellular heterogeneity. Mapping the gene regulatory networks (GRNs) underlying cell states provides attractive opportunities to mechanistically understand this heterogeneity. In this review, we discuss recently emerging methods to map GRNs from single-cell transcriptomics data, tackling the challenge of increased noise levels and data sparsity compared with bulk data, alongside increasing data volumes. Next, we discuss how new techniques for single-cell epigenomics, such as single-cell ATAC-seq and single-cell DNA methylation profiling, can be used to decipher gene regulatory programmes. We finally look forward to the application of single-cell multi-omics and perturbation techniques that will likely play important roles for GRN inference in the future.
Collapse
Affiliation(s)
- Mark W E J Fiers
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
| | - Liesbeth Minnoye
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| | - Sara Aibar
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| | - Carmen Bravo González-Blas
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| | - Zeynep Kalender Atak
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| | - Stein Aerts
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| |
Collapse
|