1
|
Gu S, Chen C, Wang J, Wang Y, Zhao L, Xiong Z, Zhang H, Deng T, Pan Q, Zheng Y, Li Y. Camellia Japonica Radix modulates gut microbiota and 9(S)-HpODE-mediated ferroptosis to alleviate oxidative stress against MASLD. PHYTOMEDICINE : INTERNATIONAL JOURNAL OF PHYTOTHERAPY AND PHYTOPHARMACOLOGY 2025; 143:156806. [PMID: 40334428 DOI: 10.1016/j.phymed.2025.156806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2025] [Revised: 04/04/2025] [Accepted: 04/25/2025] [Indexed: 05/09/2025]
Abstract
BACKGROUND Camellia japonica radix (CJR), derived from the root of Camellia japonica L., has the potential to function as an herbal tea substitute for the prevention and intervention of metabolic dysfunction-associated steatotic liver disease (MASLD). It can provide systemic therapeutic benefits, boast a favorable safety profile, facilitate convenient consumption, and support long-term applicability. Despite its potential, research on CJR remains limited. PURPOSE The aim of this study aims is to elucidate the therapeutic mechanisms of CJR in MASLD, thereby providing evidence to support its clinical application. METHODS The therapeutic effects of CJR were evaluated using a water-supplementation model in MASLD mice. Integrated microbiome, transcriptome, proteome, and metabolome analyses were employed to comprehensively explore the mechanisms involved. A drug-target pull-down assay was performed to identify specific protein targets of small molecule metabolites in vitro. Fecal microbiota transplantation in antibiotic-treated ABX mice was conducted to confirm the critical role of gut microbiota and its metabolites. Furthermore, customized medicated feed supplemented with linoleic acid was used to explore the intervention effect of its metabolite, 9(S)-HpODE, as well as to evaluate its dietary intervention potential. RESULTS This present study explicitly elucidates the efficacy of CJR extract in alleviating hepatic inflammation and steatosis in a MASLD model mice, with its pharmacological mechanism associated with gut microbiota, linoleic acid metabolism, and GPX4-mediated ferroptosis. Notably, 9(S)-HpODE was discovered to be a key metabolite of linoleic acid, which could target both KEAP1 and SLC7A11, bidirectionally regulating GPX4-mediated ferroptosis, while acting as a signaling molecule at low doses to induce redox adaptation via oxidative preconditioning, thus ameliorating oxidative stress in MASLD. CONCLUSION Our findings indicate that both CJR and linoleic acid exhibit significant potential as dietary interventions for the management of MASLD, offering promising avenues for future research and clinical application.
Collapse
Affiliation(s)
- Simin Gu
- Department of Gastroenterology, Shanghai Municipal Hospital of Traditional Chinese Medicine, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Chong Chen
- Department of Gastroenterology, Shanghai Municipal Hospital of Traditional Chinese Medicine, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Junmin Wang
- Department of Gastroenterology, Shanghai Municipal Hospital of Traditional Chinese Medicine, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Yanping Wang
- Department of Gastroenterology, Shanghai Municipal Hospital of Traditional Chinese Medicine, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Lina Zhao
- Department of Hepatobiliary Diseases, The First Affiliated Hospital of Guangzhou University of Chinese Medicine, State Key Laboratory of Traditional Chinese Medicine Syndrome, Guangzhou, China
| | - Zhekun Xiong
- Department of Spleen, Stomach and Hepatobiliary, Zhongshan Hospital of Traditional Chinese Medicine, Zhongshan, China
| | - Hui Zhang
- Department of Spleen, Stomach and Hepatobiliary, Zhongshan Hospital of Traditional Chinese Medicine, Zhongshan, China
| | - Taoying Deng
- Department of Spleen, Stomach and Hepatobiliary, Zhongshan Hospital of Traditional Chinese Medicine, Zhongshan, China
| | - Qihui Pan
- Department of Gastroenterology, Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Yiyuan Zheng
- Department of Gastroenterology, Shanghai Municipal Hospital of Traditional Chinese Medicine, Shanghai University of Traditional Chinese Medicine, Shanghai, China.
| | - Yong Li
- Department of Gastroenterology, Shanghai Municipal Hospital of Traditional Chinese Medicine, Shanghai University of Traditional Chinese Medicine, Shanghai, China.
| |
Collapse
|
2
|
Qiao L, Khalilimeybodi A, Linden-Santangeli NJ, Rangamani P. The Evolution of Systems Biology and Systems Medicine: From Mechanistic Models to Uncertainty Quantification. Annu Rev Biomed Eng 2025; 27:425-447. [PMID: 39971380 DOI: 10.1146/annurev-bioeng-102723-065309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Understanding interaction mechanisms within cells, tissues, and organisms is crucial for driving developments across biology and medicine. Mathematical modeling is an essential tool for simulating such biological systems. Building on experiments, mechanistic models are widely used to describe small-scale intracellular networks. The development of sequencing techniques and computational tools has recently enabled multiscale models. Combining such larger scale network modeling with mechanistic modeling provides us with an opportunity to reveal previously unknown disease mechanisms and pharmacological interventions. Here, we review systems biology models from mechanistic models to multiscale models that integrate multiple layers of cellular networks and discuss how they can be used to shed light on disease states and even wellness-related states. Additionally, we introduce several methods that increase the certainty and accuracy of model predictions. Thus, combining mechanistic models with emerging mathematical and computational techniques can provide us with increasingly powerful tools to understand disease states and inspire drug discoveries.
Collapse
Affiliation(s)
- Lingxia Qiao
- Department of Pharmacology, University of California San Diego, La Jolla, California, USA;
- Department of Mechanical and Aerospace Engineering, University of California San Diego, La Jolla, California, USA
| | - Ali Khalilimeybodi
- Department of Mechanical and Aerospace Engineering, University of California San Diego, La Jolla, California, USA
| | | | - Padmini Rangamani
- Department of Pharmacology, University of California San Diego, La Jolla, California, USA;
- Department of Mechanical and Aerospace Engineering, University of California San Diego, La Jolla, California, USA
| |
Collapse
|
3
|
Wu Y, Wu PH, Chambliss A, Wirtz D, Sun SX. Unifying fragmented perspectives with additive deep learning for high-dimensional models from partial faceted datasets. NPJ BIOLOGICAL PHYSICS AND MECHANICS 2025; 2:5. [PMID: 40012561 PMCID: PMC11850287 DOI: 10.1038/s44341-025-00009-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Accepted: 01/08/2025] [Indexed: 02/28/2025]
Abstract
Biological systems are complex networks where measurable functions emerge from interactions among thousands of components. Many studies aim to link biological function with molecular elements, yet quantifying their contributions simultaneously remains challenging, especially at the single-cell level. We propose a machine-learning approach that integrates faceted data subsets to reconstruct a complete view of the system using conditional distributions. We develop both polynomial regression and neural network models, validated with two examples: a mechanical spring network under external forces and an 8-dimensional biological network involving the senescence marker P53, using single-cell data. Our results demonstrate successful system reconstruction from partial datasets, with predictive accuracy improving as more variables are measured. This approach offers a systematic method to integrate fragmented experimental data, enabling unbiased and holistic modeling of complex biological functions.
Collapse
Affiliation(s)
- Yufei Wu
- Department of Mechanical Engineering, Johns Hopkins University, Baltimore, MD USA
- Institute for NanoBioTechnology, Johns Hopkins University, Baltimore, MD USA
| | - Pei-Hsun Wu
- Institute for NanoBioTechnology, Johns Hopkins University, Baltimore, MD USA
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD USA
| | - Allison Chambliss
- Department of Pathology & Laboratory Medicine, University of California Los Angeles, Los Angeles, CA USA
| | - Denis Wirtz
- Institute for NanoBioTechnology, Johns Hopkins University, Baltimore, MD USA
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD USA
| | - Sean X. Sun
- Department of Mechanical Engineering, Johns Hopkins University, Baltimore, MD USA
- Institute for NanoBioTechnology, Johns Hopkins University, Baltimore, MD USA
- Center for Cell Dynamics, Johns Hopkins School of Medicine, Baltimore, MD USA
| |
Collapse
|
4
|
Li X, Dong X, Zhang W, Shi Z, Liu Z, Sa Y, Li L, Ni N, Mei Y. Multi-omics in exploring the pathophysiology of diabetic retinopathy. Front Cell Dev Biol 2024; 12:1500474. [PMID: 39723239 PMCID: PMC11668801 DOI: 10.3389/fcell.2024.1500474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Accepted: 11/25/2024] [Indexed: 12/28/2024] Open
Abstract
Diabetic retinopathy (DR) is a leading global cause of vision impairment, with its prevalence increasing alongside the rising rates of diabetes mellitus (DM). Despite the retina's complex structure, the underlying pathology of DR remains incompletely understood. Single-cell RNA sequencing (scRNA-seq) and recent advancements in multi-omics analyses have revolutionized molecular profiling, enabling high-throughput analysis and comprehensive characterization of complex biological systems. This review highlights the significant contributions of scRNA-seq, in conjunction with other multi-omics technologies, to DR research. Integrated scRNA-seq and transcriptomic analyses have revealed novel insights into DR pathogenesis, including alternative transcription start site events, fluctuations in cell populations, altered gene expression profiles, and critical signaling pathways within retinal cells. Furthermore, by integrating scRNA-seq with genetic association studies and multi-omics analyses, researchers have identified novel biomarkers, susceptibility genes, and potential therapeutic targets for DR, emphasizing the importance of specific retinal cell types in disease progression. The integration of scRNA-seq with metabolomics has also been instrumental in identifying specific metabolites and dysregulated pathways associated with DR. It is highly conceivable that the continued synergy between scRNA-seq and other multi-omics approaches will accelerate the discovery of underlying mechanisms and the development of novel therapeutic interventions for DR.
Collapse
Affiliation(s)
- Xinlu Li
- Faculty of Life Science and Technology, Kunming University of Science and Technology, Kunming, China
- Department of Ophthalmology, The Affiliated Hospital of Kunming University of Science and Technology, Kunming, China
- Department of Ophthalmology, The First People’s Hospital of Yunnan Province, Kunming, China
- Medical School, Kunming University of Science and Technology, Kunming, China
| | - XiaoJing Dong
- Department of Ophthalmology, The Affiliated Hospital of Kunming University of Science and Technology, Kunming, China
- Department of Ophthalmology, The First People’s Hospital of Yunnan Province, Kunming, China
- Medical School, Kunming University of Science and Technology, Kunming, China
| | - Wen Zhang
- Medical School, Kunming University of Science and Technology, Kunming, China
| | - Zhizhou Shi
- Faculty of Life Science and Technology, Kunming University of Science and Technology, Kunming, China
| | - Zhongjian Liu
- Institute of Basic and Clinical Medicine, The First People’s Hospital of Yunnan Province, Kunming, China
| | - Yalian Sa
- Institute of Basic and Clinical Medicine, The First People’s Hospital of Yunnan Province, Kunming, China
| | - Li Li
- Institute of Basic and Clinical Medicine, The First People’s Hospital of Yunnan Province, Kunming, China
| | - Ninghua Ni
- Department of Ophthalmology, The Affiliated Hospital of Kunming University of Science and Technology, Kunming, China
- Department of Ophthalmology, The First People’s Hospital of Yunnan Province, Kunming, China
- Medical School, Kunming University of Science and Technology, Kunming, China
| | - Yan Mei
- Department of Ophthalmology, The Affiliated Hospital of Kunming University of Science and Technology, Kunming, China
- Department of Ophthalmology, The First People’s Hospital of Yunnan Province, Kunming, China
- Medical School, Kunming University of Science and Technology, Kunming, China
| |
Collapse
|
5
|
Qi G, Si Z, Xuan L, Han Z, Hu Y, Fang L, Dai F, Zhang T. Unravelling the genetic basis and regulation networks related to fibre quality improvement using chromosome segment substitution lines in cotton. PLANT BIOTECHNOLOGY JOURNAL 2024; 22:3135-3150. [PMID: 39046162 PMCID: PMC11500987 DOI: 10.1111/pbi.14436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 07/02/2024] [Accepted: 07/06/2024] [Indexed: 07/25/2024]
Abstract
The elucidation of genetic architecture and molecular regulatory networks underlying complex traits remains a significant challenge in life science, largely due to the substantial background effects that arise from epistasis and gene-environment interactions. The chromosome segment substitution line (CSSL) is an ideal material for genetic and molecular dissection of complex traits due to its near-isogenic properties; yet a comprehensive analysis, from the basic identification of substitution segments to advanced regulatory network, is still insufficient. Here, we developed two cotton CSSL populations on the Gossypium hirsutum background, representing wide adaptation and high lint yield, with introgression from G. barbadense, representing superior fibre quality. We sequenced 99 CSSLs that demonstrated significant differences from G. hirsutum in fibre, and characterized 836 dynamic fibre transcriptomes in three crucial developmental stages. We developed a workflow for precise resolution of chromosomal substitution segments; the genome sequencing revealed substitutions collectively representing 87.25% of the G. barbadense genome. Together, the genomic and transcriptomic survey identified 18 novel fibre-quality-related quantitative trait loci with high genetic contributions and the comprehensive landscape of fibre development regulation. Furthermore, analysis determined unique cis-expression patterns in CSSLs to be the driving force for fibre quality alteration; building upon this, the co-expression regulatory network revealed biological relationships among the noted pathways and accurately described the molecular interactions of GhHOX3, GhRDL1 and GhEXPA1 during fibre elongation, along with reliable predictions for their interactions with GhTBA8A5. Our study will enhance more strategic employment of CSSL in crop molecular biology and breeding programmes.
Collapse
Affiliation(s)
- Guoan Qi
- Hainan Institute of Zhejiang University, Yazhou Bay Science and Technology CitySanyaHainanChina
- The Advanced Seed Institute, College of Agriculture and Biotechnology, Zhejiang UniversityHangzhouZhejiangChina
| | - Zhanfeng Si
- The Advanced Seed Institute, College of Agriculture and Biotechnology, Zhejiang UniversityHangzhouZhejiangChina
| | - Lisha Xuan
- The Advanced Seed Institute, College of Agriculture and Biotechnology, Zhejiang UniversityHangzhouZhejiangChina
| | - Zegang Han
- The Advanced Seed Institute, College of Agriculture and Biotechnology, Zhejiang UniversityHangzhouZhejiangChina
| | - Yan Hu
- Hainan Institute of Zhejiang University, Yazhou Bay Science and Technology CitySanyaHainanChina
- The Advanced Seed Institute, College of Agriculture and Biotechnology, Zhejiang UniversityHangzhouZhejiangChina
| | - Lei Fang
- Hainan Institute of Zhejiang University, Yazhou Bay Science and Technology CitySanyaHainanChina
- The Advanced Seed Institute, College of Agriculture and Biotechnology, Zhejiang UniversityHangzhouZhejiangChina
| | - Fan Dai
- The Advanced Seed Institute, College of Agriculture and Biotechnology, Zhejiang UniversityHangzhouZhejiangChina
| | - Tianzhen Zhang
- Hainan Institute of Zhejiang University, Yazhou Bay Science and Technology CitySanyaHainanChina
- The Advanced Seed Institute, College of Agriculture and Biotechnology, Zhejiang UniversityHangzhouZhejiangChina
| |
Collapse
|
6
|
Cingiz MÖ. k- Strong Inference Algorithm: A Hybrid Information Theory Based Gene Network Inference Algorithm. Mol Biotechnol 2024; 66:3213-3225. [PMID: 37950851 DOI: 10.1007/s12033-023-00929-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 10/05/2023] [Indexed: 11/13/2023]
Abstract
Gene networks allow researchers to understand the underlying mechanisms between diseases and genes while reducing the need for wet lab experiments. Numerous gene network inference (GNI) algorithms have been presented in the literature to infer accurate gene networks. We proposed a hybrid GNI algorithm, k-Strong Inference Algorithm (ksia), to infer more reliable and robust gene networks from omics datasets. To increase reliability, ksia integrates Pearson correlation coefficient (PCC) and Spearman rank correlation coefficient (SCC) scores to determine mutual information scores between molecules to increase diversity of relation predictions. To infer a more robust gene network, ksia applies three different elimination steps to remove redundant and spurious relations between genes. The performance of ksia was evaluated on microbe microarrays database in the overlap analysis with other GNI algorithms, namely ARACNE, C3NET, CLR, and MRNET. Ksia inferred less number of relations due to its strict elimination steps. However, ksia generally performed better on Escherichia coli (E.coli) and Saccharomyces cerevisiae (yeast) gene expression datasets due to F- measure and precision values. The integration of association estimator scores and three elimination stages slightly increases the performance of ksia based gene networks. Users can access ksia R package and user manual of package via https://github.com/ozgurcingiz/ksia .
Collapse
Affiliation(s)
- Mustafa Özgür Cingiz
- Computer Engineering Department, Faculty of Engineering and Natural Sciences, Bursa Technical University, Mimar Sinan Campus, Yildirim, 16310, Bursa, Turkey.
| |
Collapse
|
7
|
Peng H, Xu J, Liu K, Liu F, Zhang A, Zhang X. EIEPCF: accurate inference of functional gene regulatory networks by eliminating indirect effects from confounding factors. Brief Funct Genomics 2024; 23:373-383. [PMID: 37642217 DOI: 10.1093/bfgp/elad040] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 07/07/2023] [Accepted: 08/14/2023] [Indexed: 08/31/2023] Open
Abstract
Reconstructing functional gene regulatory networks (GRNs) is a primary prerequisite for understanding pathogenic mechanisms and curing diseases in animals, and it also provides an important foundation for cultivating vegetable and fruit varieties that are resistant to diseases and corrosion in plants. Many computational methods have been developed to infer GRNs, but most of the regulatory relationships between genes obtained by these methods are biased. Eliminating indirect effects in GRNs remains a significant challenge for researchers. In this work, we propose a novel approach for inferring functional GRNs, named EIEPCF (eliminating indirect effects produced by confounding factors), which eliminates indirect effects caused by confounding factors. This method eliminates the influence of confounding factors on regulatory factors and target genes by measuring the similarity between their residuals. The validation results of the EIEPCF method on simulation studies, the gold-standard networks provided by the DREAM3 Challenge and the real gene networks of Escherichia coli demonstrate that it achieves significantly higher accuracy compared to other popular computational methods for inferring GRNs. As a case study, we utilized the EIEPCF method to reconstruct the cold-resistant specific GRN from gene expression data of cold-resistant in Arabidopsis thaliana. The source code and data are available at https://github.com/zhanglab-wbgcas/EIEPCF.
Collapse
Affiliation(s)
- Huixiang Peng
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
- University of Chinese Academy of Sciences, Beijing 100049 China
| | - Jing Xu
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
- University of Chinese Academy of Sciences, Beijing 100049 China
| | - Kangchen Liu
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
- University of Chinese Academy of Sciences, Beijing 100049 China
| | - Fang Liu
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
| | - Aidi Zhang
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
| | - Xiujun Zhang
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
- Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan 430074, China
| |
Collapse
|
8
|
Chee FT, Harun S, Mohd Daud K, Sulaiman S, Nor Muhammad NA. Exploring gene regulation and biological processes in insects: Insights from omics data using gene regulatory network models. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2024; 189:1-12. [PMID: 38604435 DOI: 10.1016/j.pbiomolbio.2024.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/18/2023] [Accepted: 04/03/2024] [Indexed: 04/13/2024]
Abstract
Gene regulatory network (GRN) comprises complicated yet intertwined gene-regulator relationships. Understanding the GRN dynamics will unravel the complexity behind the observed gene expressions. Insect gene regulation is often complicated due to their complex life cycles and diverse ecological adaptations. The main interest of this review is to have an update on the current mathematical modelling methods of GRNs to explain insect science. Several popular GRN architecture models are discussed, together with examples of applications in insect science. In the last part of this review, each model is compared from different aspects, including network scalability, computation complexity, robustness to noise and biological relevancy.
Collapse
Affiliation(s)
- Fong Ting Chee
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor, Malaysia
| | - Sarahani Harun
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor, Malaysia
| | - Kauthar Mohd Daud
- Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, 43600, UKM Bangi, Selangor, Malaysia
| | - Suhaila Sulaiman
- FGV R&D Sdn Bhd, FGV Innovation Center, PT23417 Lengkuk Teknologi, Bandar Baru Enstek, 71760 Nilai, Negeri Sembilan, Malaysia
| | - Nor Azlan Nor Muhammad
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor, Malaysia.
| |
Collapse
|
9
|
Wei PJ, Guo Z, Gao Z, Ding Z, Cao RF, Su Y, Zheng CH. Inference of gene regulatory networks based on directed graph convolutional networks. Brief Bioinform 2024; 25:bbae309. [PMID: 38935070 PMCID: PMC11209731 DOI: 10.1093/bib/bbae309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 05/17/2024] [Indexed: 06/28/2024] Open
Abstract
Inferring gene regulatory network (GRN) is one of the important challenges in systems biology, and many outstanding computational methods have been proposed; however there remains some challenges especially in real datasets. In this study, we propose Directed Graph Convolutional neural network-based method for GRN inference (DGCGRN). To better understand and process the directed graph structure data of GRN, a directed graph convolutional neural network is conducted which retains the structural information of the directed graph while also making full use of neighbor node features. The local augmentation strategy is adopted in graph neural network to solve the problem of poor prediction accuracy caused by a large number of low-degree nodes in GRN. In addition, for real data such as E.coli, sequence features are obtained by extracting hidden features using Bi-GRU and calculating the statistical physicochemical characteristics of gene sequence. At the training stage, a dynamic update strategy is used to convert the obtained edge prediction scores into edge weights to guide the subsequent training process of the model. The results on synthetic benchmark datasets and real datasets show that the prediction performance of DGCGRN is significantly better than existing models. Furthermore, the case studies on bladder uroepithelial carcinoma and lung cancer cells also illustrate the performance of the proposed model.
Collapse
Affiliation(s)
- Pi-Jing Wei
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Ziqiang Guo
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Zhen Gao
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Zheng Ding
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Rui-Fen Cao
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Yansen Su
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Chun-Hou Zheng
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| |
Collapse
|
10
|
Feng J, Song H, Province M, Li G, Payne PRO, Chen Y, Li F. PathFinder: a novel graph transformer model to infer multi-cell intra- and inter-cellular signaling pathways and communications. Front Cell Neurosci 2024; 18:1369242. [PMID: 38846640 PMCID: PMC11155453 DOI: 10.3389/fncel.2024.1369242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 04/30/2024] [Indexed: 06/09/2024] Open
Abstract
Recently, large-scale scRNA-seq datasets have been generated to understand the complex signaling mechanisms within the microenvironment of Alzheimer's Disease (AD), which are critical for identifying novel therapeutic targets and precision medicine. However, the background signaling networks are highly complex and interactive. It remains challenging to infer the core intra- and inter-multi-cell signaling communication networks using scRNA-seq data. In this study, we introduced a novel graph transformer model, PathFinder, to infer multi-cell intra- and inter-cellular signaling pathways and communications among multi-cell types. Compared with existing models, the novel and unique design of PathFinder is based on the divide-and-conquer strategy. This model divides complex signaling networks into signaling paths, which are then scored and ranked using a novel graph transformer architecture to infer intra- and inter-cell signaling communications. We evaluated the performance of PathFinder using two scRNA-seq data cohorts. The first cohort is an APOE4 genotype-specific AD, and the second is a human cirrhosis cohort. The evaluation confirms the promising potential of using PathFinder as a general signaling network inference model.
Collapse
Affiliation(s)
- Jiarui Feng
- Institute for Informatics (I2), Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, United States
- Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, MO, United States
| | - Haoran Song
- Institute for Informatics (I2), Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, United States
- Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, MO, United States
| | - Michael Province
- Division of Statistical Genomics, Department of Genetics, Washington University in St. Louis, St. Louis, MO, United States
| | - Guangfu Li
- Department of Surgery, University of Missouri-Columbia, Columbia, MO, United States
- Department of Molecular Microbiology and Immunology, University of Missouri-Columbia, Columbia, MO, United States
- NextGen Precision Health Institute, University of Missouri-Columbia, Columbia, MO, United States
| | - Philip R. O. Payne
- Institute for Informatics (I2), Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, United States
| | - Yixin Chen
- Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, MO, United States
| | - Fuhai Li
- Institute for Informatics (I2), Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, United States
- Department of Pediatrics, Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, United States
| |
Collapse
|
11
|
Stock M, Popp N, Fiorentino J, Scialdone A. Topological benchmarking of algorithms to infer gene regulatory networks from single-cell RNA-seq data. Bioinformatics 2024; 40:btae267. [PMID: 38627250 PMCID: PMC11096270 DOI: 10.1093/bioinformatics/btae267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 02/28/2024] [Accepted: 04/16/2024] [Indexed: 05/18/2024] Open
Abstract
MOTIVATION In recent years, many algorithms for inferring gene regulatory networks from single-cell transcriptomic data have been published. Several studies have evaluated their accuracy in estimating the presence of an interaction between pairs of genes. However, these benchmarking analyses do not quantify the algorithms' ability to capture structural properties of networks, which are fundamental, e.g., for studying the robustness of a gene network to external perturbations. Here, we devise a three-step benchmarking pipeline called STREAMLINE that quantifies the ability of algorithms to capture topological properties of networks and identify hubs. RESULTS To this aim, we use data simulated from different types of networks as well as experimental data from three different organisms. We apply our benchmarking pipeline to four inference algorithms and provide guidance on which algorithm should be used depending on the global network property of interest. AVAILABILITY AND IMPLEMENTATION STREAMLINE is available at https://github.com/ScialdoneLab/STREAMLINE. The data generated in this study are available at https://doi.org/10.5281/zenodo.10710444.
Collapse
Affiliation(s)
- Marco Stock
- Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 81377, Germany
- Institute of Functional Epigenetics, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich 85354, Germany
| | - Niclas Popp
- Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 81377, Germany
- Institute of Functional Epigenetics, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
| | - Jonathan Fiorentino
- Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 81377, Germany
- Institute of Functional Epigenetics, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
| | - Antonio Scialdone
- Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 81377, Germany
- Institute of Functional Epigenetics, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
| |
Collapse
|
12
|
Ranjan R, Srijan S, Balekuttira S, Agarwal T, Ramey M, Dobbins M, Kuhn R, Wang X, Hudson K, Li Y, Varala K. Organ-delimited gene regulatory networks provide high accuracy in candidate transcription factor selection across diverse processes. Proc Natl Acad Sci U S A 2024; 121:e2322751121. [PMID: 38652750 PMCID: PMC11066984 DOI: 10.1073/pnas.2322751121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 03/14/2024] [Indexed: 04/25/2024] Open
Abstract
Organ-specific gene expression datasets that include hundreds to thousands of experiments allow the reconstruction of organ-level gene regulatory networks (GRNs). However, creating such datasets is greatly hampered by the requirements of extensive and tedious manual curation. Here, we trained a supervised classification model that can accurately classify the organ-of-origin for a plant transcriptome. This K-Nearest Neighbor-based multiclass classifier was used to create organ-specific gene expression datasets for the leaf, root, shoot, flower, and seed in Arabidopsis thaliana. A GRN inference approach was used to determine the: i. influential transcription factors (TFs) in each organ and, ii. most influential TFs for specific biological processes in that organ. These genome-wide, organ-delimited GRNs (OD-GRNs), recalled many known regulators of organ development and processes operating in those organs. Importantly, many previously unknown TF regulators were uncovered as potential regulators of these processes. As a proof-of-concept, we focused on experimentally validating the predicted TF regulators of lipid biosynthesis in seeds, an important food and biofuel trait. Of the top 20 predicted TFs, eight are known regulators of seed oil content, e.g., WRI1, LEC1, FUS3. Importantly, we validated our prediction of MybS2, TGA4, SPL12, AGL18, and DiV2 as regulators of seed lipid biosynthesis. We elucidated the molecular mechanism of MybS2 and show that it induces purple acid phosphatase family genes and lipid synthesis genes to enhance seed lipid content. This general approach has the potential to be extended to any species with sufficiently large gene expression datasets to find unique regulators of any trait-of-interest.
Collapse
Affiliation(s)
- Rajeev Ranjan
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, IN47907
- Center for Plant Biology, Purdue University, West Lafayette, IN47907
| | - Sonali Srijan
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, IN47907
| | - Somaiah Balekuttira
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, IN47907
| | - Tina Agarwal
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, IN47907
- Center for Plant Biology, Purdue University, West Lafayette, IN47907
| | - Melissa Ramey
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, IN47907
| | - Madison Dobbins
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, IN47907
| | - Rachel Kuhn
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, IN47907
| | - Xiaojin Wang
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, IN47907
- Center for Plant Biology, Purdue University, West Lafayette, IN47907
| | - Karen Hudson
- United States Department of Agriculture-Agricultural Research Service Crop Production and Pest Control Research Unit, West Lafayette, IN47907
| | - Ying Li
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, IN47907
- Center for Plant Biology, Purdue University, West Lafayette, IN47907
| | - Kranthi Varala
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, IN47907
- Center for Plant Biology, Purdue University, West Lafayette, IN47907
| |
Collapse
|
13
|
Prokop B, Gelens L. From biological data to oscillator models using SINDy. iScience 2024; 27:109316. [PMID: 38523784 PMCID: PMC10959654 DOI: 10.1016/j.isci.2024.109316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 01/18/2024] [Accepted: 02/18/2024] [Indexed: 03/26/2024] Open
Abstract
Periodic changes in the concentration or activity of different molecules regulate vital cellular processes such as cell division and circadian rhythms. Developing mathematical models is essential to better understand the mechanisms underlying these oscillations. Recent data-driven methods like SINDy have fundamentally changed model identification, yet their application to experimental biological data remains limited. This study investigates SINDy's constraints by directly applying it to biological oscillatory data. We identify insufficient resolution, noise, dimensionality, and limited prior knowledge as primary limitations. Using various generic oscillator models of different complexity and/or dimensionality, we systematically analyze these factors. We then propose a comprehensive guide for inferring models from biological data, addressing these challenges step by step. Our approach is validated using glycolytic oscillation data from yeast.
Collapse
Affiliation(s)
- Bartosz Prokop
- Laboratory of Dynamics in Biological Systems, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, 3000 Leuven, Belgium
| | - Lendert Gelens
- Laboratory of Dynamics in Biological Systems, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, 3000 Leuven, Belgium
| |
Collapse
|
14
|
Feng J, Province M, Li G, Payne PR, Chen Y, Li F. PathFinder: a novel graph transformer model to infer multi-cell intra- and inter-cellular signaling pathways and communications. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.13.575534. [PMID: 38293243 PMCID: PMC10827077 DOI: 10.1101/2024.01.13.575534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Recently, large-scale scRNA-seq datasets have been generated to understand the complex and poorly understood signaling mechanisms within microenvironment of Alzheimer's Disease (AD), which are critical for identifying novel therapeutic targets and precision medicine. Though a set of targets have been identified, however, it remains a challenging to infer the core intra- and inter-multi-cell signaling communication networks using the scRNA-seq data, considering the complex and highly interactive background signaling network. Herein, we introduced a novel graph transformer model, PathFinder, to infer multi-cell intra- and inter-cellular signaling pathways and signaling communications among multi-cell types. Compared with existing models, the novel and unique design of PathFinder is based on the divide-and-conquer strategy, which divides the complex signaling networks into signaling paths, and then score and rank them using a novel graph transformer architecture to infer the intra- and inter-cell signaling communications. We evaluated PathFinder using scRNA-seq data of APOE4-genotype specific AD mice models and identified novel APOE4 altered intra- and inter-cell interaction networks among neurons, astrocytes, and microglia. PathFinder is a general signaling network inference model and can be applied to other omics data-driven signaling network inference.
Collapse
Affiliation(s)
- Jiarui Feng
- Institute for Informatics (I2), Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, USA
- Department of Computer Science and Engineering, Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, USA
| | - Michael Province
- Division of Statistical Genomics, Department of Genetics, Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, USA
| | - Guangfu Li
- Department of Surgery, University of Missouri-Columbia, Columbia, MO, 65212, USA
- Department of Molecular Microbiology and Immunology, University of Missouri-Columbia, Columbia, MO, 65212, USA
- NextGen Precision Health Institute, University of Missouri-Columbia, Columbia, MO, 65212, USA
| | - Philip R.O. Payne
- Institute for Informatics (I2), Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, USA
| | - Yixin Chen
- Department of Computer Science and Engineering, Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, USA
| | - Fuhai Li
- Institute for Informatics (I2), Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, USA
- Department of Pediatrics, Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, USA
| |
Collapse
|
15
|
Aguirre J, Guantes R. Virus-host protein co-expression networks reveal temporal organization and strategies of viral infection. iScience 2023; 26:108475. [PMID: 38077135 PMCID: PMC10698274 DOI: 10.1016/j.isci.2023.108475] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 09/27/2023] [Accepted: 11/14/2023] [Indexed: 04/14/2025] Open
Abstract
Viral replication is a complex dynamical process involving the global remodeling of the host cellular machinery across several stages. In this study, we provide a unified view of the virus-host interaction at the proteome level reconstructing protein co-expression networks from quantitative temporal data of four large DNA viruses. We take advantage of a formal framework, the theory of competing networks, to describe the viral infection as a dynamical system taking place on a network of networks where perturbations induced by viral proteins spread to hijack the host proteome for the virus benefit. Our methodology demonstrates how the viral replication cycle can be effectively examined as a complex interaction between protein networks, providing useful insights into the viral and host's temporal organization and strategies, key protein nodes targeted by the virus and dynamical bottlenecks during the course of the infection.
Collapse
Affiliation(s)
- Jacobo Aguirre
- Centro de Astrobiología (CAB), CSIC-INTA, ctra. de Ajalvir km 4, 28850 Torrejón de Ardoz, Madrid, Spain
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain
| | - Raúl Guantes
- Department of Condensed Matter Physics and Material Science Institute ‘Nicolás Cabrera’, Science Faculty, Universidad Autónoma de Madrid, 28049 Cantoblanco, Madrid, Spain
- Condensed Matter Physics Center (IFIMAC), Science Faculty, Universidad Autónoma de Madrid, 28049 Cantoblanco, Madrid, Spain
| |
Collapse
|
16
|
Lecca P, Lecca M. Graph embedding and geometric deep learning relevance to network biology and structural chemistry. Front Artif Intell 2023; 6:1256352. [PMID: 38035201 PMCID: PMC10687447 DOI: 10.3389/frai.2023.1256352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 10/16/2023] [Indexed: 12/02/2023] Open
Abstract
Graphs are used as a model of complex relationships among data in biological science since the advent of systems biology in the early 2000. In particular, graph data analysis and graph data mining play an important role in biology interaction networks, where recent techniques of artificial intelligence, usually employed in other type of networks (e.g., social, citations, and trademark networks) aim to implement various data mining tasks including classification, clustering, recommendation, anomaly detection, and link prediction. The commitment and efforts of artificial intelligence research in network biology are motivated by the fact that machine learning techniques are often prohibitively computational demanding, low parallelizable, and ultimately inapplicable, since biological network of realistic size is a large system, which is characterised by a high density of interactions and often with a non-linear dynamics and a non-Euclidean latent geometry. Currently, graph embedding emerges as the new learning paradigm that shifts the tasks of building complex models for classification, clustering, and link prediction to learning an informative representation of the graph data in a vector space so that many graph mining and learning tasks can be more easily performed by employing efficient non-iterative traditional models (e.g., a linear support vector machine for the classification task). The great potential of graph embedding is the main reason of the flourishing of studies in this area and, in particular, the artificial intelligence learning techniques. In this mini review, we give a comprehensive summary of the main graph embedding algorithms in light of the recent burgeoning interest in geometric deep learning.
Collapse
Affiliation(s)
- Paola Lecca
- Faculty of Engineering, Free University of Bozen-Bolzano, Bolzano, Italy
| | - Michela Lecca
- Fondazione Bruno Kessler, Digital Industry Center, Technologies of Vision, Trento, Italy
| |
Collapse
|
17
|
Gorin G, Vastola JJ, Pachter L. Studying stochastic systems biology of the cell with single-cell genomics data. Cell Syst 2023; 14:822-843.e22. [PMID: 37751736 PMCID: PMC10725240 DOI: 10.1016/j.cels.2023.08.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 08/16/2023] [Accepted: 08/25/2023] [Indexed: 09/28/2023]
Abstract
Recent experimental developments in genome-wide RNA quantification hold considerable promise for systems biology. However, rigorously probing the biology of living cells requires a unified mathematical framework that accounts for single-molecule biological stochasticity in the context of technical variation associated with genomics assays. We review models for a variety of RNA transcription processes, as well as the encapsulation and library construction steps of microfluidics-based single-cell RNA sequencing, and present a framework to integrate these phenomena by the manipulation of generating functions. Finally, we use simulated scenarios and biological data to illustrate the implications and applications of the approach.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - John J Vastola
- Department of Neurobiology, Harvard Medical School, Boston, MA 02115, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA; Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, USA.
| |
Collapse
|
18
|
Hasman M, Mayr M, Theofilatos K. Uncovering Protein Networks in Cardiovascular Proteomics. Mol Cell Proteomics 2023; 22:100607. [PMID: 37356494 PMCID: PMC10460687 DOI: 10.1016/j.mcpro.2023.100607] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 05/01/2023] [Accepted: 06/20/2023] [Indexed: 06/27/2023] Open
Abstract
Biological networks have been widely used in many different diseases to identify potential biomarkers and design drug targets. In the present review, we describe the main computational techniques for reconstructing and analyzing different types of protein networks and summarize the previous applications of such techniques in cardiovascular diseases. Existing tools are critically compared, discussing when each method is preferred such as the use of co-expression networks for functional annotation of protein clusters and the use of directed networks for inferring regulatory associations. Finally, we are presenting examples of reconstructing protein networks of different types (regulatory, co-expression, and protein-protein interaction networks). We demonstrate the necessity to reconstruct networks separately for each cardiovascular tissue type and disease entity and provide illustrative examples of the importance of taking into consideration relevant post-translational modifications. Finally, we demonstrate and discuss how the findings of protein networks could be interpreted using single-cell RNA-sequencing data.
Collapse
Affiliation(s)
- Maria Hasman
- King's British Heart Foundation Centre, Kings College London, London, United Kingdom
| | - Manuel Mayr
- King's British Heart Foundation Centre, Kings College London, London, United Kingdom
| | | |
Collapse
|
19
|
Marku M, Pancaldi V. From time-series transcriptomics to gene regulatory networks: A review on inference methods. PLoS Comput Biol 2023; 19:e1011254. [PMID: 37561790 PMCID: PMC10414591 DOI: 10.1371/journal.pcbi.1011254] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023] Open
Abstract
Inference of gene regulatory networks has been an active area of research for around 20 years, leading to the development of sophisticated inference algorithms based on a variety of assumptions and approaches. With the ever increasing demand for more accurate and powerful models, the inference problem remains of broad scientific interest. The abstract representation of biological systems through gene regulatory networks represents a powerful method to study such systems, encoding different amounts and types of information. In this review, we summarize the different types of inference algorithms specifically based on time-series transcriptomics, giving an overview of the main applications of gene regulatory networks in computational biology. This review is intended to give an updated reference of regulatory networks inference tools to biologists and researchers new to the topic and guide them in selecting the appropriate inference method that best fits their questions, aims, and experimental data.
Collapse
Affiliation(s)
- Malvina Marku
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France
| | - Vera Pancaldi
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France
- Barcelona Supercomputing Center, Barcelona, Spain
| |
Collapse
|
20
|
Park SH, Ha S, Kim JK. A general model-based causal inference method overcomes the curse of synchrony and indirect effect. Nat Commun 2023; 14:4287. [PMID: 37488136 PMCID: PMC10366229 DOI: 10.1038/s41467-023-39983-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Accepted: 06/22/2023] [Indexed: 07/26/2023] Open
Abstract
To identify causation, model-free inference methods, such as Granger Causality, have been widely used due to their flexibility. However, they have difficulty distinguishing synchrony and indirect effects from direct causation, leading to false predictions. To overcome this, model-based inference methods that test the reproducibility of data with a specific mechanistic model to infer causality were developed. However, they can only be applied to systems described by a specific model, greatly limiting their applicability. Here, we address this limitation by deriving an easily testable condition for a general monotonic ODE model to reproduce time-series data. We built a user-friendly computational package, General ODE-Based Inference (GOBI), which is applicable to nearly any monotonic system with positive and negative regulations described by ODE. GOBI successfully inferred positive and negative regulations in various networks at both the molecular and population levels, unlike existing model-free methods. Thus, this accurate and broadly applicable inference method is a powerful tool for understanding complex dynamical systems.
Collapse
Affiliation(s)
- Se Ho Park
- Department of Mathematics, University of Wisconsin-Madison, Madison, WI, 53706, USA
- Biomedical Mathematics Group, Institute for Basic Science, Daejeon, 34126, Republic of Korea
| | - Seokmin Ha
- Biomedical Mathematics Group, Institute for Basic Science, Daejeon, 34126, Republic of Korea
- Department of Mathematical Sciences, KAIST, Daejeon, 34141, Republic of Korea
| | - Jae Kyoung Kim
- Biomedical Mathematics Group, Institute for Basic Science, Daejeon, 34126, Republic of Korea.
- Department of Mathematical Sciences, KAIST, Daejeon, 34141, Republic of Korea.
| |
Collapse
|
21
|
Schiffthaler B, van Zalen E, Serrano AR, Street NR, Delhomme N. Seiðr: Efficient calculation of robust ensemble gene networks. Heliyon 2023; 9:e16811. [PMID: 37313140 PMCID: PMC10258422 DOI: 10.1016/j.heliyon.2023.e16811] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Revised: 05/22/2023] [Accepted: 05/29/2023] [Indexed: 06/15/2023] Open
Abstract
Gene regulatory and gene co-expression networks are powerful research tools for identifying biological signal within high-dimensional gene expression data. In recent years, research has focused on addressing shortcomings of these techniques with regard to the low signal-to-noise ratio, non-linear interactions and dataset dependent biases of published methods. Furthermore, it has been shown that aggregating networks from multiple methods provides improved results. Despite this, few useable and scalable software tools have been implemented to perform such best-practice analyses. Here, we present Seidr (stylized Seiðr), a software toolkit designed to assist scientists in gene regulatory and gene co-expression network inference. Seidr creates community networks to reduce algorithmic bias and utilizes noise corrected network backboning to prune noisy edges in the networks. Using benchmarks in real-world conditions across three eukaryotic model organisms, Saccharomyces cerevisiae, Drosophila melanogaster, and Arabidopsis thaliana, we show that individual algorithms are biased toward functional evidence for certain gene-gene interactions. We further demonstrate that the community network is less biased, providing robust performance across different standards and comparisons for the model organisms. Finally, we apply Seidr to a network of drought stress in Norway spruce (Picea abies (L.) H. Krast) as an example application in a non-model species. We demonstrate the use of a network inferred using Seidr for identifying key components, communities and suggesting gene function for non-annotated genes.
Collapse
Affiliation(s)
- Bastian Schiffthaler
- Department of Plant Physiology, Umea Plant Science Center, Umea University, Umea, Sweden
| | - Elena van Zalen
- Department of Plant Physiology, Umea Plant Science Center, Umea University, Umea, Sweden
| | - Alonso R. Serrano
- Department of Plant Physiology, Umea Plant Science Center, Swedish University of Agricultural Sciences, Umea, Sweden
| | - Nathaniel R. Street
- Department of Plant Physiology, Umea Plant Science Center, Umea University, Umea, Sweden
| | - Nicolas Delhomme
- Department of Plant Physiology, Umea Plant Science Center, Swedish University of Agricultural Sciences, Umea, Sweden
| |
Collapse
|
22
|
Gorin G, Vastola JJ, Pachter L. Studying stochastic systems biology of the cell with single-cell genomics data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.17.541250. [PMID: 37292934 PMCID: PMC10245677 DOI: 10.1101/2023.05.17.541250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Recent experimental developments in genome-wide RNA quantification hold considerable promise for systems biology. However, rigorously probing the biology of living cells requires a unified mathematical framework that accounts for single-molecule biological stochasticity in the context of technical variation associated with genomics assays. We review models for a variety of RNA transcription processes, as well as the encapsulation and library construction steps of microfluidics-based single-cell RNA sequencing, and present a framework to integrate these phenomena by the manipulation of generating functions. Finally, we use simulated scenarios and biological data to illustrate the implications and applications of the approach.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, 91125
| | - John J. Vastola
- Department of Neurobiology, Harvard Medical School, Boston, MA, 02115
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125
| |
Collapse
|
23
|
Saint-Antoine M, Singh A. Benchmarking Gene Regulatory Network Inference Methods on Simulated and Experimental Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.12.540581. [PMID: 37215029 PMCID: PMC10197678 DOI: 10.1101/2023.05.12.540581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Although the challenge of gene regulatory network inference has been studied for more than a decade, it is still unclear how well network inference methods work when applied to real data. Attempts to benchmark these methods on experimental data have yielded mixed results, in which sometimes even the best methods fail to outperform random guessing, and in other cases they perform reasonably well. So, one of the most valuable contributions one can currently make to the field of network inference is to benchmark methods on experimental data for which the true underlying network is already known, and report the results so that we can get a clearer picture of their efficacy. In this paper, we report results from the first, to our knowledge, benchmarking of network inference methods on single cell E. coli transcriptomic data. We report a moderate level of accuracy for the methods, better than random chance but still far from perfect. We also find that some methods that were quite strong and accurate on microarray and bulk RNA-seq data did not perform as well on the single cell data. Additionally, we benchmark a simple network inference method (Pearson correlation), on data generated through computer simulations in order to draw conclusions about general best practices in network inference studies. We predict that network inference would be more accurate using proteomic data rather than transcriptomic data, which could become relevant if high-throughput proteomic experimental methods are developed in the future. We also show through simulations that using a simplified model of gene expression that skips the mRNA step tends to substantially overestimate the accuracy of network inference methods, and advise against using this model for future in silico benchmarking studies.
Collapse
Affiliation(s)
- Michael Saint-Antoine
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE USA 19716
| | - Abhyudai Singh
- Department of Electrical and Computer Engineering, Biomedical Engineering, Mathematical Sciences, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE USA 19716
| |
Collapse
|
24
|
Shen B, Coruzzi G, Shasha D. EnsInfer: a simple ensemble approach to network inference outperforms any single method. BMC Bioinformatics 2023; 24:114. [PMID: 36964499 PMCID: PMC10037858 DOI: 10.1186/s12859-023-05231-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 03/15/2023] [Indexed: 03/26/2023] Open
Abstract
This study evaluates both a variety of existing base causal inference methods and a variety of ensemble methods. We show that: (i) base network inference methods vary in their performance across different datasets, so a method that works poorly on one dataset may work well on another; (ii) a non-homogeneous ensemble method in the form of a Naive Bayes classifier leads overall to as good or better results than using the best single base method or any other ensemble method; (iii) for the best results, the ensemble method should integrate all methods that satisfy a statistical test of normality on training data. The resulting ensemble model EnsInfer easily integrates all kinds of RNA-seq data as well as new and existing inference methods. The paper categorizes and reviews state-of-the-art underlying methods, describes the EnsInfer ensemble approach in detail, and presents experimental results. The source code and data used will be made available to the community upon publication.
Collapse
Affiliation(s)
- Bingran Shen
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, 251 Mercer St, New York, 10012 USA
| | - Gloria Coruzzi
- Department of Biology, Center for Genomics and Systems Biology, New York University, 12 Waverly Pl, New York, 10003 USA
| | - Dennis Shasha
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, 251 Mercer St, New York, 10012 USA
| |
Collapse
|
25
|
Olivença DV, Davis JD, Voit EO. Inference of dynamic interaction networks: A comparison between Lotka-Volterra and multivariate autoregressive models. FRONTIERS IN BIOINFORMATICS 2022; 2:1021838. [PMID: 36619477 PMCID: PMC9815445 DOI: 10.3389/fbinf.2022.1021838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 12/09/2022] [Indexed: 12/24/2022] Open
Abstract
Networks are ubiquitous throughout biology, spanning the entire range from molecules to food webs and global environmental systems. Yet, despite substantial efforts by the scientific community, the inference of these networks from data still presents a problem that is unsolved in general. One frequent strategy of addressing the structure of networks is the assumption that the interactions among molecular or organismal populations are static and correlative. While often successful, these static methods are no panacea. They usually ignore the asymmetry of relationships between two species and inferences become more challenging if the network nodes represent dynamically changing quantities. Overcoming these challenges, two very different network inference approaches have been proposed in the literature: Lotka-Volterra (LV) models and Multivariate Autoregressive (MAR) models. These models are computational frameworks with different mathematical structures which, nevertheless, have both been proposed for the same purpose of inferring the interactions within coexisting population networks from observed time-series data. Here, we assess these dynamic network inference methods for the first time in a side-by-side comparison, using both synthetically generated and ecological datasets. Multivariate Autoregressive and Lotka-Volterra models are mathematically equivalent at the steady state, but the results of our comparison suggest that Lotka-Volterra models are generally superior in capturing the dynamics of networks with non-linear dynamics, whereas Multivariate Autoregressive models are better suited for analyses of networks of populations with process noise and close-to linear behavior. To the best of our knowledge, this is the first study comparing LV and MAR approaches. Both frameworks are valuable tools that address slightly different aspects of dynamic networks.
Collapse
Affiliation(s)
- Daniel V. Olivença
- The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, United States
| | | | | |
Collapse
|
26
|
Galindez G, Sadegh S, Baumbach J, Kacprowski T, List M. Network-based approaches for modeling disease regulation and progression. Comput Struct Biotechnol J 2022; 21:780-795. [PMID: 36698974 PMCID: PMC9841310 DOI: 10.1016/j.csbj.2022.12.022] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 12/14/2022] [Accepted: 12/14/2022] [Indexed: 12/23/2022] Open
Abstract
Molecular interaction networks lay the foundation for studying how biological functions are controlled by the complex interplay of genes and proteins. Investigating perturbed processes using biological networks has been instrumental in uncovering mechanisms that underlie complex disease phenotypes. Rapid advances in omics technologies have prompted the generation of high-throughput datasets, enabling large-scale, network-based analyses. Consequently, various modeling techniques, including network enrichment, differential network extraction, and network inference, have proven to be useful for gaining new mechanistic insights. We provide an overview of recent network-based methods and their core ideas to facilitate the discovery of disease modules or candidate mechanisms. Knowledge generated from these computational efforts will benefit biomedical research, especially drug development and precision medicine. We further discuss current challenges and provide perspectives in the field, highlighting the need for more integrative and dynamic network approaches to model disease development and progression.
Collapse
Affiliation(s)
- Gihanna Galindez
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Sepideh Sadegh
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| |
Collapse
|
27
|
Gerstner N, Krontira AC, Cruceanu C, Roeh S, Pütz B, Sauer S, Rex-Haffner M, Schmidt MV, Binder EB, Knauer-Arloth J. DiffBrainNet: Differential analyses add new insights into the response to glucocorticoids at the level of genes, networks and brain regions. Neurobiol Stress 2022; 21:100496. [PMID: 36532379 PMCID: PMC9755029 DOI: 10.1016/j.ynstr.2022.100496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 09/25/2022] [Accepted: 10/13/2022] [Indexed: 10/31/2022] Open
Abstract
Genome-wide gene expression analyses are invaluable tools for studying biological and disease processes, allowing a hypothesis-free comparison of expression profiles. Traditionally, transcriptomic analysis has focused on gene-level effects found by differential expression. In recent years, network analysis has emerged as an important additional level of investigation, providing information on molecular connectivity, especially for diseases associated with a large number of linked effects of smaller magnitude, like neuropsychiatric disorders. Here, we describe how combined differential expression and prior-knowledge-based differential network analysis can be used to explore complex datasets. As an example, we analyze the transcriptional responses following administration of the glucocorticoid/stress receptor agonist dexamethasone in 8 mouse brain regions important for stress processing. By applying a combination of differential network- and expression-analyses, we find that these explain distinct but complementary biological mechanisms of the glucocorticoid responses. Additionally, network analysis identifies new differentially connected partners of risk genes and can be used to generate hypotheses on molecular pathways affected. With DiffBrainNet (http://diffbrainnet.psych.mpg.de), we provide an analysis framework and a publicly available resource for the study of the transcriptional landscape of the mouse brain which can identify molecular pathways important for basic functioning and response to glucocorticoids in a brain-region specific manner.
Collapse
Affiliation(s)
- Nathalie Gerstner
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Kraepelinstr. 2-10, 80804, Munich, Germany
- International Max Planck Research School for Translational Psychiatry, Kraepelinstr. 2-10, 80804, Munich, Germany
- Institute of Computational Biology, Helmholtz Zentrum München, Ingolstaedter Landstr. 1, 85764, Neuherberg, Germany
| | - Anthi C. Krontira
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Kraepelinstr. 2-10, 80804, Munich, Germany
- International Max Planck Research School for Translational Psychiatry, Kraepelinstr. 2-10, 80804, Munich, Germany
| | - Cristiana Cruceanu
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Kraepelinstr. 2-10, 80804, Munich, Germany
- Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden
| | - Simone Roeh
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Kraepelinstr. 2-10, 80804, Munich, Germany
| | - Benno Pütz
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Kraepelinstr. 2-10, 80804, Munich, Germany
| | - Susann Sauer
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Kraepelinstr. 2-10, 80804, Munich, Germany
| | - Monika Rex-Haffner
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Kraepelinstr. 2-10, 80804, Munich, Germany
| | - Mathias V. Schmidt
- Research Group Neurobiology of Stress Resilience, Max Planck Institute of Psychiatry, Kraepelinstr. 2-10, 80804, Munich, Germany
| | - Elisabeth B. Binder
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Kraepelinstr. 2-10, 80804, Munich, Germany
| | - Janine Knauer-Arloth
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Kraepelinstr. 2-10, 80804, Munich, Germany
- Institute of Computational Biology, Helmholtz Zentrum München, Ingolstaedter Landstr. 1, 85764, Neuherberg, Germany
| |
Collapse
|
28
|
Pomiès L, Brouard C, Duruflé H, Maigné É, Carré C, Gody L, Trösser F, Katsirelos G, Mangin B, Langlade NB, de Givry S. Gene regulatory network inference methodology for genomic and transcriptomic data acquired in genetically related heterozygote individuals. Bioinformatics 2022; 38:4127-4134. [PMID: 35792837 DOI: 10.1093/bioinformatics/btac445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 06/17/2022] [Accepted: 07/05/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Inferring gene regulatory networks in non-independent genetically related panels is a methodological challenge. This hampers evolutionary and biological studies using heterozygote individuals such as in wild sunflower populations or cultivated hybrids. RESULTS First, we simulated 100 datasets of gene expressions and polymorphisms, displaying the same gene expression distributions, heterozygosities and heritabilities as in our dataset including 173 genes and 353 genotypes measured in sunflower hybrids. Secondly, we performed a meta-analysis based on six inference methods [least absolute shrinkage and selection operator (Lasso), Random Forests, Bayesian Networks, Markov Random Fields, Ordinary Least Square and fast inference of networks from directed regulation (Findr)] and selected the minimal density networks for better accuracy with 64 edges connecting 79 genes and 0.35 area under precision and recall (AUPR) score on average. We identified that triangles and mutual edges are prone to errors in the inferred networks. Applied on classical datasets without heterozygotes, our strategy produced a 0.65 AUPR score for one dataset of the DREAM5 Systems Genetics Challenge. Finally, we applied our method to an experimental dataset from sunflower hybrids. We successfully inferred a network composed of 105 genes connected by 106 putative regulations with a major connected component. AVAILABILITY AND IMPLEMENTATION Our inference methodology dedicated to genomic and transcriptomic data is available at https://forgemia.inra.fr/sunrise/inference_methods. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lise Pomiès
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - Céline Brouard
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - Harold Duruflé
- LIPME, Université de Toulouse, INRAE, CNRS, Castanet-Tolosan 31326, France
| | - Élise Maigné
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - Clément Carré
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - Louise Gody
- LIPME, Université de Toulouse, INRAE, CNRS, Castanet-Tolosan 31326, France
| | - Fulya Trösser
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - George Katsirelos
- MIA-Paris, AgroParisTech, Université Paris-Saclay, INRAE, Paris 75231, France
| | - Brigitte Mangin
- LIPME, Université de Toulouse, INRAE, CNRS, Castanet-Tolosan 31326, France
| | - Nicolas B Langlade
- LIPME, Université de Toulouse, INRAE, CNRS, Castanet-Tolosan 31326, France
| | - Simon de Givry
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| |
Collapse
|
29
|
Pušnik Ž, Mraz M, Zimic N, Moškon M. Review and assessment of Boolean approaches for inference of gene regulatory networks. Heliyon 2022; 8:e10222. [PMID: 36033302 PMCID: PMC9403406 DOI: 10.1016/j.heliyon.2022.e10222] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 04/22/2022] [Accepted: 08/03/2022] [Indexed: 10/25/2022] Open
Abstract
Boolean descriptions of gene regulatory networks can provide an insight into interactions between genes. Boolean networks hold predictive power, are easy to understand, and can be used to simulate the observed networks in different scenarios. We review fundamental and state-of-the-art methods for inference of Boolean networks. We introduce a methodology for a straightforward evaluation of Boolean inference approaches based on the generation of evaluation datasets, application of selected inference methods, and evaluation of performance measures to guide the selection of the best method for a given inference problem. We demonstrate this procedure on inference methods REVEAL (REVerse Engineering ALgorithm), Best-Fit Extension, MIBNI (Mutual Information-based Boolean Network Inference), GABNI (Genetic Algorithm-based Boolean Network Inference) and ATEN (AND/OR Tree ENsemble algorithm), which infers Boolean descriptions of gene regulatory networks from discretised time series data. Boolean inference approaches tend to perform better in terms of dynamic accuracy, and slightly worse in terms of structural correctness. We believe that the proposed methodology and provided guidelines will help researchers to develop Boolean inference approaches with a good predictive capability while maintaining structural correctness and biological relevance.
Collapse
Affiliation(s)
- Žiga Pušnik
- University of Ljubljana, Faculty of Computer and Information Science, Večna pot 113, Ljubljana, SI-1000, Slovenia
| | - Miha Mraz
- University of Ljubljana, Faculty of Computer and Information Science, Večna pot 113, Ljubljana, SI-1000, Slovenia
| | - Nikolaj Zimic
- University of Ljubljana, Faculty of Computer and Information Science, Večna pot 113, Ljubljana, SI-1000, Slovenia
| | - Miha Moškon
- University of Ljubljana, Faculty of Computer and Information Science, Večna pot 113, Ljubljana, SI-1000, Slovenia
| |
Collapse
|
30
|
Ye Z. Identification of T cell-related biomarkers for breast cancer based on weighted gene co-expression network analysis. J Chemother 2022:1-9. [PMID: 35822502 DOI: 10.1080/1120009x.2022.2097431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
Breast cancer is the most frequent malignancy worldwide, with immunotherapy and targeted therapy being key strategies to improving the prognosis. We downloaded mRNA expression dataset of breast cancer from The Cancer Genome Atlas (TCGA) database, and divided preprocessed genes into 12 modules based on gene expression profile by weighted gene co-expression network analysis (WGCNA). The StromalScore, ImmuneScore and ESTIMATEScore of samples were assessed. The Kaplan-Meier curve showed that ImmuneScore was notably correlated with breast cancer patient's prognosis. By analyzing the connectivity between module eigengenes and clinical traits, the gene module closely related to ImmuneScore was obtained. Further, through intramodular gene connectivity and protein-protein interaction network topology analysis of module genes, hub genes (HLA-E, HLA-DPB1 and HLA-DRB1) in immune-related module were screened out. Finally, bioinformatics analysis displayed that HLA-DPB1 and HLA-DRB1 were notably overexpressed and HLA-E was underexpressed in breast cancer tissues. TIMER database analysis showed that three hub gene levels were significantly correlated with infiltration levels of CD8+ T cells and CD4+ T cells. Meanwhile, Pearson correlation analysis revealed positive correlation between three hub genes and those of immune checkpoint genes (LAG3, PD-1, PD-L1). Additionally, prognosis could be effectively evaluated by HLA-DPB1 and HLA-DRB1 levels, and differentially activated signalling pathways between high- and low-expression groups of HLA-E and HLA-DPB1 were obtained by gene set enrichment analysis. To conclude, this study identified three T cell-related biomarkers for breast cancer based on TCGA-BRCA dataset, and the screened genes could provide references for breast cancer immunotherapy.
Collapse
Affiliation(s)
- Zhenkai Ye
- Department of Radiotherapy, Minzu Hospital of Guangxi Zhuang Autonomous Region, Affiliated Minzu Hospital of Guangxi Medical University, Nanning, Guangxi, China
| |
Collapse
|
31
|
Foo M, Dony L, He F. Data-driven dynamical modelling of a pathogen-infected plant gene regulatory network: A comparative analysis. Biosystems 2022; 219:104732. [PMID: 35781035 DOI: 10.1016/j.biosystems.2022.104732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 05/30/2022] [Accepted: 06/22/2022] [Indexed: 11/02/2022]
Abstract
Recent advances in synthetic biology have enabled the design of genetic feedback control circuits that could be implemented to build resilient plants against pathogen attacks. To facilitate the proper design of these genetic feedback control circuits, an accurate model that is able to capture the vital dynamical behaviour of the pathogen-infected plant is required. In this study, using a data-driven modelling approach, we develop and compare four dynamical models (i.e. linear, Michaelis-Menten with Hill coefficient (Hill Function), standard S-System and extended S-System) of a pathogen-infected plant gene regulatory network (GRN). These models are then assessed across several criteria, i.e. ease of identifying the type of gene regulation, the predictive capability, Akaike Information Criterion (AIC) and the robustness to parameter uncertainty to determine its viability of balancing between biological complexity and accuracy when modelling the pathogen-infected plant GRN. Using our defined ranking score, we obtain the following insights to the modelling of GRN. Our analyses show that despite commonly used and provide biological relevance, the Hill Function model ranks the lowest while the extended S-System model ranks highest in the overall comparison. Interestingly, the performance of the linear model is more consistent throughout the comparison, making it the preferred model for this pathogen-infected plant GRN when considering data-driven modelling approach.
Collapse
Affiliation(s)
- Mathias Foo
- School of Engineering, University of Warwick, CV4 7AL, Coventry, UK.
| | - Leander Dony
- Institute of Computational Biology, Helmholtz Munich, 85764, Neuherberg, Germany; Department of Translational Psychiatry, Max Planck Institute of Psychiatry, International Max Planck Research School for Translational Psychiatry (IMPRS-TP), 80804, Munich, Germany; TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354, Freising, Germany.
| | - Fei He
- Centre for Computational Science and Mathematical Modelling, Coventry University, CV1 2JH, Coventry, UK.
| |
Collapse
|
32
|
Stefan T, Wu XN, Zhang Y, Fernie A, Schulze WX. Regulatory Modules of Metabolites and Protein Phosphorylation in Arabidopsis Genotypes With Altered Sucrose Allocation. FRONTIERS IN PLANT SCIENCE 2022; 13:891405. [PMID: 35665154 PMCID: PMC9161306 DOI: 10.3389/fpls.2022.891405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 04/11/2022] [Indexed: 06/15/2023]
Abstract
Multi-omics data sets are increasingly being used for the interpretation of cellular processes in response to environmental cues. Especially, the posttranslational modification of proteins by phosphorylation is an important regulatory process affecting protein activity and/or localization, which, in turn, can have effects on metabolic processes and metabolite levels. Despite this importance, relationships between protein phosphorylation status and metabolite abundance remain largely underexplored. Here, we used a phosphoproteomics-metabolomics data set collected at the end of day and night in shoots and roots of Arabidopsis to propose regulatory relationships between protein phosphorylation and accumulation or allocation of metabolites. For this purpose, we introduced a novel, robust co-expression measure suited to the structure of our data sets, and we used this measure to construct metabolite-phosphopeptide networks. These networks were compared between wild type and plants with perturbations in key processes of sugar metabolism, namely, sucrose export (sweet11/12 mutant) and starch synthesis (pgm mutant). The phosphopeptide-metabolite network turned out to be highly sensitive to perturbations in sugar metabolism. Specifically, KING1, the regulatory subunit of SnRK1, was identified as a primary candidate connecting protein phosphorylation status with metabolism. We additionally identified strong changes in the fatty acid network of the sweet11/12 mutant, potentially resulting from a combination of fatty acid signaling and metabolic overflow reactions in response to high internal sucrose concentrations. Our results further suggest novel protein-metabolite relationships as candidates for future targeted research.
Collapse
Affiliation(s)
- Thorsten Stefan
- Department of Plant Systems Biology, University of Hohenheim, Stuttgart, Germany
| | - Xu Na Wu
- College for Life Science, Yunnan University, Kunming, China
| | - Youjun Zhang
- Department of Central Metabolism, Max-Planck-Institute of Molecular Plant Physiology, Potsdam, Germany
- Center of Plant System Biology and Biotechnology, Plovdiv, Bulgaria
| | - Alisdair Fernie
- Department of Central Metabolism, Max-Planck-Institute of Molecular Plant Physiology, Potsdam, Germany
- Center of Plant System Biology and Biotechnology, Plovdiv, Bulgaria
| | - Waltraud X. Schulze
- Department of Plant Systems Biology, University of Hohenheim, Stuttgart, Germany
| |
Collapse
|
33
|
Inference of Molecular Regulatory Systems Using Statistical Path-Consistency Algorithm. ENTROPY 2022; 24:e24050693. [PMID: 35626576 PMCID: PMC9142129 DOI: 10.3390/e24050693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 05/12/2022] [Accepted: 05/12/2022] [Indexed: 11/16/2022]
Abstract
One of the key challenges in systems biology and molecular sciences is how to infer regulatory relationships between genes and proteins using high-throughout omics datasets. Although a wide range of methods have been designed to reverse engineer the regulatory networks, recent studies show that the inferred network may depend on the variable order in the dataset. In this work, we develop a new algorithm, called the statistical path-consistency algorithm (SPCA), to solve the problem of the dependence of variable order. This method generates a number of different variable orders using random samples, and then infers a network by using the path-consistent algorithm based on each variable order. We propose measures to determine the edge weights using the corresponding edge weights in the inferred networks, and choose the edges with the largest weights as the putative regulations between genes or proteins. The developed method is rigorously assessed by the six benchmark networks in DREAM challenges, the mitogen-activated protein (MAP) kinase pathway, and a cancer-specific gene regulatory network. The inferred networks are compared with those obtained by using two up-to-date inference methods. The accuracy of the inferred networks shows that the developed method is effective for discovering molecular regulatory systems.
Collapse
|
34
|
Panditrao G, Bhowmick R, Meena C, Sarkar RR. Emerging landscape of molecular interaction networks: Opportunities, challenges and prospects. J Biosci 2022. [PMID: 36210749 PMCID: PMC9018971 DOI: 10.1007/s12038-022-00253-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Network biology finds application in interpreting molecular interaction networks and providing insightful inferences using graph theoretical analysis of biological systems. The integration of computational bio-modelling approaches with different hybrid network-based techniques provides additional information about the behaviour of complex systems. With increasing advances in high-throughput technologies in biological research, attempts have been made to incorporate this information into network structures, which has led to a continuous update of network biology approaches over time. The newly minted centrality measures accommodate the details of omics data and regulatory network structure information. The unification of graph network properties with classical mathematical and computational modelling approaches and technologically advanced approaches like machine-learning- and artificial intelligence-based algorithms leverages the potential application of these techniques. These computational advances prove beneficial and serve various applications such as essential gene prediction, identification of drug–disease interaction and gene prioritization. Hence, in this review, we have provided a comprehensive overview of the emerging landscape of molecular interaction networks using graph theoretical approaches. With the aim to provide information on the wide range of applications of network biology approaches in understanding the interaction and regulation of genes, proteins, enzymes and metabolites at different molecular levels, we have reviewed the methods that utilize network topological properties, emerging hybrid network-based approaches and applications that integrate machine learning techniques to analyse molecular interaction networks. Further, we have discussed the applications of these approaches in biomedical research with a note on future prospects.
Collapse
Affiliation(s)
- Gauri Panditrao
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Rupa Bhowmick
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| | - Chandrakala Meena
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Ram Rup Sarkar
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| |
Collapse
|
35
|
Deep neural network prediction of genome-wide transcriptome signatures - beyond the Black-box. NPJ Syst Biol Appl 2022; 8:9. [PMID: 35197482 PMCID: PMC8866467 DOI: 10.1038/s41540-022-00218-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 01/24/2022] [Indexed: 11/28/2022] Open
Abstract
Prediction algorithms for protein or gene structures, including transcription factor binding from sequence information, have been transformative in understanding gene regulation. Here we ask whether human transcriptomic profiles can be predicted solely from the expression of transcription factors (TFs). We find that the expression of 1600 TFs can explain >95% of the variance in 25,000 genes. Using the light-up technique to inspect the trained NN, we find an over-representation of known TF-gene regulations. Furthermore, the learned prediction network has a hierarchical organization. A smaller set of around 125 core TFs could explain close to 80% of the variance. Interestingly, reducing the number of TFs below 500 induces a rapid decline in prediction performance. Next, we evaluated the prediction model using transcriptional data from 22 human diseases. The TFs were sufficient to predict the dysregulation of the target genes (rho = 0.61, P < 10−216). By inspecting the model, key causative TFs could be extracted for subsequent validation using disease-associated genetic variants. We demonstrate a methodology for constructing an interpretable neural network predictor, where analyses of the predictors identified key TFs that were inducing transcriptional changes during disease.
Collapse
|
36
|
Salih SJ, Ghobadi MZ. Evaluating the cytotoxicity and pathogenicity of multi-walled carbon nanotube through weighted gene co-expression network analysis: a nanotoxicogenomics study. BMC Genom Data 2022; 23:12. [PMID: 35176998 PMCID: PMC8851761 DOI: 10.1186/s12863-022-01031-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 02/08/2022] [Indexed: 11/20/2022] Open
Abstract
Background Multi-walled carbon nanotube (MWCNT) is one of the most momentous carbonaceous nanoparticles which is widely used for various applications such as electronics, vehicles, and therapeutics. However, their possible toxicity and adverse effects convert them into a major health threat for humans and animals. Results In this study, we employed weighted gene co-expression network analysis (WGCNA) to identify the co-expressed gene groups and dysregulated pathways due to the MWCNT exposure. For this purpose, three weighted gene co-expression networks for the microarray gene expression profiles of the mouse after 1, 6, and 12-month post-exposure to MWCNT were constructed. The module-trait analysis specified the significant modules related to different doses (1, 10, 40, and 80 µg) of MWCNT. Afterward, common genes between co-regulated and differentially expressed genes were determined. The further pathway analysis highlighted the enrichment of genes including Actb, Ube2b, Psme3, Ezh2, Alas2, S100a10, Ypel5, Rhoa, Rac1, Ube2l6, Prdx2, Ctsb, Bnip3l, Gp6, Myh9, Ube2k, Mbnl1, Kbtbd8, Riok3, Itgb1, Rap1a, and Atp5h in immune-, inflammation-, and protein metabolism-related pathways. Conclusions This study discloses the genotoxicity and cytotoxicity effects of various doses of MWCNT which also affect the metabolism system. The identified genes can serve as potential biomarkers and therapeutic candidates. However, further studies should be performed to validate them in human cells. Supplementary Information The online version contains supplementary material available at 10.1186/s12863-022-01031-3.
Collapse
Affiliation(s)
- Shameran Jamal Salih
- Department of Chemistry, Faculty of Science and Health, Koya University, KOY45, Koya, Kurdistan Region, Iraq
| | | |
Collapse
|
37
|
Redhu N, Thakur Z. Network biology and applications. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00024-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
38
|
Metabolic and Transcriptional Changes across Osteogenic Differentiation of Mesenchymal Stromal Cells. Bioengineering (Basel) 2021; 8:bioengineering8120208. [PMID: 34940360 PMCID: PMC8698318 DOI: 10.3390/bioengineering8120208] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 12/03/2021] [Accepted: 12/08/2021] [Indexed: 12/23/2022] Open
Abstract
Mesenchymal stromal cells (MSCs) are multipotent post-natal stem cells with applications in tissue engineering and regenerative medicine. MSCs can differentiate into osteoblasts, chondrocytes, or adipocytes, with functional differences in cells during osteogenesis accompanied by metabolic changes. The temporal dynamics of these metabolic shifts have not yet been fully characterized and are suspected to be important for therapeutic applications such as osteogenesis optimization. Here, our goal was to characterize the metabolic shifts that occur during osteogenesis. We profiled five key extracellular metabolites longitudinally (glucose, lactate, glutamine, glutamate, and ammonia) from MSCs from four donors to classify osteogenic differentiation into three metabolic stages, defined by changes in the uptake and secretion rates of the metabolites in cell culture media. We used a combination of untargeted metabolomic analysis, targeted analysis of 13C-glucose labelled intracellular data, and RNA-sequencing data to reconstruct a gene regulatory network and further characterize cellular metabolism. The metabolic stages identified in this proof-of-concept study provide a framework for more detailed investigations aimed at identifying biomarkers of osteogenic differentiation and small molecule interventions to optimize MSC differentiation for clinical applications.
Collapse
|
39
|
St Mary C, Powell THQ, Kominoski JS, Weinert E. Rescaling Biology: Increasing Integration Across Biological Scales and Subdisciplines to Enhance Understanding and Prediction. Integr Comp Biol 2021; 61:2031-2037. [PMID: 34472603 DOI: 10.1093/icb/icab191] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The organization of the living world covers a vast range of spatiotemporal scales, from molecules to the biosphere, seconds to centuries. Biologists working within specialized subdisciplines tend to focus on different ranges of scales. Therefore, developing frameworks that enable testing questions and predictions of scaling require sufficient understanding of complex processes across biological subdisciplines and spatiotemporal scales. Frameworks that enable scaling across subdisciplines would ideally allow us to test hypotheses about the degree to which explicit integration across spatiotemporal scales is needed for predicting the outcome of biological processes. For instance, how does genomic variation within populations allow us to explain community structure? How do the dynamics of cellular metabolism translate to our understanding of whole-ecosystem metabolism? Do patterns and processes operate seamlessly across biological scales, or are there fundamental laws of biological scaling that limit our ability to make predictions from one scale to another? Similarly, can sub-organismal structures and processes be sufficiently understood in isolation of potential feedbacks from the population, community, or ecosystem levels? And can we infer the sub-organismal processes from data on the population, community, or ecosystem scale? Concerted efforts to develop more cross-disciplinary frameworks will open doors to a more fully integrated field of biology. In this paper we discuss how we might integrate across scales, specifically by 1. Identifying scales and boundaries, 2. Determining analogous units and processes across scales, 3. Developing frameworks to unite multiple scales, and 4. Extending frameworks to new empirical systems.
Collapse
|
40
|
Tyler J, Forger D, Kim JK. Inferring causality in biological oscillators. Bioinformatics 2021; 38:196-203. [PMID: 34463706 PMCID: PMC8696107 DOI: 10.1093/bioinformatics/btab623] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 08/25/2021] [Accepted: 08/27/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Fundamental to biological study is identifying regulatory interactions. The recent surge in time-series data collection in biology provides a unique opportunity to infer regulations computationally. However, when components oscillate, model-free inference methods, while easily implemented, struggle to distinguish periodic synchrony and causality. Alternatively, model-based methods test the reproducibility of time series given a specific model but require inefficient simulations and have limited applicability. RESULTS We develop an inference method based on a general model of molecular, neuronal and ecological oscillatory systems that merges the advantages of both model-based and model-free methods, namely accuracy, broad applicability and usability. Our method successfully infers the positive and negative regulations within various oscillatory networks, e.g. the repressilator and a network of cofactors at the pS2 promoter, outperforming popular inference methods. AVAILABILITY AND IMPLEMENTATION We provide a computational package, ION (Inferring Oscillatory Networks), that users can easily apply to noisy, oscillatory time series to uncover the mechanisms by which diverse systems generate oscillations. Accompanying MATLAB code under a BSD-style license and examples are available at https://github.com/Mathbiomed/ION. Additionally, the code is available under a CC-BY 4.0 License at https://doi.org/10.6084/m9.figshare.16431408.v1. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jonathan Tyler
- Department of Mathematics, University of Michigan, Ann Arbor, MI 48109, USA,Department of Pediatrics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Daniel Forger
- Department of Mathematics, University of Michigan, Ann Arbor, MI 48109, USA,Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | | |
Collapse
|
41
|
Pazhamala LT, Kudapa H, Weckwerth W, Millar AH, Varshney RK. Systems biology for crop improvement. THE PLANT GENOME 2021; 14:e20098. [PMID: 33949787 DOI: 10.1002/tpg2.20098] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Accepted: 03/09/2021] [Indexed: 05/19/2023]
Abstract
In recent years, generation of large-scale data from genome, transcriptome, proteome, metabolome, epigenome, and others, has become routine in several plant species. Most of these datasets in different crop species, however, were studied independently and as a result, full insight could not be gained on the molecular basis of complex traits and biological networks. A systems biology approach involving integration of multiple omics data, modeling, and prediction of the cellular functions is required to understand the flow of biological information that underlies complex traits. In this context, systems biology with multiomics data integration is crucial and allows a holistic understanding of the dynamic system with the different levels of biological organization interacting with external environment for a phenotypic expression. Here, we present recent progress made in the area of various omics studies-integrative and systems biology approaches with a special focus on application to crop improvement. We have also discussed the challenges and opportunities in multiomics data integration, modeling, and understanding of the biology of complex traits underpinning yield and stress tolerance in major cereals and legumes.
Collapse
Affiliation(s)
- Lekha T Pazhamala
- Center of Excellence in Genomics & Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Hyderabad, 502 324, India
| | - Himabindu Kudapa
- Center of Excellence in Genomics & Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Hyderabad, 502 324, India
| | - Wolfram Weckwerth
- Department of Ecogenomics and Systems Biology, University of Vienna, Vienna, Austria
- Vienna Metabolomics Center, University of Vienna, Vienna, Austria
| | - A Harvey Millar
- ARC Centre of Excellence in Plant Energy Biology and School of Molecular Sciences, The University of Western Australia, Perth, WA, Australia
| | - Rajeev K Varshney
- Center of Excellence in Genomics & Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Hyderabad, 502 324, India
- State Agricultural Biotechnology Centre, Crop Research Innovation Centre, Food Futures Institute, Murdoch University, Murdoch, WA, Australia
| |
Collapse
|
42
|
Integrated Inference of Asymmetric Protein Interaction Networks Using Dynamic Model and Individual Patient Proteomics Data. Symmetry (Basel) 2021. [DOI: 10.3390/sym13061097] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Recent advances in experimental biology studies have produced large amount of molecular activity data. In particular, individual patient data provide non-time series information for the molecular activities in disease conditions. The challenge is how to design effective algorithms to infer regulatory networks using the individual patient datasets and consequently address the issue of network symmetry. This work is aimed at developing an efficient pipeline to reverse-engineer regulatory networks based on the individual patient proteomic data. The first step uses the SCOUT algorithm to infer the pseudo-time trajectory of individual patients. Then the path-consistent method with part mutual information is used to construct a static network that contains the potential protein interactions. To address the issue of network symmetry in terms of undirected symmetric network, a dynamic model of ordinary differential equations is used to further remove false interactions to derive asymmetric networks. In this work a dataset from triple-negative breast cancer patients is used to develop a protein-protein interaction network with 15 proteins.
Collapse
|
43
|
Identifying "more equal than others" edges in diverse biochemical networks. Proc Natl Acad Sci U S A 2021; 118:2103698118. [PMID: 33766850 DOI: 10.1073/pnas.2103698118] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
44
|
Computational analysis of fused co-expression networks for the identification of candidate cancer gene biomarkers. NPJ Syst Biol Appl 2021; 7:17. [PMID: 33712625 PMCID: PMC7955132 DOI: 10.1038/s41540-021-00175-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 02/08/2021] [Indexed: 11/08/2022] Open
Abstract
The complexity of cancer has always been a huge issue in understanding the source of this disease. However, by appreciating its complexity, we can shed some light on crucial gene associations across and in specific cancer types. In this study, we develop a general framework to infer relevant gene biomarkers and their gene-to-gene associations using multiple gene co-expression networks for each cancer type. Specifically, we infer computationally and biologically interesting communities of genes from kidney renal clear cell carcinoma, liver hepatocellular carcinoma, and prostate adenocarcinoma data sets of The Cancer Genome Atlas (TCGA) database. The gene communities are extracted through a data-driven pipeline and then evaluated through both functional analyses and literature findings. Furthermore, we provide a computational validation of their relevance for each cancer type by comparing the performance of normal/cancer classification for our identified gene sets and other gene signatures, including the typically-used differentially expressed genes. The hallmark of this study is its approach based on gene co-expression networks from different similarity measures: using a combination of multiple gene networks and then fusing normal and cancer networks for each cancer type, we can have better insights on the overall structure of the cancer-type-specific network.
Collapse
|
45
|
Raimondo S, De Domenico M. Measuring topological descriptors of complex networks under uncertainty. Phys Rev E 2021; 103:022311. [PMID: 33735966 DOI: 10.1103/physreve.103.022311] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Accepted: 01/13/2021] [Indexed: 11/07/2022]
Abstract
Revealing the structural features of a complex system from the observed collective dynamics is a fundamental problem in network science. To compute the various topological descriptors commonly used to characterize the structure of a complex system (e.g., the degree, the clustering coefficient, etc.), it is usually necessary to completely reconstruct the network of relations between the subsystems. Several methods are available to detect the existence of interactions between the nodes of a network. By observing some physical quantities through time, the structural relationships are inferred using various discriminating statistics (e.g., correlations, mutual information, etc.). In this setting, the uncertainty about the existence of the edges is reflected in the uncertainty about the topological descriptors. In this study, we propose a methodological framework to evaluate this uncertainty, replacing the topological descriptors, even at the level of a single node, with appropriate probability distributions, eluding the reconstruction phase. Our theoretical framework agrees with the numerical experiments performed on a large set of synthetic and real-world networks. Our results provide a grounded framework for the analysis and the interpretation of widely used topological descriptors, such as degree centrality, clustering, and clusters, in scenarios in which the existence of network connectivity is statistically inferred or when the probabilities of existence π_{ij} of the edges are known. To this purpose, we also provide a simple and mathematically grounded process to transform the discriminating statistics into the probabilities π_{ij}.
Collapse
Affiliation(s)
- Sebastian Raimondo
- CoMuNe Lab, Center for Information and Communication Technology, Fondazione Bruno Kessler, Via Sommarive 18, 38123 Povo (TN), Italy and Department of Mathematics, University of Trento, Via Sommarive 9, 38123 Povo (TN), Italy
| | - Manlio De Domenico
- CoMuNe Lab, Center for Information and Communication Technology, Fondazione Bruno Kessler, Via Sommarive 18, 38123 Povo (TN), Italy
| |
Collapse
|
46
|
Zhao M, He W, Tang J, Zou Q, Guo F. A comprehensive overview and critical evaluation of gene regulatory network inference technologies. Brief Bioinform 2021; 22:6128842. [PMID: 33539514 DOI: 10.1093/bib/bbab009] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 12/11/2020] [Accepted: 01/06/2021] [Indexed: 12/12/2022] Open
Abstract
Gene regulatory network (GRN) is the important mechanism of maintaining life process, controlling biochemical reaction and regulating compound level, which plays an important role in various organisms and systems. Reconstructing GRN can help us to understand the molecular mechanism of organisms and to reveal the essential rules of a large number of biological processes and reactions in organisms. Various outstanding network reconstruction algorithms use specific assumptions that affect prediction accuracy, in order to deal with the uncertainty of processing. In order to study why a certain method is more suitable for specific research problem or experimental data, we conduct research from model-based, information-based and machine learning-based method classifications. There are obviously different types of computational tools that can be generated to distinguish GRNs. Furthermore, we discuss several classical, representative and latest methods in each category to analyze core ideas, general steps, characteristics, etc. We compare the performance of state-of-the-art GRN reconstruction technologies on simulated networks and real networks under different scaling conditions. Through standardized performance metrics and common benchmarks, we quantitatively evaluate the stability of various methods and the sensitivity of the same algorithm applying to different scaling networks. The aim of this study is to explore the most appropriate method for a specific GRN, which helps biologists and medical scientists in discovering potential drug targets and identifying cancer biomarkers.
Collapse
Affiliation(s)
- Mengyuan Zhao
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Wenying He
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jijun Tang
- University of South Carolina, Tianjin, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Fei Guo
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
47
|
Dhillon BK, Smith M, Baghela A, Lee AHY, Hancock REW. Systems Biology Approaches to Understanding the Human Immune System. Front Immunol 2020; 11:1683. [PMID: 32849587 PMCID: PMC7406790 DOI: 10.3389/fimmu.2020.01683] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Accepted: 06/24/2020] [Indexed: 12/18/2022] Open
Abstract
Systems biology is an approach to interrogate complex biological systems through large-scale quantification of numerous biomolecules. The immune system involves >1,500 genes/proteins in many interconnected pathways and processes, and a systems-level approach is critical in broadening our understanding of the immune response to vaccination. Changes in molecular pathways can be detected using high-throughput omics datasets (e.g., transcriptomics, proteomics, and metabolomics) by using methods such as pathway enrichment, network analysis, machine learning, etc. Importantly, integration of multiple omic datasets is becoming key to revealing novel biological insights. In this perspective article, we highlight the use of protein-protein interaction (PPI) networks as a multi-omics integration approach to unravel information flow and mechanisms during complex biological events, with a focus on the immune system. This involves a combination of tools, including: InnateDB, a database of curated interactions between genes and protein products involved in the innate immunity; NetworkAnalyst, a visualization and analysis platform for InnateDB interactions; and MetaBridge, a tool to integrate metabolite data into PPI networks. The application of these systems techniques is demonstrated for a variety of biological questions, including: the developmental trajectory of neonates during the first week of life, mechanisms in host-pathogen interaction, disease prognosis, biomarker discovery, and drug discovery and repurposing. Overall, systems biology analyses of omics data have been applied to a variety of immunology-related questions, and here we demonstrate the numerous ways in which PPI network analysis can be a powerful tool in contributing to our understanding of the immune system and the study of vaccines.
Collapse
Affiliation(s)
- Bhavjinder K. Dhillon
- Centre for Microbial Diseases and Immunity Research, University of British Columbia, Vancouver, BC, Canada
| | - Maren Smith
- Centre for Microbial Diseases and Immunity Research, University of British Columbia, Vancouver, BC, Canada
| | - Arjun Baghela
- Centre for Microbial Diseases and Immunity Research, University of British Columbia, Vancouver, BC, Canada
| | - Amy H. Y. Lee
- Centre for Microbial Diseases and Immunity Research, University of British Columbia, Vancouver, BC, Canada
- Molecular Biology & Biochemistry Department, Simon Fraser University, Burnaby, BC, Canada
| | - Robert E. W. Hancock
- Centre for Microbial Diseases and Immunity Research, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|