1
|
Morabito A, De Simone G, Pastorelli R, Brunelli L, Ferrario M. Algorithms and tools for data-driven omics integration to achieve multilayer biological insights: a narrative review. J Transl Med 2025; 23:425. [PMID: 40211300 PMCID: PMC11987215 DOI: 10.1186/s12967-025-06446-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2025] [Accepted: 03/30/2025] [Indexed: 04/13/2025] Open
Abstract
Systems biology is a holistic approach to biological sciences that combines experimental and computational strategies, aimed at integrating information from different scales of biological processes to unravel pathophysiological mechanisms and behaviours. In this scenario, high-throughput technologies have been playing a major role in providing huge amounts of omics data, whose integration would offer unprecedented possibilities in gaining insights on diseases and identifying potential biomarkers. In the present review, we focus on strategies that have been applied in literature to integrate genomics, transcriptomics, proteomics, and metabolomics in the year range 2018-2024. Integration approaches were divided into three main categories: statistical-based approaches, multivariate methods, and machine learning/artificial intelligence techniques. Among them, statistical approaches (mainly based on correlation) were the ones with a slightly higher prevalence, followed by multivariate approaches, and machine learning techniques. Integrating multiple biological layers has shown great potential in uncovering molecular mechanisms, identifying putative biomarkers, and aid classification, most of the time resulting in better performances when compared to single omics analyses. However, significant challenges remain. The high-throughput nature of omics platforms introduces issues such as variable data quality, missing values, collinearity, and dimensionality. These challenges further increase when combining multiple omics datasets, as the complexity and heterogeneity of the data increase with integration. We report different strategies that have been found in literature to cope with these challenges, but some open issues still remain and should be addressed to disclose the full potential of omics integration.
Collapse
Affiliation(s)
- Aurelia Morabito
- Laboratory of Metabolites and Proteins in Translational Research, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 20156, Milan, Italy.
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, 20133, Milan, Italy.
| | - Giulia De Simone
- Laboratory of Metabolites and Proteins in Translational Research, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 20156, Milan, Italy
- Department of Biotechnologies and Biosciences, Università degli Studi Milano Bicocca, 20126, Milan, Italy
| | - Roberta Pastorelli
- Laboratory of Metabolites and Proteins in Translational Research, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 20156, Milan, Italy
| | - Laura Brunelli
- Laboratory of Metabolites and Proteins in Translational Research, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 20156, Milan, Italy
| | - Manuela Ferrario
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, 20133, Milan, Italy
| |
Collapse
|
2
|
Miyake K, Costa Cruz PH, Nagatomo I, Kato Y, Motooka D, Satoh S, Adachi Y, Takeda Y, Kawahara Y, Kumanogoh A. A cancer-associated METTL14 mutation induces aberrant m6A modification, affecting tumor growth. Cell Rep 2023; 42:112688. [PMID: 37355987 DOI: 10.1016/j.celrep.2023.112688] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 04/25/2023] [Accepted: 06/08/2023] [Indexed: 06/27/2023] Open
Abstract
The methyltransferase-like 3 (METTL3)-/METTL14-containing complex predominantly catalyzes N6-methyladenosine (m6A) modification, which affects mRNA stability. Although the METTL14 R298P mutation is found in multiple cancer types, its biological effects are not completely understood. Here, we show that the heterozygous R298P mutation promotes cancer cell proliferation, whereas the homozygous mutation reduces proliferation. Methylated RNA immunoprecipitation sequencing analysis indicates that the R298P mutation reduces m6A modification at canonical motifs. Furthermore, this mutation induces m6A modification at aberrant motifs, which is evident only in cell lines harboring the homozygous mutation. The aberrant recognition of m6A modification sites alters the methylation efficiency at surrounding canonical motifs. One example is c-MET mRNA, which is highly methylated at canonical motifs close to the aberrantly methylated sites. Consequently, c-MET mRNA is severely destabilized, reducing c-Myc expression and suppressing cell proliferation. These data suggest that the METTL14 R298P mutation affects target recognition for m6A modification, perturbing gene expression patterns and cell growth.
Collapse
Affiliation(s)
- Kotaro Miyake
- Department of Respiratory Medicine and Clinical Immunology, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan.
| | - Pedro Henrique Costa Cruz
- Department of RNA Biology and Neuroscience, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan
| | - Izumi Nagatomo
- Department of Respiratory Medicine and Clinical Immunology, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan
| | - Yuki Kato
- Department of RNA Biology and Neuroscience, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan; Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, Osaka 565-0871, Japan
| | - Daisuke Motooka
- Research Institute for Microbial Diseases, Osaka University, Suita, Osaka 565-0871, Japan
| | - Shingo Satoh
- Department of Respiratory Medicine and Clinical Immunology, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan
| | - Yuichi Adachi
- Department of Respiratory Medicine and Clinical Immunology, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan
| | - Yoshito Takeda
- Department of Respiratory Medicine and Clinical Immunology, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan
| | - Yukio Kawahara
- Department of RNA Biology and Neuroscience, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan; Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, Osaka 565-0871, Japan.
| | - Atsushi Kumanogoh
- Department of Respiratory Medicine and Clinical Immunology, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan; Department of Immunopathology, World Premier Institute Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, Osaka 565-0871, Japan; Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, Osaka 565-0871, Japan.
| |
Collapse
|
3
|
Chierici M, Bussola N, Marcolini A, Francescatto M, Zandonà A, Trastulla L, Agostinelli C, Jurman G, Furlanello C. Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling. Front Oncol 2020; 10:1065. [PMID: 32714870 PMCID: PMC7340129 DOI: 10.3389/fonc.2020.01065] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Accepted: 05/28/2020] [Indexed: 12/20/2022] Open
Abstract
Recent technological advances and international efforts, such as The Cancer Genome Atlas (TCGA), have made available several pan-cancer datasets encompassing multiple omics layers with detailed clinical information in large collection of samples. The need has thus arisen for the development of computational methods aimed at improving cancer subtyping and biomarker identification from multi-modal data. Here we apply the Integrative Network Fusion (INF) pipeline, which combines multiple omics layers exploiting Similarity Network Fusion (SNF) within a machine learning predictive framework. INF includes a feature ranking scheme (rSNF) on SNF-integrated features, used by a classifier over juxtaposed multi-omics features (juXT). In particular, we show instances of INF implementing Random Forest (RF) and linear Support Vector Machine (LSVM) as the classifier, and two baseline RF and LSVM models are also trained on juXT. A compact RF model, called rSNFi, trained on the intersection of top-ranked biomarkers from the two approaches juXT and rSNF is finally derived. All the classifiers are run in a 10x5-fold cross-validation schema to warrant reproducibility, following the guidelines for an unbiased Data Analysis Plan by the US FDA-led initiatives MAQC/SEQC. INF is demonstrated on four classification tasks on three multi-modal TCGA oncogenomics datasets. Gene expression, protein expression and copy number variants are used to predict estrogen receptor status (BRCA-ER, N = 381) and breast invasive carcinoma subtypes (BRCA-subtypes, N = 305), while gene expression, miRNA expression and methylation data is used as predictor layers for acute myeloid leukemia and renal clear cell carcinoma survival (AML-OS, N = 157; KIRC-OS, N = 181). In test, INF achieved similar Matthews Correlation Coefficient (MCC) values and 97% to 83% smaller feature sizes (FS), compared with juXT for BRCA-ER (MCC: 0.83 vs. 0.80; FS: 56 vs. 1801) and BRCA-subtypes (0.84 vs. 0.80; 302 vs. 1801), improving KIRC-OS performance (0.38 vs. 0.31; 111 vs. 2319). INF predictions are generally more accurate in test than one-dimensional omics models, with smaller signatures too, where transcriptomics consistently play the leading role. Overall, the INF framework effectively integrates multiple data levels in oncogenomics classification tasks, improving over the performance of single layers alone and naive juxtaposition, and provides compact signature sizes.
Collapse
Affiliation(s)
| | - Nicole Bussola
- Fondazione Bruno Kessler, Trento, Italy
- University of Trento, Trento, Italy
| | | | - Margherita Francescatto
- Fondazione Bruno Kessler, Trento, Italy
- Department of Medical, Surgical and Health Sciences, University of Trieste, Trieste, Italy
| | | | | | | | | | | |
Collapse
|