1
|
Isavand P, Aghamiri SS, Amin R. Applications of Multimodal Artificial Intelligence in Non-Hodgkin Lymphoma B Cells. Biomedicines 2024; 12:1753. [PMID: 39200217 PMCID: PMC11351272 DOI: 10.3390/biomedicines12081753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 07/22/2024] [Accepted: 08/01/2024] [Indexed: 09/02/2024] Open
Abstract
Given advancements in large-scale data and AI, integrating multimodal artificial intelligence into cancer research can enhance our understanding of tumor behavior by simultaneously processing diverse biomedical data types. In this review, we explore the potential of multimodal AI in comprehending B-cell non-Hodgkin lymphomas (B-NHLs). B-cell non-Hodgkin lymphomas (B-NHLs) represent a particular challenge in oncology due to tumor heterogeneity and the intricate ecosystem in which tumors develop. These complexities complicate diagnosis, prognosis, and therapy response, emphasizing the need to use sophisticated approaches to enhance personalized treatment strategies for better patient outcomes. Therefore, multimodal AI can be leveraged to synthesize critical information from available biomedical data such as clinical record, imaging, pathology and omics data, to picture the whole tumor. In this review, we first define various types of modalities, multimodal AI frameworks, and several applications in precision medicine. Then, we provide several examples of its usage in B-NHLs, for analyzing the complexity of the ecosystem, identifying immune biomarkers, optimizing therapy strategy, and its clinical applications. Lastly, we address the limitations and future directions of multimodal AI, highlighting the need to overcome these challenges for better clinical practice and application in healthcare.
Collapse
Affiliation(s)
- Pouria Isavand
- Department of Radiology, School of Medicine, Zanjan University of Medical Sciences, Zanjan 4513956184, Iran
| | | | - Rada Amin
- Department of Biochemistry, University of Nebraska, Lincoln, NE 68503, USA
| |
Collapse
|
2
|
Roman-Naranjo P, Parra-Perez AM, Lopez-Escamez JA. A systematic review on machine learning approaches in the diagnosis and prognosis of rare genetic diseases. J Biomed Inform 2023:104429. [PMID: 37352901 DOI: 10.1016/j.jbi.2023.104429] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Revised: 06/05/2023] [Accepted: 06/17/2023] [Indexed: 06/25/2023]
Abstract
BACKGROUND The diagnosis of rare genetic diseases is often challenging due to the complexity of the genetic underpinnings of these conditions and the limited availability of diagnostic tools. Machine learning (ML) algorithms have the potential to improve the accuracy and speed of diagnosis by analyzing large amounts of genomic data and identifying complex multiallelic patterns that may be associated with specific diseases. In this systematic review, we aimed to identify the methodological trends and the ML application areas in rare genetic diseases. METHODS We performed a systematic review of the literature following the PRISMA guidelines to search studies that used ML approaches to enhance the diagnosis of rare genetic diseases. Studies that used DNA-based sequencing data and a variety of ML algorithms were included, summarized, and analyzed using bibliometric methods, visualization tools, and a feature co-occurrence analysis. FINDINGS Our search identified 22 studies that met the inclusion criteria. We found that exome sequencing was the most frequently used sequencing technology (59%), and rare neoplastic diseases were the most prevalent disease scenario (59%). In rare neoplasms, the most frequent applications of ML models were the differential diagnosis or stratification of patients (38.5%) and the identification of somatic mutations (30.8%). In other rare diseases, the most frequent goals were the prioritization of rare variants or genes (55.5%) and the identification of biallelic or digenic inheritance (33.3%). The most employed method was the random forest algorithm (54.5%). In addition, the features of the datasets needed for training these algorithms were distinctive depending on the goal pursued, including the mutational load in each gene for the differential diagnosis of patients, or the combination of genotype features and sequence-derived features (such as GC-content) for the identification of somatic mutations. CONCLUSIONS ML algorithms based on sequencing data are mainly used for the diagnosis of rare neoplastic diseases, with random forest being the most common approach. We identified key features in the datasets used for training these ML models according to the objective pursued. These features can support the development of future ML models in the diagnosis of rare genetic diseases.
Collapse
Affiliation(s)
- P Roman-Naranjo
- Division of Otolaryngology, Department of Surgery, Instituto de Investigación Biosanitaria, ibs.GRANADA, Universidad de Granada, Granada, Spain; Otology and Neurotology Group CTS495, Department of Genomic Medicine, GENYO - Centre for Genomics and Oncological Research - Pfizer, University of Granada, Junta de Andalucía, PTS, Granada, Spain; Sensorineural Pathology Programme, Centro de Investigación Biomédica en Red en Enfermedades Raras, CIBERER, Madrid, Spain.
| | - A M Parra-Perez
- Division of Otolaryngology, Department of Surgery, Instituto de Investigación Biosanitaria, ibs.GRANADA, Universidad de Granada, Granada, Spain; Otology and Neurotology Group CTS495, Department of Genomic Medicine, GENYO - Centre for Genomics and Oncological Research - Pfizer, University of Granada, Junta de Andalucía, PTS, Granada, Spain; Sensorineural Pathology Programme, Centro de Investigación Biomédica en Red en Enfermedades Raras, CIBERER, Madrid, Spain
| | - J A Lopez-Escamez
- Division of Otolaryngology, Department of Surgery, Instituto de Investigación Biosanitaria, ibs.GRANADA, Universidad de Granada, Granada, Spain; Otology and Neurotology Group CTS495, Department of Genomic Medicine, GENYO - Centre for Genomics and Oncological Research - Pfizer, University of Granada, Junta de Andalucía, PTS, Granada, Spain; Sensorineural Pathology Programme, Centro de Investigación Biomédica en Red en Enfermedades Raras, CIBERER, Madrid, Spain; Meniere's Disease Neuroscience Research Program, Faculty of Medicine & Health, School of Medical Sciences, The Kolling Institute, University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
3
|
De-Kayne R, Selz OM, Marques DA, Frei D, Seehausen O, Feulner PGD. Genomic architecture of adaptive radiation and hybridization in Alpine whitefish. Nat Commun 2022; 13:4479. [PMID: 35918341 PMCID: PMC9345977 DOI: 10.1038/s41467-022-32181-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 07/20/2022] [Indexed: 11/18/2022] Open
Abstract
Adaptive radiations represent some of the most remarkable explosions of diversification across the tree of life. However, the constraints to rapid diversification and how they are sometimes overcome, particularly the relative roles of genetic architecture and hybridization, remain unclear. Here, we address these questions in the Alpine whitefish radiation, using a whole-genome dataset that includes multiple individuals of each of the 22 species belonging to six ecologically distinct ecomorph classes across several lake-systems. We reveal that repeated ecological and morphological diversification along a common environmental axis is associated with both genome-wide allele frequency shifts and a specific, larger effect, locus, associated with the gene edar. Additionally, we highlight the possible role of introgression between species from different lake-systems in facilitating the evolution and persistence of species with unique trait combinations and ecology. These results highlight the importance of both genome architecture and secondary contact with hybridization in fuelling adaptive radiation.
Collapse
Affiliation(s)
- Rishi De-Kayne
- Department of Fish Ecology and Evolution, Centre of Ecology, Evolution and Biogeochemistry, EAWAG Swiss Federal Institute of Aquatic Science and Technology, Kastanienbaum, Switzerland
- Division of Aquatic Ecology and Evolution, Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | - Oliver M Selz
- Department of Fish Ecology and Evolution, Centre of Ecology, Evolution and Biogeochemistry, EAWAG Swiss Federal Institute of Aquatic Science and Technology, Kastanienbaum, Switzerland
| | - David A Marques
- Department of Fish Ecology and Evolution, Centre of Ecology, Evolution and Biogeochemistry, EAWAG Swiss Federal Institute of Aquatic Science and Technology, Kastanienbaum, Switzerland
- Division of Aquatic Ecology and Evolution, Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
- Natural History Museum Basel, Basel, Switzerland
| | - David Frei
- Department of Fish Ecology and Evolution, Centre of Ecology, Evolution and Biogeochemistry, EAWAG Swiss Federal Institute of Aquatic Science and Technology, Kastanienbaum, Switzerland
- Division of Aquatic Ecology and Evolution, Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
| | - Ole Seehausen
- Department of Fish Ecology and Evolution, Centre of Ecology, Evolution and Biogeochemistry, EAWAG Swiss Federal Institute of Aquatic Science and Technology, Kastanienbaum, Switzerland
- Division of Aquatic Ecology and Evolution, Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
| | - Philine G D Feulner
- Department of Fish Ecology and Evolution, Centre of Ecology, Evolution and Biogeochemistry, EAWAG Swiss Federal Institute of Aquatic Science and Technology, Kastanienbaum, Switzerland.
- Division of Aquatic Ecology and Evolution, Institute of Ecology and Evolution, University of Bern, Bern, Switzerland.
| |
Collapse
|
4
|
Ozerov M, Noreikiene K, Kahar S, Huss M, Huusko A, Kõiv T, Sepp M, López M, Gårdmark A, Gross R, Vasemägi A. Whole-genome sequencing illuminates multifaceted targets of selection to humic substances in Eurasian perch. Mol Ecol 2022; 31:2367-2383. [PMID: 35202502 PMCID: PMC9314028 DOI: 10.1111/mec.16409] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 02/10/2022] [Accepted: 02/17/2022] [Indexed: 11/30/2022]
Abstract
Extreme environments are inhospitable to the majority of species, but some organisms are able to survive in such hostile conditions due to evolutionary adaptations. For example, modern bony fishes have colonized various aquatic environments, including perpetually dark, hypoxic, hypersaline and toxic habitats. Eurasian perch (Perca fluviatilis) is among the few fish species of northern latitudes that is able to live in very acidic humic lakes. Such lakes represent almost "nocturnal" environments; they contain high levels of dissolved organic matter, which in addition to creating a challenging visual environment, also affects a large number of other habitat parameters and biotic interactions. To reveal the genomic targets of humic-associated selection, we performed whole-genome sequencing of perch originating from 16 humic and 16 clear-water lakes in northern Europe. We identified over 800,000 single nucleotide polymorphisms, of which >10,000 were identified as potential candidates under selection (associated with >3000 genes) using multiple outlier approaches. Our findings suggest that adaptation to the humic environment may involve hundreds of regions scattered across the genome. Putative signals of adaptation were detected in genes and gene families with diverse functions, including organism development and ion transportation. The observed excess of variants under selection in regulatory regions highlights the importance of adaptive evolution via regulatory elements, rather than via protein sequence modification. Our study demonstrates the power of whole-genome analysis to illuminate the multifaceted nature of humic adaptation and provides the foundation for further investigation of causal mutations underlying phenotypic traits of ecological and evolutionary importance.
Collapse
Affiliation(s)
- Mikhail Ozerov
- Department of Aquatic ResourcesInstitute of Freshwater ResearchSwedish University of Agricultural SciencesDrottningholmSweden
- Department of BiologyUniversity of TurkuTurkuFinland
- Biodiversity UnitUniversity of TurkuTurkuFinland
| | - Kristina Noreikiene
- Chair of AquacultureInstitute of Veterinary Medicine and Animal SciencesEstonian University of Life SciencesTartuEstonia
| | - Siim Kahar
- Chair of AquacultureInstitute of Veterinary Medicine and Animal SciencesEstonian University of Life SciencesTartuEstonia
| | - Magnus Huss
- Department of Aquatic ResourcesSwedish University of Agricultural SciencesÖregrundSweden
| | - Ari Huusko
- Natural resources Institute Finland (Luke)PaltamoFinland
| | - Toomas Kõiv
- Chair of Hydrobiology and FisheryInstitute of Agricultural and Environmental SciencesEstonian University of Life SciencesTartuEstonia
| | - Margot Sepp
- Chair of Hydrobiology and FisheryInstitute of Agricultural and Environmental SciencesEstonian University of Life SciencesTartuEstonia
| | - María‐Eugenia López
- Department of Aquatic ResourcesInstitute of Freshwater ResearchSwedish University of Agricultural SciencesDrottningholmSweden
| | - Anna Gårdmark
- Department of Aquatic ResourcesSwedish University of Agricultural SciencesÖregrundSweden
| | - Riho Gross
- Chair of AquacultureInstitute of Veterinary Medicine and Animal SciencesEstonian University of Life SciencesTartuEstonia
| | - Anti Vasemägi
- Department of Aquatic ResourcesInstitute of Freshwater ResearchSwedish University of Agricultural SciencesDrottningholmSweden
- Chair of AquacultureInstitute of Veterinary Medicine and Animal SciencesEstonian University of Life SciencesTartuEstonia
| |
Collapse
|
5
|
Narum S, News JK, Fountain-Jones N, Hooper Junior R, Ortiz-Barrientos D, O'Boyle B, Sibbett B. Editorial 2022. Mol Ecol Resour 2021; 22:1-8. [PMID: 34919782 DOI: 10.1111/1755-0998.13572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
6
|
Lou RN, Therkildsen NO. Batch effects in population genomic studies with low-coverage whole genome sequencing data: Causes, detection and mitigation. Mol Ecol Resour 2021; 22:1678-1692. [PMID: 34825778 DOI: 10.1111/1755-0998.13559] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Revised: 11/05/2021] [Accepted: 11/11/2021] [Indexed: 01/04/2023]
Abstract
Over the past few decades, there has been an explosion in the amount of publicly available sequencing data. This opens new opportunities for combining data sets to achieve unprecedented sample sizes, spatial coverage or temporal replication in population genomic studies. However, a common concern is that nonbiological differences between data sets may generate patterns of variation in the data that can confound real biological patterns, a problem known as batch effects. In this paper, we compare two batches of low-coverage whole genome sequencing (lcWGS) data generated from the same populations of Atlantic cod (Gadus morhua). First, we show that with a "batch-effect-naive" bioinformatic pipeline, batch effects systematically biased our genetic diversity estimates, population structure inference and selection scans. We then demonstrate that these batch effects resulted from multiple technical differences between our data sets, including the sequencing chemistry (four-channel vs. two-channel), sequencing run, read type (single-end vs. paired-end), read length (125 vs. 150 bp), DNA degradation level (degraded vs. well preserved) and sequencing depth (0.8× vs. 0.3× on average). Lastly, we illustrate that a set of simple bioinformatic strategies (such as different read trimming and single nucleotide polymorphism filtering) can be used to detect batch effects in our data and substantially mitigate their impact. We conclude that combining data sets remains a powerful approach as long as batch effects are explicitly accounted for. We focus on lcWGS data in this paper, which may be particularly vulnerable to certain causes of batch effects, but many of our conclusions also apply to other sequencing strategies.
Collapse
|
7
|
Abstract
The rapidly emerging field of macrogenetics focuses on analysing publicly accessible genetic datasets from thousands of species to explore large-scale patterns and predictors of intraspecific genetic variation. Facilitated by advances in evolutionary biology, technology, data infrastructure, statistics and open science, macrogenetics addresses core evolutionary hypotheses (such as disentangling environmental and life-history effects on genetic variation) with a global focus. Yet, there are important, often overlooked, limitations to this approach and best practices need to be considered and adopted if macrogenetics is to continue its exciting trajectory and reach its full potential in fields such as biodiversity monitoring and conservation. Here, we review the history of this rapidly growing field, highlight knowledge gaps and future directions, and provide guidelines for further research.
Collapse
|