1
|
Shaposhnikov M, Thakar J, Berk BC. Value of Bioinformatics Models for Predicting Translational Control of Angiogenesis. Circ Res 2025; 136:1147-1165. [PMID: 40339045 DOI: 10.1161/circresaha.125.325438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 05/10/2025]
Abstract
Angiogenesis, the formation of new blood vessels, is a fundamental biological process with implications for both physiological functions and pathological conditions. While the transcriptional regulation of angiogenesis, mediated by factors such as HIF-1α (hypoxia-inducible factor 1-alpha) and VEGF (vascular endothelial growth factor), is well-characterized, the translational regulation of this process remains underexplored. Bioinformatics has emerged as an indispensable tool for advancing our understanding of translational regulation, offering predictive models that leverage large data sets to guide research and optimize experimental approaches. However, a significant gap persists between bioinformatics experts and other researchers, limiting the accessibility and utility of these tools in the broader scientific community. To address this divide, user-friendly bioinformatics platforms are being developed to democratize access to predictive analytics and empower researchers across disciplines. Translational control, compared with transcriptional control, offers a more energy-efficient mechanism that facilitates rapid cellular responses to environmental changes. Furthermore, transcriptional regulators themselves are often subject to translational control, emphasizing the interconnected nature of these regulatory layers. Investigating translational regulation requires advanced, accessible bioinformatics tools to analyze RNA structures, interacting micro-RNAs, long noncoding RNAs, and RBPs (RNA-binding proteins). Predictive platforms such as RNA structure, human internal ribosome entry site Atlas, and RBPSuite enable the study of RNA motifs and RNA-protein interactions, shedding light on these critical regulatory mechanisms. This review highlights the transformative role of bioinformatics using widely accessible user-friendly tools with a Web-browser interface to elucidate translational regulation in angiogenesis. The bioinformatics tools discussed extend beyond angiogenesis, with applications in diverse fields, including clinical care. By integrating predictive models and experimental insights, researchers can streamline hypothesis generation, reduce experimental costs, and find novel translational regulators. By bridging the bioinformatics knowledge gap, this review aims to empower researchers worldwide to adopt bioinformatics tools in their work, fostering innovation and accelerating scientific discovery.
Collapse
Affiliation(s)
- Michal Shaposhnikov
- Department of Cellular and Molecular Pharmacology and Physiology (M.S., B.C.B.), University of Rochester School of Medicine and Dentistry, NY
- Department of Medicine, Aab Cardiovascular Research Institute (M.S., B.C.B.), University of Rochester School of Medicine and Dentistry, NY
| | - Juilee Thakar
- Department of Microbiology and Immunology (J.T.), University of Rochester School of Medicine and Dentistry, NY
- Department of Biomedical Genetics, Biostatistics and Computational Biology (J.T.), University of Rochester School of Medicine and Dentistry, NY
| | - Bradford C Berk
- Department of Cellular and Molecular Pharmacology and Physiology (M.S., B.C.B.), University of Rochester School of Medicine and Dentistry, NY
- Department of Medicine, Aab Cardiovascular Research Institute (M.S., B.C.B.), University of Rochester School of Medicine and Dentistry, NY
| |
Collapse
|
2
|
Yan R, Islam MT, Xing L. Interpretable discovery of patterns in tabular data via spatially semantic topographic maps. Nat Biomed Eng 2025; 9:471-482. [PMID: 39407015 DOI: 10.1038/s41551-024-01268-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 09/23/2024] [Indexed: 04/18/2025]
Abstract
Tabular data-rows of samples and columns of sample features-are ubiquitously used across disciplines. Yet the tabular representation makes it difficult to discover underlying associations in the data and thus hinders their analysis and the discovery of useful patterns. Here we report a broadly applicable strategy for unravelling intertwined relationships in tabular data by reconfiguring each data sample into a spatially semantic 2D topographic map, which we refer to as TabMap. A TabMap preserves the original feature values as pixel intensities, with the relationships among the features spatially encoded in the map (the strength of two inter-related features correlates with their distance on the map). TabMap makes it possible to apply 2D convolutional neural networks to extract association patterns in the data to aid data analysis, and offers interpretability by ranking features according to importance. We show the superior predictive performance of TabMap by applying it to 12 datasets across a wide range of biomedical applications, including disease diagnosis, human activity recognition, microbial identification and the analysis of quantitative structure-activity relationships.
Collapse
Affiliation(s)
- Rui Yan
- Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA, USA
| | - Md Tauhidual Islam
- Department of Radiation Oncology, Stanford University, Stanford, CA, USA
| | - Lei Xing
- Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA, USA.
- Department of Radiation Oncology, Stanford University, Stanford, CA, USA.
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA.
| |
Collapse
|
3
|
Luo X, Zhang X, Su D, Li H, Zou M, Xiong Y, Yang L. Deep Clustering-Based Metabolic Stratification of Non-Small Cell Lung Cancer Patients Through Integration of Somatic Mutation Profile and Network Propagation Algorithm. Interdiscip Sci 2025:10.1007/s12539-025-00699-2. [PMID: 40100545 DOI: 10.1007/s12539-025-00699-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Revised: 02/21/2025] [Accepted: 02/22/2025] [Indexed: 03/20/2025]
Abstract
As a common malignancy of the lower respiratory tract, non-small cell lung cancer (NSCLC) represents a major oncological challenge globally, characterized by high incidence and mortality rates. Recent research highlights the critical involvement of somatic mutations in the onset and development of NSCLC. Stratification of NSCLC patients based on somatic mutation data could facilitate the identification of patients likely to respond to personalized therapeutic strategies. However, stratification of NSCLC patients using somatic mutation data is challenging due to the sparseness of this data. In this study, based on sparse somatic mutation data from 4581 NSCLC patients from the Memorial Sloan Kettering Cancer Center (MSKCC) database, we systematically evaluate the metabolic pathway activity in NSCLC patients through the application of network propagation algorithm and computational biology algorithms. Based on these metabolic pathways associated with prognosis, as recognized through univariate Cox regression analysis, NSCLC patients are stratified using the deep clustering algorithm to explore the optimal classification strategy, thereby establishing biologically meaningful metabolic subtypes of NSCLC patients. The precise NSCLC metabolic subtypes obtained from the network propagation algorithm and deep clustering algorithm are systematically evaluated and validated for survival benefits of immunotherapy. Our research marks progress towards developing a universal approach for classifying NSCLC patients based solely on somatic mutation profiles, employing deep clustering algorithm. The implementation of our research will help to deepen the analysis of NSCLC patients' metabolic subtypes from the perspective of tumor microenvironment, providing a strong basis for the formulation of more precise personalized treatment plans.
Collapse
Affiliation(s)
- Xu Luo
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Xinpeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Dongqing Su
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Honghao Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Min Zou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Yuqiang Xiong
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Lei Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China.
| |
Collapse
|
4
|
van Dorp CH, Gray JI, Paik DH, Farber DL, Yates AJ. A variational deep-learning approach to modeling memory T cell dynamics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.07.08.602409. [PMID: 40060443 PMCID: PMC11888226 DOI: 10.1101/2024.07.08.602409] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/15/2025]
Abstract
Mechanistic models of dynamic, interacting cell populations have yielded many insights into the growth and resolution of immune responses. Historically these models have described the behavior of pre-defined cell types based on small numbers of phenotypic markers. The ubiquity of deep phenotyping therefore presents a new challenge; how do we confront tractable and interpretable mathematical models with high-dimensional data? To tackle this problem, we studied the development and persistence of lung-resident memory CD4 and CD8 T cells (TRM) in mice infected with influenza virus. We developed an approach in which dynamical model parameters and the population structure are inferred simultaneously. This method uses deep learning and stochastic variational inference and is trained on the single-cell flow-cytometry data directly, rather than on the kinetics of pre-identified clusters. We show that during the resolution phase of the immune response, memory CD4 and CD8 T cells within the lung are phenotypically diverse, with subsets exhibiting highly distinct and time-dependent dynamics. TRM heterogeneity is maintained long-term by ongoing differentiation of relatively persistent Bcl-2hi CD4 and CD8 TRM subsets which resolve into distinct functional populations. Our approach yields new insights into the dynamics of tissue-localized immune memory, and is a novel basis for interpreting time series of high-dimensional data, broadly applicable to diverse biological systems.
Collapse
Affiliation(s)
- Christiaan H van Dorp
- Department of Pathology and Cell Biology, Columbia University Irving Medical Center, New York City, USA
| | - Joshua I Gray
- Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York City, USA
| | - Daniel H Paik
- Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York City, USA
| | - Donna L Farber
- Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York City, USA
| | - Andrew J Yates
- Department of Pathology and Cell Biology, Columbia University Irving Medical Center, New York City, USA
| |
Collapse
|
5
|
Beato M, Jaward MH, Nassis GP, Figueiredo P, Clemente FM, Krustrup P. An Educational Review on Machine Learning: A SWOT Analysis for Implementing Machine Learning Techniques in Football. Int J Sports Physiol Perform 2025; 20:183-191. [PMID: 39662428 DOI: 10.1123/ijspp.2024-0247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 09/25/2024] [Accepted: 10/07/2024] [Indexed: 12/13/2024]
Abstract
PURPOSE The abundance of data in football presents both opportunities and challenges for decision making. Consequently, this review has 2 primary objectives: first, to provide practitioners with a concise overview of the characteristics of machine-learning (ML) analysis, and, second, to conduct a strengths, weaknesses, opportunities, and threats (SWOT) analysis regarding the implementation of ML techniques in professional football clubs. This review explains the difference between artificial intelligence and ML and the difference between ML and statistical analysis. Moreover, we summarize and explain the characteristics of ML learning approaches, such as supervised learning, unsupervised learning, and reinforcement learning. Finally, we present an example of a SWOT analysis that suggests some actions to be considered in applying ML techniques by medical and sport science staff working in football. Specifically, 4 dimensions are presented: the use of strengths to create opportunities and make the most of them, the use of strengths to avoid threats, working on weaknesses to take advantage of opportunities, and upgrading weaknesses to avoid threats. CONCLUSION ML analysis can be an invaluable tool for football clubs and sport-science and medical departments due to its ability to analyze vast amounts of data and extract meaningful insights. Moreover, ML can enhance performance by assessing the risk of injury, physiological parameters, and physical fitness, as well as optimizing training, recommending strategies based on opponent analysis, and identifying talent and assessing player suitability.
Collapse
Affiliation(s)
- Marco Beato
- School of Allied Health Sciences, University of Suffolk, Ipswich, United Kingdom
| | - Mohamed Hisham Jaward
- School of School of Technology, Business and Arts, University of Suffolk, Ipswich, United Kingdom
| | - George P Nassis
- Physical Education Department, College of Education, United Arab Emirates University, Al Ain, United Arab Emirates
- Department of Sports Science and Clinical Biomechanics, Sport and Health Sciences Cluster (SHSC), University of Southern Denmark, Odense, Denmark
| | - Pedro Figueiredo
- Physical Education Department, College of Education, United Arab Emirates University, Al Ain, United Arab Emirates
- Research Center in Sports Sciences, Health Sciences and Human Development, CIDESD, Vila Real, Portugal
| | - Filipe Manuel Clemente
- Escola Superior Desporto e Lazer, Instituto Politécnico de Viana do Castelo, Rua Escola Industrial e Comercial de Nun'Álvares, Viana do Castelo, Portugal
- Gdansk University of Physical Education and Sport, Gdańsk, Poland
- Sport Physical Activity and Health Research Innovation and Technology Center (SPRINT), Viana do Castelo, Portugal
| | - Peter Krustrup
- Department of Sports Science and Clinical Biomechanics, Sport and Health Sciences Cluster (SHSC), University of Southern Denmark, Odense, Denmark
- Danish Institute for Advanced Study (DIAS), University of Southern Denmark, Odense, Denmark
| |
Collapse
|
6
|
Ikotun AM, Habyarimana F, Ezugwu AE. Cluster validity indices for automatic clustering: A comprehensive review. Heliyon 2025; 11:e41953. [PMID: 39897868 PMCID: PMC11787482 DOI: 10.1016/j.heliyon.2025.e41953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Revised: 01/08/2025] [Accepted: 01/13/2025] [Indexed: 02/04/2025] Open
Abstract
The Cluster Validity Index is an integral part of clustering algorithms. It evaluates inter-cluster separation and intra-cluster cohesion of candidate clusters to determine the quality of potential solutions. Several cluster validity indices have been suggested for both classical clustering algorithms and automatic metaheuristic-based clustering algorithms. Different cluster validity indices exhibit different characteristics based on the mathematical models they employ in determining the values for the various cluster attributes. Metaheuristic-based automatic clustering algorithms use cluster validity index as a fitness function in its optimization procedure to evaluate the candidate cluster solution's quality. A systematic review of the cluster validity indices used as fitness functions in metaheuristic-based automatic clustering algorithms is presented in this study. Identifying, reporting, and analysing various cluster validity indices is important in classifying the best CVIs for optimum performance of a metaheuristic-based automatic clustering algorithm. This review also includes an experimental study on the performance of some common cluster validity indices on some synthetic datasets with varied characteristics as well as real-life datasets using the SOSK-means automatic clustering algorithm. This review aims to assist researchers in identifying and selecting the most suitable cluster validity indices (CVIs) for their specific application areas.
Collapse
Affiliation(s)
- Abiodun M. Ikotun
- School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, King Edward Avenue, Pietermaritzburg Campus, Pietermaritzburg, 3201, KwaZulu-Natal, South Africa
| | - Faustin Habyarimana
- School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, King Edward Avenue, Pietermaritzburg Campus, Pietermaritzburg, 3201, KwaZulu-Natal, South Africa
| | - Absalom E. Ezugwu
- Unit for Data Science and Computing, North-West University, 11 Hoffman Street, Potchefstroom, 2520, North-West, South Africa
| |
Collapse
|
7
|
Shaheen A, Mrabah N, Ksantini R, Alqaddoumi A. Rethinking deep clustering paradigms: Self-supervision is all you need. Neural Netw 2025; 181:106773. [PMID: 39383676 DOI: 10.1016/j.neunet.2024.106773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 08/14/2024] [Accepted: 09/29/2024] [Indexed: 10/11/2024]
Abstract
The recent advances in deep clustering have been made possible by significant progress in self-supervised and pseudo-supervised learning. However, the trade-off between self-supervision and pseudo-supervision can give rise to three primary issues. The joint training causes Feature Randomness and Feature Drift, whereas the independent training causes Feature Randomness and Feature Twist. In essence, using pseudo-labels generates random and unreliable features. The combination of pseudo-supervision and self-supervision drifts the reliable clustering-oriented features. Moreover, moving from self-supervision to pseudo-supervision can twist the curved latent manifolds. This paper addresses the limitations of existing deep clustering paradigms concerning Feature Randomness, Feature Drift, and Feature Twist. We propose a new paradigm with a new strategy that replaces pseudo-supervision with a second round of self-supervision training. The new strategy makes the transition between instance-level self-supervision and neighborhood-level self-supervision smoother and less abrupt. Moreover, it prevents the drifting effect that is caused by the strong competition between instance-level self-supervision and clustering-level pseudo-supervision. Moreover, the absence of the pseudo-supervision prevents the risk of generating random features. With this novel approach, our paper introduces a Rethinking of the Deep Clustering Paradigms, denoted by R-DC. Our model is specifically designed to address three primary challenges encountered in Deep Clustering: Feature Randomness, Feature Drift, and Feature Twist. Experimental results conducted on six datasets have shown that the two-level self-supervision training yields substantial improvements, as evidenced by the results of the clustering and ablation study. Furthermore, experimental comparisons with nine state-of-the-art clustering models have clearly shown that our strategy leads to a significant enhancement in performance.
Collapse
Affiliation(s)
- Amal Shaheen
- Computer Science, College of IT, UOB, Kingdom of Bahrain.
| | - Nairouz Mrabah
- Computer Science, Université du Québec à Montréal, Montréal, QC, Canada
| | - Riadh Ksantini
- Computer Science, College of IT, UOB, Kingdom of Bahrain
| | | |
Collapse
|
8
|
Guo Y, Li T, Gong B, Hu Y, Wang S, Yang L, Zheng C. From Images to Genes: Radiogenomics Based on Artificial Intelligence to Achieve Non-Invasive Precision Medicine in Cancer Patients. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025; 12:e2408069. [PMID: 39535476 PMCID: PMC11727298 DOI: 10.1002/advs.202408069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 10/19/2024] [Indexed: 11/16/2024]
Abstract
With the increasing demand for precision medicine in cancer patients, radiogenomics emerges as a promising frontier. Radiogenomics is originally defined as a methodology for associating gene expression information from high-throughput technologies with imaging phenotypes. However, with advancements in medical imaging, high-throughput omics technologies, and artificial intelligence, both the concept and application of radiogenomics have significantly broadened. In this review, the history of radiogenomics is enumerated, related omics technologies, the five basic workflows and their applications across tumors, the role of AI in radiogenomics, the opportunities and challenges from tumor heterogeneity, and the applications of radiogenomics in tumor immune microenvironment. The application of radiogenomics in positron emission tomography and the role of radiogenomics in multi-omics studies is also discussed. Finally, the challenges faced by clinical transformation, along with future trends in this field is discussed.
Collapse
Affiliation(s)
- Yusheng Guo
- Department of RadiologyUnion HospitalTongji Medical CollegeHuazhong University of Science and TechnologyWuhan430022China
- Hubei Key Laboratory of Molecular ImagingWuhan430022China
| | - Tianxiang Li
- Department of UltrasoundState Key Laboratory of Complex Severe and Rare DiseasesPeking Union Medical College HospitalChinese Academy of Medical. SciencesPeking Union Medical CollegeBeijing100730China
| | - Bingxin Gong
- Department of RadiologyUnion HospitalTongji Medical CollegeHuazhong University of Science and TechnologyWuhan430022China
- Hubei Key Laboratory of Molecular ImagingWuhan430022China
| | - Yan Hu
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science and EngineeringSouthern University of Science and TechnologyShenzhen518055China
| | - Sichen Wang
- School of Life Science and TechnologyComputational Biology Research CenterHarbin Institute of TechnologyHarbin150001China
| | - Lian Yang
- Department of RadiologyUnion HospitalTongji Medical CollegeHuazhong University of Science and TechnologyWuhan430022China
- Hubei Key Laboratory of Molecular ImagingWuhan430022China
| | - Chuansheng Zheng
- Department of RadiologyUnion HospitalTongji Medical CollegeHuazhong University of Science and TechnologyWuhan430022China
- Hubei Key Laboratory of Molecular ImagingWuhan430022China
| |
Collapse
|
9
|
Gao J, Wu M, Liao J, Meng F, Chen C. Clustering one million molecular structures on GPU within seconds. J Comput Chem 2024; 45:2710-2718. [PMID: 39143827 DOI: 10.1002/jcc.27470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 06/13/2024] [Accepted: 07/14/2024] [Indexed: 08/16/2024]
Abstract
Structure clustering is a general but time-consuming work in the study of life science. Up to now, most published tools do not support the clustering analysis on graphics processing unit (GPU) with root mean square deviation metric. In this work, we specially write codes to do the work. It supports multiple threads on multiple GPUs. To show the performance, we apply the program to a 33-residue fragment in protein Pin1 WW domain mutant. The dataset contains 1,400,000 snapshots, which are extracted from an enhanced sampling simulation and distribute widely in the conformational space. Various testing results present that our program is quite efficient. Particularly, with two NVIDIA RTX4090 GPUs and single precision data type, the clustering calculation on 1 million snapshots is completed in a few seconds (including the uploading time of data from memory to GPU and neglecting the reading time from hard disk). This is hundreds of times faster than central processing unit. Our program could be a powerful tool for fast extraction of representative states of a molecule among its thousands to millions of candidate structures.
Collapse
Affiliation(s)
- Junyong Gao
- Biomolecular Physics and Modeling Group, School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Mincong Wu
- Biomolecular Physics and Modeling Group, School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Jun Liao
- Biomolecular Physics and Modeling Group, School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Fanjun Meng
- Biomolecular Physics and Modeling Group, School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Changjun Chen
- Biomolecular Physics and Modeling Group, School of Physics, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
10
|
Li T, Li M, Wu Y, Li Y. Visualization Methods for DNA Sequences: A Review and Prospects. Biomolecules 2024; 14:1447. [PMID: 39595624 PMCID: PMC11592258 DOI: 10.3390/biom14111447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Revised: 11/08/2024] [Accepted: 11/12/2024] [Indexed: 11/28/2024] Open
Abstract
The efficient analysis and interpretation of biological sequence data remain major challenges in bioinformatics. Graphical representation, as an emerging and effective visualization technique, offers a more intuitive method for analyzing DNA sequences. However, many visualization approaches are dispersed across research databases, requiring urgent organization, integration, and analysis. Additionally, no single visualization method excels in all aspects. To advance these methods, knowledge graphs and advanced machine learning techniques have become key areas of exploration. This paper reviews the current 2D and 3D DNA sequence visualization methods and proposes a new research direction focused on constructing knowledge graphs for biological sequence visualization, explaining the relevant theories, techniques, and models involved. Additionally, we summarize machine learning techniques applicable to sequence visualization, such as graph embedding methods and the use of convolutional neural networks (CNNs) for processing graphical representations. These machine learning techniques and knowledge graphs aim to provide valuable insights into computational biology, bioinformatics, genomic computing, and evolutionary analysis. The study serves as an important reference for improving intelligent search systems, enriching knowledge bases, and enhancing query systems related to biological sequence visualization, offering a comprehensive framework for future research.
Collapse
Affiliation(s)
- Tan Li
- School of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, China; (T.L.); (Y.L.)
| | - Mengshan Li
- School of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, China; (T.L.); (Y.L.)
| | - Yan Wu
- School of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, China;
| | - Yelin Li
- School of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, China; (T.L.); (Y.L.)
| |
Collapse
|
11
|
Getz WM, Salter R, Sethi V, Cain S, Spiegel O, Toledo S. The statistical building blocks of animal movement simulations. MOVEMENT ECOLOGY 2024; 12:67. [PMID: 39350248 PMCID: PMC11440923 DOI: 10.1186/s40462-024-00507-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 09/16/2024] [Indexed: 10/04/2024]
Abstract
Animal movement plays a key role in many ecological processes and has a direct influence on an individual's fitness at several scales of analysis (i.e., next-step, subdiel, day-by-day, seasonal). This highlights the need to dissect movement behavior at different spatio-temporal scales and develop hierarchical movement tools for generating realistic tracks to supplement existing single-temporal-scale simulators. In reality, animal movement paths are a concatenation of fundamental movement elements (FuMEs: e.g., a step or wing flap), but these are not generally extractable from a relocation time-series track (e.g., sequential GPS fixes) from which step-length (SL, aka velocity) and turning-angle (TA) time series can be extracted. For short, fixed-length segments of track, we generate their SL and TA statistics (e.g., means, standard deviations, correlations) to obtain segment-specific vectors that can be cluster into different types. We use the centroids of these clusters to obtain a set of statistical movement elements (StaMEs; e.g.,directed fast movement versus random slow movement elements) that we use as a basis for analyzing and simulating movement tracks. Our novel concept is that sequences of StaMEs provide a basis for constructing and fitting step-selection kernels at the scale of fixed-length canonical activity modes: short fixed-length sequences of interpretable activity such as dithering, ambling, directed walking, or running. Beyond this, variable length pure or characteristic mixtures of CAMs can be interpreted as behavioral activity modes (BAMs), such as gathering resources (a sequence of dithering and walking StaMEs) or beelining (a sequence of fast directed-walk StaMEs interspersed with vigilance and navigation stops). Here we formulate a multi-modal, step-selection kernel simulation framework, and construct a 2-mode movement simulator (Numerus ANIMOVER_1), using Numerus RAMP technology. These RAMPs run as stand alone applications: they require no coding but only the input of selected parameter values. They can also be used in R programming environments as virtual R packages. We illustrate our methods for extracting StaMEs from both ANIMOVER_1 simulated data and empirical data from two barn owls (Tyto alba) in the Harod Valley, Israel. Overall, our new bottom-up approach to path segmentation allows us to both dissect real movement tracks and generate realistic synthetic ones, thereby providing a general tool for testing hypothesis in movement ecology and simulating animal movement in diverse contexts such as evaluating an individual's response to landscape changes, release of an individual into a novel environment, or identifying when individuals are sick or unusually stressed.
Collapse
Affiliation(s)
- Wayne M Getz
- Department Environmental Science, Policy and Management, University of California, Berkeley, CA, 94720, USA.
- School of Mathematics, Statistics & Computer Science, University of KwaZulu-Natal, Durban, South Africa.
- Numerus Inc., 850 Iron Point Road, Folsom, CA, 95630, USA.
| | - Richard Salter
- Numerus Inc., 850 Iron Point Road, Folsom, CA, 95630, USA.
- Department of Computer Science, Oberlin College, Oberlin, OH, 44074, USA.
| | - Varun Sethi
- Department Environmental Science, Policy and Management, University of California, Berkeley, CA, 94720, USA
| | - Shlomo Cain
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, 69978, Tel Aviv, Israel
| | - Orr Spiegel
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, 69978, Tel Aviv, Israel
| | - Sivan Toledo
- Blavatnik School of Computer Science, Tel Aviv University, 69978, Tel Aviv, Israel
| |
Collapse
|
12
|
Wu J, Wang L, Cui Y, Liu C, Ding W, Ren S, Dong R, Zhang J. Development of a Quality Evaluation Method for Allii Macrostemonis Bulbus Based on Solid-Phase Extraction-High-Performance Liquid Chromatography-Evaporative Light Scattering Detection Chromatographic Fingerprinting, Chemometrics, and Quantitative Analysis of Multi-Components via a Single-Marker Method. Molecules 2024; 29:4600. [PMID: 39407530 PMCID: PMC11478197 DOI: 10.3390/molecules29194600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 08/31/2024] [Accepted: 09/25/2024] [Indexed: 10/20/2024] Open
Abstract
As a traditional Chinese medicine (TCM), Allii Macrostemonis Bulbus (AMB) is a key herb for the treatment of thoracic paralytic cardiac pain, but its quality evaluation method has not yet been fully clarified. In this study, chromatographic fingerprints of AMB were developed using solid-phase extraction-high-performance liquid chromatography-evaporative light scattering detection (SPE-HPLC-ELSD) to evaluate the quality of AMB from various origins and processing methods. This was achieved by employing chemical pattern recognition techniques and verifying the feasibility and applicability of the quality evaluation of AMB through the quantitative analysis of multi-components via a single-marker (QAMS) method. Through the analysis of the fingerprints of 18 batches of AMB, 30 common peaks were screened, and 6 components (adenosine, syringin, macrostemonoside T, macrostemonoside A, macrostemonoside U, and macrostemonoside V) were identified. Moreover, three differential markers (macrostemonoside A, macrostemonoside T, and macrostemonoside U) were screened out using chemometrics techniques, including principal component analysis (PCA), hierarchical cluster analysis (HCA), and orthogonal partial least squares discriminant analysis (OPLS-DA). Subsequently, a QAMS method was established for macrostemonoside T and macrostemonoside U using macrostemonoside A as an internal reference. The results demonstrate the method's accuracy, reproducibility, and stability, rendering it suitable for the quality evaluation of AMB. This study provides a theoretical basis for drug quality control and the discovery of quality markers for AMB.
Collapse
Affiliation(s)
- Jianfa Wu
- College of Chinese Medicinal Materials, Jilin Agricultural University, Changchun 130118, China; (J.W.); (Y.C.); (C.L.); (W.D.); (S.R.)
| | - Lulu Wang
- School of Medicine, Changchun Sci-Tech University, Changchun 130600, China;
| | - Ying Cui
- College of Chinese Medicinal Materials, Jilin Agricultural University, Changchun 130118, China; (J.W.); (Y.C.); (C.L.); (W.D.); (S.R.)
| | - Chang Liu
- College of Chinese Medicinal Materials, Jilin Agricultural University, Changchun 130118, China; (J.W.); (Y.C.); (C.L.); (W.D.); (S.R.)
| | - Weixing Ding
- College of Chinese Medicinal Materials, Jilin Agricultural University, Changchun 130118, China; (J.W.); (Y.C.); (C.L.); (W.D.); (S.R.)
| | - Shen Ren
- College of Chinese Medicinal Materials, Jilin Agricultural University, Changchun 130118, China; (J.W.); (Y.C.); (C.L.); (W.D.); (S.R.)
- Jilin Provincial International Joint Research Center for the Development and Utilization of Authentic Medicinal Materials, Changchun 130600, China
| | - Rui Dong
- College of Chinese Medicinal Materials, Jilin Agricultural University, Changchun 130118, China; (J.W.); (Y.C.); (C.L.); (W.D.); (S.R.)
- Jilin Provincial International Joint Research Center for the Development and Utilization of Authentic Medicinal Materials, Changchun 130600, China
| | - Jing Zhang
- College of Chinese Medicinal Materials, Jilin Agricultural University, Changchun 130118, China; (J.W.); (Y.C.); (C.L.); (W.D.); (S.R.)
- Jilin Provincial International Joint Research Center for the Development and Utilization of Authentic Medicinal Materials, Changchun 130600, China
| |
Collapse
|
13
|
Kolk MZH, Frodi DM, Langford J, Andersen TO, Jacobsen PK, Risum N, Tan HL, Svendsen JH, Knops RE, Diederichsen SZ, Tjong FVY. Deep behavioural representation learning reveals risk profiles for malignant ventricular arrhythmias. NPJ Digit Med 2024; 7:250. [PMID: 39284923 PMCID: PMC11405885 DOI: 10.1038/s41746-024-01247-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 08/30/2024] [Indexed: 09/22/2024] Open
Abstract
We aimed to identify and characterise behavioural profiles in patients at high risk of SCD, by using deep representation learning of day-to-day behavioural recordings. We present a pipeline that employed unsupervised clustering on low-dimensional representations of behavioural time-series data learned by a convolutional residual variational neural network (ResNet-VAE). Data from the prospective, observational SafeHeart study conducted at two large tertiary university centers in the Netherlands and Denmark were used. Patients received an implantable cardioverter-defibrillator (ICD) between May 2021 and September 2022 and wore wearable devices using accelerometer technology during 180 consecutive days. A total of 272 patients (mean age of 63.1 ± 10.2 years, 81% male) were eligible with a total sampling of 37,478 days of behavioural data (138 ± 47 days per patient). Deep representation learning identified five distinct behavioural profiles: Cluster A (n = 46) had very low physical activity levels and a disturbed sleep pattern. Cluster B (n = 70) had high activity levels, mainly at light-to-moderate intensity. Cluster C (n = 63) exhibited a high-intensity activity profile. Cluster D (n = 51) showed above-average sleep efficiency. Cluster E (n = 42) had frequent waking episodes and poor sleep. Annual risks of malignant ventricular arrhythmias ranged from 30.4% in Cluster A to 9.8% and 9.5% for Clusters D-E, respectively. Compared to low-risk profiles (D-E), Cluster A demonstrated a three-to-four fold increased risk of malignant ventricular arrhythmias adjusted for clinical covariates (adjusted HR 3.63, 95% CI 1.54-8.53, p < 0.001). These behavioural profiles may guide more personalised approaches to ventricular arrhythmia and SCD prevention.
Collapse
Affiliation(s)
- Maarten Z H Kolk
- Department of Clinical and Experimental Cardiology, Amsterdam UMC Location University of Amsterdam, Heart Center, Meibergdreef 9, Amsterdam, the Netherlands
- Amsterdam Cardiovascular Sciences, Heart Failure & Arrhythmias, Amsterdam UMC location AMC Meibergdreef 9, 1105 AZ, Amsterdam, the Netherlands
| | - Diana My Frodi
- Department of Cardiology, Copenhagen University Hospital Rigshospitalet, Inge Lehmanns Vej 7, 2100, Copenhagen, Denmark
| | - Joss Langford
- Activinsights Ltd., Unit 11, Harvard Industrial Estate, Kimbolton, Huntingdon, PE28 0NJ, United Kingdom
- College of Life and Environmental Sciences, University of Exeter, Stocker Rd, Exeter, EX4 4PY, United Kingdom
| | - Tariq O Andersen
- Department of Computer Science, University of Copenhagen, Universitetsparken 1, 2100, Copenhagen, Denmark
| | - Peter Karl Jacobsen
- Department of Cardiology, Copenhagen University Hospital Rigshospitalet, Inge Lehmanns Vej 7, 2100, Copenhagen, Denmark
| | - Niels Risum
- Department of Cardiology, Copenhagen University Hospital Rigshospitalet, Inge Lehmanns Vej 7, 2100, Copenhagen, Denmark
| | - Hanno L Tan
- Department of Clinical and Experimental Cardiology, Amsterdam UMC Location University of Amsterdam, Heart Center, Meibergdreef 9, Amsterdam, the Netherlands
- Netherlands Heart Institute, Moreelsepark 1, 3511 EP, Utrecht, The Netherlands
| | - Jesper Hastrup Svendsen
- Department of Cardiology, Copenhagen University Hospital Rigshospitalet, Inge Lehmanns Vej 7, 2100, Copenhagen, Denmark
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3B, 2200, Copenhagen, Denmark
| | - Reinoud E Knops
- Department of Clinical and Experimental Cardiology, Amsterdam UMC Location University of Amsterdam, Heart Center, Meibergdreef 9, Amsterdam, the Netherlands
- Amsterdam Cardiovascular Sciences, Heart Failure & Arrhythmias, Amsterdam UMC location AMC Meibergdreef 9, 1105 AZ, Amsterdam, the Netherlands
| | - Søren Zöga Diederichsen
- Department of Cardiology, Copenhagen University Hospital Rigshospitalet, Inge Lehmanns Vej 7, 2100, Copenhagen, Denmark
| | - Fleur V Y Tjong
- Department of Clinical and Experimental Cardiology, Amsterdam UMC Location University of Amsterdam, Heart Center, Meibergdreef 9, Amsterdam, the Netherlands.
- Amsterdam Cardiovascular Sciences, Heart Failure & Arrhythmias, Amsterdam UMC location AMC Meibergdreef 9, 1105 AZ, Amsterdam, the Netherlands.
| |
Collapse
|
14
|
Goggin SM, Zunder ER. ESCHR: a hyperparameter-randomized ensemble approach for robust clustering across diverse datasets. Genome Biol 2024; 25:242. [PMID: 39285487 PMCID: PMC11406744 DOI: 10.1186/s13059-024-03386-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 08/30/2024] [Indexed: 09/19/2024] Open
Abstract
Clustering is widely used for single-cell analysis, but current methods are limited in accuracy, robustness, ease of use, and interpretability. To address these limitations, we developed an ensemble clustering method that outperforms other methods at hard clustering without the need for hyperparameter tuning. It also performs soft clustering to characterize continuum-like regions and quantify clustering uncertainty, demonstrated here by mapping the connectivity and intermediate transitions between MNIST handwritten digits and between hypothalamic tanycyte subpopulations. This hyperparameter-randomized ensemble approach improves the accuracy, robustness, ease of use, and interpretability of single-cell clustering, and may prove useful in other fields as well.
Collapse
Affiliation(s)
- Sarah M Goggin
- Neuroscience Graduate Program, School of Medicine, University of Virginia, Charlottesville, VA, 22902, USA
| | - Eli R Zunder
- Neuroscience Graduate Program, School of Medicine, University of Virginia, Charlottesville, VA, 22902, USA.
- Department of Biomedical Engineering, School of Engineering, University of Virginia, Charlottesville, VA, 22902, USA.
| |
Collapse
|
15
|
Gorla A, Witonsky J, Elhawary JR, Chen ZJ, Mefford J, Perez-Garcia J, Huntsman S, Hu D, Eng C, Woodruff PG, Sankararaman S, Ziv E, Flint J, Zaitlen N, Burchard E, Rahmani E. Epigenetic patient stratification via contrastive machine learning refines hallmark biomarkers in minoritized children with asthma. RESEARCH SQUARE 2024:rs.3.rs-5066762. [PMID: 39315258 PMCID: PMC11419268 DOI: 10.21203/rs.3.rs-5066762/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
Identifying and refining clinically significant patient stratification is a critical step toward realizing the promise of precision medicine in asthma. Several peripheral blood hallmarks, including total peripheral blood eosinophil count (BEC) and immunoglobulin E (IgE) levels, are routinely used in asthma clinical practice for endotype classification and predicting response to state-of-the-art targeted biologic drugs. However, these biomarkers appear ineffective in predicting treatment outcomes in some patients, and they differ in distribution between racially and ethnically diverse populations, potentially compromising medical care and hindering health equity due to biases in drug eligibility. Here, we propose constructing an unbiased patient stratification score based on DNA methylation (DNAm) and utilizing it to refine the efficacy of hallmark biomarkers for predicting drug response. We developed Phenotype Aware Component Analysis (PACA), a novel contrastive machine-learning method for learning combinations of DNAm sites reflecting biomedically meaningful patient stratifications. Leveraging whole-blood DNAm from Latino (discovery; n=1,016) and African American (replication; n=756) pediatric asthma case-control cohorts, we applied PACA to refine the prediction of bronchodilator response (BDR) to the short-acting β2-agonist albuterol, the most used drug to treat acute bronchospasm worldwide. While BEC and IgE correlate with BDR in the general patient population, our PACA-derived DNAm score renders these biomarkers predictive of drug response only in patients with high DNAm scores. BEC correlates with BDR in patients with upper-quartile DNAm scores (OR 1.12; 95% CI [1.04, 1.22]; P=7.9 e-4) but not in patients with lower-quartile scores (OR 1.05; 95% CI [0.95, 1.17]; P=0.21); and IgE correlates with BDR in above-median (OR for response 1.42; 95% CI [1.24, 1.63]; P=3.9e-7) but not in below-median patients (OR 1.05; 95% CI [0.92, 1.2]; P=0.57). These results hold within the commonly recognized type 2 (T2)-high asthma endotype but not in T2-low patients, suggesting that our DNAm score primarily represents an unknown variation of T2 asthma. Among T2-high patients with high DNAm scores, elevated BEC or IgE also corresponds to baseline clinical presentation that is known to benefit more from biologic treatment, including higher exacerbation scores, higher allergen sensitization, lower BMI, more recent oral corticosteroids prescription, and lower lung function. Our findings suggest that BEC and IgE, the traditional asthma biomarkers of T2-high asthma, are poor biomarkers for millions worldwide. Revisiting existing drug eligibility criteria relying on these biomarkers in asthma medical care may enhance precision and equity in treatment.
Collapse
Affiliation(s)
- Aditya Gorla
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
| | - Jonathan Witonsky
- Division of Allergy, Immunology, and Bone Marrow Transplant, Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA
| | - Jennifer R Elhawary
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Zeyuan Johnson Chen
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA
| | - Joel Mefford
- Department of Neurology, University of California Los Angeles, Los Angeles, CA, USA
| | - Javier Perez-Garcia
- Genomics and Health Group, Department of Biochemistry, Microbiology, Cell Biology, and Genetics, University of La Laguna, La Laguna, Spain
| | - Scott Huntsman
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Donglei Hu
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Celeste Eng
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Prescott G Woodruff
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Sriram Sankararaman
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of California Los Angeles, Los Angeles, CA, USA
| | - Elad Ziv
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Jonathan Flint
- Department of Psychiatry and Behavioral Sciences, Brain Research Institute, University of California Los Angeles, Los Angeles, CA, USA
| | - Noah Zaitlen
- Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of California Los Angeles, Los Angeles, CA, USA
- Department of Neurology, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Esteban Burchard
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Elior Rahmani
- Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| |
Collapse
|
16
|
Luximon DC, Neylon J, Ritter T, Agazaryan N, Hegde JV, Steinberg ML, Low DA, Lamb JM. Results of an Artificial Intelligence-Based Image Review System to Detect Patient Misalignment Errors in a Multi-institutional Database of Cone Beam Computed Tomography-Guided Radiation Therapy. Int J Radiat Oncol Biol Phys 2024; 120:243-252. [PMID: 38485098 DOI: 10.1016/j.ijrobp.2024.02.065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 02/15/2024] [Accepted: 02/28/2024] [Indexed: 04/17/2024]
Abstract
PURPOSE Present knowledge of patient setup and alignment errors in image guided radiation therapy (IGRT) relies on voluntary reporting, which is thought to underestimate error frequencies. A manual retrospective patient-setup misalignment error search is infeasible owing to the bulk of cases to be reviewed. We applied a deep learning-based misalignment error detection algorithm (EDA) to perform a fully automated retrospective error search of clinical IGRT databases and determine an absolute gross patient misalignment error rate. METHODS AND MATERIALS The EDA was developed to analyze the registration between planning scans and pretreatment cone beam computed tomography scans, outputting a misalignment score ranging from 0 (most unlikely) to 1 (most likely). The algorithm was trained using simulated translational errors on a data set obtained from 680 patients treated at 2 radiation therapy clinics between 2017 and 2022. A receiver operating characteristic analysis was performed to obtain target thresholds. DICOM Query and Retrieval software was integrated with the EDA to interact with the clinical database and fully automate data retrieval and analysis during a retrospective error search from 2016 to 2017 and from 2021 to 2022 for the 2 institutions, respectively. Registrations were flagged for human review using both a hard-thresholding method and a prediction trending analysis over each individual patient's treatment course. Flagged registrations were manually reviewed and categorized as errors (>1 cm misalignment at the target) or nonerrors. RESULTS A total of 17,612 registrations were analyzed by the EDA, resulting in 7.7% flagged events. Three previously reported errors were successfully flagged by the EDA, and 4 previously unreported vertebral body misalignment errors were discovered during case reviews. False positive cases often displayed substantial image artifacts, patient rotation, and soft tissue anatomy changes. CONCLUSIONS Our results validated the clinical utility of the EDA for bulk image reviews and highlighted the reliability and safety of IGRT, with an absolute gross patient misalignment error rate of 0.04% ± 0.02% per delivered fraction.
Collapse
Affiliation(s)
- Dishane C Luximon
- Department of Radiation Oncology, David Geffen School of Medicine, University of California, Los Angeles, California.
| | - Jack Neylon
- Department of Radiation Oncology, David Geffen School of Medicine, University of California, Los Angeles, California
| | - Timothy Ritter
- Department of Medical Physics, Virginia Commonwealth University, Richmond, Virginia
| | - Nzhde Agazaryan
- Department of Radiation Oncology, David Geffen School of Medicine, University of California, Los Angeles, California
| | - John V Hegde
- Department of Radiation Oncology, David Geffen School of Medicine, University of California, Los Angeles, California
| | - Michael L Steinberg
- Department of Radiation Oncology, David Geffen School of Medicine, University of California, Los Angeles, California
| | - Daniel A Low
- Department of Radiation Oncology, David Geffen School of Medicine, University of California, Los Angeles, California
| | - James M Lamb
- Department of Radiation Oncology, David Geffen School of Medicine, University of California, Los Angeles, California
| |
Collapse
|
17
|
Akgüller Ö, Balcı MA, Cioca G. Clustering Molecules at a Large Scale: Integrating Spectral Geometry with Deep Learning. Molecules 2024; 29:3902. [PMID: 39202980 PMCID: PMC11357287 DOI: 10.3390/molecules29163902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2024] [Revised: 08/14/2024] [Accepted: 08/14/2024] [Indexed: 09/03/2024] Open
Abstract
This study conducts an in-depth analysis of clustering small molecules using spectral geometry and deep learning techniques. We applied a spectral geometric approach to convert molecular structures into triangulated meshes and used the Laplace-Beltrami operator to derive significant geometric features. By examining the eigenvectors of these operators, we captured the intrinsic geometric properties of the molecules, aiding their classification and clustering. The research utilized four deep learning methods: Deep Belief Network, Convolutional Autoencoder, Variational Autoencoder, and Adversarial Autoencoder, each paired with k-means clustering at different cluster sizes. Clustering quality was evaluated using the Calinski-Harabasz and Davies-Bouldin indices, Silhouette Score, and standard deviation. Nonparametric tests were used to assess the impact of topological descriptors on clustering outcomes. Our results show that the DBN + k-means combination is the most effective, particularly at lower cluster counts, demonstrating significant sensitivity to structural variations. This study highlights the potential of integrating spectral geometry with deep learning for precise and efficient molecular clustering.
Collapse
Affiliation(s)
- Ömer Akgüller
- Faculty of Science, Department of Mathematics, Mugla Sitki Kocman University, Muğla 48000, Turkey;
| | - Mehmet Ali Balcı
- Faculty of Science, Department of Mathematics, Mugla Sitki Kocman University, Muğla 48000, Turkey;
| | - Gabriela Cioca
- Faculty of Medicine, Preclinical Department, Lucian Blaga University of Sibiu, 550024 Sibiu, Romania;
| |
Collapse
|
18
|
Wang C, Gao X, Li Y, Li C, Ma Z, Sun D, Liang X, Zhang X. A molecular subtyping associated with the cGAS-STING pathway provides novel perspectives on the treatment of ulcerative colitis. Sci Rep 2024; 14:12683. [PMID: 38831059 PMCID: PMC11148070 DOI: 10.1038/s41598-024-63695-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Accepted: 05/31/2024] [Indexed: 06/05/2024] Open
Abstract
Ulcerative colitis (UC) is characterized by an abnormal immune response, and the pathogenesis lacks clear understanding. The cGAS-STING pathway is an innate immune signaling pathway that plays a significant role in various pathophysiological processes. However, the role of the cGAS-STING pathway in UC remains largely unclear. In this study, we obtained transcriptome sequencing data from multiple publicly available databases. cGAS-STING related genes were obtained through literature search, and differentially expressed genes (DEGs) were analyzed using R package limma. Hub genes were identified through protein-protein interaction (PPI) network analysis and module construction. The ConsensuClusterPlus package was utilized to identify molecular subtypes based on hub genes. The therapeutic response, immune microenvironment, and biological pathways of subtypes were further investigated. A total of 18 DEGs were found in UC patients. We further identified IFI16, MB21D1 (CGAS), TMEM173 (STING) and TBK1 as the hub genes. These genes are highly expressed in UC. IFI16 exhibited the highest diagnostic value and predictive value for response to anti-TNF therapy. The expression level of IFI16 was higher in non-responders to anti-TNF therapy. Furthermore, a cluster analysis based on genes related to the cGAS-STING pathway revealed that patients with higher gene expression exhibited elevated immune burden and inflammation levels. This study is a pioneering analysis of cGAS-STING pathway-related genes in UC. These findings provide new insights for the diagnosis of UC and the prediction of therapeutic response.
Collapse
Affiliation(s)
- Chen Wang
- Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
| | - Xin Gao
- Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
| | - Yanchen Li
- Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
| | - Chenyang Li
- Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
| | - Zhimin Ma
- Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
- Department of Respirology, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
| | - Donglei Sun
- Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
| | - Xiaonan Liang
- Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
| | - Xiaolan Zhang
- Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China.
| |
Collapse
|
19
|
Zamanian H, Shalbaf A. Estimation of non-alcoholic steatohepatitis (NASH) disease using clinical information based on the optimal combination of intelligent algorithms for feature selection and classification. Comput Methods Biomech Biomed Engin 2024; 27:964-979. [PMID: 37254745 DOI: 10.1080/10255842.2023.2217978] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 05/12/2023] [Indexed: 06/01/2023]
Abstract
The early diagnosis of NASH disease can decrease the risk of proceeding elements and treatment costs for patients. This study aims to present an optimal combination of intelligent algorithms using advanced machine learning methods, including different feature selections and classifications based on clinical data and blood factors. In this work, collected data were from 176 patients to investigate NASH disease, and 19 features were extracted. We then sought to find the best combination of features based on different feature selection algorithms such as Feature Forward Selection (FFS), Minimum Redundancy Maximum Relevance (MRMR), and Mutual Information (MI). Finally, we used nine classifier frameworks with different mathematical mechanisms, including random forest (RF), logistic regression (LR), Linear Discriminant Analysis (LDA), AdaBoost, K nearest neighbors (KNN), multilayer perceptron model (MLP), support vector machine (SVM), and decision tree (DT) to estimate NASH disease. Our investigation revealed that the combination of dominant features, namely body mass index (BMI), glutamic pyruvic transaminase (GPT), total cholesterol (TC), high-density lipoprotein (HDL), Ezetimibe, lipoprotein level Lp(a), Loge(Lp(a)), total triglyceride (TG), Creatinine (Cre), HbA1c, Fibrate, and Sex, selected by the MRMR algorithm and classified by the RF method can provide the most appropriate performance based on less computation effort and maximum performance with accuracy, AUC, precision, and recall indices, which are 81.51 ± 9.35 , 82.53 ± 11.24 , 85.28 ± 9.68 , and 89.49 ± 7.92 , respectively. This study investigated the configuration of feature selection and classifier that is most suitable for classifying NASH disease based on clinical data and blood factors. The proposed intelligent algorithm based on MRMR and RF classifier can automatically diagnose NASH disease with appropriate performance and present an initial report without any further invasive methods. It also clarifies the diagnostic process and, as a result, the continuation of their prevention and treatment cycle.
Collapse
Affiliation(s)
- Hamed Zamanian
- Department of Biomedical Engineering and Medical Physics, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ahmad Shalbaf
- Department of Biomedical Engineering and Medical Physics, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
20
|
Trottet C, Allam A, Horvath AN, Finckh A, Hügle T, Adler S, Kyburz D, Micheroli R, Krauthammer M, Ospelt C. Explainable deep learning for disease activity prediction in chronic inflammatory joint diseases. PLOS DIGITAL HEALTH 2024; 3:e0000422. [PMID: 38935600 PMCID: PMC11210792 DOI: 10.1371/journal.pdig.0000422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 05/27/2024] [Indexed: 06/29/2024]
Abstract
Analysing complex diseases such as chronic inflammatory joint diseases (CIJDs), where many factors influence the disease evolution over time, is a challenging task. CIJDs are rheumatic diseases that cause the immune system to attack healthy organs, mainly the joints. Different environmental, genetic and demographic factors affect disease development and progression. The Swiss Clinical Quality Management in Rheumatic Diseases (SCQM) Foundation maintains a national database of CIJDs documenting the disease management over time for 19'267 patients. We propose the Disease Activity Score Network (DAS-Net), an explainable multi-task learning model trained on patients' data with different arthritis subtypes, transforming longitudinal patient journeys into comparable representations and predicting multiple disease activity scores. First, we built a modular model composed of feed-forward neural networks, long short-term memory networks and attention layers to process the heterogeneous patient histories and predict future disease activity. Second, we investigated the utility of the model's computed patient representations (latent embeddings) to identify patients with similar disease progression. Third, we enhanced the explainability of our model by analysing the impact of different patient characteristics on disease progression and contrasted our model outcomes with medical expert knowledge. To this end, we explored multiple feature attribution methods including SHAP, attention attribution and feature weighting using case-based similarity. Our model outperforms temporal and non-temporal neural network, tree-based, and naive static baselines in predicting future disease activity scores. To identify similar patients, a k-nearest neighbours regression algorithm applied to the model's computed latent representations outperforms baseline strategies that use raw input features representation.
Collapse
Affiliation(s)
- Cécile Trottet
- Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland
| | - Ahmed Allam
- Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland
| | - Aron N. Horvath
- Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland
| | - Axel Finckh
- Division of Rheumatology, Department of Medicine, Faculty of Medicine, Geneva University Hospitals, Geneva, Switzerland
| | - Thomas Hügle
- Department of Rheumatology, Lausanne University Hospital, Lausanne, Switzerland
| | - Sabine Adler
- Department of Rheumatology and Immunology, Kantonsspital Aarau, Aarau, Switzerland
- Department of Rheumatology and Immunology, Inselspital - University Hospital Bern, Bern, Switzerland
| | - Diego Kyburz
- Department of Rheumatology, University Hospital Basel, Basel, Switzerland
| | - Raphael Micheroli
- Center of Experimental Rheumatology, Department of Rheumatology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Michael Krauthammer
- Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland
- Biomedical Informatics DFL, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Caroline Ospelt
- Center of Experimental Rheumatology, Department of Rheumatology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| |
Collapse
|
21
|
Xiong Y, Chen C, He C, Yang X, Cheng W. Identification of shared gene signatures and biological mechanisms between preeclampsia and polycystic ovary syndrome. Heliyon 2024; 10:e29225. [PMID: 38638956 PMCID: PMC11024567 DOI: 10.1016/j.heliyon.2024.e29225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 03/24/2024] [Accepted: 04/03/2024] [Indexed: 04/20/2024] Open
Abstract
Preeclampsia (PE) is one of the most common complications of pregnancy and polycystic ovary syndrome (PCOS) is a prevalent metabolic and endocrinopathy disorder in women of reproductive age. Identifying the shared genetic signatures and molecular mechanisms between PCOS and PE was the objective of this study. The intersections of WGCNA module genes, PPI module genes, and PPI hub genes revealed that 8 immunity-related genes might be shared causative genes of PE and PCOS. Further, qRT-PCR results showed that TSIX/miR-223-3p/DDX58 might play a crucial role in immune dysregulation in PE and PCOS and Spearman rank correlation analysis results illustrated the potential of DDX58 as a novel diagnostic and therapeutic target for PE and PCOS. Our study demonstrated a common disease pathway model TSIX/miR-223-3p/DDX58, illustrating that immune dysregulation may be a possible mechanism of PE and PCOS, and revealed that DDX58 might be a novel predictive target for PE and PCOS.
Collapse
Affiliation(s)
- Yaoxi Xiong
- International Peace Maternity and Child Health Hospital, School of Medicine, Shanghai Jiao Tong University, 200030, Shanghai, China
- Shanghai Key Laboratory of Embryo Original Disease, 200030, Shanghai, China
| | - Chao Chen
- International Peace Maternity and Child Health Hospital, School of Medicine, Shanghai Jiao Tong University, 200030, Shanghai, China
- Shanghai Key Laboratory of Embryo Original Disease, 200030, Shanghai, China
| | - Chengrong He
- International Peace Maternity and Child Health Hospital, School of Medicine, Shanghai Jiao Tong University, 200030, Shanghai, China
- Shanghai Key Laboratory of Embryo Original Disease, 200030, Shanghai, China
| | - Xingyu Yang
- International Peace Maternity and Child Health Hospital, School of Medicine, Shanghai Jiao Tong University, 200030, Shanghai, China
- Shanghai Key Laboratory of Embryo Original Disease, 200030, Shanghai, China
| | - Weiwei Cheng
- International Peace Maternity and Child Health Hospital, School of Medicine, Shanghai Jiao Tong University, 200030, Shanghai, China
| |
Collapse
|
22
|
Barakat A, Munro G, Heegaard AM. Finding new analgesics: Computational pharmacology faces drug discovery challenges. Biochem Pharmacol 2024; 222:116091. [PMID: 38412924 DOI: 10.1016/j.bcp.2024.116091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 01/10/2024] [Accepted: 02/23/2024] [Indexed: 02/29/2024]
Abstract
Despite the worldwide prevalence and huge burden of pain, pain is an undertreated phenomenon. Currently used analgesics have several limitations regarding their efficacy and safety. The discovery of analgesics possessing a novel mechanism of action has faced multiple challenges, including a limited understanding of biological processes underpinning pain and analgesia and poor animal-to-human translation. Computational pharmacology is currently employed to face these challenges. In this review, we discuss the theory, methods, and applications of computational pharmacology in pain research. Computational pharmacology encompasses a wide variety of theoretical concepts and practical methodological approaches, with the overall aim of gaining biological insight through data acquisition and analysis. Data are acquired from patients or animal models with pain or analgesic treatment, at different levels of biological organization (molecular, cellular, physiological, and behavioral). Distinct methodological algorithms can then be used to analyze and integrate data. This helps to facilitate the identification of biological molecules and processes associated with pain phenotype, build quantitative models of pain signaling, and extract translatable features between humans and animals. However, computational pharmacology has several limitations, and its predictions can provide false positive and negative findings. Therefore, computational predictions are required to be validated experimentally before drawing solid conclusions. In this review, we discuss several case study examples of combining and integrating computational tools with experimental pain research tools to meet drug discovery challenges.
Collapse
Affiliation(s)
- Ahmed Barakat
- Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark; Department of Pharmacology and Toxicology, Faculty of Pharmacy, Assiut University, Assiut, Egypt.
| | | | - Anne-Marie Heegaard
- Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
23
|
Bao LX, Luo ZM, Zhu XL, Xu YY. Automated identification of protein expression intensity and classification of protein cellular locations in mouse brain regions from immunofluorescence images. Med Biol Eng Comput 2024; 62:1105-1119. [PMID: 38150111 DOI: 10.1007/s11517-023-02985-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 11/28/2023] [Indexed: 12/28/2023]
Abstract
Knowledge of protein expression in mammalian brains at regional and cellular levels can facilitate understanding of protein functions and associated diseases. As the mouse brain is a typical mammalian brain considering cell type and structure, several studies have been conducted to analyze protein expression in mouse brains. However, labeling protein expression using biotechnology is costly and time-consuming. Therefore, automated models that can accurately recognize protein expression are needed. Here, we constructed machine learning models to automatically annotate the protein expression intensity and cellular location in different mouse brain regions from immunofluorescence images. The brain regions and sub-regions were segmented through learning image features using an autoencoder and then performing K-means clustering and registration to align with the anatomical references. The protein expression intensities for those segmented structures were computed on the basis of the statistics of the image pixels, and patch-based weakly supervised methods and multi-instance learning were used to classify the cellular locations. Results demonstrated that the models achieved high accuracy in the expression intensity estimation, and the F1 score of the cellular location prediction was 74.5%. This work established an automated pipeline for analyzing mouse brain images and provided a foundation for further study of protein expression and functions.
Collapse
Affiliation(s)
- Lin-Xia Bao
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China
- Guangdong Provincial Key Laboratory of Medical Imaging Processing, Southern Medical University, Guangzhou, 510515, China
- Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510623, China
| | - Zhuo-Ming Luo
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China
- Guangdong Provincial Key Laboratory of Medical Imaging Processing, Southern Medical University, Guangzhou, 510515, China
- Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510623, China
| | - Xi-Liang Zhu
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China
- Guangdong Provincial Key Laboratory of Medical Imaging Processing, Southern Medical University, Guangzhou, 510515, China
- Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510623, China
| | - Ying-Ying Xu
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China.
- Guangdong Provincial Key Laboratory of Medical Imaging Processing, Southern Medical University, Guangzhou, 510515, China.
- Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510623, China.
| |
Collapse
|
24
|
Wang T, Li Z, Zhao S, Liu Y, Guo W, Alarcòn Rodrìguez R, Wu Y, Wei R. Characterizing hedgehog pathway features in senescence associated osteoarthritis through Integrative multi-omics and machine learning analysis. Front Genet 2024; 15:1255455. [PMID: 38444758 PMCID: PMC10912584 DOI: 10.3389/fgene.2024.1255455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 02/06/2024] [Indexed: 03/07/2024] Open
Abstract
Purpose: Osteoarthritis (OA) is a disease of senescence and inflammation. Hedgehog's role in OA mechanisms is unclear. This study combines Bulk RNA-seq and scRNA-seq to identify Hedgehog-associated genes in OA, investigating their impact on the pathogenesis of OA. Materials and methods: Download and merge eight bulk-RNA seq datasets from GEO, also obtain a scRNA-seq dataset for validation and analysis. Analyze Hedgehog pathway activity in OA using bulk-RNA seq datasets. Use ten machine learning algorithms to identify important Hedgehog-associated genes, validate predictive models. Perform GSEA to investigate functional implications of identified Hedgehog-associated genes. Assess immune infiltration in OA using Cibersort and MCP-counter algorithms. Utilize ConsensusClusterPlus package to identify Hedgehog-related subgroups. Conduct WGCNA to identify key modules enriched based on Hedgehog-related subgroups. Characterization of genes by methylation and GWAS analysis. Evaluate Hedgehog pathway activity, expression of hub genes, pseudotime, and cell communication, in OA chondrocytes using scRNA-seq dataset. Validate Hedgehog-associated gene expression levels through Real-time PCR analysis. Results: The activity of the Hedgehog pathway is significantly enhanced in OA. Additionally, nine important Hedgehog-associated genes have been identified, and the predictive models built using these genes demonstrate strong predictive capabilities. GSEA analysis indicates a significant positive correlation between all seven important Hedgehog-associated genes and lysosomes. Consensus clustering reveals the presence of two hedgehog-related subgroups. In Cluster 1, Hedgehog pathway activity is significantly upregulated and associated with inflammatory pathways. WGCNA identifies that genes in the blue module are most significantly correlated with Cluster 1 and Cluster 2, as well as being involved in extracellular matrix and collagen-related pathways. Single-cell analysis confirms the significant upregulation of the Hedgehog pathway in OA, along with expression changes observed in 5 genes during putative temporal progression. Cell communication analysis suggests an association between low-scoring chondrocytes and macrophages. Conclusion: The Hedgehog pathway is significantly activated in OA and is associated with the extracellular matrix and collagen proteins. It plays a role in regulating immune cells and immune responses.
Collapse
Affiliation(s)
- Tao Wang
- Department of Orthopedic Joint, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Zhengrui Li
- Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Shijian Zhao
- Department of Cardiology, The Affiliated Cardiovascular Hospital of Kunming Medical University (Fuwai Yunnan Cardiovascular Hospital), Kunming, Yunnan, China
| | - Ying Liu
- Department of Rehabilitation Medicine, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Wenliang Guo
- Department of Rehabilitation Medicine, The Eighth Affiliated Hospital of Guangxi Medical University, Guigang, Guangxi, China
| | | | - Yinteng Wu
- Department of Orthopedic and Trauma Surgery, the First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Ruqiong Wei
- Department of Rehabilitation Medicine, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| |
Collapse
|
25
|
Dube F, Delhomme N, Martin F, Hinas A, Åbrink M, Svärd S, Tydén E. Gene co-expression network analysis reveal core responsive genes in Parascaris univalens tissues following ivermectin exposure. PLoS One 2024; 19:e0298039. [PMID: 38359071 PMCID: PMC10868809 DOI: 10.1371/journal.pone.0298039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 01/17/2024] [Indexed: 02/17/2024] Open
Abstract
Anthelmintic resistance in equine parasite Parascaris univalens, compromises ivermectin (IVM) effectiveness and necessitates an in-depth understanding of its resistance mechanisms. Most research, primarily focused on holistic gene expression analyses, may overlook vital tissue-specific responses and often limit the scope of novel genes. This study leveraged gene co-expression network analysis to elucidate tissue-specific transcriptional responses and to identify core genes implicated in the IVM response in P. univalens. Adult worms (n = 28) were exposed to 10-11 M and 10-9 M IVM in vitro for 24 hours. RNA-sequencing examined transcriptional changes in the anterior end and intestine. Differential expression analysis revealed pronounced tissue differences, with the intestine exhibiting substantially more IVM-induced transcriptional activity. Gene co-expression network analysis identified seven modules significantly associated with the response to IVM. Within these, 219 core genes were detected, largely expressed in the intestinal tissue and spanning diverse biological processes with unspecific patterns. After 10-11 M IVM, intestinal tissue core genes showed transcriptional suppression, cell cycle inhibition, and ribosomal alterations. Interestingly, genes PgR028_g047 (sorb-1), PgB01_g200 (gmap-1) and PgR046_g017 (col-37 & col-102) switched from downregulation at 10-11 M to upregulation at 10-9 M IVM. The 10-9 M concentration induced expression of cuticle and membrane integrity core genes in the intestinal tissue. No clear core gene patterns were visible in the anterior end after 10-11 M IVM. However, after 10-9 M IVM, the anterior end mostly displayed downregulation, indicating disrupted transcriptional regulation. One interesting finding was the non-modular calcium-signaling gene, PgR047_g066 (gegf-1), which uniquely connected 71 genes across four modules. These genes were enriched for transmembrane signaling activity, suggesting that PgR047_g066 (gegf-1) could have a key signaling role. By unveiling tissue-specific expression patterns and highlighting biological processes through unbiased core gene detection, this study reveals intricate IVM responses in P. univalens. These findings suggest alternative drug uptake of IVM and can guide functional validations to further IVM resistance mechanism understanding.
Collapse
Affiliation(s)
- Faruk Dube
- Department of Animal Biosciences, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Nicolas Delhomme
- Umeå Plant Science Centre (UPSC), Department of Forest Genetics and Plant Physiology, Swedish University of Agricultural Sciences, Umeå, Sweden
| | - Frida Martin
- Department of Animal Biosciences, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Andrea Hinas
- Department of Cell and Molecular Biology, Uppsala University, Uppsala Sweden
| | - Magnus Åbrink
- Department of Animal Biosciences, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Staffan Svärd
- Department of Cell and Molecular Biology, Uppsala University, Uppsala Sweden
| | - Eva Tydén
- Department of Animal Biosciences, Swedish University of Agricultural Sciences, Uppsala, Sweden
| |
Collapse
|
26
|
Shafique A, Gonzalez R, Pantanowitz L, Tan PH, Machado A, Cree IA, Tizhoosh HR. A Preliminary Investigation into Search and Matching for Tumor Discrimination in World Health Organization Breast Taxonomy Using Deep Networks. Mod Pathol 2024; 37:100381. [PMID: 37939901 PMCID: PMC10891482 DOI: 10.1016/j.modpat.2023.100381] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 10/26/2023] [Accepted: 10/31/2023] [Indexed: 11/10/2023]
Abstract
Breast cancer is one of the most common cancers affecting women worldwide. It includes a group of malignant neoplasms with a variety of biological, clinical, and histopathologic characteristics. There are more than 35 different histologic forms of breast lesions that can be classified and diagnosed histologically according to cell morphology, growth, and architecture patterns. Recently, deep learning, in the field of artificial intelligence, has drawn a lot of attention for the computerized representation of medical images. Searchable digital atlases can provide pathologists with patch-matching tools, allowing them to search among evidently diagnosed and treated archival cases, a technology that may be regarded as computational second opinion. In this study, we indexed and analyzed the World Health Organization breast taxonomy (Classification of Tumors fifth ed.) spanning 35 tumor types. We visualized all tumor types using deep features extracted from a state-of-the-art deep-learning model, pretrained on millions of diagnostic histopathology images from the Cancer Genome Atlas repository. Furthermore, we tested the concept of a digital "atlas" as a reference for search and matching with rare test cases. The patch similarity search within the World Health Organization breast taxonomy data reached >88% accuracy when validating through "majority vote" and >91% accuracy when validating using top n tumor types. These results show for the first time that complex relationships among common and rare breast lesions can be investigated using an indexed digital archive.
Collapse
Affiliation(s)
- Abubakr Shafique
- Rhazes Lab, Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota; Kimia Lab, University of Waterloo, Waterloo, Ontario, Canada
| | - Ricardo Gonzalez
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota
| | - Liron Pantanowitz
- Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | - Puay Hoon Tan
- Women's Imaging Centre, Luma Medical Centre, Singapore
| | - Alberto Machado
- WHO Classification of Tumours Group, International Agency for Research on Cancer, Lyon, France
| | - Ian A Cree
- WHO Classification of Tumours Group, International Agency for Research on Cancer, Lyon, France
| | - Hamid R Tizhoosh
- Rhazes Lab, Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota; Kimia Lab, University of Waterloo, Waterloo, Ontario, Canada.
| |
Collapse
|
27
|
Khine AH, Wettayaprasit W, Duangsuwan J. A new word embedding model integrated with medical knowledge for deep learning-based sentiment classification. Artif Intell Med 2024; 148:102758. [PMID: 38325934 DOI: 10.1016/j.artmed.2023.102758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 11/19/2023] [Accepted: 12/29/2023] [Indexed: 02/09/2024]
Abstract
The development of intelligent systems that use social media data for decision-making processes in numerous domains such as politics, business, marketing, and finance, has been made possible by the popularity of social media platforms. However, the utilization of textual data from social media in the healthcare management industry is still somewhat limited when it is compared to other industries. Investigating how current machine learning and natural language processing technologies can be used in the healthcare industry to gauge public sentiment is an important study. Earlier works on healthcare sentiment analysis have utilized traditional word embedding models trained on the general and medical corpus. However, integration of medical knowledge to pre-trained word embedding models has not been considered yet. Word embedding models trained on the general corpus led to the problem of lacking medical knowledge and the models trained on the small size of the medical corpus have limitations in capturing semantic and syntactic properties. This research proposes a new word embedding model named Word Embedding Integrated with Medical Knowledge Vector (WE-iMKVec). The proposed model integrates sentiment lexicons and medical knowledgebases into the pre-trained word embedding to enrich the properties of word embedding. A new medical-aware sentiment polarity score is proposed for the utilization in learning neural-network sentiment and these vectors incorporate with the original pre-trained word vectors. The resulting vectors are enriched with lexicon vectors and the medical knowledge vectors: Adverse Drug Reaction (ADR) vector and Unified Medical Language System (UMLS) vector are used to build the proposed WE-iMKVec model. WE-iMKVec is validated on the five different social media healthcare review datasets and the empirical results showed its superiority over traditional word embedding models in medical sentiment analysis. The highest improvement can be found in the patients.info medical condition dataset where the proposed model outperforms three conventional word2vec models (Google-News, PubMed-PMC, and Drug Reviews) by 12.7 %, 31.4 %, and 25.4 % respectively in terms of F1 score.
Collapse
Affiliation(s)
- Aye Hninn Khine
- Artificial Intelligence Research Lab, Division of Computational Science, Faculty of Science, Prince of Songkla University, Hat Yai, Thailand
| | - Wiphada Wettayaprasit
- Artificial Intelligence Research Lab, Division of Computational Science, Faculty of Science, Prince of Songkla University, Hat Yai, Thailand
| | - Jarunee Duangsuwan
- Artificial Intelligence Research Lab, Division of Computational Science, Faculty of Science, Prince of Songkla University, Hat Yai, Thailand.
| |
Collapse
|
28
|
Xiang J, Sun Y, Wu X, Guo Y, Xue J, Niu Y, Cui X. Abnormal Spatial and Temporal Overlap of Time-Varying Brain Functional Networks in Patients with Schizophrenia. Brain Sci 2023; 14:40. [PMID: 38248255 PMCID: PMC10813230 DOI: 10.3390/brainsci14010040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 12/25/2023] [Accepted: 12/27/2023] [Indexed: 01/23/2024] Open
Abstract
Schizophrenia (SZ) is a complex psychiatric disorder with unclear etiology and pathological features. Neuroscientists are increasingly proposing that schizophrenia is an abnormality in the dynamic organization of brain networks. Previous studies have found that the dynamic brain networks of people with SZ are abnormal in both space and time. However, little is known about the interactions and overlaps between hubs of the brain underlying spatiotemporal dynamics. In this study, we aimed to investigate different patterns of spatial and temporal overlap of hubs between SZ patients and healthy individuals. Specifically, we obtained resting-state functional magnetic resonance imaging data from the public dataset for 43 SZ patients and 49 healthy individuals. We derived a representation of time-varying functional connectivity using the Jackknife Correlation (JC) method. We employed the Betweenness Centrality (BC) method to identify the hubs of the brain's functional connectivity network. We then applied measures of temporal overlap, spatial overlap, and hierarchical clustering to investigate differences in the organization of brain hubs between SZ patients and healthy controls. Our findings suggest significant differences between SZ patients and healthy controls at the whole-brain and subnetwork levels. Furthermore, spatial overlap and hierarchical clustering analysis showed that quasi-periodic patterns were disrupted in SZ patients. Analyses of temporal overlap revealed abnormal pairwise engagement preferences in the hubs of SZ patients. These results provide new insights into the dynamic characteristics of the network organization of the SZ brain.
Collapse
Affiliation(s)
- Jie Xiang
- College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan 030024, China; (J.X.); (Y.S.); (X.W.); (J.X.); (Y.N.)
| | - Yumeng Sun
- College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan 030024, China; (J.X.); (Y.S.); (X.W.); (J.X.); (Y.N.)
| | - Xubin Wu
- College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan 030024, China; (J.X.); (Y.S.); (X.W.); (J.X.); (Y.N.)
| | - Yuxiang Guo
- School of Software, Taiyuan University of Technology, Taiyuan 030024, China;
| | - Jiayue Xue
- College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan 030024, China; (J.X.); (Y.S.); (X.W.); (J.X.); (Y.N.)
| | - Yan Niu
- College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan 030024, China; (J.X.); (Y.S.); (X.W.); (J.X.); (Y.N.)
| | - Xiaohong Cui
- College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan 030024, China; (J.X.); (Y.S.); (X.W.); (J.X.); (Y.N.)
| |
Collapse
|
29
|
Chen QS, Bergman O, Ziegler L, Baldassarre D, Veglia F, Tremoli E, Strawbridge RJ, Gallo A, Pirro M, Smit AJ, Kurl S, Savonen K, Lind L, Eriksson P, Gigante B. A machine learning based approach to identify carotid subclinical atherosclerosis endotypes. Cardiovasc Res 2023; 119:2594-2606. [PMID: 37475157 PMCID: PMC10730242 DOI: 10.1093/cvr/cvad106] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 03/12/2023] [Accepted: 05/05/2023] [Indexed: 07/22/2023] Open
Abstract
AIMS To define endotypes of carotid subclinical atherosclerosis. METHODS AND RESULTS We integrated demographic, clinical, and molecular data (n = 124) with ultrasonographic carotid measurements from study participants in the IMPROVE cohort (n = 3340). We applied a neural network algorithm and hierarchical clustering to identify carotid atherosclerosis endotypes. A measure of carotid subclinical atherosclerosis, the c-IMTmean-max, was used to extract atherosclerosis-related features and SHapley Additive exPlanations (SHAP) to reveal endotypes. The association of endotypes with carotid ultrasonographic measurements at baseline, after 30 months, and with the 3-year atherosclerotic cardiovascular disease (ASCVD) risk was estimated by linear (β, SE) and Cox [hazard ratio (HR), 95% confidence interval (CI)] regression models. Crude estimates were adjusted by common cardiovascular risk factors, and baseline ultrasonographic measures. Improvement in ASCVD risk prediction was evaluated by C-statistic and by net reclassification improvement with reference to SCORE2, c-IMTmean-max, and presence of carotid plaques. An ensemble stacking model was used to predict endotypes in an independent validation cohort, the PIVUS (n = 1061). We identified four endotypes able to differentiate carotid atherosclerosis risk profiles from mild (endotype 1) to severe (endotype 4). SHAP identified endotype-shared variables (age, biological sex, and systolic blood pressure) and endotype-specific biomarkers. In the IMPROVE, as compared to endotype 1, endotype 4 associated with the thickest c-IMT at baseline (β, SE) 0.36 (0.014), the highest number of plaques 1.65 (0.075), the fastest c-IMT progression 0.06 (0.013), and the highest ASCVD risk (HR, 95% CI) (1.95, 1.18-3.23). Baseline and progression measures of carotid subclinical atherosclerosis and ASCVD risk were associated with the predicted endotypes in the PIVUS. Endotypes consistently improved measures of ASCVD risk discrimination and reclassification in both study populations. CONCLUSIONS We report four replicable subclinical carotid atherosclerosis-endotypes associated with progression of atherosclerosis and ASCVD risk in two independent populations. Our approach based on endotypes can be applied for precision medicine in ASCVD prevention.
Collapse
Affiliation(s)
- Qiao Sen Chen
- Division of Cardiovascular Medicine, Department of Medicine Solna, Karolinska Institutet, Solnavägen 30, 171 64 Stockholm, Sweden
| | - Otto Bergman
- Division of Cardiovascular Medicine, Department of Medicine Solna, Karolinska Institutet, Solnavägen 30, 171 64 Stockholm, Sweden
| | - Louise Ziegler
- Division of Medicine and Department of Clinical Sciences, Danderyd Hospital, Karolinska Institutet, Entrevägen 2, 182 88 Stockholm, Sweden
| | - Damiano Baldassarre
- Department of Medical Biotechnology and Translational Medicine, Università di Milano, Via Vanvitelli 32, 20133 Milan, Italy
- Centro Cardiologico Monzino, IRCCS, Via Carlo Parea 4, 20138 Milan, Italy
| | - Fabrizio Veglia
- Maria Cecilia Hospital, GVM Care & Research, Via Corriera 1, 48033 Cotignola (RA), Italy
| | - Elena Tremoli
- Maria Cecilia Hospital, GVM Care & Research, Via Corriera 1, 48033 Cotignola (RA), Italy
| | - Rona J Strawbridge
- Division of Cardiovascular Medicine, Department of Medicine Solna, Karolinska Institutet, Solnavägen 30, 171 64 Stockholm, Sweden
- Institute of Health and Wellbeing, University of Glasgow, Clarice Pears Building, 90 Byres Road, Glasgow G12 8TB, UK
- Health Data Research, Clarice Pears Building, 90 Byres Road, Glasgow G12 8TB, UK
| | - Antonio Gallo
- Lipidology and Cardiovascular Prevention Unit, Department of Nutrition, Sorbonne Université, INSERM UMR1166, APHP, Hôpital Pitié-Salpètriêre, 47 Boulevard de l´Hopital, 75013 Paris, France
| | - Matteo Pirro
- Internal Medicine, Angiology and Arteriosclerosis Diseases, Department of Medicine, University of Perugia, Piazzale Menghini 1, 06129 Perugia, Italy
| | - Andries J Smit
- Department of Medicine, University Medical Center Groningen, Groningen & Isala Clinics Zwolle, Dokter Spanjaardweg 29B, 8025 BT Groningen, the Netherlands
| | - Sudhir Kurl
- Institute of Public Health and Clinical Nutrition, University of Eastern Finland, Kuopio Campus, Yliopistonranta 1 C, Canthia Building, B Wing, FI-70211 Kuopio, Finland
| | - Kai Savonen
- Kuopio Research Institute of Exercise Medicine, Haapaniementie 16, FI-70100 Kuopio, Finland
- Department of Clinical Physiology and Nuclear Medicine, Science Service Center, Kuopio University Hospital, Yliopsistonranta 1F, FI-70211 Kuopio, Finland
| | - Lars Lind
- Department of Medical Sciences, Uppsala University, Uppsala Science Park, Dag Hammarskjöldsv 10B, 752 37 Uppsala, Sweden
| | - Per Eriksson
- Division of Cardiovascular Medicine, Department of Medicine Solna, Karolinska Institutet, Solnavägen 30, 171 64 Stockholm, Sweden
| | - Bruna Gigante
- Division of Cardiovascular Medicine, Department of Medicine Solna, Karolinska Institutet, Solnavägen 30, 171 64 Stockholm, Sweden
- Department of Cardiology, Danderyd University Hospital, Entrevägen 2, 182 88 Stockholm, Sweden
| |
Collapse
|
30
|
Su D, Xiong Y, Wang S, Wei H, Ke J, Li H, Wang T, Zuo Y, Yang L. Structural deep clustering network for stratification of breast cancer patients through integration of somatic mutation profiles. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 242:107808. [PMID: 37716222 DOI: 10.1016/j.cmpb.2023.107808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/15/2023] [Accepted: 09/10/2023] [Indexed: 09/18/2023]
Abstract
BACKGROUND AND OBJECTIVE Breast cancer is among of the most malignant tumor that occurs in women and is one of the leading causes of death from gynecologic malignancy worldwide. The high degree of heterogeneity that characterizes breast cancer makes it challenging to devise effective therapeutic strategies. Accumulating evidence highlights the crucial role of stratifying breast cancer patients into clinically significant subtypes to achieve better prognoses and treatments. The structural deep clustering network is a graph convolutional network-based clustering algorithm that integrates structural information and has achieved state-of-the-art performance in various applications. METHODS In this study, we employed structural deep clustering network to integrate somatic mutation profiles for stratifying 2526 breast cancer patients from the Memorial Sloan Kettering Cancer Center into two clinically differentiable subtypes. RESULTS Breast cancer patients in cluster 1 exhibited better prognosis than breast cancer patients in cluster 2, and the difference between them was statistically significant. The immunogenomic landscape further demonstrated that cluster 1 was associated with remarkable infiltration of the tumor infiltrating lymphocytes. The clustering subtype could be used to evaluate the therapeutic benefit of immunotherapy and chemotherapy in breast cancer patients. Furthermore, our approach effectively classified patients from eight different cancer types, demonstrating its generalizability. CONCLUSIONS Our study represents a step towards a generic methodology for classifying cancer patients using only somatic mutation data and structural deep clustering network approaches. Employing structural deep clustering network to identify breast cancer subtypes is promising and can inform the development of more accurate and personalized therapies.
Collapse
Affiliation(s)
- Dongqing Su
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Yuqiang Xiong
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Shiyuan Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Haodong Wei
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Jiawei Ke
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Honghao Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Tao Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Yongchun Zuo
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China; Digital College, Inner Mongolia Intelligent Union Big Data Academy, Inner Mongolia Wesure Date Technology Co., Ltd. Hohhot, 010010, China; Inner Mongolia International Mongolian Hospital, Hohhot 010065, China
| | - Lei Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China.
| |
Collapse
|
31
|
Li L, Li H, Yang C, Tang Y, Wang Y, Yang H, Zhang W, Jiang F, Ji S. Multiscale levels CO 2 decouple reinforcement in China. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:121569-121583. [PMID: 37953427 DOI: 10.1007/s11356-023-30931-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 11/02/2023] [Indexed: 11/14/2023]
Abstract
Decoupling economic growth from CO2 emissions is imperative for China. Meanwhile, establishing a consistent and comprehensive decoupling inventory that includes national (N), regional and provincial (RP), and city and county (CC) levels is essential for further policy formulation. This research aims to investigate the decoupling status using the "N-RP-CC" approach while considering changes in decoupling trends at the different levels. A combination of the Tapio decoupling model and cluster analysis is employed to study the decoupling's spatiotemporal characteristics and trends. The study first calculates the decoupling value for "national, 7; regions, 30; provinces, 1501 CCs" in China, 2006-2017. The results show that there continues to be an improvement in the decoupling trend at the national level. Conversely, the regional scale exhibits a more vulnerable decoupling trend compared to the national level, with weak and extended negative decoupling observed in northeastern and northern China. Moreover, provincial heterogeneities are increasingly evident, with poor decoupling statuses appearing in Jilin, Heilongjiang, Liaoning, and Xinjiang, as well as many central provinces. Additionally, although more than half of CCs exhibit weak decoupling during most years, seven different states of decoupling were also identified during the time frame. These findings further indicate that spatiotemporal heterogeneities extend beyond RP scales within CCs. Taking the Yangtze River as a boundary line reveals a severe situation in northern areas along with rapid development trends observed in southern regions. Finally, we clustered 1414 CCs based on their industrial proportions for 2017 which further highlights increasingly prominent heterogeneities that should be carefully considered. Based on these findings, policy recommendations such as spatial organization and optimization and technique investment are proposed to achieve CO2 emission decoupling under the N-RP-CC levels.
Collapse
Affiliation(s)
- Lei Li
- School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China
- Research Center of Lake Restoration Technology Engineering for Universities of Yunnan Province (Yunnan University), School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China
| | - Huiying Li
- Research Center of Lake Restoration Technology Engineering for Universities of Yunnan Province (Yunnan University), School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China
- Institute of International Rivers and Eco-Security, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China
| | - Chuanhua Yang
- School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China
- Research Center of Lake Restoration Technology Engineering for Universities of Yunnan Province (Yunnan University), School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China
| | - Yue Tang
- School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China
- Research Center of Lake Restoration Technology Engineering for Universities of Yunnan Province (Yunnan University), School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China
| | - Yujian Wang
- School of Chemical Science and Technology, Yunnan Minzu University, 2929 Yuehua Street, Kunming, 650500, China
| | - HongJuan Yang
- Faculty of Management and Economics, Kunming University of Science and Technology, No. 727 Jingming South Road, Kunming, 650500, China
| | - Weishi Zhang
- School of Geographic and Environmental Sciences, Tianjin Normal University, No.393, Extension of Bin Shui West Road, Xi Qing District, Tianjin, 300387, China
| | - Fengzhi Jiang
- School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China
- Research Center of Lake Restoration Technology Engineering for Universities of Yunnan Province (Yunnan University), School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China
- Workstation of Academician Chen Jing of Yunnan Province, University City East Outer Ring South Road, Kunming, 650500, China
| | - Siping Ji
- School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China.
- Research Center of Lake Restoration Technology Engineering for Universities of Yunnan Province (Yunnan University), School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China.
- School of Chemistry Science and Engineering, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, Yunnan Province, China.
| |
Collapse
|
32
|
Bailleux C, Chardin D, Guigonis JM, Ferrero JM, Chateau Y, Humbert O, Pourcher T, Gal J. Survival analysis of patient groups defined by unsupervised machine learning clustering methods based on patient metabolomic data. Comput Struct Biotechnol J 2023; 21:5136-5143. [PMID: 37920813 PMCID: PMC10618114 DOI: 10.1016/j.csbj.2023.10.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 10/16/2023] [Accepted: 10/16/2023] [Indexed: 11/04/2023] Open
Abstract
Purpose Meta-analyses failed to accurately identify patients with non-metastatic breast cancer who are likely to benefit from chemotherapy, and metabolomics could provide new answers. In our previous published work, patients were clustered using five different unsupervised machine learning (ML) methods resulting in the identification of three clusters with distinct clinical and simulated survival data. The objective of this study was to evaluate the survival outcomes, with extended follow-up, using the same 5 different methods of unsupervised machine learning. Experimental design Forty-nine patients, diagnosed between 2013 and 2016, with non-metastatic BC were included retrospectively. Median follow-up was extended to 85.8 months. 449 metabolites were extracted from tumor resection samples by combined Liquid chromatography-mass spectrometry (LC-MS). Survival analyses were reported grouping together Cluster 1 and 2 versus cluster 3. Bootstrap optimization was applied. Results PCA k-means, K-sparse and Spectral clustering were the most effective methods to predict 2-year progression-free survival with bootstrap optimization (PFSb); as bootstrap example, with PCA k-means method, PFSb were 94% for cluster 1&2 versus 82% for cluster 3 (p = 0.01). PCA k-means method performed best, with higher reproducibility (mean HR=2 (95%CI [1.4-2.7]); probability of p ≤ 0.05 85%). Cancer-specific survival (CSS) and overall survival (OS) analyses highlighted a discrepancy between the 5 ML unsupervised methods. Conclusion Our study is a proof-of-principle that it is possible to use unsupervised ML methods on metabolomic data to predict PFS survival outcomes, with the best performance for PCA k-means. A larger population study is needed to draw conclusions from CSS and OS analyses.
Collapse
Affiliation(s)
- Caroline Bailleux
- University Côte d′Azur, Centre Antoine Lacassagne, Medical Oncology Department, Nice F-06189, France
- University Côte d′Azur, Commissariat à l′Energie Atomique et aux énergies alternatives, Institut Frédéric Joliot, Service Hospitalier Frédéric Joliot, laboratory Transporters in Oncology and Radiotherapy in Oncology (TIRO), School of medicine, Nice F-06100, France
| | - David Chardin
- University Côte d′Azur, Commissariat à l′Energie Atomique et aux énergies alternatives, Institut Frédéric Joliot, Service Hospitalier Frédéric Joliot, laboratory Transporters in Oncology and Radiotherapy in Oncology (TIRO), School of medicine, Nice F-06100, France
- University Côte d′Azur, Centre Antoine Lacassagne, Nuclear medicine Department, Nice F-06189, France
| | - Jean-Marie Guigonis
- University Côte d′Azur, Commissariat à l′Energie Atomique et aux énergies alternatives, Institut Frédéric Joliot, Service Hospitalier Frédéric Joliot, laboratory Transporters in Oncology and Radiotherapy in Oncology (TIRO), School of medicine, Nice F-06100, France
| | - Jean-Marc Ferrero
- University Côte d′Azur, Centre Antoine Lacassagne, Medical Oncology Department, Nice F-06189, France
| | - Yann Chateau
- University Côte d′Azur, Centre Antoine Lacassagne, Epidemiology and Biostatistics Department, Nice F-06189, France
| | - Olivier Humbert
- University Côte d′Azur, Commissariat à l′Energie Atomique et aux énergies alternatives, Institut Frédéric Joliot, Service Hospitalier Frédéric Joliot, laboratory Transporters in Oncology and Radiotherapy in Oncology (TIRO), School of medicine, Nice F-06100, France
- University Côte d′Azur, Centre Antoine Lacassagne, Nuclear medicine Department, Nice F-06189, France
| | - Thierry Pourcher
- University Côte d′Azur, Commissariat à l′Energie Atomique et aux énergies alternatives, Institut Frédéric Joliot, Service Hospitalier Frédéric Joliot, laboratory Transporters in Oncology and Radiotherapy in Oncology (TIRO), School of medicine, Nice F-06100, France
| | - Jocelyn Gal
- University Côte d′Azur, Centre Antoine Lacassagne, Epidemiology and Biostatistics Department, Nice F-06189, France
| |
Collapse
|
33
|
Willie E, Yang P, Patrick E. The impact of similarity metrics on cell-type clustering in highly multiplexed in situ imaging cytometry data. BIOINFORMATICS ADVANCES 2023; 3:vbad141. [PMID: 37928340 PMCID: PMC10625459 DOI: 10.1093/bioadv/vbad141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 08/23/2023] [Accepted: 10/07/2023] [Indexed: 11/07/2023]
Abstract
Motivation The advent of highly multiplexed in situ imaging cytometry assays has revolutionized the study of cellular systems, offering unparalleled detail in observing cellular activities and characteristics. These assays provide comprehensive insights by concurrently profiling the spatial distribution and molecular features of numerous cells. In navigating this complex data landscape, unsupervised machine learning techniques, particularly clustering algorithms, have become essential tools. They enable the identification and categorization of cell types and subsets based on their molecular characteristics. Despite their widespread adoption, most clustering algorithms in use were initially developed for cell suspension technologies, leading to a potential mismatch in application. There is a critical gap in the systematic evaluation of these methods, particularly in determining the properties that make them optimal for in situ imaging assays. Addressing this gap is vital for ensuring accurate, reliable analyses and fostering advancements in cellular biology research. Results In our extensive investigation, we evaluated a range of similarity metrics, which are crucial in determining the relationships between cells during the clustering process. Our findings reveal substantial variations in clustering performance, contingent on the similarity metric employed. These variations underscore the importance of selecting appropriate metrics to ensure accurate cell type and subset identification. In response to these challenges, we introduce FuseSOM, a novel ensemble clustering algorithm that integrates hierarchical multiview learning of similarity metrics with self-organizing maps. Through a rigorous stratified subsampling analysis framework, we demonstrate that FuseSOM outperforms existing best-practice clustering methods specifically tailored for in situ imaging cytometry data. Our work not only provides critical insights into the performance of clustering algorithms in this novel context but also offers a robust solution, paving the way for more accurate and reliable in situ imaging cytometry data analysis. Availability and implementation The FuseSOM R package is available on Bioconductor and is available under the GPL-3 license. All the codes for the analysis performed can be found at Github.
Collapse
Affiliation(s)
- Elijah Willie
- Sydney Precision Data Science Centre, The University of Sydney, Camperdown, NSW 2006, Australia
- School of Mathematics and Statistics, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Pengyi Yang
- Sydney Precision Data Science Centre, The University of Sydney, Camperdown, NSW 2006, Australia
- School of Mathematics and Statistics, The University of Sydney, Camperdown, NSW 2006, Australia
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong, China
- Computational Systems Biology Group, Children’s Medical Research Institute, The University of Sydney, Westmead, NSW 2145, Australia
| | - Ellis Patrick
- Sydney Precision Data Science Centre, The University of Sydney, Camperdown, NSW 2006, Australia
- School of Mathematics and Statistics, The University of Sydney, Camperdown, NSW 2006, Australia
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong, China
- Centre for Cancer Research, The Westmead Institute for Medical Research, The University of Sydney, Westmead, NSW 2145, Australia
| |
Collapse
|
34
|
Hui X, Wang Y, Li W, Yuan Y, Tao X, Lv R. Nd-Mn Molecular Cluster with Searched Targets for Oral Cancer Imaging. Mol Imaging Biol 2023; 25:875-886. [PMID: 37256508 DOI: 10.1007/s11307-023-01828-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 04/10/2023] [Accepted: 05/11/2023] [Indexed: 06/01/2023]
Abstract
In this research, we designed a novel NIR II luminescence imaging probe with targeting effect to accurately track oral squamous cell carcinoma (OSCC) cells. Massive gene expression data were processed by weighted gene co-expression network analysis to establish a network of relationships between genes. After clustering, correlation of clinical information, and gene functional enrichment analysis, MMP1 was predicted to be a biomarker/therapeutic target for OSCC cells. To obtain rare-earth probes with better luminescence in the NIR II region, we adjusted the doping ratio of the rare-earth element (Nd, Gd, Er, and Yb) fraction of the Nd-Mn molecular cluster to optimize its luminescence properties. The results of in vitro targeting experiments showed that Nd-Mn-MMP1Ab can target Cal-27 cells, demonstrating at the cellular level that the MMP1 gene is a biomarker for oral cancer, which also proves that the cancer targets predicted by the bioinformatics approach are correct.
Collapse
Affiliation(s)
- Xin Hui
- Engineering Research Center of Molecular and Neuro Imaging, Ministry of Education, School of Life Science and Technology, Xidian University, Xi'an, 710071, Shanxi, China
| | - Yanxing Wang
- Engineering Research Center of Molecular and Neuro Imaging, Ministry of Education, School of Life Science and Technology, Xidian University, Xi'an, 710071, Shanxi, China
| | - Wenjing Li
- Engineering Research Center of Molecular and Neuro Imaging, Ministry of Education, School of Life Science and Technology, Xidian University, Xi'an, 710071, Shanxi, China
| | - Ying Yuan
- Department of Medical Interdisciplinary Research, Xi'an Ninth Hospital Affiliated to Medical College of Xi'an Jiaotong University, Xi'an, 710054, Shaanxi, China
| | - Xiaofeng Tao
- Department of Medical Interdisciplinary Research, Xi'an Ninth Hospital Affiliated to Medical College of Xi'an Jiaotong University, Xi'an, 710054, Shaanxi, China.
| | - Ruichan Lv
- Engineering Research Center of Molecular and Neuro Imaging, Ministry of Education, School of Life Science and Technology, Xidian University, Xi'an, 710071, Shanxi, China.
| |
Collapse
|
35
|
Meng R, Yin S, Sun J, Hu H, Zhao Q. scAAGA: Single cell data analysis framework using asymmetric autoencoder with gene attention. Comput Biol Med 2023; 165:107414. [PMID: 37660567 DOI: 10.1016/j.compbiomed.2023.107414] [Citation(s) in RCA: 64] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 08/02/2023] [Accepted: 08/28/2023] [Indexed: 09/05/2023]
Abstract
In recent years, single-cell RNA sequencing (scRNA-seq) has emerged as a powerful technique for investigating cellular heterogeneity and structure. However, analyzing scRNA-seq data remains challenging, especially in the context of COVID-19 research. Single-cell clustering is a key step in analyzing scRNA-seq data, and deep learning methods have shown great potential in this area. In this work, we propose a novel scRNA-seq analysis framework called scAAGA. Specifically, we utilize an asymmetric autoencoder with a gene attention module to learn important gene features adaptively from scRNA-seq data, with the aim of improving the clustering effect. We apply scAAGA to COVID-19 peripheral blood mononuclear cell (PBMC) scRNA-seq data and compare its performance with state-of-the-art methods. Our results consistently demonstrate that scAAGA outperforms existing methods in terms of adjusted rand index (ARI), normalized mutual information (NMI), and adjusted mutual information (AMI) scores, achieving improvements ranging from 2.8% to 27.8% in NMI scores. Additionally, we discuss a data augmentation technology to expand the datasets and improve the accuracy of scAAGA. Overall, scAAGA presents a robust tool for scRNA-seq data analysis, enhancing the accuracy and reliability of clustering results in COVID-19 research.
Collapse
Affiliation(s)
- Rui Meng
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Shuaidong Yin
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Jianqiang Sun
- School of Information Science and Engineering, Linyi University, Linyi, 276000, China
| | - Huan Hu
- Institute of Applied Genomics, Fuzhou University, Fuzhou, 350108, China.
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China.
| |
Collapse
|
36
|
Zeibich R, Kwan P, J. O’Brien T, Perucca P, Ge Z, Anderson A. Applications for Deep Learning in Epilepsy Genetic Research. Int J Mol Sci 2023; 24:14645. [PMID: 37834093 PMCID: PMC10572791 DOI: 10.3390/ijms241914645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 09/11/2023] [Accepted: 09/21/2023] [Indexed: 10/15/2023] Open
Abstract
Epilepsy is a group of brain disorders characterised by an enduring predisposition to generate unprovoked seizures. Fuelled by advances in sequencing technologies and computational approaches, more than 900 genes have now been implicated in epilepsy. The development and optimisation of tools and methods for analysing the vast quantity of genomic data is a rapidly evolving area of research. Deep learning (DL) is a subset of machine learning (ML) that brings opportunity for novel investigative strategies that can be harnessed to gain new insights into the genomic risk of people with epilepsy. DL is being harnessed to address limitations in accuracy of long-read sequencing technologies, which improve on short-read methods. Tools that predict the functional consequence of genetic variation can represent breaking ground in addressing critical knowledge gaps, while methods that integrate independent but complimentary data enhance the predictive power of genetic data. We provide an overview of these DL tools and discuss how they may be applied to the analysis of genetic data for epilepsy research.
Collapse
Affiliation(s)
- Robert Zeibich
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.)
| | - Patrick Kwan
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.)
- Department of Neurology, Alfred Health, Melbourne, VIC 3004, Australia
- Department of Neurology, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia
- Department of Medicine, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia
| | - Terence J. O’Brien
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.)
- Department of Neurology, Alfred Health, Melbourne, VIC 3004, Australia
- Department of Neurology, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia
- Department of Medicine, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia
| | - Piero Perucca
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.)
- Department of Neurology, Alfred Health, Melbourne, VIC 3004, Australia
- Department of Neurology, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia
- Epilepsy Research Centre, Department of Medicine, Austin Health, The University of Melbourne, Melbourne, VIC 3084, Australia
- Bladin-Berkovic Comprehensive Epilepsy Program, Department of Neurology, Austin Health, The University of Melbourne, Melbourne, VIC 3084, Australia
| | - Zongyuan Ge
- Faculty of Engineering, Monash University, Melbourne, VIC 3800, Australia;
- Monash-Airdoc Research, Monash University, Melbourne, VIC 3800, Australia
| | - Alison Anderson
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.)
- Department of Medicine, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia
| |
Collapse
|
37
|
Karim MR, Islam T, Shajalal M, Beyan O, Lange C, Cochez M, Rebholz-Schuhmann D, Decker S. Explainable AI for Bioinformatics: Methods, Tools and Applications. Brief Bioinform 2023; 24:bbad236. [PMID: 37478371 DOI: 10.1093/bib/bbad236] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 05/10/2023] [Accepted: 05/26/2023] [Indexed: 07/23/2023] Open
Abstract
Artificial intelligence (AI) systems utilizing deep neural networks and machine learning (ML) algorithms are widely used for solving critical problems in bioinformatics, biomedical informatics and precision medicine. However, complex ML models that are often perceived as opaque and black-box methods make it difficult to understand the reasoning behind their decisions. This lack of transparency can be a challenge for both end-users and decision-makers, as well as AI developers. In sensitive areas such as healthcare, explainability and accountability are not only desirable properties but also legally required for AI systems that can have a significant impact on human lives. Fairness is another growing concern, as algorithmic decisions should not show bias or discrimination towards certain groups or individuals based on sensitive attributes. Explainable AI (XAI) aims to overcome the opaqueness of black-box models and to provide transparency in how AI systems make decisions. Interpretable ML models can explain how they make predictions and identify factors that influence their outcomes. However, the majority of the state-of-the-art interpretable ML methods are domain-agnostic and have evolved from fields such as computer vision, automated reasoning or statistics, making direct application to bioinformatics problems challenging without customization and domain adaptation. In this paper, we discuss the importance of explainability and algorithmic transparency in the context of bioinformatics. We provide an overview of model-specific and model-agnostic interpretable ML methods and tools and outline their potential limitations. We discuss how existing interpretable ML methods can be customized and fit to bioinformatics research problems. Further, through case studies in bioimaging, cancer genomics and text mining, we demonstrate how XAI methods can improve transparency and decision fairness. Our review aims at providing valuable insights and serving as a starting point for researchers wanting to enhance explainability and decision transparency while solving bioinformatics problems. GitHub: https://github.com/rezacsedu/XAI-for-bioinformatics.
Collapse
Affiliation(s)
- Md Rezaul Karim
- Computer Science 5 - Information Systems and Databases, RWTH Aachen University, Germany
- Department of Data Science and Artificial Intelligence, Fraunhofer FIT, Germany
| | - Tanhim Islam
- Computer Science 9 - Process and Data Science, RWTH Aachen University, Germany
| | | | - Oya Beyan
- Computer Science 5 - Information Systems and Databases, RWTH Aachen University, Germany
- University of Cologne, Faculty of Medicine and University Hospital Cologne, Institute for Medical Informatics, Germany
| | - Christoph Lange
- Computer Science 5 - Information Systems and Databases, RWTH Aachen University, Germany
- Department of Data Science and Artificial Intelligence, Fraunhofer FIT, Germany
| | - Michael Cochez
- Department of Computer Science, Vrije Universiteit Amsterdam, the Netherlands
- Elsevier Discovery Lab, Amsterdam, the Netherlands
| | - Dietrich Rebholz-Schuhmann
- ZBMED - Information Center for Life Sciences, Cologne, Germany
- Faculty of Medicine, University of Cologne, Germany
| | - Stefan Decker
- Computer Science 5 - Information Systems and Databases, RWTH Aachen University, Germany
- Department of Data Science and Artificial Intelligence, Fraunhofer FIT, Germany
| |
Collapse
|
38
|
Gao CX, Dwyer D, Zhu Y, Smith CL, Du L, Filia KM, Bayer J, Menssink JM, Wang T, Bergmeir C, Wood S, Cotton SM. An overview of clustering methods with guidelines for application in mental health research. Psychiatry Res 2023; 327:115265. [PMID: 37348404 DOI: 10.1016/j.psychres.2023.115265] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 05/20/2023] [Accepted: 05/21/2023] [Indexed: 06/24/2023]
Abstract
Cluster analyzes have been widely used in mental health research to decompose inter-individual heterogeneity by identifying more homogeneous subgroups of individuals. However, despite advances in new algorithms and increasing popularity, there is little guidance on model choice, analytical framework and reporting requirements. In this paper, we aimed to address this gap by introducing the philosophy, design, advantages/disadvantages and implementation of major algorithms that are particularly relevant in mental health research. Extensions of basic models, such as kernel methods, deep learning, semi-supervised clustering, and clustering ensembles are subsequently introduced. How to choose algorithms to address common issues as well as methods for pre-clustering data processing, clustering evaluation and validation are then discussed. Importantly, we also provide general guidance on clustering workflow and reporting requirements. To facilitate the implementation of different algorithms, we provide information on R functions and libraries.
Collapse
Affiliation(s)
- Caroline X Gao
- Centre for Youth Mental Health, The University of Melbourne, Parkville, VIC, Australia; Orygen, Parkville, VIC, Australia; Department of Epidemiology and Preventative Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia.
| | - Dominic Dwyer
- Centre for Youth Mental Health, The University of Melbourne, Parkville, VIC, Australia; Orygen, Parkville, VIC, Australia
| | - Ye Zhu
- School of Information Technology, Deakin University, Geelong, VIC, Australia
| | - Catherine L Smith
- Department of Epidemiology and Preventative Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia
| | - Lan Du
- Faculty of Information Technology, Monash University, Clayton, VIC, Australia
| | - Kate M Filia
- Centre for Youth Mental Health, The University of Melbourne, Parkville, VIC, Australia; Orygen, Parkville, VIC, Australia
| | - Johanna Bayer
- Centre for Youth Mental Health, The University of Melbourne, Parkville, VIC, Australia; Orygen, Parkville, VIC, Australia
| | - Jana M Menssink
- Centre for Youth Mental Health, The University of Melbourne, Parkville, VIC, Australia; Orygen, Parkville, VIC, Australia
| | - Teresa Wang
- Faculty of Information Technology, Monash University, Clayton, VIC, Australia
| | - Christoph Bergmeir
- Faculty of Information Technology, Monash University, Clayton, VIC, Australia; Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain
| | - Stephen Wood
- Centre for Youth Mental Health, The University of Melbourne, Parkville, VIC, Australia; Orygen, Parkville, VIC, Australia
| | - Sue M Cotton
- Centre for Youth Mental Health, The University of Melbourne, Parkville, VIC, Australia; Orygen, Parkville, VIC, Australia
| |
Collapse
|
39
|
Vora LK, Gholap AD, Jetha K, Thakur RRS, Solanki HK, Chavda VP. Artificial Intelligence in Pharmaceutical Technology and Drug Delivery Design. Pharmaceutics 2023; 15:1916. [PMID: 37514102 PMCID: PMC10385763 DOI: 10.3390/pharmaceutics15071916] [Citation(s) in RCA: 193] [Impact Index Per Article: 96.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 06/28/2023] [Accepted: 07/04/2023] [Indexed: 07/30/2023] Open
Abstract
Artificial intelligence (AI) has emerged as a powerful tool that harnesses anthropomorphic knowledge and provides expedited solutions to complex challenges. Remarkable advancements in AI technology and machine learning present a transformative opportunity in the drug discovery, formulation, and testing of pharmaceutical dosage forms. By utilizing AI algorithms that analyze extensive biological data, including genomics and proteomics, researchers can identify disease-associated targets and predict their interactions with potential drug candidates. This enables a more efficient and targeted approach to drug discovery, thereby increasing the likelihood of successful drug approvals. Furthermore, AI can contribute to reducing development costs by optimizing research and development processes. Machine learning algorithms assist in experimental design and can predict the pharmacokinetics and toxicity of drug candidates. This capability enables the prioritization and optimization of lead compounds, reducing the need for extensive and costly animal testing. Personalized medicine approaches can be facilitated through AI algorithms that analyze real-world patient data, leading to more effective treatment outcomes and improved patient adherence. This comprehensive review explores the wide-ranging applications of AI in drug discovery, drug delivery dosage form designs, process optimization, testing, and pharmacokinetics/pharmacodynamics (PK/PD) studies. This review provides an overview of various AI-based approaches utilized in pharmaceutical technology, highlighting their benefits and drawbacks. Nevertheless, the continued investment in and exploration of AI in the pharmaceutical industry offer exciting prospects for enhancing drug development processes and patient care.
Collapse
Affiliation(s)
- Lalitkumar K Vora
- School of Pharmacy, Queen's University Belfast, 97 Lisburn Road, Belfast BT9 7BL, UK
| | - Amol D Gholap
- Department of Pharmaceutics, St. John Institute of Pharmacy and Research, Palghar 401404, Maharashtra, India
| | - Keshava Jetha
- Department of Pharmaceutics and Pharmaceutical Technology, L. M. College of Pharmacy, Ahmedabad 380009, Gujarat, India
- Ph.D. Section, Gujarat Technological University, Ahmedabad 382424, Gujarat, India
| | | | - Hetvi K Solanki
- Pharmacy Section, L. M. College of Pharmacy, Ahmedabad 380009, Gujarat, India
| | - Vivek P Chavda
- Department of Pharmaceutics and Pharmaceutical Technology, L. M. College of Pharmacy, Ahmedabad 380009, Gujarat, India
| |
Collapse
|
40
|
Kalweit M, Burden AM, Boedecker J, Hügle T, Burkard T. Patient groups in Rheumatoid arthritis identified by deep learning respond differently to biologic or targeted synthetic DMARDs. PLoS Comput Biol 2023; 19:e1011073. [PMID: 37267387 DOI: 10.1371/journal.pcbi.1011073] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 04/04/2023] [Indexed: 06/04/2023] Open
Abstract
Cycling of biologic or targeted synthetic disease modifying antirheumatic drugs (b/tsDMARDs) in rheumatoid arthritis (RA) patients due to non-response is a problem preventing and delaying disease control. We aimed to assess and validate treatment response of b/tsDMARDs among clusters of RA patients identified by deep learning. We clustered RA patients clusters at first-time b/tsDMARD (cohort entry) in the Swiss Clinical Quality Management in Rheumatic Diseases registry (SCQM) [1999-2018]. We performed comparative effectiveness analyses of b/tsDMARDs (ref. adalimumab) using Cox proportional hazard regression. Within 15 months, we assessed b/tsDMARD stop due to non-response, and separately a ≥20% reduction in DAS28-esr as a response proxy. We validated results through stratified analyses according to most distinctive patient characteristics of clusters. Clusters comprised between 362 and 1481 patients (3516 unique patients). Stratified (validation) analyses confirmed comparative effectiveness results among clusters: Patients with ≥2 conventional synthetic DMARDs and prednisone at b/tsDMARD initiation, male patients, as well as patients with a lower disease burden responded better to tocilizumab than to adalimumab (hazard ratio [HR] 5.46, 95% confidence interval [CI] [1.76-16.94], and HR 8.44 [3.43-20.74], and HR 3.64 [2.04-6.49], respectively). Furthermore, seronegative women without use of prednisone at b/tsDMARD initiation as well as seropositive women with a higher disease burden and longer disease duration had a higher risk of non-response with golimumab (HR 2.36 [1.03-5.40] and HR 5.27 [2.10-13.21], respectively) than with adalimumab. Our results suggest that RA patient clusters identified by deep learning may have different responses to first-line b/tsDMARD. Thus, it may suggest optimal first-line b/tsDMARD for certain RA patients, which is a step forward towards personalizing treatment. However, further research in other cohorts is needed to verify our results.
Collapse
Affiliation(s)
- Maria Kalweit
- Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Andrea M Burden
- ETH Zurich, Department of Chemistry and Applied Biosciences, Zurich, Switzerland
| | - Joschka Boedecker
- Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Thomas Hügle
- Department of Rheumatology, Lausanne University Hospital, and University of Lausanne, Lausanne, Switzerland
| | - Theresa Burkard
- ETH Zurich, Department of Chemistry and Applied Biosciences, Zurich, Switzerland
| |
Collapse
|
41
|
Zhang H, Kong W, Xie Y, Zhao X, Luo D, Chen S, Pan Z. Telomere-related genes as potential biomarkers to predict endometriosis and immune response: Development of a machine learning-based risk model. Front Med (Lausanne) 2023; 10:1132676. [PMID: 36968845 PMCID: PMC10034389 DOI: 10.3389/fmed.2023.1132676] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 02/20/2023] [Indexed: 03/11/2023] Open
Abstract
IntroductionEndometriosis (EM) is an aggressive, pleomorphic, and common gynecological disease. Its clinical presentation includes abnormal menstruation, dysmenorrhea, and infertility, which seriously affect the patient's quality of life. However, the pathogenesis underlying EM and associated regulatory genes are unknown.MethodsTelomere-related genes (TRGs) were uploaded from TelNet. RNA-sequencing (RNA-seq) data of EM patients were obtained from three datasets (GSE5108, GSE23339, and GSE25628) in the GEO database, and a random forest approach was used to identify telomere signature genes and build nomogram prediction models. Gene Ontology, Kyoto Encyclopedia of Genes and Genomes, and Gene Set Enrichment Analysis were used to identify the pathways involved in the action of the signature genes. Finally, the CAMP database was used to screen drugs for potential use in EM treatment.ResultsFifteen total genes were screened as EM–telomere differentially expressed genes. Further screening by machine learning obtained six genes as characteristic predictive of EM. Immuno-infiltration analysis of the telomeric genes showed that expressions including macrophages and natural killer cells were significantly higher in cluster A. Further enrichment analysis showed that the differential genes were mainly enriched in biological pathways like cell cycle and extracellular matrix. Finally, the Connective Map database was used to screen 11 potential drugs for EM treatment.DiscussionTRGs play a crucial role in EM development, and are associated with immune infiltration and act on multiple pathways, including the cell cycle. Telomere signature genes can be valuable predictive markers for EM.
Collapse
|
42
|
Hernández-Hernández S, Ballester PJ. On the Best Way to Cluster NCI-60 Molecules. Biomolecules 2023; 13:biom13030498. [PMID: 36979433 PMCID: PMC10046274 DOI: 10.3390/biom13030498] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 03/02/2023] [Accepted: 03/06/2023] [Indexed: 03/30/2023] Open
Abstract
Machine learning-based models have been widely used in the early drug-design pipeline. To validate these models, cross-validation strategies have been employed, including those using clustering of molecules in terms of their chemical structures. However, the poor clustering of compounds will compromise such validation, especially on test molecules dissimilar to those in the training set. This study aims at finding the best way to cluster the molecules screened by the National Cancer Institute (NCI)-60 project by comparing hierarchical, Taylor-Butina, and uniform manifold approximation and projection (UMAP) clustering methods. The best-performing algorithm can then be used to generate clusters for model validation strategies. This study also aims at measuring the impact of removing outlier molecules prior to the clustering step. Clustering results are evaluated using three well-known clustering quality metrics. In addition, we compute an average similarity matrix to assess the quality of each cluster. The results show variation in clustering quality from method to method. The clusters obtained by the hierarchical and Taylor-Butina methods are more computationally expensive to use in cross-validation strategies, and both cluster the molecules poorly. In contrast, the UMAP method provides the best quality, and therefore we recommend it to analyze this highly valuable dataset.
Collapse
Affiliation(s)
- Saiveth Hernández-Hernández
- Cancer Research Center of Marseille (INSERM U1068, Institut Paoli-Calmettes, Aix-Marseille Université UM105, CNRS UMR7258), 13009 Marseille, France
| | - Pedro J Ballester
- Department of Bioengineering, Imperial College London, London SW7 2AZ, UK
| |
Collapse
|
43
|
Nguyen R, Sokhansanj BA, Polikar R, Rosen GL. Complet+: a computationally scalable method to improve completeness of large-scale protein sequence clustering. PeerJ 2023; 11:e14779. [PMID: 36785708 PMCID: PMC9921987 DOI: 10.7717/peerj.14779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 01/03/2023] [Indexed: 02/10/2023] Open
Abstract
A major challenge for clustering algorithms is to balance the trade-off between homogeneity, i.e., the degree to which an individual cluster includes only related sequences, and completeness, the degree to which related sequences are broken up into multiple clusters. Most algorithms are conservative in grouping sequences with other sequences. Remote homologs may fail to be clustered together and instead form unnecessarily distinct clusters. The resulting clusters have high homogeneity but completeness that is too low. We propose Complet+, a computationally scalable post-processing method to increase the completeness of clusters without an undue cost in homogeneity. Complet+ proves to effectively merge closely-related clusters of protein that have verified structural relationships in the SCOPe classification scheme, improving the completeness of clustering results at little cost to homogeneity. Applying Complet+ to clusters obtained using MMseqs2's clusterupdate achieves an increased V-measure of 0.09 and 0.05 at the SCOPe superfamily and family levels, respectively. Complet+ also creates more biologically representative clusters, as shown by a substantial increase in Adjusted Mutual Information (AMI) and Adjusted Rand Index (ARI) metrics when comparing predicted clusters to biological classifications. Complet+ similarly improves clustering metrics when applied to other methods, such as CD-HIT and linclust. Finally, we show that Complet+ runtime scales linearly with respect to the number of clusters being post-processed on a COG dataset of over 3 million sequences. Code and supplementary information is available on Github: https://github.com/EESI/Complet-Plus.
Collapse
Affiliation(s)
- Rachel Nguyen
- Drexel University, Philadelphia, United States of America
| | | | - Robi Polikar
- Rowan University, Glassboro, NJ, United States of America
| | - Gail L. Rosen
- Drexel University, Philadelphia, United States of America
| |
Collapse
|
44
|
Alharbi F, Vakanski A. Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review. Bioengineering (Basel) 2023; 10:bioengineering10020173. [PMID: 36829667 PMCID: PMC9952758 DOI: 10.3390/bioengineering10020173] [Citation(s) in RCA: 46] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 01/24/2023] [Accepted: 01/26/2023] [Indexed: 01/31/2023] Open
Abstract
Cancer is a term that denotes a group of diseases caused by the abnormal growth of cells that can spread in different parts of the body. According to the World Health Organization (WHO), cancer is the second major cause of death after cardiovascular diseases. Gene expression can play a fundamental role in the early detection of cancer, as it is indicative of the biochemical processes in tissue and cells, as well as the genetic characteristics of an organism. Deoxyribonucleic acid (DNA) microarrays and ribonucleic acid (RNA)-sequencing methods for gene expression data allow quantifying the expression levels of genes and produce valuable data for computational analysis. This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods. Both conventional and deep learning-based approaches are reviewed, with an emphasis on the application of deep learning models due to their comparative advantages for identifying gene patterns that are distinctive for various types of cancers. Relevant works that employ the most commonly used deep neural network architectures are covered, including multi-layer perceptrons, as well as convolutional, recurrent, graph, and transformer networks. This survey also presents an overview of the data collection methods for gene expression analysis and lists important datasets that are commonly used for supervised machine learning for this task. Furthermore, we review pertinent techniques for feature engineering and data preprocessing that are typically used to handle the high dimensionality of gene expression data, caused by a large number of genes present in data samples. The paper concludes with a discussion of future research directions for machine learning-based gene expression analysis for cancer classification.
Collapse
|
45
|
Gorla A, Sankararaman S, Burchard E, Flint J, Zaitlen N, Rahmani E. Phenotypic subtyping via contrastive learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.05.522921. [PMID: 36711575 PMCID: PMC9881932 DOI: 10.1101/2023.01.05.522921] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Defining and accounting for subphenotypic structure has the potential to increase statistical power and provide a deeper understanding of the heterogeneity in the molecular basis of complex disease. Existing phenotype subtyping methods primarily rely on clinically observed heterogeneity or metadata clustering. However, they generally tend to capture the dominant sources of variation in the data, which often originate from variation that is not descriptive of the mechanistic heterogeneity of the phenotype of interest; in fact, such dominant sources of variation, such as population structure or technical variation, are, in general, expected to be independent of subphenotypic structure. We instead aim to find a subspace with signal that is unique to a group of samples for which we believe that subphenotypic variation exists (e.g., cases of a disease). To that end, we introduce Phenotype Aware Components Analysis (PACA), a contrastive learning approach leveraging canonical correlation analysis to robustly capture weak sources of subphenotypic variation. In the context of disease, PACA learns a gradient of variation unique to cases in a given dataset, while leveraging control samples for accounting for variation and imbalances of biological and technical confounders between cases and controls. We evaluated PACA using an extensive simulation study, as well as on various subtyping tasks using genotypes, transcriptomics, and DNA methylation data. Our results provide multiple strong evidence that PACA allows us to robustly capture weak unknown variation of interest while being calibrated and well-powered, far superseding the performance of alternative methods. This renders PACA as a state-of-the-art tool for defining de novo subtypes that are more likely to reflect molecular heterogeneity, especially in challenging cases where the phenotypic heterogeneity may be masked by a myriad of strong unrelated effects in the data.
Collapse
Affiliation(s)
- Aditya Gorla
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, USA
| | - Sriram Sankararaman
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
| | - Esteban Burchard
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
| | - Jonathan Flint
- Department of Psychiatry and Behavioral Sciences, Brain Research Institute, University of California, Los Angeles, Los Angeles, CA, USA
| | - Noah Zaitlen
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Elior Rahmani
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| |
Collapse
|
46
|
Wu H, Zeng R, Qiu X, Chen K, Zhuo Z, Guo K, Xiang Y, Yang Q, Jiang R, Leung FW, Lian Q, Sha W, Chen H. Investigating regulatory patterns of NLRP3 Inflammasome features and association with immune microenvironment in Crohn's disease. Front Immunol 2023; 13:1096587. [PMID: 36685554 PMCID: PMC9849378 DOI: 10.3389/fimmu.2022.1096587] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Accepted: 12/02/2022] [Indexed: 01/06/2023] Open
Abstract
INTRODUCTION Crohn's disease is characterized of dysregulated inflammatory and immune reactions. The role of the NOD-like receptor family, pyrin domain-containing 3 (NLRP3) inflammasome in Crohn's disease remains largely unknown. METHODS The microarray-based transcriptomic data and corresponding clinical information of GSE100833 and GSE16879 were obtained from the Gene Expression Omnibus (GEO) database. Identification of in the NLRP3 inflammasome-related genes and construction of LASSO regression model. Immune landscape analysis was evaluated with ssGSEA. Classification of Crohn's-disease samples based on NLRP3 inflammasome-related genes with ConsensusClusterPlus. Functional enrichment analysis, gene set variation analysis (GSVA) and drug-gene interaction network. RESULTS The expressions of NLRP3 inflammasome-related genes were increased in diseased tissues, and higher expressions of NLRP3 inflammasome-related genes were correlated with generally enhanced immune cell infiltration, immune-related pathways and human leukocyte antigen (HLA)-gene expressions. The gene-based signature showed well performance in the diagnosis of Crohn's disease. Moreover, consensus clustering identified two Crohn's disease clusters based on NLRP3 inflammasome-related genes, and cluster 2 was with higher expressions of the genes. Cluster 2 demonstrated upregulated activities of immune environment in Crohn's disease. Furthermore, four key hub genes were identified and potential drugs were explored for the treatment of Crohn's disease. CONCLUSIONS Our findings indicate that NLRP3 inflammasome and its related genes could regulate immune cells and responses, as well as involve in the pathogenesis of Crohn's disease from transcriptomic aspects. These findings provide in silico insights into the diagnosis and treatment of Crohn's disease and might assist in the clinical decision-making process.
Collapse
Affiliation(s)
- Huihuan Wu
- Department of Gastroenterology, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
- School of Medicine, South China University of Technology, Guangzhou, China
| | - Ruijie Zeng
- Department of Gastroenterology, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
- School of Medicine, Shantou University Medical College, Shantou, China
| | - Xinqi Qiu
- Zhuguang Community Healthcare Center, Guangzhou, China
| | - Kequan Chen
- Department of Gastroenterology, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Zewei Zhuo
- Department of Gastroenterology, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Kehang Guo
- Department of Critical Care Medicine, The Fifth Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Yawen Xiang
- Edinburgh Medical School, College of Medicine and Veterinary Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Qi Yang
- Department of Gastroenterology, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Rui Jiang
- School of Medicine, South China University of Technology, Guangzhou, China
| | - Felix W. Leung
- David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, United States
| | - Qizhou Lian
- Department of Medicine, Queen Mary Hospital, Hong Kong, Hong Kong SAR, China
| | - Weihong Sha
- Department of Gastroenterology, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
- School of Medicine, South China University of Technology, Guangzhou, China
| | - Hao Chen
- Department of Gastroenterology, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
- School of Medicine, South China University of Technology, Guangzhou, China
| |
Collapse
|
47
|
Sun J, Huang Q. Two stages biclustering with three populations. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
48
|
Sun J, Liu Q, Wang Y, Wang L, Song X, Zhao X. Five-year prognosis model of esophageal cancer based on genetic algorithm improved deep neural network. Ing Rech Biomed 2023. [DOI: 10.1016/j.irbm.2022.100748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
|
49
|
Johnson AC, Silva JAF, Kim SC, Larsen CP. Progress in kidney transplantation: The role for systems immunology. Front Med (Lausanne) 2022; 9:1070385. [PMID: 36590970 PMCID: PMC9800623 DOI: 10.3389/fmed.2022.1070385] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 11/16/2022] [Indexed: 12/23/2022] Open
Abstract
The development of systems biology represents an immense breakthrough in our ability to perform translational research and deliver personalized and precision medicine. A multidisciplinary approach in combination with use of novel techniques allows for the extraction and analysis of vast quantities of data even from the volume and source limited samples that can be obtained from human subjects. Continued advances in microfluidics, scalability and affordability of sequencing technologies, and development of data analysis tools have made the application of a multi-omics, or systems, approach more accessible for use outside of specialized centers. The study of alloimmune and protective immune responses after solid organ transplant offers innumerable opportunities for a multi-omics approach, however, transplant immunology labs are only just beginning to adopt the systems methodology. In this review, we focus on advances in biological techniques and how they are improving our understanding of the immune system and its interactions, highlighting potential applications in transplant immunology. First, we describe the techniques that are available, with emphasis on major advances that allow for increased scalability. Then, we review initial applications in the field of transplantation with a focus on topics that are nearing clinical integration. Finally, we examine major barriers to adapting these methods and discuss potential future developments.
Collapse
|
50
|
Sherif FF, Ahmed KS. Unsupervised clustering of SARS-CoV-2 using deep convolutional autoencoder. JOURNAL OF ENGINEERING AND APPLIED SCIENCE 2022. [PMCID: PMC9383682 DOI: 10.1186/s44147-022-00125-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
SARS-CoV-2’s population structure might have a substantial impact on public health management and diagnostics if it can be identified. It is critical to rapidly monitor and characterize their lineages circulating globally for a more accurate diagnosis, improved care, and faster treatment. For a clearer picture of the SARS-CoV-2 population structure, clustering the sequencing data is essential. Here, deep clustering techniques were used to automatically group 29,017 different strains of SARS-CoV-2 into clusters. We aim to identify the main clusters of SARS-CoV-2 population structure based on convolutional autoencoder (CAE) trained with numerical feature vectors mapped from coronavirus Spike peptide sequences. Our clustering findings revealed that there are six large SARS-CoV-2 population clusters (C1, C2, C3, C4, C5, C6). These clusters contained 43 unique lineages in which the 29,017 publicly accessible strains were dispersed. In all the resulting six clusters, the genetic distances within the same cluster (intra-cluster distances) are less than the distances between inter-clusters (P-value 0.0019, Wilcoxon rank-sum test). This indicates substantial evidence of a connection between the cluster’s lineages. Furthermore, comparisons of the K-means and hierarchical clustering methods have been examined against the proposed deep learning clustering method. The intra-cluster genetic distances of the proposed method were smaller than those of K-means alone and hierarchical clustering methods. We used T-distributed stochastic-neighbor embedding (t-SNE) to show the outcomes of the deep learning clustering. The strains were isolated correctly between clusters in the t-SNE plot. Our results showed that the (C5) cluster exclusively includes Gamma lineage (P.1) only, suggesting that strains of P.1 in C5 are more diversified than those in the other clusters. Our study indicates that the genetic similarity between strains in the same cluster enables a better understanding of the major features of the unknown population lineages when compared to some of the more prevalent viral isolates. This information helps researchers figure out how the virus changed over time and spread to people all over the world.
Collapse
|