Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Karim MR, Beyan O, Zappa A, Costa IG, Rebholz-Schuhmann D, Cochez M, Decker S. Deep learning-based clustering approaches for bioinformatics. Brief Bioinform 2021;22:393-415. [PMID: 32008043 PMCID: PMC7820885 DOI: 10.1093/bib/bbz170] [Citation(s) in RCA: 93] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2019] [Revised: 11/28/2019] [Accepted: 12/11/2019] [Indexed: 12/14/2022] Open

For:	Karim MR, Beyan O, Zappa A, Costa IG, Rebholz-Schuhmann D, Cochez M, Decker S. Deep learning-based clustering approaches for bioinformatics. Brief Bioinform 2021;22:393-415. [PMID: 32008043 PMCID: PMC7820885 DOI: 10.1093/bib/bbz170] [Citation(s) in RCA: 93] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2019] [Revised: 11/28/2019] [Accepted: 12/11/2019] [Indexed: 12/14/2022] Open

Number

Cited by Other Article(s)

Shaposhnikov M, Thakar J, Berk BC. Value of Bioinformatics Models for Predicting Translational Control of Angiogenesis. Circ Res 2025;136:1147-1165. [PMID: 40339045 DOI: 10.1161/circresaha.125.325438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 05/10/2025]

Abstract

Angiogenesis, the formation of new blood vessels, is a fundamental biological process with implications for both physiological functions and pathological conditions. While the transcriptional regulation of angiogenesis, mediated by factors such as HIF-1α (hypoxia-inducible factor 1-alpha) and VEGF (vascular endothelial growth factor), is well-characterized, the translational regulation of this process remains underexplored. Bioinformatics has emerged as an indispensable tool for advancing our understanding of translational regulation, offering predictive models that leverage large data sets to guide research and optimize experimental approaches. However, a significant gap persists between bioinformatics experts and other researchers, limiting the accessibility and utility of these tools in the broader scientific community. To address this divide, user-friendly bioinformatics platforms are being developed to democratize access to predictive analytics and empower researchers across disciplines. Translational control, compared with transcriptional control, offers a more energy-efficient mechanism that facilitates rapid cellular responses to environmental changes. Furthermore, transcriptional regulators themselves are often subject to translational control, emphasizing the interconnected nature of these regulatory layers. Investigating translational regulation requires advanced, accessible bioinformatics tools to analyze RNA structures, interacting micro-RNAs, long noncoding RNAs, and RBPs (RNA-binding proteins). Predictive platforms such as RNA structure, human internal ribosome entry site Atlas, and RBPSuite enable the study of RNA motifs and RNA-protein interactions, shedding light on these critical regulatory mechanisms. This review highlights the transformative role of bioinformatics using widely accessible user-friendly tools with a Web-browser interface to elucidate translational regulation in angiogenesis. The bioinformatics tools discussed extend beyond angiogenesis, with applications in diverse fields, including clinical care. By integrating predictive models and experimental insights, researchers can streamline hypothesis generation, reduce experimental costs, and find novel translational regulators. By bridging the bioinformatics knowledge gap, this review aims to empower researchers worldwide to adopt bioinformatics tools in their work, fostering innovation and accelerating scientific discovery.

Collapse

Yan R, Islam MT, Xing L. Interpretable discovery of patterns in tabular data via spatially semantic topographic maps. Nat Biomed Eng 2025;9:471-482. [PMID: 39407015 DOI: 10.1038/s41551-024-01268-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 09/23/2024] [Indexed: 04/18/2025]

Luo X, Zhang X, Su D, Li H, Zou M, Xiong Y, Yang L. Deep Clustering-Based Metabolic Stratification of Non-Small Cell Lung Cancer Patients Through Integration of Somatic Mutation Profile and Network Propagation Algorithm. Interdiscip Sci 2025:10.1007/s12539-025-00699-2. [PMID: 40100545 DOI: 10.1007/s12539-025-00699-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Revised: 02/21/2025] [Accepted: 02/22/2025] [Indexed: 03/20/2025]

Abstract

As a common malignancy of the lower respiratory tract, non-small cell lung cancer (NSCLC) represents a major oncological challenge globally, characterized by high incidence and mortality rates. Recent research highlights the critical involvement of somatic mutations in the onset and development of NSCLC. Stratification of NSCLC patients based on somatic mutation data could facilitate the identification of patients likely to respond to personalized therapeutic strategies. However, stratification of NSCLC patients using somatic mutation data is challenging due to the sparseness of this data. In this study, based on sparse somatic mutation data from 4581 NSCLC patients from the Memorial Sloan Kettering Cancer Center (MSKCC) database, we systematically evaluate the metabolic pathway activity in NSCLC patients through the application of network propagation algorithm and computational biology algorithms. Based on these metabolic pathways associated with prognosis, as recognized through univariate Cox regression analysis, NSCLC patients are stratified using the deep clustering algorithm to explore the optimal classification strategy, thereby establishing biologically meaningful metabolic subtypes of NSCLC patients. The precise NSCLC metabolic subtypes obtained from the network propagation algorithm and deep clustering algorithm are systematically evaluated and validated for survival benefits of immunotherapy. Our research marks progress towards developing a universal approach for classifying NSCLC patients based solely on somatic mutation profiles, employing deep clustering algorithm. The implementation of our research will help to deepen the analysis of NSCLC patients' metabolic subtypes from the perspective of tumor microenvironment, providing a strong basis for the formulation of more precise personalized treatment plans.

Collapse

van Dorp CH, Gray JI, Paik DH, Farber DL, Yates AJ. A variational deep-learning approach to modeling memory T cell dynamics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.07.08.602409. [PMID: 40060443 PMCID: PMC11888226 DOI: 10.1101/2024.07.08.602409] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/15/2025]

Beato M, Jaward MH, Nassis GP, Figueiredo P, Clemente FM, Krustrup P. An Educational Review on Machine Learning: A SWOT Analysis for Implementing Machine Learning Techniques in Football. Int J Sports Physiol Perform 2025;20:183-191. [PMID: 39662428 DOI: 10.1123/ijspp.2024-0247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 09/25/2024] [Accepted: 10/07/2024] [Indexed: 12/13/2024]

Ikotun AM, Habyarimana F, Ezugwu AE. Cluster validity indices for automatic clustering: A comprehensive review. Heliyon 2025;11:e41953. [PMID: 39897868 PMCID: PMC11787482 DOI: 10.1016/j.heliyon.2025.e41953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Revised: 01/08/2025] [Accepted: 01/13/2025] [Indexed: 02/04/2025] Open

Shaheen A, Mrabah N, Ksantini R, Alqaddoumi A. Rethinking deep clustering paradigms: Self-supervision is all you need. Neural Netw 2025;181:106773. [PMID: 39383676 DOI: 10.1016/j.neunet.2024.106773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 08/14/2024] [Accepted: 09/29/2024] [Indexed: 10/11/2024]

Abstract

The recent advances in deep clustering have been made possible by significant progress in self-supervised and pseudo-supervised learning. However, the trade-off between self-supervision and pseudo-supervision can give rise to three primary issues. The joint training causes Feature Randomness and Feature Drift, whereas the independent training causes Feature Randomness and Feature Twist. In essence, using pseudo-labels generates random and unreliable features. The combination of pseudo-supervision and self-supervision drifts the reliable clustering-oriented features. Moreover, moving from self-supervision to pseudo-supervision can twist the curved latent manifolds. This paper addresses the limitations of existing deep clustering paradigms concerning Feature Randomness, Feature Drift, and Feature Twist. We propose a new paradigm with a new strategy that replaces pseudo-supervision with a second round of self-supervision training. The new strategy makes the transition between instance-level self-supervision and neighborhood-level self-supervision smoother and less abrupt. Moreover, it prevents the drifting effect that is caused by the strong competition between instance-level self-supervision and clustering-level pseudo-supervision. Moreover, the absence of the pseudo-supervision prevents the risk of generating random features. With this novel approach, our paper introduces a Rethinking of the Deep Clustering Paradigms, denoted by R-DC. Our model is specifically designed to address three primary challenges encountered in Deep Clustering: Feature Randomness, Feature Drift, and Feature Twist. Experimental results conducted on six datasets have shown that the two-level self-supervision training yields substantial improvements, as evidenced by the results of the clustering and ablation study. Furthermore, experimental comparisons with nine state-of-the-art clustering models have clearly shown that our strategy leads to a significant enhancement in performance.

Collapse

Guo Y, Li T, Gong B, Hu Y, Wang S, Yang L, Zheng C. From Images to Genes: Radiogenomics Based on Artificial Intelligence to Achieve Non-Invasive Precision Medicine in Cancer Patients. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025;12:e2408069. [PMID: 39535476 PMCID: PMC11727298 DOI: 10.1002/advs.202408069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 10/19/2024] [Indexed: 11/16/2024]

Gao J, Wu M, Liao J, Meng F, Chen C. Clustering one million molecular structures on GPU within seconds. J Comput Chem 2024;45:2710-2718. [PMID: 39143827 DOI: 10.1002/jcc.27470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 06/13/2024] [Accepted: 07/14/2024] [Indexed: 08/16/2024]

Li T, Li M, Wu Y, Li Y. Visualization Methods for DNA Sequences: A Review and Prospects. Biomolecules 2024;14:1447. [PMID: 39595624 PMCID: PMC11592258 DOI: 10.3390/biom14111447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Revised: 11/08/2024] [Accepted: 11/12/2024] [Indexed: 11/28/2024] Open

Getz WM, Salter R, Sethi V, Cain S, Spiegel O, Toledo S. The statistical building blocks of animal movement simulations. MOVEMENT ECOLOGY 2024;12:67. [PMID: 39350248 PMCID: PMC11440923 DOI: 10.1186/s40462-024-00507-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 09/16/2024] [Indexed: 10/04/2024]

Abstract

Animal movement plays a key role in many ecological processes and has a direct influence on an individual's fitness at several scales of analysis (i.e., next-step, subdiel, day-by-day, seasonal). This highlights the need to dissect movement behavior at different spatio-temporal scales and develop hierarchical movement tools for generating realistic tracks to supplement existing single-temporal-scale simulators. In reality, animal movement paths are a concatenation of fundamental movement elements (FuMEs: e.g., a step or wing flap), but these are not generally extractable from a relocation time-series track (e.g., sequential GPS fixes) from which step-length (SL, aka velocity) and turning-angle (TA) time series can be extracted. For short, fixed-length segments of track, we generate their SL and TA statistics (e.g., means, standard deviations, correlations) to obtain segment-specific vectors that can be cluster into different types. We use the centroids of these clusters to obtain a set of statistical movement elements (StaMEs; e.g.,directed fast movement versus random slow movement elements) that we use as a basis for analyzing and simulating movement tracks. Our novel concept is that sequences of StaMEs provide a basis for constructing and fitting step-selection kernels at the scale of fixed-length canonical activity modes: short fixed-length sequences of interpretable activity such as dithering, ambling, directed walking, or running. Beyond this, variable length pure or characteristic mixtures of CAMs can be interpreted as behavioral activity modes (BAMs), such as gathering resources (a sequence of dithering and walking StaMEs) or beelining (a sequence of fast directed-walk StaMEs interspersed with vigilance and navigation stops). Here we formulate a multi-modal, step-selection kernel simulation framework, and construct a 2-mode movement simulator (Numerus ANIMOVER_1), using Numerus RAMP technology. These RAMPs run as stand alone applications: they require no coding but only the input of selected parameter values. They can also be used in R programming environments as virtual R packages. We illustrate our methods for extracting StaMEs from both ANIMOVER_1 simulated data and empirical data from two barn owls (Tyto alba) in the Harod Valley, Israel. Overall, our new bottom-up approach to path segmentation allows us to both dissect real movement tracks and generate realistic synthetic ones, thereby providing a general tool for testing hypothesis in movement ecology and simulating animal movement in diverse contexts such as evaluating an individual's response to landscape changes, release of an individual into a novel environment, or identifying when individuals are sick or unusually stressed.

Collapse

Wu J, Wang L, Cui Y, Liu C, Ding W, Ren S, Dong R, Zhang J. Development of a Quality Evaluation Method for Allii Macrostemonis Bulbus Based on Solid-Phase Extraction-High-Performance Liquid Chromatography-Evaporative Light Scattering Detection Chromatographic Fingerprinting, Chemometrics, and Quantitative Analysis of Multi-Components via a Single-Marker Method. Molecules 2024;29:4600. [PMID: 39407530 PMCID: PMC11478197 DOI: 10.3390/molecules29194600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 08/31/2024] [Accepted: 09/25/2024] [Indexed: 10/20/2024] Open

Kolk MZH, Frodi DM, Langford J, Andersen TO, Jacobsen PK, Risum N, Tan HL, Svendsen JH, Knops RE, Diederichsen SZ, Tjong FVY. Deep behavioural representation learning reveals risk profiles for malignant ventricular arrhythmias. NPJ Digit Med 2024;7:250. [PMID: 39284923 PMCID: PMC11405885 DOI: 10.1038/s41746-024-01247-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 08/30/2024] [Indexed: 09/22/2024] Open

Affiliation(s)

Maarten Z H Kolk Department of Clinical and Experimental Cardiology, Amsterdam UMC Location University of Amsterdam, Heart Center, Meibergdreef 9, Amsterdam, the Netherlands Amsterdam Cardiovascular Sciences, Heart Failure & Arrhythmias, Amsterdam UMC location AMC Meibergdreef 9, 1105 AZ, Amsterdam, the Netherlands
Diana My Frodi Department of Cardiology, Copenhagen University Hospital Rigshospitalet, Inge Lehmanns Vej 7, 2100, Copenhagen, Denmark
Joss Langford Activinsights Ltd., Unit 11, Harvard Industrial Estate, Kimbolton, Huntingdon, PE28 0NJ, United Kingdom College of Life and Environmental Sciences, University of Exeter, Stocker Rd, Exeter, EX4 4PY, United Kingdom
Tariq O Andersen Department of Computer Science, University of Copenhagen, Universitetsparken 1, 2100, Copenhagen, Denmark
Peter Karl Jacobsen Department of Cardiology, Copenhagen University Hospital Rigshospitalet, Inge Lehmanns Vej 7, 2100, Copenhagen, Denmark
Niels Risum Department of Cardiology, Copenhagen University Hospital Rigshospitalet, Inge Lehmanns Vej 7, 2100, Copenhagen, Denmark
Hanno L Tan Department of Clinical and Experimental Cardiology, Amsterdam UMC Location University of Amsterdam, Heart Center, Meibergdreef 9, Amsterdam, the Netherlands Netherlands Heart Institute, Moreelsepark 1, 3511 EP, Utrecht, The Netherlands
Jesper Hastrup Svendsen Department of Cardiology, Copenhagen University Hospital Rigshospitalet, Inge Lehmanns Vej 7, 2100, Copenhagen, Denmark Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3B, 2200, Copenhagen, Denmark
Reinoud E Knops Department of Clinical and Experimental Cardiology, Amsterdam UMC Location University of Amsterdam, Heart Center, Meibergdreef 9, Amsterdam, the Netherlands Amsterdam Cardiovascular Sciences, Heart Failure & Arrhythmias, Amsterdam UMC location AMC Meibergdreef 9, 1105 AZ, Amsterdam, the Netherlands
Søren Zöga Diederichsen Department of Cardiology, Copenhagen University Hospital Rigshospitalet, Inge Lehmanns Vej 7, 2100, Copenhagen, Denmark
Fleur V Y Tjong Department of Clinical and Experimental Cardiology, Amsterdam UMC Location University of Amsterdam, Heart Center, Meibergdreef 9, Amsterdam, the Netherlands. Amsterdam Cardiovascular Sciences, Heart Failure & Arrhythmias, Amsterdam UMC location AMC Meibergdreef 9, 1105 AZ, Amsterdam, the Netherlands.

Collapse

Goggin SM, Zunder ER. ESCHR: a hyperparameter-randomized ensemble approach for robust clustering across diverse datasets. Genome Biol 2024;25:242. [PMID: 39285487 PMCID: PMC11406744 DOI: 10.1186/s13059-024-03386-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 08/30/2024] [Indexed: 09/19/2024] Open

Gorla A, Witonsky J, Elhawary JR, Chen ZJ, Mefford J, Perez-Garcia J, Huntsman S, Hu D, Eng C, Woodruff PG, Sankararaman S, Ziv E, Flint J, Zaitlen N, Burchard E, Rahmani E. Epigenetic patient stratification via contrastive machine learning refines hallmark biomarkers in minoritized children with asthma. RESEARCH SQUARE 2024:rs.3.rs-5066762. [PMID: 39315258 PMCID: PMC11419268 DOI: 10.21203/rs.3.rs-5066762/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]

Abstract

Identifying and refining clinically significant patient stratification is a critical step toward realizing the promise of precision medicine in asthma. Several peripheral blood hallmarks, including total peripheral blood eosinophil count (BEC) and immunoglobulin E (IgE) levels, are routinely used in asthma clinical practice for endotype classification and predicting response to state-of-the-art targeted biologic drugs. However, these biomarkers appear ineffective in predicting treatment outcomes in some patients, and they differ in distribution between racially and ethnically diverse populations, potentially compromising medical care and hindering health equity due to biases in drug eligibility. Here, we propose constructing an unbiased patient stratification score based on DNA methylation (DNAm) and utilizing it to refine the efficacy of hallmark biomarkers for predicting drug response. We developed Phenotype Aware Component Analysis (PACA), a novel contrastive machine-learning method for learning combinations of DNAm sites reflecting biomedically meaningful patient stratifications. Leveraging whole-blood DNAm from Latino (discovery; n=1,016) and African American (replication; n=756) pediatric asthma case-control cohorts, we applied PACA to refine the prediction of bronchodilator response (BDR) to the short-acting β2-agonist albuterol, the most used drug to treat acute bronchospasm worldwide. While BEC and IgE correlate with BDR in the general patient population, our PACA-derived DNAm score renders these biomarkers predictive of drug response only in patients with high DNAm scores. BEC correlates with BDR in patients with upper-quartile DNAm scores (OR 1.12; 95% CI [1.04, 1.22]; P=7.9 e-4) but not in patients with lower-quartile scores (OR 1.05; 95% CI [0.95, 1.17]; P=0.21); and IgE correlates with BDR in above-median (OR for response 1.42; 95% CI [1.24, 1.63]; P=3.9e-7) but not in below-median patients (OR 1.05; 95% CI [0.92, 1.2]; P=0.57). These results hold within the commonly recognized type 2 (T2)-high asthma endotype but not in T2-low patients, suggesting that our DNAm score primarily represents an unknown variation of T2 asthma. Among T2-high patients with high DNAm scores, elevated BEC or IgE also corresponds to baseline clinical presentation that is known to benefit more from biologic treatment, including higher exacerbation scores, higher allergen sensitization, lower BMI, more recent oral corticosteroids prescription, and lower lung function. Our findings suggest that BEC and IgE, the traditional asthma biomarkers of T2-high asthma, are poor biomarkers for millions worldwide. Revisiting existing drug eligibility criteria relying on these biomarkers in asthma medical care may enhance precision and equity in treatment.

Collapse

Affiliation(s)

Aditya Gorla Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
Jonathan Witonsky Division of Allergy, Immunology, and Bone Marrow Transplant, Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA
Jennifer R Elhawary Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
Zeyuan Johnson Chen Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA
Joel Mefford Department of Neurology, University of California Los Angeles, Los Angeles, CA, USA
Javier Perez-Garcia Genomics and Health Group, Department of Biochemistry, Microbiology, Cell Biology, and Genetics, University of La Laguna, La Laguna, Spain
Scott Huntsman Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
Donglei Hu Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
Celeste Eng Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
Prescott G Woodruff Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
Sriram Sankararaman Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA Department of Human Genetics, University of California Los Angeles, Los Angeles, CA, USA
Elad Ziv Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
Jonathan Flint Department of Psychiatry and Behavioral Sciences, Brain Research Institute, University of California Los Angeles, Los Angeles, CA, USA
Noah Zaitlen Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA Department of Human Genetics, University of California Los Angeles, Los Angeles, CA, USA Department of Neurology, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
Esteban Burchard Department of Medicine, University of California, San Francisco, San Francisco, CA, USA Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
Elior Rahmani Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA

Collapse

Luximon DC, Neylon J, Ritter T, Agazaryan N, Hegde JV, Steinberg ML, Low DA, Lamb JM. Results of an Artificial Intelligence-Based Image Review System to Detect Patient Misalignment Errors in a Multi-institutional Database of Cone Beam Computed Tomography-Guided Radiation Therapy. Int J Radiat Oncol Biol Phys 2024;120:243-252. [PMID: 38485098 DOI: 10.1016/j.ijrobp.2024.02.065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 02/15/2024] [Accepted: 02/28/2024] [Indexed: 04/17/2024]

Abstract

PURPOSE

Present knowledge of patient setup and alignment errors in image guided radiation therapy (IGRT) relies on voluntary reporting, which is thought to underestimate error frequencies. A manual retrospective patient-setup misalignment error search is infeasible owing to the bulk of cases to be reviewed. We applied a deep learning-based misalignment error detection algorithm (EDA) to perform a fully automated retrospective error search of clinical IGRT databases and determine an absolute gross patient misalignment error rate.

METHODS AND MATERIALS

The EDA was developed to analyze the registration between planning scans and pretreatment cone beam computed tomography scans, outputting a misalignment score ranging from 0 (most unlikely) to 1 (most likely). The algorithm was trained using simulated translational errors on a data set obtained from 680 patients treated at 2 radiation therapy clinics between 2017 and 2022. A receiver operating characteristic analysis was performed to obtain target thresholds. DICOM Query and Retrieval software was integrated with the EDA to interact with the clinical database and fully automate data retrieval and analysis during a retrospective error search from 2016 to 2017 and from 2021 to 2022 for the 2 institutions, respectively. Registrations were flagged for human review using both a hard-thresholding method and a prediction trending analysis over each individual patient's treatment course. Flagged registrations were manually reviewed and categorized as errors (>1 cm misalignment at the target) or nonerrors.

RESULTS

A total of 17,612 registrations were analyzed by the EDA, resulting in 7.7% flagged events. Three previously reported errors were successfully flagged by the EDA, and 4 previously unreported vertebral body misalignment errors were discovered during case reviews. False positive cases often displayed substantial image artifacts, patient rotation, and soft tissue anatomy changes.

CONCLUSIONS

Our results validated the clinical utility of the EDA for bulk image reviews and highlighted the reliability and safety of IGRT, with an absolute gross patient misalignment error rate of 0.04% ± 0.02% per delivered fraction.

Collapse

Akgüller Ö, Balcı MA, Cioca G. Clustering Molecules at a Large Scale: Integrating Spectral Geometry with Deep Learning. Molecules 2024;29:3902. [PMID: 39202980 PMCID: PMC11357287 DOI: 10.3390/molecules29163902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2024] [Revised: 08/14/2024] [Accepted: 08/14/2024] [Indexed: 09/03/2024] Open

Wang C, Gao X, Li Y, Li C, Ma Z, Sun D, Liang X, Zhang X. A molecular subtyping associated with the cGAS-STING pathway provides novel perspectives on the treatment of ulcerative colitis. Sci Rep 2024;14:12683. [PMID: 38831059 PMCID: PMC11148070 DOI: 10.1038/s41598-024-63695-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Accepted: 05/31/2024] [Indexed: 06/05/2024] Open

Affiliation(s)

Chen Wang Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
Xin Gao Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
Yanchen Li Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
Chenyang Li Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
Zhimin Ma Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China Department of Respirology, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
Donglei Sun Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
Xiaonan Liang Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China
Xiaolan Zhang Department of Gastroenterology, Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Diseases, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China.

Collapse

Zamanian H, Shalbaf A. Estimation of non-alcoholic steatohepatitis (NASH) disease using clinical information based on the optimal combination of intelligent algorithms for feature selection and classification. Comput Methods Biomech Biomed Engin 2024;27:964-979. [PMID: 37254745 DOI: 10.1080/10255842.2023.2217978] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 05/12/2023] [Indexed: 06/01/2023]

Abstract

The early diagnosis of NASH disease can decrease the risk of proceeding elements and treatment costs for patients. This study aims to present an optimal combination of intelligent algorithms using advanced machine learning methods, including different feature selections and classifications based on clinical data and blood factors. In this work, collected data were from 176 patients to investigate NASH disease, and 19 features were extracted. We then sought to find the best combination of features based on different feature selection algorithms such as Feature Forward Selection (FFS), Minimum Redundancy Maximum Relevance (MRMR), and Mutual Information (MI). Finally, we used nine classifier frameworks with different mathematical mechanisms, including random forest (RF), logistic regression (LR), Linear Discriminant Analysis (LDA), AdaBoost, K nearest neighbors (KNN), multilayer perceptron model (MLP), support vector machine (SVM), and decision tree (DT) to estimate NASH disease. Our investigation revealed that the combination of dominant features, namely body mass index (BMI), glutamic pyruvic transaminase (GPT), total cholesterol (TC), high-density lipoprotein (HDL), Ezetimibe, lipoprotein level Lp(a), Loge(Lp(a)), total triglyceride (TG), Creatinine (Cre), HbA1c, Fibrate, and Sex, selected by the MRMR algorithm and classified by the RF method can provide the most appropriate performance based on less computation effort and maximum performance with accuracy, AUC, precision, and recall indices, which are 81.51 ± 9.35 , 82.53 ± 11.24 , 85.28 ± 9.68 , and 89.49 ± 7.92 , respectively. This study investigated the configuration of feature selection and classifier that is most suitable for classifying NASH disease based on clinical data and blood factors. The proposed intelligent algorithm based on MRMR and RF classifier can automatically diagnose NASH disease with appropriate performance and present an initial report without any further invasive methods. It also clarifies the diagnostic process and, as a result, the continuation of their prevention and treatment cycle.

Collapse

Trottet C, Allam A, Horvath AN, Finckh A, Hügle T, Adler S, Kyburz D, Micheroli R, Krauthammer M, Ospelt C. Explainable deep learning for disease activity prediction in chronic inflammatory joint diseases. PLOS DIGITAL HEALTH 2024;3:e0000422. [PMID: 38935600 PMCID: PMC11210792 DOI: 10.1371/journal.pdig.0000422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 05/27/2024] [Indexed: 06/29/2024]

Abstract

Analysing complex diseases such as chronic inflammatory joint diseases (CIJDs), where many factors influence the disease evolution over time, is a challenging task. CIJDs are rheumatic diseases that cause the immune system to attack healthy organs, mainly the joints. Different environmental, genetic and demographic factors affect disease development and progression. The Swiss Clinical Quality Management in Rheumatic Diseases (SCQM) Foundation maintains a national database of CIJDs documenting the disease management over time for 19'267 patients. We propose the Disease Activity Score Network (DAS-Net), an explainable multi-task learning model trained on patients' data with different arthritis subtypes, transforming longitudinal patient journeys into comparable representations and predicting multiple disease activity scores. First, we built a modular model composed of feed-forward neural networks, long short-term memory networks and attention layers to process the heterogeneous patient histories and predict future disease activity. Second, we investigated the utility of the model's computed patient representations (latent embeddings) to identify patients with similar disease progression. Third, we enhanced the explainability of our model by analysing the impact of different patient characteristics on disease progression and contrasted our model outcomes with medical expert knowledge. To this end, we explored multiple feature attribution methods including SHAP, attention attribution and feature weighting using case-based similarity. Our model outperforms temporal and non-temporal neural network, tree-based, and naive static baselines in predicting future disease activity scores. To identify similar patients, a k-nearest neighbours regression algorithm applied to the model's computed latent representations outperforms baseline strategies that use raw input features representation.

Collapse

Xiong Y, Chen C, He C, Yang X, Cheng W. Identification of shared gene signatures and biological mechanisms between preeclampsia and polycystic ovary syndrome. Heliyon 2024;10:e29225. [PMID: 38638956 PMCID: PMC11024567 DOI: 10.1016/j.heliyon.2024.e29225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 03/24/2024] [Accepted: 04/03/2024] [Indexed: 04/20/2024] Open

Barakat A, Munro G, Heegaard AM. Finding new analgesics: Computational pharmacology faces drug discovery challenges. Biochem Pharmacol 2024;222:116091. [PMID: 38412924 DOI: 10.1016/j.bcp.2024.116091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 01/10/2024] [Accepted: 02/23/2024] [Indexed: 02/29/2024]

Bao LX, Luo ZM, Zhu XL, Xu YY. Automated identification of protein expression intensity and classification of protein cellular locations in mouse brain regions from immunofluorescence images. Med Biol Eng Comput 2024;62:1105-1119. [PMID: 38150111 DOI: 10.1007/s11517-023-02985-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 11/28/2023] [Indexed: 12/28/2023]

Wang T, Li Z, Zhao S, Liu Y, Guo W, Alarcòn Rodrìguez R, Wu Y, Wei R. Characterizing hedgehog pathway features in senescence associated osteoarthritis through Integrative multi-omics and machine learning analysis. Front Genet 2024;15:1255455. [PMID: 38444758 PMCID: PMC10912584 DOI: 10.3389/fgene.2024.1255455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 02/06/2024] [Indexed: 03/07/2024] Open

Abstract

Purpose: Osteoarthritis (OA) is a disease of senescence and inflammation. Hedgehog's role in OA mechanisms is unclear. This study combines Bulk RNA-seq and scRNA-seq to identify Hedgehog-associated genes in OA, investigating their impact on the pathogenesis of OA. Materials and methods: Download and merge eight bulk-RNA seq datasets from GEO, also obtain a scRNA-seq dataset for validation and analysis. Analyze Hedgehog pathway activity in OA using bulk-RNA seq datasets. Use ten machine learning algorithms to identify important Hedgehog-associated genes, validate predictive models. Perform GSEA to investigate functional implications of identified Hedgehog-associated genes. Assess immune infiltration in OA using Cibersort and MCP-counter algorithms. Utilize ConsensusClusterPlus package to identify Hedgehog-related subgroups. Conduct WGCNA to identify key modules enriched based on Hedgehog-related subgroups. Characterization of genes by methylation and GWAS analysis. Evaluate Hedgehog pathway activity, expression of hub genes, pseudotime, and cell communication, in OA chondrocytes using scRNA-seq dataset. Validate Hedgehog-associated gene expression levels through Real-time PCR analysis. Results: The activity of the Hedgehog pathway is significantly enhanced in OA. Additionally, nine important Hedgehog-associated genes have been identified, and the predictive models built using these genes demonstrate strong predictive capabilities. GSEA analysis indicates a significant positive correlation between all seven important Hedgehog-associated genes and lysosomes. Consensus clustering reveals the presence of two hedgehog-related subgroups. In Cluster 1, Hedgehog pathway activity is significantly upregulated and associated with inflammatory pathways. WGCNA identifies that genes in the blue module are most significantly correlated with Cluster 1 and Cluster 2, as well as being involved in extracellular matrix and collagen-related pathways. Single-cell analysis confirms the significant upregulation of the Hedgehog pathway in OA, along with expression changes observed in 5 genes during putative temporal progression. Cell communication analysis suggests an association between low-scoring chondrocytes and macrophages. Conclusion: The Hedgehog pathway is significantly activated in OA and is associated with the extracellular matrix and collagen proteins. It plays a role in regulating immune cells and immune responses.

Collapse

Dube F, Delhomme N, Martin F, Hinas A, Åbrink M, Svärd S, Tydén E. Gene co-expression network analysis reveal core responsive genes in Parascaris univalens tissues following ivermectin exposure. PLoS One 2024;19:e0298039. [PMID: 38359071 PMCID: PMC10868809 DOI: 10.1371/journal.pone.0298039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 01/17/2024] [Indexed: 02/17/2024] Open

Abstract

Anthelmintic resistance in equine parasite Parascaris univalens, compromises ivermectin (IVM) effectiveness and necessitates an in-depth understanding of its resistance mechanisms. Most research, primarily focused on holistic gene expression analyses, may overlook vital tissue-specific responses and often limit the scope of novel genes. This study leveraged gene co-expression network analysis to elucidate tissue-specific transcriptional responses and to identify core genes implicated in the IVM response in P. univalens. Adult worms (n = 28) were exposed to 10-11 M and 10-9 M IVM in vitro for 24 hours. RNA-sequencing examined transcriptional changes in the anterior end and intestine. Differential expression analysis revealed pronounced tissue differences, with the intestine exhibiting substantially more IVM-induced transcriptional activity. Gene co-expression network analysis identified seven modules significantly associated with the response to IVM. Within these, 219 core genes were detected, largely expressed in the intestinal tissue and spanning diverse biological processes with unspecific patterns. After 10-11 M IVM, intestinal tissue core genes showed transcriptional suppression, cell cycle inhibition, and ribosomal alterations. Interestingly, genes PgR028_g047 (sorb-1), PgB01_g200 (gmap-1) and PgR046_g017 (col-37 & col-102) switched from downregulation at 10-11 M to upregulation at 10-9 M IVM. The 10-9 M concentration induced expression of cuticle and membrane integrity core genes in the intestinal tissue. No clear core gene patterns were visible in the anterior end after 10-11 M IVM. However, after 10-9 M IVM, the anterior end mostly displayed downregulation, indicating disrupted transcriptional regulation. One interesting finding was the non-modular calcium-signaling gene, PgR047_g066 (gegf-1), which uniquely connected 71 genes across four modules. These genes were enriched for transmembrane signaling activity, suggesting that PgR047_g066 (gegf-1) could have a key signaling role. By unveiling tissue-specific expression patterns and highlighting biological processes through unbiased core gene detection, this study reveals intricate IVM responses in P. univalens. These findings suggest alternative drug uptake of IVM and can guide functional validations to further IVM resistance mechanism understanding.

Collapse

Shafique A, Gonzalez R, Pantanowitz L, Tan PH, Machado A, Cree IA, Tizhoosh HR. A Preliminary Investigation into Search and Matching for Tumor Discrimination in World Health Organization Breast Taxonomy Using Deep Networks. Mod Pathol 2024;37:100381. [PMID: 37939901 PMCID: PMC10891482 DOI: 10.1016/j.modpat.2023.100381] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 10/26/2023] [Accepted: 10/31/2023] [Indexed: 11/10/2023]

Khine AH, Wettayaprasit W, Duangsuwan J. A new word embedding model integrated with medical knowledge for deep learning-based sentiment classification. Artif Intell Med 2024;148:102758. [PMID: 38325934 DOI: 10.1016/j.artmed.2023.102758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 11/19/2023] [Accepted: 12/29/2023] [Indexed: 02/09/2024]

Abstract

The development of intelligent systems that use social media data for decision-making processes in numerous domains such as politics, business, marketing, and finance, has been made possible by the popularity of social media platforms. However, the utilization of textual data from social media in the healthcare management industry is still somewhat limited when it is compared to other industries. Investigating how current machine learning and natural language processing technologies can be used in the healthcare industry to gauge public sentiment is an important study. Earlier works on healthcare sentiment analysis have utilized traditional word embedding models trained on the general and medical corpus. However, integration of medical knowledge to pre-trained word embedding models has not been considered yet. Word embedding models trained on the general corpus led to the problem of lacking medical knowledge and the models trained on the small size of the medical corpus have limitations in capturing semantic and syntactic properties. This research proposes a new word embedding model named Word Embedding Integrated with Medical Knowledge Vector (WE-iMKVec). The proposed model integrates sentiment lexicons and medical knowledgebases into the pre-trained word embedding to enrich the properties of word embedding. A new medical-aware sentiment polarity score is proposed for the utilization in learning neural-network sentiment and these vectors incorporate with the original pre-trained word vectors. The resulting vectors are enriched with lexicon vectors and the medical knowledge vectors: Adverse Drug Reaction (ADR) vector and Unified Medical Language System (UMLS) vector are used to build the proposed WE-iMKVec model. WE-iMKVec is validated on the five different social media healthcare review datasets and the empirical results showed its superiority over traditional word embedding models in medical sentiment analysis. The highest improvement can be found in the patients.info medical condition dataset where the proposed model outperforms three conventional word2vec models (Google-News, PubMed-PMC, and Drug Reviews) by 12.7 %, 31.4 %, and 25.4 % respectively in terms of F1 score.

Collapse

Xiang J, Sun Y, Wu X, Guo Y, Xue J, Niu Y, Cui X. Abnormal Spatial and Temporal Overlap of Time-Varying Brain Functional Networks in Patients with Schizophrenia. Brain Sci 2023;14:40. [PMID: 38248255 PMCID: PMC10813230 DOI: 10.3390/brainsci14010040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 12/25/2023] [Accepted: 12/27/2023] [Indexed: 01/23/2024] Open

Chen QS, Bergman O, Ziegler L, Baldassarre D, Veglia F, Tremoli E, Strawbridge RJ, Gallo A, Pirro M, Smit AJ, Kurl S, Savonen K, Lind L, Eriksson P, Gigante B. A machine learning based approach to identify carotid subclinical atherosclerosis endotypes. Cardiovasc Res 2023;119:2594-2606. [PMID: 37475157 PMCID: PMC10730242 DOI: 10.1093/cvr/cvad106] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 03/12/2023] [Accepted: 05/05/2023] [Indexed: 07/22/2023] Open

Abstract

AIMS

To define endotypes of carotid subclinical atherosclerosis.

METHODS AND RESULTS

We integrated demographic, clinical, and molecular data (n = 124) with ultrasonographic carotid measurements from study participants in the IMPROVE cohort (n = 3340). We applied a neural network algorithm and hierarchical clustering to identify carotid atherosclerosis endotypes. A measure of carotid subclinical atherosclerosis, the c-IMTmean-max, was used to extract atherosclerosis-related features and SHapley Additive exPlanations (SHAP) to reveal endotypes. The association of endotypes with carotid ultrasonographic measurements at baseline, after 30 months, and with the 3-year atherosclerotic cardiovascular disease (ASCVD) risk was estimated by linear (β, SE) and Cox [hazard ratio (HR), 95% confidence interval (CI)] regression models. Crude estimates were adjusted by common cardiovascular risk factors, and baseline ultrasonographic measures. Improvement in ASCVD risk prediction was evaluated by C-statistic and by net reclassification improvement with reference to SCORE2, c-IMTmean-max, and presence of carotid plaques. An ensemble stacking model was used to predict endotypes in an independent validation cohort, the PIVUS (n = 1061). We identified four endotypes able to differentiate carotid atherosclerosis risk profiles from mild (endotype 1) to severe (endotype 4). SHAP identified endotype-shared variables (age, biological sex, and systolic blood pressure) and endotype-specific biomarkers. In the IMPROVE, as compared to endotype 1, endotype 4 associated with the thickest c-IMT at baseline (β, SE) 0.36 (0.014), the highest number of plaques 1.65 (0.075), the fastest c-IMT progression 0.06 (0.013), and the highest ASCVD risk (HR, 95% CI) (1.95, 1.18-3.23). Baseline and progression measures of carotid subclinical atherosclerosis and ASCVD risk were associated with the predicted endotypes in the PIVUS. Endotypes consistently improved measures of ASCVD risk discrimination and reclassification in both study populations.

CONCLUSIONS

We report four replicable subclinical carotid atherosclerosis-endotypes associated with progression of atherosclerosis and ASCVD risk in two independent populations. Our approach based on endotypes can be applied for precision medicine in ASCVD prevention.

Collapse

Affiliation(s)

Qiao Sen Chen Division of Cardiovascular Medicine, Department of Medicine Solna, Karolinska Institutet, Solnavägen 30, 171 64 Stockholm, Sweden
Otto Bergman Division of Cardiovascular Medicine, Department of Medicine Solna, Karolinska Institutet, Solnavägen 30, 171 64 Stockholm, Sweden
Louise Ziegler Division of Medicine and Department of Clinical Sciences, Danderyd Hospital, Karolinska Institutet, Entrevägen 2, 182 88 Stockholm, Sweden
Damiano Baldassarre Department of Medical Biotechnology and Translational Medicine, Università di Milano, Via Vanvitelli 32, 20133 Milan, Italy Centro Cardiologico Monzino, IRCCS, Via Carlo Parea 4, 20138 Milan, Italy
Fabrizio Veglia Maria Cecilia Hospital, GVM Care & Research, Via Corriera 1, 48033 Cotignola (RA), Italy
Elena Tremoli Maria Cecilia Hospital, GVM Care & Research, Via Corriera 1, 48033 Cotignola (RA), Italy
Rona J Strawbridge Division of Cardiovascular Medicine, Department of Medicine Solna, Karolinska Institutet, Solnavägen 30, 171 64 Stockholm, Sweden Institute of Health and Wellbeing, University of Glasgow, Clarice Pears Building, 90 Byres Road, Glasgow G12 8TB, UK Health Data Research, Clarice Pears Building, 90 Byres Road, Glasgow G12 8TB, UK
Antonio Gallo Lipidology and Cardiovascular Prevention Unit, Department of Nutrition, Sorbonne Université, INSERM UMR1166, APHP, Hôpital Pitié-Salpètriêre, 47 Boulevard de l´Hopital, 75013 Paris, France
Matteo Pirro Internal Medicine, Angiology and Arteriosclerosis Diseases, Department of Medicine, University of Perugia, Piazzale Menghini 1, 06129 Perugia, Italy
Andries J Smit Department of Medicine, University Medical Center Groningen, Groningen & Isala Clinics Zwolle, Dokter Spanjaardweg 29B, 8025 BT Groningen, the Netherlands
Sudhir Kurl Institute of Public Health and Clinical Nutrition, University of Eastern Finland, Kuopio Campus, Yliopistonranta 1 C, Canthia Building, B Wing, FI-70211 Kuopio, Finland
Kai Savonen Kuopio Research Institute of Exercise Medicine, Haapaniementie 16, FI-70100 Kuopio, Finland Department of Clinical Physiology and Nuclear Medicine, Science Service Center, Kuopio University Hospital, Yliopsistonranta 1F, FI-70211 Kuopio, Finland
Lars Lind Department of Medical Sciences, Uppsala University, Uppsala Science Park, Dag Hammarskjöldsv 10B, 752 37 Uppsala, Sweden
Per Eriksson Division of Cardiovascular Medicine, Department of Medicine Solna, Karolinska Institutet, Solnavägen 30, 171 64 Stockholm, Sweden
Bruna Gigante Division of Cardiovascular Medicine, Department of Medicine Solna, Karolinska Institutet, Solnavägen 30, 171 64 Stockholm, Sweden Department of Cardiology, Danderyd University Hospital, Entrevägen 2, 182 88 Stockholm, Sweden

Collapse

Su D, Xiong Y, Wang S, Wei H, Ke J, Li H, Wang T, Zuo Y, Yang L. Structural deep clustering network for stratification of breast cancer patients through integration of somatic mutation profiles. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023;242:107808. [PMID: 37716222 DOI: 10.1016/j.cmpb.2023.107808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/15/2023] [Accepted: 09/10/2023] [Indexed: 09/18/2023]

Abstract

BACKGROUND AND OBJECTIVE

Breast cancer is among of the most malignant tumor that occurs in women and is one of the leading causes of death from gynecologic malignancy worldwide. The high degree of heterogeneity that characterizes breast cancer makes it challenging to devise effective therapeutic strategies. Accumulating evidence highlights the crucial role of stratifying breast cancer patients into clinically significant subtypes to achieve better prognoses and treatments. The structural deep clustering network is a graph convolutional network-based clustering algorithm that integrates structural information and has achieved state-of-the-art performance in various applications.

METHODS

In this study, we employed structural deep clustering network to integrate somatic mutation profiles for stratifying 2526 breast cancer patients from the Memorial Sloan Kettering Cancer Center into two clinically differentiable subtypes.

RESULTS

Breast cancer patients in cluster 1 exhibited better prognosis than breast cancer patients in cluster 2, and the difference between them was statistically significant. The immunogenomic landscape further demonstrated that cluster 1 was associated with remarkable infiltration of the tumor infiltrating lymphocytes. The clustering subtype could be used to evaluate the therapeutic benefit of immunotherapy and chemotherapy in breast cancer patients. Furthermore, our approach effectively classified patients from eight different cancer types, demonstrating its generalizability.

CONCLUSIONS

Our study represents a step towards a generic methodology for classifying cancer patients using only somatic mutation data and structural deep clustering network approaches. Employing structural deep clustering network to identify breast cancer subtypes is promising and can inform the development of more accurate and personalized therapies.

Collapse

Li L, Li H, Yang C, Tang Y, Wang Y, Yang H, Zhang W, Jiang F, Ji S. Multiscale levels CO₂ decouple reinforcement in China. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023;30:121569-121583. [PMID: 37953427 DOI: 10.1007/s11356-023-30931-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 11/02/2023] [Indexed: 11/14/2023]

Abstract

Decoupling economic growth from CO2 emissions is imperative for China. Meanwhile, establishing a consistent and comprehensive decoupling inventory that includes national (N), regional and provincial (RP), and city and county (CC) levels is essential for further policy formulation. This research aims to investigate the decoupling status using the "N-RP-CC" approach while considering changes in decoupling trends at the different levels. A combination of the Tapio decoupling model and cluster analysis is employed to study the decoupling's spatiotemporal characteristics and trends. The study first calculates the decoupling value for "national, 7; regions, 30; provinces, 1501 CCs" in China, 2006-2017. The results show that there continues to be an improvement in the decoupling trend at the national level. Conversely, the regional scale exhibits a more vulnerable decoupling trend compared to the national level, with weak and extended negative decoupling observed in northeastern and northern China. Moreover, provincial heterogeneities are increasingly evident, with poor decoupling statuses appearing in Jilin, Heilongjiang, Liaoning, and Xinjiang, as well as many central provinces. Additionally, although more than half of CCs exhibit weak decoupling during most years, seven different states of decoupling were also identified during the time frame. These findings further indicate that spatiotemporal heterogeneities extend beyond RP scales within CCs. Taking the Yangtze River as a boundary line reveals a severe situation in northern areas along with rapid development trends observed in southern regions. Finally, we clustered 1414 CCs based on their industrial proportions for 2017 which further highlights increasingly prominent heterogeneities that should be carefully considered. Based on these findings, policy recommendations such as spatial organization and optimization and technique investment are proposed to achieve CO2 emission decoupling under the N-RP-CC levels.

Collapse

Affiliation(s)

Lei Li School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China Research Center of Lake Restoration Technology Engineering for Universities of Yunnan Province (Yunnan University), School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China
Huiying Li Research Center of Lake Restoration Technology Engineering for Universities of Yunnan Province (Yunnan University), School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China Institute of International Rivers and Eco-Security, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China
Chuanhua Yang School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China Research Center of Lake Restoration Technology Engineering for Universities of Yunnan Province (Yunnan University), School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China
Yue Tang School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China Research Center of Lake Restoration Technology Engineering for Universities of Yunnan Province (Yunnan University), School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China
Yujian Wang School of Chemical Science and Technology, Yunnan Minzu University, 2929 Yuehua Street, Kunming, 650500, China
HongJuan Yang Faculty of Management and Economics, Kunming University of Science and Technology, No. 727 Jingming South Road, Kunming, 650500, China
Weishi Zhang School of Geographic and Environmental Sciences, Tianjin Normal University, No.393, Extension of Bin Shui West Road, Xi Qing District, Tianjin, 300387, China
Fengzhi Jiang School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China Research Center of Lake Restoration Technology Engineering for Universities of Yunnan Province (Yunnan University), School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China Workstation of Academician Chen Jing of Yunnan Province, University City East Outer Ring South Road, Kunming, 650500, China
Siping Ji School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China. Research Center of Lake Restoration Technology Engineering for Universities of Yunnan Province (Yunnan University), School of Chemical Science and Technology, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, China. School of Chemistry Science and Engineering, Yunnan University, University City East Outer Ring South Road, Kunming, 650500, Yunnan Province, China.

Collapse

Bailleux C, Chardin D, Guigonis JM, Ferrero JM, Chateau Y, Humbert O, Pourcher T, Gal J. Survival analysis of patient groups defined by unsupervised machine learning clustering methods based on patient metabolomic data. Comput Struct Biotechnol J 2023;21:5136-5143. [PMID: 37920813 PMCID: PMC10618114 DOI: 10.1016/j.csbj.2023.10.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 10/16/2023] [Accepted: 10/16/2023] [Indexed: 11/04/2023] Open

Affiliation(s)

Caroline Bailleux University Côte d′Azur, Centre Antoine Lacassagne, Medical Oncology Department, Nice F-06189, France University Côte d′Azur, Commissariat à l′Energie Atomique et aux énergies alternatives, Institut Frédéric Joliot, Service Hospitalier Frédéric Joliot, laboratory Transporters in Oncology and Radiotherapy in Oncology (TIRO), School of medicine, Nice F-06100, France
David Chardin University Côte d′Azur, Commissariat à l′Energie Atomique et aux énergies alternatives, Institut Frédéric Joliot, Service Hospitalier Frédéric Joliot, laboratory Transporters in Oncology and Radiotherapy in Oncology (TIRO), School of medicine, Nice F-06100, France University Côte d′Azur, Centre Antoine Lacassagne, Nuclear medicine Department, Nice F-06189, France
Jean-Marie Guigonis University Côte d′Azur, Commissariat à l′Energie Atomique et aux énergies alternatives, Institut Frédéric Joliot, Service Hospitalier Frédéric Joliot, laboratory Transporters in Oncology and Radiotherapy in Oncology (TIRO), School of medicine, Nice F-06100, France
Jean-Marc Ferrero University Côte d′Azur, Centre Antoine Lacassagne, Medical Oncology Department, Nice F-06189, France
Yann Chateau University Côte d′Azur, Centre Antoine Lacassagne, Epidemiology and Biostatistics Department, Nice F-06189, France
Olivier Humbert University Côte d′Azur, Commissariat à l′Energie Atomique et aux énergies alternatives, Institut Frédéric Joliot, Service Hospitalier Frédéric Joliot, laboratory Transporters in Oncology and Radiotherapy in Oncology (TIRO), School of medicine, Nice F-06100, France University Côte d′Azur, Centre Antoine Lacassagne, Nuclear medicine Department, Nice F-06189, France
Thierry Pourcher University Côte d′Azur, Commissariat à l′Energie Atomique et aux énergies alternatives, Institut Frédéric Joliot, Service Hospitalier Frédéric Joliot, laboratory Transporters in Oncology and Radiotherapy in Oncology (TIRO), School of medicine, Nice F-06100, France
Jocelyn Gal University Côte d′Azur, Centre Antoine Lacassagne, Epidemiology and Biostatistics Department, Nice F-06189, France

Collapse

Willie E, Yang P, Patrick E. The impact of similarity metrics on cell-type clustering in highly multiplexed in situ imaging cytometry data. BIOINFORMATICS ADVANCES 2023;3:vbad141. [PMID: 37928340 PMCID: PMC10625459 DOI: 10.1093/bioadv/vbad141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 08/23/2023] [Accepted: 10/07/2023] [Indexed: 11/07/2023]

Abstract

Motivation

The advent of highly multiplexed in situ imaging cytometry assays has revolutionized the study of cellular systems, offering unparalleled detail in observing cellular activities and characteristics. These assays provide comprehensive insights by concurrently profiling the spatial distribution and molecular features of numerous cells. In navigating this complex data landscape, unsupervised machine learning techniques, particularly clustering algorithms, have become essential tools. They enable the identification and categorization of cell types and subsets based on their molecular characteristics. Despite their widespread adoption, most clustering algorithms in use were initially developed for cell suspension technologies, leading to a potential mismatch in application. There is a critical gap in the systematic evaluation of these methods, particularly in determining the properties that make them optimal for in situ imaging assays. Addressing this gap is vital for ensuring accurate, reliable analyses and fostering advancements in cellular biology research.

Results

In our extensive investigation, we evaluated a range of similarity metrics, which are crucial in determining the relationships between cells during the clustering process. Our findings reveal substantial variations in clustering performance, contingent on the similarity metric employed. These variations underscore the importance of selecting appropriate metrics to ensure accurate cell type and subset identification. In response to these challenges, we introduce FuseSOM, a novel ensemble clustering algorithm that integrates hierarchical multiview learning of similarity metrics with self-organizing maps. Through a rigorous stratified subsampling analysis framework, we demonstrate that FuseSOM outperforms existing best-practice clustering methods specifically tailored for in situ imaging cytometry data. Our work not only provides critical insights into the performance of clustering algorithms in this novel context but also offers a robust solution, paving the way for more accurate and reliable in situ imaging cytometry data analysis.

Availability and implementation

The FuseSOM R package is available on Bioconductor and is available under the GPL-3 license. All the codes for the analysis performed can be found at Github.

Collapse

Hui X, Wang Y, Li W, Yuan Y, Tao X, Lv R. Nd-Mn Molecular Cluster with Searched Targets for Oral Cancer Imaging. Mol Imaging Biol 2023;25:875-886. [PMID: 37256508 DOI: 10.1007/s11307-023-01828-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 04/10/2023] [Accepted: 05/11/2023] [Indexed: 06/01/2023]

Meng R, Yin S, Sun J, Hu H, Zhao Q. scAAGA: Single cell data analysis framework using asymmetric autoencoder with gene attention. Comput Biol Med 2023;165:107414. [PMID: 37660567 DOI: 10.1016/j.compbiomed.2023.107414] [Citation(s) in RCA: 64] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 08/02/2023] [Accepted: 08/28/2023] [Indexed: 09/05/2023]

Zeibich R, Kwan P, J. O’Brien T, Perucca P, Ge Z, Anderson A. Applications for Deep Learning in Epilepsy Genetic Research. Int J Mol Sci 2023;24:14645. [PMID: 37834093 PMCID: PMC10572791 DOI: 10.3390/ijms241914645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 09/11/2023] [Accepted: 09/21/2023] [Indexed: 10/15/2023] Open

Affiliation(s)

Robert Zeibich Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.)
Patrick Kwan Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.) Department of Neurology, Alfred Health, Melbourne, VIC 3004, Australia Department of Neurology, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia Department of Medicine, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia
Terence J. O’Brien Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.) Department of Neurology, Alfred Health, Melbourne, VIC 3004, Australia Department of Neurology, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia Department of Medicine, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia
Piero Perucca Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.) Department of Neurology, Alfred Health, Melbourne, VIC 3004, Australia Department of Neurology, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia Epilepsy Research Centre, Department of Medicine, Austin Health, The University of Melbourne, Melbourne, VIC 3084, Australia Bladin-Berkovic Comprehensive Epilepsy Program, Department of Neurology, Austin Health, The University of Melbourne, Melbourne, VIC 3084, Australia
Zongyuan Ge Faculty of Engineering, Monash University, Melbourne, VIC 3800, Australia; Monash-Airdoc Research, Monash University, Melbourne, VIC 3800, Australia
Alison Anderson Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.) Department of Medicine, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia

Collapse

Karim MR, Islam T, Shajalal M, Beyan O, Lange C, Cochez M, Rebholz-Schuhmann D, Decker S. Explainable AI for Bioinformatics: Methods, Tools and Applications. Brief Bioinform 2023;24:bbad236. [PMID: 37478371 DOI: 10.1093/bib/bbad236] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 05/10/2023] [Accepted: 05/26/2023] [Indexed: 07/23/2023] Open

Abstract

Artificial intelligence (AI) systems utilizing deep neural networks and machine learning (ML) algorithms are widely used for solving critical problems in bioinformatics, biomedical informatics and precision medicine. However, complex ML models that are often perceived as opaque and black-box methods make it difficult to understand the reasoning behind their decisions. This lack of transparency can be a challenge for both end-users and decision-makers, as well as AI developers. In sensitive areas such as healthcare, explainability and accountability are not only desirable properties but also legally required for AI systems that can have a significant impact on human lives. Fairness is another growing concern, as algorithmic decisions should not show bias or discrimination towards certain groups or individuals based on sensitive attributes. Explainable AI (XAI) aims to overcome the opaqueness of black-box models and to provide transparency in how AI systems make decisions. Interpretable ML models can explain how they make predictions and identify factors that influence their outcomes. However, the majority of the state-of-the-art interpretable ML methods are domain-agnostic and have evolved from fields such as computer vision, automated reasoning or statistics, making direct application to bioinformatics problems challenging without customization and domain adaptation. In this paper, we discuss the importance of explainability and algorithmic transparency in the context of bioinformatics. We provide an overview of model-specific and model-agnostic interpretable ML methods and tools and outline their potential limitations. We discuss how existing interpretable ML methods can be customized and fit to bioinformatics research problems. Further, through case studies in bioimaging, cancer genomics and text mining, we demonstrate how XAI methods can improve transparency and decision fairness. Our review aims at providing valuable insights and serving as a starting point for researchers wanting to enhance explainability and decision transparency while solving bioinformatics problems. GitHub: https://github.com/rezacsedu/XAI-for-bioinformatics.

Collapse

Gao CX, Dwyer D, Zhu Y, Smith CL, Du L, Filia KM, Bayer J, Menssink JM, Wang T, Bergmeir C, Wood S, Cotton SM. An overview of clustering methods with guidelines for application in mental health research. Psychiatry Res 2023;327:115265. [PMID: 37348404 DOI: 10.1016/j.psychres.2023.115265] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 05/20/2023] [Accepted: 05/21/2023] [Indexed: 06/24/2023]

Vora LK, Gholap AD, Jetha K, Thakur RRS, Solanki HK, Chavda VP. Artificial Intelligence in Pharmaceutical Technology and Drug Delivery Design. Pharmaceutics 2023;15:1916. [PMID: 37514102 PMCID: PMC10385763 DOI: 10.3390/pharmaceutics15071916] [Citation(s) in RCA: 193] [Impact Index Per Article: 96.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 06/28/2023] [Accepted: 07/04/2023] [Indexed: 07/30/2023] Open

Kalweit M, Burden AM, Boedecker J, Hügle T, Burkard T. Patient groups in Rheumatoid arthritis identified by deep learning respond differently to biologic or targeted synthetic DMARDs. PLoS Comput Biol 2023;19:e1011073. [PMID: 37267387 DOI: 10.1371/journal.pcbi.1011073] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 04/04/2023] [Indexed: 06/04/2023] Open

Abstract

Cycling of biologic or targeted synthetic disease modifying antirheumatic drugs (b/tsDMARDs) in rheumatoid arthritis (RA) patients due to non-response is a problem preventing and delaying disease control. We aimed to assess and validate treatment response of b/tsDMARDs among clusters of RA patients identified by deep learning. We clustered RA patients clusters at first-time b/tsDMARD (cohort entry) in the Swiss Clinical Quality Management in Rheumatic Diseases registry (SCQM) [1999-2018]. We performed comparative effectiveness analyses of b/tsDMARDs (ref. adalimumab) using Cox proportional hazard regression. Within 15 months, we assessed b/tsDMARD stop due to non-response, and separately a ≥20% reduction in DAS28-esr as a response proxy. We validated results through stratified analyses according to most distinctive patient characteristics of clusters. Clusters comprised between 362 and 1481 patients (3516 unique patients). Stratified (validation) analyses confirmed comparative effectiveness results among clusters: Patients with ≥2 conventional synthetic DMARDs and prednisone at b/tsDMARD initiation, male patients, as well as patients with a lower disease burden responded better to tocilizumab than to adalimumab (hazard ratio [HR] 5.46, 95% confidence interval [CI] [1.76-16.94], and HR 8.44 [3.43-20.74], and HR 3.64 [2.04-6.49], respectively). Furthermore, seronegative women without use of prednisone at b/tsDMARD initiation as well as seropositive women with a higher disease burden and longer disease duration had a higher risk of non-response with golimumab (HR 2.36 [1.03-5.40] and HR 5.27 [2.10-13.21], respectively) than with adalimumab. Our results suggest that RA patient clusters identified by deep learning may have different responses to first-line b/tsDMARD. Thus, it may suggest optimal first-line b/tsDMARD for certain RA patients, which is a step forward towards personalizing treatment. However, further research in other cohorts is needed to verify our results.

Collapse

Zhang H, Kong W, Xie Y, Zhao X, Luo D, Chen S, Pan Z. Telomere-related genes as potential biomarkers to predict endometriosis and immune response: Development of a machine learning-based risk model. Front Med (Lausanne) 2023;10:1132676. [PMID: 36968845 PMCID: PMC10034389 DOI: 10.3389/fmed.2023.1132676] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 02/20/2023] [Indexed: 03/11/2023] Open

Hernández-Hernández S, Ballester PJ. On the Best Way to Cluster NCI-60 Molecules. Biomolecules 2023;13:biom13030498. [PMID: 36979433 PMCID: PMC10046274 DOI: 10.3390/biom13030498] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 03/02/2023] [Accepted: 03/06/2023] [Indexed: 03/30/2023] Open

Nguyen R, Sokhansanj BA, Polikar R, Rosen GL. Complet+: a computationally scalable method to improve completeness of large-scale protein sequence clustering. PeerJ 2023;11:e14779. [PMID: 36785708 PMCID: PMC9921987 DOI: 10.7717/peerj.14779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 01/03/2023] [Indexed: 02/10/2023] Open

Alharbi F, Vakanski A. Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review. Bioengineering (Basel) 2023;10:bioengineering10020173. [PMID: 36829667 PMCID: PMC9952758 DOI: 10.3390/bioengineering10020173] [Citation(s) in RCA: 46] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 01/24/2023] [Accepted: 01/26/2023] [Indexed: 01/31/2023] Open

Gorla A, Sankararaman S, Burchard E, Flint J, Zaitlen N, Rahmani E. Phenotypic subtyping via contrastive learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.05.522921. [PMID: 36711575 PMCID: PMC9881932 DOI: 10.1101/2023.01.05.522921] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Abstract

Defining and accounting for subphenotypic structure has the potential to increase statistical power and provide a deeper understanding of the heterogeneity in the molecular basis of complex disease. Existing phenotype subtyping methods primarily rely on clinically observed heterogeneity or metadata clustering. However, they generally tend to capture the dominant sources of variation in the data, which often originate from variation that is not descriptive of the mechanistic heterogeneity of the phenotype of interest; in fact, such dominant sources of variation, such as population structure or technical variation, are, in general, expected to be independent of subphenotypic structure. We instead aim to find a subspace with signal that is unique to a group of samples for which we believe that subphenotypic variation exists (e.g., cases of a disease). To that end, we introduce Phenotype Aware Components Analysis (PACA), a contrastive learning approach leveraging canonical correlation analysis to robustly capture weak sources of subphenotypic variation. In the context of disease, PACA learns a gradient of variation unique to cases in a given dataset, while leveraging control samples for accounting for variation and imbalances of biological and technical confounders between cases and controls. We evaluated PACA using an extensive simulation study, as well as on various subtyping tasks using genotypes, transcriptomics, and DNA methylation data. Our results provide multiple strong evidence that PACA allows us to robustly capture weak unknown variation of interest while being calibrated and well-powered, far superseding the performance of alternative methods. This renders PACA as a state-of-the-art tool for defining de novo subtypes that are more likely to reflect molecular heterogeneity, especially in challenging cases where the phenotypic heterogeneity may be masked by a myriad of strong unrelated effects in the data.

Collapse

Wu H, Zeng R, Qiu X, Chen K, Zhuo Z, Guo K, Xiang Y, Yang Q, Jiang R, Leung FW, Lian Q, Sha W, Chen H. Investigating regulatory patterns of NLRP3 Inflammasome features and association with immune microenvironment in Crohn's disease. Front Immunol 2023;13:1096587. [PMID: 36685554 PMCID: PMC9849378 DOI: 10.3389/fimmu.2022.1096587] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Accepted: 12/02/2022] [Indexed: 01/06/2023] Open

Abstract

INTRODUCTION

Crohn's disease is characterized of dysregulated inflammatory and immune reactions. The role of the NOD-like receptor family, pyrin domain-containing 3 (NLRP3) inflammasome in Crohn's disease remains largely unknown.

METHODS

The microarray-based transcriptomic data and corresponding clinical information of GSE100833 and GSE16879 were obtained from the Gene Expression Omnibus (GEO) database. Identification of in the NLRP3 inflammasome-related genes and construction of LASSO regression model. Immune landscape analysis was evaluated with ssGSEA. Classification of Crohn's-disease samples based on NLRP3 inflammasome-related genes with ConsensusClusterPlus. Functional enrichment analysis, gene set variation analysis (GSVA) and drug-gene interaction network.

RESULTS

The expressions of NLRP3 inflammasome-related genes were increased in diseased tissues, and higher expressions of NLRP3 inflammasome-related genes were correlated with generally enhanced immune cell infiltration, immune-related pathways and human leukocyte antigen (HLA)-gene expressions. The gene-based signature showed well performance in the diagnosis of Crohn's disease. Moreover, consensus clustering identified two Crohn's disease clusters based on NLRP3 inflammasome-related genes, and cluster 2 was with higher expressions of the genes. Cluster 2 demonstrated upregulated activities of immune environment in Crohn's disease. Furthermore, four key hub genes were identified and potential drugs were explored for the treatment of Crohn's disease.

CONCLUSIONS

Our findings indicate that NLRP3 inflammasome and its related genes could regulate immune cells and responses, as well as involve in the pathogenesis of Crohn's disease from transcriptomic aspects. These findings provide in silico insights into the diagnosis and treatment of Crohn's disease and might assist in the clinical decision-making process.

Collapse

Affiliation(s)

Huihuan Wu Department of Gastroenterology, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China School of Medicine, South China University of Technology, Guangzhou, China
Ruijie Zeng Department of Gastroenterology, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China School of Medicine, Shantou University Medical College, Shantou, China
Xinqi Qiu Zhuguang Community Healthcare Center, Guangzhou, China
Kequan Chen Department of Gastroenterology, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
Zewei Zhuo Department of Gastroenterology, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
Kehang Guo Department of Critical Care Medicine, The Fifth Affiliated Hospital of Zhengzhou University, Zhengzhou, China
Yawen Xiang Edinburgh Medical School, College of Medicine and Veterinary Medicine, University of Edinburgh, Edinburgh, United Kingdom
Qi Yang Department of Gastroenterology, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
Rui Jiang School of Medicine, South China University of Technology, Guangzhou, China
Felix W. Leung David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, United States
Qizhou Lian Department of Medicine, Queen Mary Hospital, Hong Kong, Hong Kong SAR, China
Weihong Sha Department of Gastroenterology, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China School of Medicine, South China University of Technology, Guangzhou, China
Hao Chen Department of Gastroenterology, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China School of Medicine, South China University of Technology, Guangzhou, China

Collapse

Sun J, Huang Q. Two stages biclustering with three populations. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Sun J, Liu Q, Wang Y, Wang L, Song X, Zhao X. Five-year prognosis model of esophageal cancer based on genetic algorithm improved deep neural network. Ing Rech Biomed 2023. [DOI: 10.1016/j.irbm.2022.100748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]

Johnson AC, Silva JAF, Kim SC, Larsen CP. Progress in kidney transplantation: The role for systems immunology. Front Med (Lausanne) 2022;9:1070385. [PMID: 36590970 PMCID: PMC9800623 DOI: 10.3389/fmed.2022.1070385] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 11/16/2022] [Indexed: 12/23/2022] Open

Sherif FF, Ahmed KS. Unsupervised clustering of SARS-CoV-2 using deep convolutional autoencoder. JOURNAL OF ENGINEERING AND APPLIED SCIENCE 2022. [PMCID: PMC9383682 DOI: 10.1186/s44147-022-00125-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Abstract

SARS-CoV-2’s population structure might have a substantial impact on public health management and diagnostics if it can be identified. It is critical to rapidly monitor and characterize their lineages circulating globally for a more accurate diagnosis, improved care, and faster treatment. For a clearer picture of the SARS-CoV-2 population structure, clustering the sequencing data is essential. Here, deep clustering techniques were used to automatically group 29,017 different strains of SARS-CoV-2 into clusters. We aim to identify the main clusters of SARS-CoV-2 population structure based on convolutional autoencoder (CAE) trained with numerical feature vectors mapped from coronavirus Spike peptide sequences. Our clustering findings revealed that there are six large SARS-CoV-2 population clusters (C1, C2, C3, C4, C5, C6). These clusters contained 43 unique lineages in which the 29,017 publicly accessible strains were dispersed. In all the resulting six clusters, the genetic distances within the same cluster (intra-cluster distances) are less than the distances between inter-clusters (P-value 0.0019, Wilcoxon rank-sum test). This indicates substantial evidence of a connection between the cluster’s lineages. Furthermore, comparisons of the K-means and hierarchical clustering methods have been examined against the proposed deep learning clustering method. The intra-cluster genetic distances of the proposed method were smaller than those of K-means alone and hierarchical clustering methods. We used T-distributed stochastic-neighbor embedding (t-SNE) to show the outcomes of the deep learning clustering. The strains were isolated correctly between clusters in the t-SNE plot. Our results showed that the (C5) cluster exclusively includes Gamma lineage (P.1) only, suggesting that strains of P.1 in C5 are more diversified than those in the other clusters. Our study indicates that the genetic similarity between strains in the same cluster enables a better understanding of the major features of the unknown population lineages when compared to some of the more prevalent viral isolates. This information helps researchers figure out how the virus changed over time and spread to people all over the world.

Collapse