Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

60
(from Reference Citation Analysis)

Article PDFs (26)

Cited by > 0 (38)

Searched Name

Unsupervised clustering

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Jiang J, Zheng P, Li L. Identification of Prognostic and Immune Characteristics of Two Lung Adenocarcinoma Subtypes Based on TRPV Channel Family Genes. J Membr Biol 2024;257:115-129. [PMID: 38150051 DOI: 10.1007/s00232-023-00300-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 11/21/2023] [Indexed: 12/28/2023]

Crombé A, Lecomte JC, Seux M, Banaste N, Gorincour G. Using the Textual Content of Radiological Reports to Detect Emerging Diseases: A Proof-of-Concept Study of COVID-19. J Imaging Inform Med 2024;37:620-632. [PMID: 38343242 PMCID: PMC11031522 DOI: 10.1007/s10278-023-00949-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 10/02/2023] [Accepted: 10/04/2023] [Indexed: 04/20/2024]

Abstract

Changes in the content of radiological reports at population level could detect emerging diseases. Herein, we developed a method to quantify similarities in consecutive temporal groupings of radiological reports using natural language processing, and we investigated whether appearance of dissimilarities between consecutive periods correlated with the beginning of the COVID-19 pandemic in France. CT reports from 67,368 consecutive adults across 62 emergency departments throughout France between October 2019 and March 2020 were collected. Reports were vectorized using time frequency-inverse document frequency (TF-IDF) analysis on one-grams. For each successive 2-week period, we performed unsupervised clustering of the reports based on TF-IDF values and partition-around-medoids. Next, we assessed the similarities between this clustering and a clustering from two weeks before according to the average adjusted Rand index (AARI). Statistical analyses included (1) cross-correlation functions (CCFs) with the number of positive SARS-CoV-2 tests and advanced sanitary index for flu syndromes (ASI-flu, from open-source dataset), and (2) linear regressions of time series at different lags to understand the variations of AARI over time. Overall, 13,235 chest CT reports were analyzed. AARI was correlated with ASI-flu at lag = + 1, + 5, and + 6 weeks (P = 0.0454, 0.0121, and 0.0042, respectively) and with SARS-CoV-2 positive tests at lag = - 1 and 0 week (P = 0.0057 and 0.0001, respectively). In the best fit, AARI correlated with the ASI-flu with a lag of 2 weeks (P = 0.0026), SARS-CoV-2-positive tests in the same week (P < 0.0001) and their interaction (P < 0.0001) (adjusted R2 = 0.921). Thus, our method enables the automatic monitoring of changes in radiological reports and could help capturing disease emergence.

Collapse

Wang X, Rao J, Zhang L, Liu X, Zhang Y. Identification of circadian rhythm-related gene classification patterns and immune infiltration analysis in heart failure based on machine learning. Heliyon 2024;10:e27049. [PMID: 38509983 PMCID: PMC10950509 DOI: 10.1016/j.heliyon.2024.e27049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 12/17/2023] [Accepted: 02/22/2024] [Indexed: 03/22/2024] Open

Abstract

Background

Circadian rhythms play a key role in the failing heart, but the exact molecular mechanisms linking changes in the expression of circadian rhythm-related genes to heart failure (HF) remain unclear.

Methods

By intersecting differentially expressed genes (DEGs) between normal and HF samples in the Gene Expression Omnibus (GEO) database with circadian rhythm-related genes (CRGs), differentially expressed circadian rhythm-related genes (DE-CRGs) were obtained. Machine learning algorithms were used to screen for feature genes, and diagnostic models were constructed based on these feature genes. Subsequently, consensus clustering algorithms and non-negative matrix factorization (NMF) algorithms were used for clustering analysis of HF samples. On this basis, immune infiltration analysis was used to score the immune infiltration status between HF and normal samples as well as among different subclusters. Gene Set Variation Analysis (GSVA) evaluated the biological functional differences among subclusters.

Results

13 CRGs showed differential expression between HF patients and normal samples. Nine feature genes were obtained through cross-referencing results from four distinct machine learning algorithms. Multivariate LASSO regression and external dataset validation were performed to select five key genes with diagnostic value, including NAMPT, SERPINA3, MAPK10, NPPA, and SLC2A1. Moreover, consensus clustering analysis could divide HF patients into two distinct clusters, which exhibited different biological functions and immune characteristics. Additionally, two subgroups were distinguished using the NMF algorithm based on circadian rhythm associated differentially expressed genes. Studies on immune infiltration showed marked variances in levels of immune infiltration between these subgroups. Subgroup A had higher immune scores and more widespread immune infiltration. Finally, the Weighted Gene Co-expression Network Analysis (WGCNA) method was utilized to discern the modules that had the closest association with the two observed subgroups, and hub genes were pinpointed via protein-protein interaction (PPI) networks. GRIN2A, DLG1, ERBB4, LRRC7, and NRG1 were circadian rhythm-related hub genes closely associated with HF.

Conclusion

This study provides valuable references for further elucidating the pathogenesis of HF and offers beneficial insights for targeting circadian rhythm mechanisms to regulate immune responses and energy metabolism in HF treatment. Five genes identified by us as diagnostic features could be potential targets for therapy for HF.

Collapse

Bushra AA, Kim D, Kan Y, Yi G. AutoSCAN: automatic detection of DBSCAN parameters and efficient clustering of data in overlapping density regions. PeerJ Comput Sci 2024;10:e1921. [PMID: 38660211 PMCID: PMC11042006 DOI: 10.7717/peerj-cs.1921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 02/12/2024] [Indexed: 04/26/2024]

Li D, Li X, Lv J, Li S. Creation of signatures and identification of molecular subtypes based on disulfidptosis-related genes for glioblastoma patients' prognosis and immunological activity. Asian J Surg 2024:S1015-9584(24)00299-9. [PMID: 38462406 DOI: 10.1016/j.asjsur.2024.02.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 12/23/2023] [Accepted: 02/02/2024] [Indexed: 03/12/2024] Open

Zhuang X, Moshi MA, Quinones O, Trenholm RA, Chang CL, Cordes D, Vanderford BJ, Vo V, Gerrity D, Oh EC. Spatial and Temporal Drug Usage Patterns in Wastewater Correlate with Socioeconomic and Demographic Indicators in Southern Nevada. medRxiv 2024:2024.02.02.24302241. [PMID: 38352613 PMCID: PMC10863018 DOI: 10.1101/2024.02.02.24302241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/19/2024]

Affiliation(s)

Xiaowei Zhuang Laboratory of Neurogenetics and Precision Medicine, College of Sciences, Kerkorian School of Medicine at UNLV, University of Nevada Las Vegas, Las Vegas, NV 89154 Neuroscience Interdisciplinary Ph.D. program, Kerkorian School of Medicine at UNLV, University of Nevada Las Vegas, Las Vegas, NV 89154 Cleveland Clinic Lou Ruvo Center for Brain Health, Las Vegas, NV
Michael A. Moshi Laboratory of Neurogenetics and Precision Medicine, College of Sciences, Kerkorian School of Medicine at UNLV, University of Nevada Las Vegas, Las Vegas, NV 89154 Neuroscience Interdisciplinary Ph.D. program, Kerkorian School of Medicine at UNLV, University of Nevada Las Vegas, Las Vegas, NV 89154
Oscar Quinones Applied Research and Development Center, Southern Nevada Water Authority, P.O. Box 99954, Las Vegas NV, 89193, USA
Rebecca A. Trenholm Applied Research and Development Center, Southern Nevada Water Authority, P.O. Box 99954, Las Vegas NV, 89193, USA
Ching-Lan Chang Laboratory of Neurogenetics and Precision Medicine, College of Sciences, Kerkorian School of Medicine at UNLV, University of Nevada Las Vegas, Las Vegas, NV 89154 Neuroscience Interdisciplinary Ph.D. program, Kerkorian School of Medicine at UNLV, University of Nevada Las Vegas, Las Vegas, NV 89154
Dietmar Cordes Cleveland Clinic Lou Ruvo Center for Brain Health, Las Vegas, NV
Brett J. Vanderford Applied Research and Development Center, Southern Nevada Water Authority, P.O. Box 99954, Las Vegas NV, 89193, USA
Van Vo Laboratory of Neurogenetics and Precision Medicine, College of Sciences, Kerkorian School of Medicine at UNLV, University of Nevada Las Vegas, Las Vegas, NV 89154
Daniel Gerrity Applied Research and Development Center, Southern Nevada Water Authority, P.O. Box 99954, Las Vegas NV, 89193, USA
Edwin C. Oh Laboratory of Neurogenetics and Precision Medicine, College of Sciences, Kerkorian School of Medicine at UNLV, University of Nevada Las Vegas, Las Vegas, NV 89154 Neuroscience Interdisciplinary Ph.D. program, Kerkorian School of Medicine at UNLV, University of Nevada Las Vegas, Las Vegas, NV 89154 Department of Brain Health, Kerkorian School of Medicine at UNLV, University of Nevada Las Vegas, Las Vegas, NV 89154 Department of Internal Medicine, Kirk Kerkorian School of Medicine at UNLV, University of Nevada Las Vegas, Las Vegas, NV 89154

Collapse

Chang H, Ashlock DA, Graether SP, Keller SM. Anchor Clustering for million-scale immune repertoire sequencing data. BMC Bioinformatics 2024;25:42. [PMID: 38273275 PMCID: PMC10809746 DOI: 10.1186/s12859-024-05659-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 01/16/2024] [Indexed: 01/27/2024] Open

Chen S, Li X, Ao W. Prognostic and immune infiltration features of disulfidptosis-related subtypes in breast cancer. BMC Womens Health 2024;24:6. [PMID: 38166898 PMCID: PMC10763228 DOI: 10.1186/s12905-023-02823-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 12/01/2023] [Indexed: 01/05/2024] Open

Gharbi-Meliani A, Husson F, Vandendriessche H, Bayen E, Yaffe K, Bachoud-Lévi AC, Cleret de Langavant L. Identification of high likelihood of dementia in population-based surveys using unsupervised clustering: a longitudinal analysis. Alzheimers Res Ther 2023;15:209. [PMID: 38031083 PMCID: PMC10688099 DOI: 10.1186/s13195-023-01357-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 11/21/2023] [Indexed: 12/01/2023]

Abstract

BACKGROUND

Dementia is defined as a cognitive decline that affects functional status. Longitudinal ageing surveys often lack a clinical diagnosis of dementia though measure cognition and daily function over time. We used unsupervised machine learning and longitudinal data to identify transition to probable dementia.

METHODS

Multiple Factor Analysis was applied to longitudinal function and cognitive data of 15,278 baseline participants (aged 50 years and more) from the Survey of Health, Ageing, and Retirement in Europe (SHARE) (waves 1, 2 and 4-7, between 2004 and 2017). Hierarchical Clustering on Principal Components discriminated three clusters at each wave. We estimated probable or "Likely Dementia" prevalence by sex and age, and assessed whether dementia risk factors increased the risk of being assigned probable dementia status using multistate models. Next, we compared the "Likely Dementia" cluster with self-reported dementia status and replicated our findings in the English Longitudinal Study of Ageing (ELSA) cohort (waves 1-9, between 2002 and 2019, 7840 participants at baseline).

RESULTS

Our algorithm identified a higher number of probable dementia cases compared with self-reported cases and showed good discriminative power across all waves (AUC ranged from 0.754 [0.722-0.787] to 0.830 [0.800-0.861]). "Likely Dementia" status was more prevalent in older people, displayed a 2:1 female/male ratio, and was associated with nine factors that increased risk of transition to dementia: low education, hearing loss, hypertension, drinking, smoking, depression, social isolation, physical inactivity, diabetes, and obesity. Results were replicated in ELSA cohort with good accuracy.

CONCLUSIONS

Machine learning clustering can be used to study dementia determinants and outcomes in longitudinal population ageing surveys in which dementia clinical diagnosis is lacking.

Collapse

Affiliation(s)

Amin Gharbi-Meliani Neuropsychologie Interventionnelle, U955 E01, Institut Mondor de Recherche Biomédicale & Département d'études Cognitives, INSERM, Ecole Normale Supérieure, Université PSL, Université Paris-Est Créteil, Creteil, 94000, France
François Husson Institut Agro, Univ Rennes1, CNRS, IRMAR, Rennes, 35000, France
Henri Vandendriessche Laboratoire de Neurosciences Cognitives et Computationnelles, Département d'études Cognitives, Ecole Normale Supérieure, Université PSL, INSERM, Paris, 75005, France
Eleonore Bayen Département de Rééducation Neurologique, Sorbonne Université, Hôpital Pitié-Salpêtrière-Assistance Publique Hôpitaux de Paris, Paris, 75013, France Global Brain Health Institute, University of California, San Francisco, CA, 94143, USA
Kristine Yaffe Global Brain Health Institute, University of California, San Francisco, CA, 94143, USA Departments of Psychiatry, Neurology and Epidemiology and Biostatistics, University of California, San Francisco, CA, 94143, USA
Anne-Catherine Bachoud-Lévi Neuropsychologie Interventionnelle, U955 E01, Institut Mondor de Recherche Biomédicale & Département d'études Cognitives, INSERM, Ecole Normale Supérieure, Université PSL, Université Paris-Est Créteil, Creteil, 94000, France Service de Neurologie, Centre de référence maladie de Huntington, Hôpital Henri Mondor, Assistance Publique Hôpitaux de Paris, 1 rue Gustave Eiffel, Creteil, 94000, France
Laurent Cleret de Langavant Neuropsychologie Interventionnelle, U955 E01, Institut Mondor de Recherche Biomédicale & Département d'études Cognitives, INSERM, Ecole Normale Supérieure, Université PSL, Université Paris-Est Créteil, Creteil, 94000, France. Global Brain Health Institute, University of California, San Francisco, CA, 94143, USA. Service de Neurologie, Centre de référence maladie de Huntington, Hôpital Henri Mondor, Assistance Publique Hôpitaux de Paris, 1 rue Gustave Eiffel, Creteil, 94000, France.

Collapse

Liu L, Han L, Dong L, He Z, Gao K, Chen X, Guo JC, Zhao Y. The hypoxia-associated genes in immune infiltration and treatment options of lung adenocarcinoma. PeerJ 2023;11:e15621. [PMID: 37576511 PMCID: PMC10414028 DOI: 10.7717/peerj.15621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 06/01/2023] [Indexed: 08/15/2023] Open

Pan W, Long F, Pan J. ScInfoVAE: interpretable dimensional reduction of single cell transcription data with variational autoencoders and extended mutual information regularization. BioData Min 2023;16:17. [PMID: 37301826 DOI: 10.1186/s13040-023-00333-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 06/05/2023] [Indexed: 06/12/2023] Open

Tang L, Lei X, Hu H, Li Z, Zhu H, Zhan W, Zhang T. Investigation of fatty acid metabolism-related genes in breast cancer: Implications for Immunotherapy and clinical significance. Transl Oncol 2023;34:101700. [PMID: 37247503 DOI: 10.1016/j.tranon.2023.101700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 05/21/2023] [Indexed: 05/31/2023] Open

Affiliation(s)

Liyang Tang School of Pharmacy, Hengyang Medical College, University of South China, 28 Western Changsheng Road, Hengyang, Hunan 421001, China; The First Affiliated Hospital, Department of Pharmacy, Hengyang Medical School, University of South China, 69 Chuanshan Road, Hengyang, Hunan, 421001, China; The First Affiliated Hospital, Chinese Traditional Medicine(TCM) research platform of major Epidemic Treatment base, Hengyang Medical School, University of South China, 69 Chuanshan Road, Hengyang, Hunan, 421001, China
Xiaoyong Lei School of Pharmacy, Hengyang Medical College, University of South China, 28 Western Changsheng Road, Hengyang, Hunan 421001, China
Haihong Hu School of Pharmacy, Hengyang Medical College, University of South China, 28 Western Changsheng Road, Hengyang, Hunan 421001, China; The First Affiliated Hospital, Department of Pharmacy, Hengyang Medical School, University of South China, 69 Chuanshan Road, Hengyang, Hunan, 421001, China; The First Affiliated Hospital, Chinese Traditional Medicine(TCM) research platform of major Epidemic Treatment base, Hengyang Medical School, University of South China, 69 Chuanshan Road, Hengyang, Hunan, 421001, China
Zhuo Li School of Pharmacy, Hengyang Medical College, University of South China, 28 Western Changsheng Road, Hengyang, Hunan 421001, China; The First Affiliated Hospital, Department of Pharmacy, Hengyang Medical School, University of South China, 69 Chuanshan Road, Hengyang, Hunan, 421001, China; The First Affiliated Hospital, Chinese Traditional Medicine(TCM) research platform of major Epidemic Treatment base, Hengyang Medical School, University of South China, 69 Chuanshan Road, Hengyang, Hunan, 421001, China
Hongxia Zhu School of Pharmacy, Hengyang Medical College, University of South China, 28 Western Changsheng Road, Hengyang, Hunan 421001, China; The First Affiliated Hospital, Department of Pharmacy, Hengyang Medical School, University of South China, 69 Chuanshan Road, Hengyang, Hunan, 421001, China; The First Affiliated Hospital, Chinese Traditional Medicine(TCM) research platform of major Epidemic Treatment base, Hengyang Medical School, University of South China, 69 Chuanshan Road, Hengyang, Hunan, 421001, China
Wendi Zhan School of Pharmacy, Hengyang Medical College, University of South China, 28 Western Changsheng Road, Hengyang, Hunan 421001, China; The First Affiliated Hospital, Department of Pharmacy, Hengyang Medical School, University of South China, 69 Chuanshan Road, Hengyang, Hunan, 421001, China; The First Affiliated Hospital, Chinese Traditional Medicine(TCM) research platform of major Epidemic Treatment base, Hengyang Medical School, University of South China, 69 Chuanshan Road, Hengyang, Hunan, 421001, China
Taolan Zhang School of Pharmacy, Hengyang Medical College, University of South China, 28 Western Changsheng Road, Hengyang, Hunan 421001, China; The First Affiliated Hospital, Department of Pharmacy, Hengyang Medical School, University of South China, 69 Chuanshan Road, Hengyang, Hunan, 421001, China; The First Affiliated Hospital, Chinese Traditional Medicine(TCM) research platform of major Epidemic Treatment base, Hengyang Medical School, University of South China, 69 Chuanshan Road, Hengyang, Hunan, 421001, China.

Collapse

Kyodo A, Kanaoka K, Keshi A, Nogi M, Nogi K, Ishihara S, Kamon D, Hashimoto Y, Nakada Y, Ueda T, Seno A, Nishida T, Onoue K, Soeda T, Kawakami R, Watanabe M, Nagai T, Anzai T, Saito Y. Heart failure with preserved ejection fraction phenogroup classification using machine learning. ESC Heart Fail 2023;10:2019-2030. [PMID: 37051638 DOI: 10.1002/ehf2.14368] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Revised: 01/05/2023] [Accepted: 03/13/2023] [Indexed: 04/14/2023] Open

Abstract

AIMS

Heart failure (HF) with preserved ejection fraction (HFpEF) is a complex syndrome with a poor prognosis. Phenotyping is required to identify subtype-dependent treatment strategies. Phenotypes of Japanese HFpEF patients are not fully elucidated, whose obesity is much less than Western patients. This study aimed to reveal model-based phenomapping using unsupervised machine learning (ML) for HFpEF in Japanese patients.

METHODS AND RESULTS

We studied 365 patients with HFpEF (left ventricular ejection fraction >50%) as a derivation cohort from the Nara Registry and Analyses for Heart Failure (NARA-HF), which registered patients with hospitalization by acute decompensated HF. We used unsupervised ML with a variational Bayesian-Gaussian mixture model (VBGMM) with common clinical variables. We also performed hierarchical clustering on the derivation cohort. We adopted 230 patients in the Japanese Heart Failure Syndrome with Preserved Ejection Fraction Registry as the validation cohort for VBGMM. The primary endpoint was defined as all-cause death and HF readmission within 5 years. Supervised ML was performed on the composite cohort of derivation and validation. The optimal number of clusters was three because of the probable distribution of VBGMM and the minimum Bayesian information criterion, and we stratified HFpEF into three phenogroups. Phenogroup 1 (n = 125) was older (mean age 78.9 ± 9.1 years) and predominantly male (57.6%), with the worst kidney function (mean estimated glomerular filtration rate 28.5 ± 9.7 mL/min/1.73 m² ) and a high incidence of atherosclerotic factor. Phenogroup 2 (n = 200) had older individuals (mean age 78.8 ± 9.7 years), the lowest body mass index (BMI; 22.78 ± 3.94), and the highest incidence of women (57.5%) and atrial fibrillation (56.5%). Phenogroup 3 (n = 40) was the youngest (mean age 63.5 ± 11.2) and predominantly male (63.5 ± 11.2), with the highest BMI (27.46 ± 5.85) and a high incidence of left ventricular hypertrophy. We characterized these three phenogroups as atherosclerosis and chronic kidney disease, atrial fibrillation, and younger and left ventricular hypertrophy groups, respectively. At the primary endpoint, Phenogroup 1 demonstrated the worst prognosis (Phenogroups 1-3: 72.0% vs. 58.5% vs. 45%, P = 0.0036). We also successfully classified a derivation cohort into three similar phenogroups using VBGMM. Hierarchical and supervised clustering successfully showed the reproducibility of the three phenogroups.

CONCLUSIONS

ML could successfully stratify Japanese HFpEF patients into three phenogroups (atherosclerosis and chronic kidney disease, atrial fibrillation, and younger and left ventricular hypertrophy groups).

Collapse

Dashtban A, Mizani MA, Pasea L, Denaxas S, Corbett R, Mamza JB, Gao H, Morris T, Hemingway H, Banerjee A. Identifying subtypes of chronic kidney disease with machine learning: development, internal validation and prognostic validation using linked electronic health records in 350,067 individuals. EBioMedicine 2023;89:104489. [PMID: 36857859 PMCID: PMC9989643 DOI: 10.1016/j.ebiom.2023.104489] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Revised: 01/31/2023] [Accepted: 02/06/2023] [Indexed: 03/01/2023] Open

Abstract

BACKGROUND

Although chronic kidney disease (CKD) is associated with high multimorbidity, polypharmacy, morbidity and mortality, existing classification systems (mild to severe, usually based on estimated glomerular filtration rate, proteinuria or urine albumin-creatinine ratio) and risk prediction models largely ignore the complexity of CKD, its risk factors and its outcomes. Improved subtype definition could improve prediction of outcomes and inform effective interventions.

METHODS

We analysed individuals ≥18 years with incident and prevalent CKD (n = 350,067 and 195,422 respectively) from a population-based electronic health record resource (2006-2020; Clinical Practice Research Datalink, CPRD). We included factors (n = 264 with 2670 derived variables), e.g. demography, history, examination, blood laboratory values and medications. Using a published framework, we identified subtypes through seven unsupervised machine learning (ML) methods (K-means, Diana, HC, Fanny, PAM, Clara, Model-based) with 66 (of 2670) variables in each dataset. We evaluated subtypes for: (i) internal validity (within dataset, across methods); (ii) prognostic validity (predictive accuracy for 5-year all-cause mortality and admissions); and (iii) medications (new and existing by British National Formulary chapter).

FINDINGS

After identifying five clusters across seven approaches, we labelled CKD subtypes: 1. Early-onset, 2. Late-onset, 3. Cancer, 4. Metabolic, and 5. Cardiometabolic. Internal validity: We trained a high performing model (using XGBoost) that could predict disease subtypes with 95% accuracy for incident and prevalent CKD (Sensitivity: 0.81-0.98, F1 score:0.84-0.97). Prognostic validity: 5-year all-cause mortality, hospital admissions, and incidence of new chronic diseases differed across CKD subtypes. The 5-year risk of mortality and admissions in the overall incident CKD population were highest in cardiometabolic subtype: 43.3% (42.3-42.8%) and 29.5% (29.1-30.0%), respectively, and lowest in the early-onset subtype: 5.7% (5.5-5.9%) and 18.7% (18.4-19.1%).

MEDICATIONS

Across CKD subtypes, the distribution of prescription medication classes at baseline varied, with highest medication burden in cardiometabolic and metabolic subtypes, and higher burden in prevalent than incident CKD.

INTERPRETATION

In the largest CKD study using ML, to-date, we identified five distinct subtypes in individuals with incident and prevalent CKD. These subtypes have relevance to study of aetiology, therapeutics and risk prediction.

FUNDING

AstraZeneca UK Ltd, Health Data Research UK.

Collapse

Ma C, Tu D, Xu Q, Wu Y, Song X, Guo Z, Zhao X. Identification of m⁷G regulator-mediated RNA methylation modification patterns and related immune microenvironment regulation characteristics in heart failure. Clin Epigenetics 2023;15:22. [PMID: 36782329 PMCID: PMC9926673 DOI: 10.1186/s13148-023-01439-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 02/05/2023] [Indexed: 02/15/2023] Open

Abstract

BACKGROUND

N⁷-methylguanosine (m⁷G) modification has been reported to regulate RNA expression in multiple pathophysiological processes. However, little is known about its role and association with immune microenvironment in heart failure (HF).

RESULTS

One hundred twenty-four HF patients and 135 nonfailing donors (NFDs) from six microarray datasets in the gene expression omnibus (GEO) database were included to evaluate the expression profiles of m⁷G regulators. Results revealed that 14 m⁷G regulators were differentially expressed in heart tissues from HF patients and NFDs. Furthermore, a five-gene m⁷G regulator diagnostic signature, NUDT16, NUDT4, CYFIP1, LARP1, and DCP2, which can easily distinguish HF patients and NFDs, was established by cross-combination of three machine learning methods, including best subset regression, regularization techniques, and random forest algorithm. The diagnostic value of five-gene m⁷G regulator signature was further validated in human samples through quantitative reverse-transcription polymerase chain reaction (qRT-PCR). In addition, consensus clustering algorithms were used to categorize HF patients into distinct molecular subtypes. We identified two distinct m⁷G subtypes of HF with unique m⁷G modification pattern, functional enrichment, and immune characteristics. Additionally, two gene subgroups based on m⁷G subtype-related genes were further discovered. Single-sample gene-set enrichment analysis (ssGSEA) was utilized to assess the alterations of immune microenvironment. Finally, utilizing protein-protein interaction network and weighted gene co-expression network analysis (WGCNA), we identified UQCRC1, NDUFB6, and NDUFA13 as m⁷G methylation-associated hub genes with significant clinical relevance to cardiac functions.

CONCLUSIONS

Our study discovered for the first time that m⁷G RNA modification and immune microenvironment are closely correlated in HF development. A five-gene m⁷G regulator diagnostic signature for HF (NUDT16, NUDT4, CYFIP1, LARP1, and DCP2) and three m⁷G methylation-associated hub genes (UQCRC1, NDUFB6, and NDUFA13) were identified, providing new insights into the underlying mechanisms and effective treatments of HF.

Collapse

Chang MJ, Hao JW, Qiao J, Chen MR, Wang Q, Wang Q, Zhang SX, Yu Q, He PF. A compendium of mucosal molecular characteristics provides novel perspectives on the treatment of ulcerative colitis. J Crohns Colitis 2023:6995436. [PMID: 36682023 DOI: 10.1093/ecco-jcc/jjad011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Indexed: 01/23/2023]

Zhao W, Ma J, Liu Q, Song J, Tysklind M, Liu C, Wang D, Qu Y, Wu Y, Wu F. Comparison and application of SOFM, fuzzy c-means and k-means clustering algorithms for natural soil environment regionalization in China. Environ Res 2023;216:114519. [PMID: 36252833 DOI: 10.1016/j.envres.2022.114519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 09/28/2022] [Accepted: 10/04/2022] [Indexed: 06/16/2023]

Moreno G, Ruiz-Botella M, Martín-Loeches I, Gómez Álvarez J, Jiménez Herrera M, Bodí M, Armestar F, Marques Parra A, Estella Á, Trefler S, Jorge García R, Murcia Paya J, Vidal Cortes P, Díaz E, Ferrer R, Albaya-Moreno A, Socias-Crespi L, Bonell Goytisolo J, Sancho Chinesta S, Loza A, Forcelledo Espina L, Pozo Laderas J, deAlba-Aparicio M, Sánchez Montori L, Vallverdú Perapoch I, Hidalgo V, Fraile Gutiérrez V, Casamitjana Ortega A, Martín Serrano F, Nieto M, Blasco Cortes M, Marín-Corral J, Solé-Violán J, Rodríguez A. A differential therapeutic consideration for use of corticosteroids according to established COVID-19 clinical phenotypes in critically ill patients. Med Intensiva 2023;47:23-33. [PMID: 36272908 PMCID: PMC9579897 DOI: 10.1016/j.medine.2021.10.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 10/02/2021] [Indexed: 11/06/2022]

Abstract

OBJECTIVE

To determine if the use of corticosteroids was associated with Intensive Care Unit (ICU) mortality among whole population and pre-specified clinical phenotypes.

DESIGN

A secondary analysis derived from multicenter, observational study.

SETTING

Critical Care Units.

PATIENTS

Adult critically ill patients with confirmed COVID-19 disease admitted to 63 ICUs in Spain.

INTERVENTIONS

Corticosteroids vs. no corticosteroids.

MAIN VARIABLES OF INTEREST

Three phenotypes were derived by non-supervised clustering analysis from whole population and classified as (A: severe, B: critical and C: life-threatening). We performed a multivariate analysis after propensity optimal full matching (PS) for whole population and weighted Cox regression (HR) and Fine-Gray analysis (sHR) to assess the impact of corticosteroids on ICU mortality according to the whole population and distinctive patient clinical phenotypes.

RESULTS

A total of 2017 patients were analyzed, 1171 (58%) with corticosteroids. After PS, corticosteroids were shown not to be associated with ICU mortality (OR: 1.0; 95% CI: 0.98-1.15). Corticosteroids were administered in 298/537 (55.5%) patients of "A" phenotype and their use was not associated with ICU mortality (HR=0.85 [0.55-1.33]). A total of 338/623 (54.2%) patients in "B" phenotype received corticosteroids. No effect of corticosteroids on ICU mortality was observed when HR was performed (0.72 [0.49-1.05]). Finally, 535/857 (62.4%) patients in "C" phenotype received corticosteroids. In this phenotype HR (0.75 [0.58-0.98]) and sHR (0.79 [0.63-0.98]) suggest a protective effect of corticosteroids on ICU mortality.

CONCLUSION

Our finding warns against the widespread use of corticosteroids in all critically ill patients with COVID-19 at moderate dose. Only patients with the highest inflammatory levels could benefit from steroid treatment.

Collapse

Affiliation(s)

G. Moreno ICU, Hospital Universitario Joan XXIII/URV/IISPV, Tarragona, Spain
M. Ruiz-Botella Tarragona Health Data Research Working Group (THeDaR) – ICU Hospital Universitario Joan XXIII, Tarragona, Spain
I. Martín-Loeches Department of Intensive Care Medicine, Multidisciplinary Intensive Care Research Organization (MICRO), St. James's Hospital, Dublin, Ireland
J. Gómez Álvarez Tarragona Health Data Research Working Group (THeDaR) – ICU Hospital Universitario Joan XXIII, Tarragona, Spain
M. Jiménez Herrera Dean Nursing Faculty, Universitat Rovira i Virgili, Tarragona, Spain
M. Bodí ICU, Hospital Universitario Joan XXIII/URV/IISPV, Tarragona, Spain,eCIBERES/CIBERESUCICOVID
F. Armestar ICU, Hospital Universitario German Trias i Pujol, Badalona, Spain
A. Marques Parra ICU, Hospital de la Ribera, Alzira, Spain
Á. Estella ICU, Hospital Universitario de Jerez, Jerez de la Frontera, Spain
S. Trefler ICU, Hospital Universitario Joan XXIII/URV/IISPV, Tarragona, Spain
R. Jorge García ICU, Hospital Nuestra Señora de Gracia, Zaragoza, Spain
J. Murcia Paya UCI, Hospital Santa Lucía, Cartagena, Spain
P. Vidal Cortes UCI, Complejo Hospitalario Universitario de Ourense, Orense, Spain
E. Díaz UCI, Hospital Parc Taulí/UAB/CIBERES, Barcelona, Spain
R. Ferrer UCI, Hospital Universitario Vall d’Hebron, Barcelona, Spain
A. Albaya-Moreno ICU, Hospital Universitario de Guadalajara, Guadalajara, Spain
L. Socias-Crespi UCI, Hospital Universitario Son Llátzer, Palma de Mallorca, Spain
J.M. Bonell Goytisolo UCI, Hospital QuironSalud Palmaplanas, Palma de Mallorca, Spain
S. Sancho Chinesta ICU, Hospital Universitario y Politécnico La Fe, Valencia, Spain
A. Loza ICU, Hospital Universitario Nuestra Señora de Valme, Sevilla, Spain
L. Forcelledo Espina ICU, Hospital Central de Asturias, Grupo de Investigación de Microbiología Traslacional del ISPA, Oviedo, Spain
J.C. Pozo Laderas ICU, Hospital Universitario Reina Sofía, Córdoba, Spain
M. deAlba-Aparicio ICU, Hospital Universitario Reina Sofía, Córdoba, Spain
L. Sánchez Montori ICU, Hospital Clínico Universitario Lozano Blesa, Zaragoza, Spain
I. Vallverdú Perapoch ICU, Hospital Universitario Sant Joan, Reus, Spain
V. Hidalgo ICU, Hospital Complejo Asistencial de Segovia, Segovia, Spain
V. Fraile Gutiérrez ICU, Hospital Universitario Río Hortega, Valladolid, Spain
A.M. Casamitjana Ortega UCI, Complejo Hospitalario Universitario Insular – Materno Infantil, Las Palmas de Gran Canaria, Spain
F. Martín Serrano UCI, Hospital de La Moncloa, Madrid, Spain
M. Nieto UCI, Hospital Clínico San Carlos, Madrid, Spain
M. Blasco Cortes UCI, Hospital Clínico Universitario, Valencia, Spain
J. Marín-Corral ICU, Hospital del Mar/GREPAC – IMIM, Barcelona, Spain,adDivision of Pulmonary Diseases & Critical Care Medicine, UTH San Antonio, San Antonio, TX, USA
J. Solé-Violán ICU, Hospital Universitario Dr. Negrín, Las Palmas de Gran Canaria, Spain
A. Rodríguez ICU, Hospital Universitario Joan XXIII/URV/IISPV, Tarragona, Spain,eCIBERES/CIBERESUCICOVID,⁎Corresponding author
on behalf COVID-19 SEMICYUC Working Group

Collapse

Moreno G, Ruiz-Botella M, Martín-Loeches I, Gómez Álvarez J, Jiménez Herrera M, Bodí M, Armestar F, Marques Parra A, Estella Á, Trefler S, Jorge García R, Murcia Paya J, Vidal Cortes P, Díaz E, Ferrer R, Albaya-Moreno A, Socias-Crespi L, Bonell Goytisolo JM, Sancho Chinesta S, Loza A, Forcelledo Espina L, Pozo Laderas JC, deAlba-Aparicio M, Sánchez Montori L, Vallverdú Perapoch I, Hidalgo V, Fraile Gutiérrez V, Casamitjana Ortega AM, Martín Serrano F, Nieto M, Blasco Cortes M, Marín-Corral J, Solé-Violán J, Rodríguez A; on behalf COVID-19 SEMICYUC Working Group. A differential therapeutic consideration for use of corticosteroids according to established COVID-19 clinical phenotypes in critically ill patients. Med Intensiva 2023;47:23-33. [PMID: 34720310 DOI: 10.1016/j.medin.2021.10.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 10/02/2021] [Indexed: 01/04/2023]

Abstract

Objective

To determine if the use of corticosteroids was associated with Intensive Care Unit (ICU) mortality among whole population and pre-specified clinical phenotypes.

Design

A secondary analysis derived from multicenter, observational study.

Setting

Critical Care Units.

Patients

Adult critically ill patients with confirmed COVID-19 disease admitted to 63 ICUs in Spain.

Interventions

Corticosteroids vs. no corticosteroids.

Main variables of interest

Results

A total of 2017 patients were analyzed, 1171 (58%) with corticosteroids. After PS, corticosteroids were shown not to be associated with ICU mortality (OR: 1.0; 95% CI: 0.98-1.15). Corticosteroids were administered in 298/537 (55.5%) patients of "A" phenotype and their use was not associated with ICU mortality (HR = 0.85 [0.55-1.33]). A total of 338/623 (54.2%) patients in "B" phenotype received corticosteroids. No effect of corticosteroids on ICU mortality was observed when HR was performed (0.72 [0.49-1.05]). Finally, 535/857 (62.4%) patients in "C" phenotype received corticosteroids. In this phenotype HR (0.75 [0.58-0.98]) and sHR (0.79 [0.63-0.98]) suggest a protective effect of corticosteroids on ICU mortality.

Conclusion

Collapse

Beccuti M, Calogero RA. Single-Cell RNAseq Clustering. Methods Mol Biol 2022;2584:241-250. [PMID: 36495454 DOI: 10.1007/978-1-0716-2756-3_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Mrukwa G, Polanska J. DiviK: divisive intelligent K-means for hands-free unsupervised clustering in big biological data. BMC Bioinformatics 2022;23:538. [PMID: 36503372 PMCID: PMC9743550 DOI: 10.1186/s12859-022-05093-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 12/01/2022] [Indexed: 12/14/2022] Open

Abstract

BACKGROUND

Investigating molecular heterogeneity provides insights into tumour origin and metabolomics. The increasing amount of data gathered makes manual analyses infeasible-therefore, automated unsupervised learning approaches are utilised for discovering tissue heterogeneity. However, automated analyses require experience setting the algorithms' hyperparameters and expert knowledge about the analysed biological processes. Moreover, feature engineering is needed to obtain valuable results because of the numerous features measured.

RESULTS

We propose DiviK: a scalable stepwise algorithm with local data-driven feature space adaptation for segmenting high-dimensional datasets. The algorithm is compared to the optional solutions (regular k-means, spatial and spectral approaches) combined with different feature engineering techniques (None, PCA, EXIMS, UMAP, Neural Ions). Three quality indices: Dice Index, Rand Index and EXIMS score, focusing on the overall composition of the clustering, coverage of the tumour region and spatial cluster consistency, are used to assess the quality of unsupervised analyses. Algorithms were validated on mass spectrometry imaging (MSI) datasets-2D human cancer tissue samples and 3D mouse kidney images. DiviK algorithm performed the best among the four clustering algorithms compared (overall quality score 1.24, 0.58 and 162 for d(0, 0, 0), d(1, 1, 1) and the sum of ranks, respectively), with spectral clustering being mostly second. Feature engineering techniques impact the overall clustering results less than the algorithms themselves (partial [Formula: see text] effect size: 0.141 versus 0.345, Kendall's concordance index: 0.424 versus 0.138 for d(0, 0, 0)).

CONCLUSIONS

DiviK could be the default choice in the exploration of MSI data. Thanks to its unique, GMM-based local optimisation of the feature space and deglomerative schema, DiviK results do not strongly depend on the feature engineering technique applied and can reveal the hidden structure in a tissue sample. Additionally, DiviK shows high scalability, and it can process at once the big omics data with more than 1.5 mln instances and a few thousand features. Finally, due to its simplicity, DiviK is easily generalisable to an even more flexible framework. Therefore, it is helpful for other -omics data (as single cell spatial transcriptomic) or tabular data in general (including medical images after appropriate embedding). A generic implementation is freely available under Apache 2.0 license at https://github.com/gmrukwa/divik .

Collapse

Jiang Z, Li X, Guo L. Binning Metagenomic Contigs Using Unsupervised Clustering and Reference Databases. Interdiscip Sci 2022;14:795-803. [PMID: 35639335 DOI: 10.1007/s12539-022-00526-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Revised: 04/23/2022] [Accepted: 04/27/2022] [Indexed: 06/15/2023]

Chakraborty S, Mali K. SUFEMO: A superpixel based fuzzy image segmentation method for COVID-19 radiological image elucidation. Appl Soft Comput 2022;129:109625. [PMID: 36124000 PMCID: PMC9474408 DOI: 10.1016/j.asoc.2022.109625] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 08/15/2022] [Accepted: 09/05/2022] [Indexed: 11/27/2022]

Abstract

COVID-19 causes an ongoing worldwide pandemic situation. The non-discovery of specialized drugs and/or any other kind of medicines makes the situation worse. Early diagnosis of this disease will be certainly helpful to start the treatment early and also to bring down the dire spread of this highly infectious virus. This article describes the proposed novel unsupervised segmentation method to segment the radiological image samples of the chest area that are accumulated from the COVID-19 infected patients. The proposed approach is helpful for physicians, medical technologists, and other related experts in the quick and early diagnosis of COVID-19 infection. The proposed approach will be the SUFEMO (SUperpixel based Fuzzy Electromagnetism-like Optimization). This approach is developed depending on some well-known theories like the Electromagnetism-like optimization algorithm, the type-2 fuzzy logic, and the superpixels. The proposed approach brings down the processing burden that is required to deal with a considerably large amount of spatial information by assimilating the notion of the superpixel. In this work, the EMO approach is modified by utilizing the type 2 fuzzy framework. The EMO approach updates the cluster centers without using the cluster center updation equation. This approach is independent of the choice of the initial cluster centers. To decrease the related computational overhead of handling a lot of spatial data, a novel superpixel-based approach is proposed in which the noise-sensitiveness of the watershed-based superpixel formation approach is dealt with by computing the nearby minima from the gradient image. Also, to take advantage of the superpixels, the fuzzy objective function is modified. The proposed approach was evaluated using both qualitatively and quantitatively using 310 chest CT scan images that are gathered from various sources. Four standard cluster validity indices are taken into consideration to quantify the results. It is observed that the proposed approach gives better performance compared to some of the state-of-the-art approaches in terms of both qualitative and quantitative outcomes. On average, the proposed approach attains Davies-Bouldin index value of 1.812008792, Xie-Beni index value of 1.683281, Dunn index value 2.588595748, and β index value 3.142069236 for 5 clusters. Apart from this, the proposed approach is also found to be superior with regard to the rate of convergence. Rigorous experiments prove the effectiveness of the proposed approach and establish the real-life applicability of the proposed method for the initial filtering of the COVID-19 patients.

Collapse

Åkerlund CAI, Holst A, Stocchetti N, Steyerberg EW, Menon DK, Ercole A, Nelson DW. Clustering identifies endotypes of traumatic brain injury in an intensive care cohort: a CENTER-TBI study. Crit Care 2022;26:228. [PMID: 35897070 PMCID: PMC9327174 DOI: 10.1186/s13054-022-04079-w] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2022] [Accepted: 07/02/2022] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

While the Glasgow coma scale (GCS) is one of the strongest outcome predictors, the current classification of traumatic brain injury (TBI) as 'mild', 'moderate' or 'severe' based on this fails to capture enormous heterogeneity in pathophysiology and treatment response. We hypothesized that data-driven characterization of TBI could identify distinct endotypes and give mechanistic insights.

METHODS

We developed an unsupervised statistical clustering model based on a mixture of probabilistic graphs for presentation (< 24 h) demographic, clinical, physiological, laboratory and imaging data to identify subgroups of TBI patients admitted to the intensive care unit in the CENTER-TBI dataset (N = 1,728). A cluster similarity index was used for robust determination of optimal cluster number. Mutual information was used to quantify feature importance and for cluster interpretation.

RESULTS

Six stable endotypes were identified with distinct GCS and composite systemic metabolic stress profiles, distinguished by GCS, blood lactate, oxygen saturation, serum creatinine, glucose, base excess, pH, arterial partial pressure of carbon dioxide, and body temperature. Notably, a cluster with 'moderate' TBI (by traditional classification) and deranged metabolic profile, had a worse outcome than a cluster with 'severe' GCS and a normal metabolic profile. Addition of cluster labels significantly improved the prognostic precision of the IMPACT (International Mission for Prognosis and Analysis of Clinical trials in TBI) extended model, for prediction of both unfavourable outcome and mortality (both p < 0.001).

CONCLUSIONS

Six stable and clinically distinct TBI endotypes were identified by probabilistic unsupervised clustering. In addition to presenting neurology, a profile of biochemical derangement was found to be an important distinguishing feature that was both biologically plausible and associated with outcome. Our work motivates refining current TBI classifications with factors describing metabolic stress. Such data-driven clusters suggest TBI endotypes that merit investigation to identify bespoke treatment strategies to improve care. Trial registration The core study was registered with ClinicalTrials.gov, number NCT02210221 , registered on August 06, 2014, with Resource Identification Portal (RRID: SCR_015582).

Collapse

Chen C, Luo J, Wang X. Identification of prostate cancer subtypes based on immune signature scores in bulk and single-cell transcriptomes. Med Oncol 2022;39:123. [PMID: 35716212 DOI: 10.1007/s12032-022-01719-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 03/09/2022] [Indexed: 10/18/2022]

Polouliakh N, Hase T, Ghosh S, Kitano H. Toxicity Analysis of Pentachlorophenol Data with a Bioinformatics Tool Set. Methods Mol Biol 2022;2486:105-125. [PMID: 35437721 DOI: 10.1007/978-1-0716-2265-0_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Shi Y, Zhang L, Peterson CB, Do KA, Jenq RR. Performance determinants of unsupervised clustering methods for microbiome data. Microbiome 2022;10:25. [PMID: 35120564 PMCID: PMC8817542 DOI: 10.1186/s40168-021-01199-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Accepted: 11/15/2021] [Indexed: 05/04/2023]

Jiang Z, Li X, Guo L. MetaCRS: unsupervised clustering of contigs with the recursive strategy of reducing metagenomic dataset's complexity. BMC Bioinformatics 2022;22:315. [PMID: 35045830 PMCID: PMC8772042 DOI: 10.1186/s12859-021-04227-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2021] [Accepted: 06/01/2021] [Indexed: 01/02/2023] Open

Hong Y, Zhang L, Tian X, Xiang X, Yu Y, Zeng Z, Cao Y, Chen S, Sun A. Identification of immune subtypes of Ph-neg B-ALL with ferroptosis related genes and the potential implementation of Sorafenib. BMC Cancer 2021;21:1331. [PMID: 34906116 PMCID: PMC8670244 DOI: 10.1186/s12885-021-09076-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2021] [Accepted: 11/30/2021] [Indexed: 12/15/2022] Open

Affiliation(s)

Yang Hong Department of Hematology, The First Affiliated Hospital of Soochow University, Jiangsu Institute of Hematology, National Clinical Research Center for Hematologic Diseases, Suzhou, China.,Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China
Ling Zhang Department of Hematology, The First Affiliated Hospital of Soochow University, Jiangsu Institute of Hematology, National Clinical Research Center for Hematologic Diseases, Suzhou, China.,Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China
Xiaopeng Tian Department of Hematology, The First Affiliated Hospital of Soochow University, Jiangsu Institute of Hematology, National Clinical Research Center for Hematologic Diseases, Suzhou, China.,Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China
Xin Xiang Department of Hematology, The First Affiliated Hospital of Soochow University, Jiangsu Institute of Hematology, National Clinical Research Center for Hematologic Diseases, Suzhou, China.,Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China
Yan Yu Department of Hematology, The First Affiliated Hospital of Soochow University, Jiangsu Institute of Hematology, National Clinical Research Center for Hematologic Diseases, Suzhou, China.,Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China
Zhao Zeng Department of Hematology, The First Affiliated Hospital of Soochow University, Jiangsu Institute of Hematology, National Clinical Research Center for Hematologic Diseases, Suzhou, China.,Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China
Yaqing Cao Department of Hematology, The First Affiliated Hospital of Soochow University, Jiangsu Institute of Hematology, National Clinical Research Center for Hematologic Diseases, Suzhou, China.,Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China
Suning Chen Department of Hematology, The First Affiliated Hospital of Soochow University, Jiangsu Institute of Hematology, National Clinical Research Center for Hematologic Diseases, Suzhou, China.,Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China
Aining Sun Department of Hematology, The First Affiliated Hospital of Soochow University, Jiangsu Institute of Hematology, National Clinical Research Center for Hematologic Diseases, Suzhou, China. .,Institute of Blood and Marrow Transplantation, Collaborative Innovation Center of Hematology, Soochow University, Suzhou, China.

Collapse

Kim SK, Jung SM, Park KS, Kim KJ. Integrative analysis of lung molecular signatures reveals key drivers of idiopathic pulmonary fibrosis. BMC Pulm Med 2021;21:404. [PMID: 34876074 PMCID: PMC8650281 DOI: 10.1186/s12890-021-01749-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 11/16/2021] [Indexed: 11/10/2022] Open

Abstract

Background

Idiopathic pulmonary fibrosis (IPF) is a devastating disease with a high clinical burden. The molecular signatures of IPF were analyzed to distinguish molecular subgroups and identify key driver genes and therapeutic targets.

Methods

Thirteen datasets of lung tissue transcriptomics including 585 IPF patients and 362 normal controls were obtained from the databases and subjected to filtration of differentially expressed genes (DEGs). A functional enrichment analysis, agglomerative hierarchical clustering, network-based key driver analysis, and diffusion scoring were performed, and the association of enriched pathways and clinical parameters was evaluated.

Results

A total of 2,967 upregulated DEGs was filtered during the comparison of gene expression profiles of lung tissues between IPF patients and healthy controls. The core molecular network of IPF featured p53 signaling pathway and cellular senescence. IPF patients were classified into two molecular subgroups (C1, C2) via unsupervised clustering. C1 was more enriched in the p53 signaling pathway and ciliated cells and presented a worse prognostic score, while C2 was more enriched for cellular senescence, profibrosing pathways, and alveolar epithelial cells. The p53 signaling pathway was closely correlated with a decline in forced vital capacity and carbon monoxide diffusion capacity and with the activation of cellular senescence. CDK1/2, CKDNA1A, CSNK1A1, HDAC1/2, FN1, VCAM1, and ITGA4 were the key regulators as evidence by high diffusion scores in the disease module. Currently available and investigational drugs showed differential diffusion scores in terms of their target molecules.

Conclusions

An integrative molecular analysis of IPF lungs identified two molecular subgroups with distinct pathobiological characteristics and clinical prognostic scores. Inhibition against CDKs or HDACs showed great promise for controlling lung fibrosis. This approach provided molecular insights to support the prediction of clinical outcomes and the selection of therapeutic targets in IPF patients.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12890-021-01749-3.

Collapse

Testa D, Jourde-Chiche N, Mancini J, Varriale P, Radoszycki L, Chiche L. Unsupervised clustering analysis of data from an online community to identify lupus patient profiles with regards to treatment preferences. Lupus 2021;30:1837-1843. [PMID: 34313509 DOI: 10.1177/09612033211033977] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Abstract

OBJECTIVE

Lupus is a chronic complex autoimmune disease. Non-adherence to treatment can affect patient outcomes. Considering patients' preferences into medical decisions may increase acceptance to their medication. The PREFERLUP study used unsupervised clustering analysis to identify profiles of patients with similar treatment preferences in an online community of French lupus patients.

METHODS

An online survey was conducted in adult lupus patients from the Carenity community between August 2018 and April 2019. Multiple Correspondence Analysis (MCA) was used with three unsupervised clustering methods (hierarchical, kmeans and partitioning around medoids). Several indicators (measure of connectivity, Dunn index and Silhouette width) were used to select the best clustering algorithm and choose the number of clusters.

RESULTS

The 268 participants were mostly female (96%), with a mean age of 44.3 years 83% fulfilled the American College of Rheumatology (ACR) self-reported diagnostic criteria for systemic lupus erythematosus. Overall, the preferred route of administration was oral (62%) and the most important feature of an ideal drug was a low risk of side-effects (32%). Hierarchical clustering identified three clusters. Cluster 1 (59%) comprised patients with few comorbidities and a poor ability to identify oncoming flares; 84% of these patients desired oral treatments with limited side-effects. Cluster 2 (13%) comprised younger patients, who had already participated in a clinical trial, were willing to use implants and valued the compatibility of treatments with pregnancy. Cluster 3 (28%) comprised patients with a longer lupus duration, poorer control of the disease and more comorbidities; these patients mainly valued implants and injections and expected a reduction of corticosteroid intake.

CONCLUSIONS

Different profiles of lupus patients were identified according to their drug preferences. These clusters could help physicians tailor their therapeutic proposals to take into account individual patient preferences, which could have a positive impact on treatment acceptance and then adherence. The study highlights the value of data acquired directly from patient communities.

Collapse

Holmberg-Thyden S, Grønbæk K, Gang AO, El Fassi D, Hadrup SR. A user's guide to multicolor flow cytometry panels for comprehensive immune profiling. Anal Biochem 2021;627:114210. [PMID: 34033799 DOI: 10.1016/j.ab.2021.114210] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2021] [Accepted: 04/13/2021] [Indexed: 12/12/2022]

Chakraborty S, Mali K. A morphology-based radiological image segmentation approach for efficient screening of COVID-19. Biomed Signal Process Control 2021;69:102800. [PMID: 34031636 PMCID: PMC8133384 DOI: 10.1016/j.bspc.2021.102800] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Revised: 05/09/2021] [Accepted: 05/15/2021] [Indexed: 12/22/2022]

Chen Z, Yang Z, Yuan X, Zhang X, Hao P. scSensitiveGeneDefine: sensitive gene detection in single-cell RNA sequencing data by Shannon entropy. BMC Bioinformatics 2021;22:211. [PMID: 33888056 PMCID: PMC8063398 DOI: 10.1186/s12859-021-04136-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 03/26/2021] [Indexed: 11/26/2022] Open

Corbi A, Burgos D. Connection between sleeping patterns and cognitive deterioration in women with Alzheimer's disease. Sleep Breath 2021. [PMID: 33792886 DOI: 10.1007/s11325-021-02327-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Revised: 02/01/2021] [Accepted: 02/12/2021] [Indexed: 10/21/2022]

Russo ET, Laio A, Punta M. Density Peak clustering of protein sequences associated to a Pfam clan reveals clear similarities and interesting differences with respect to manual family annotation. BMC Bioinformatics 2021;22:121. [PMID: 33711918 PMCID: PMC7955657 DOI: 10.1186/s12859-021-04013-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Accepted: 02/09/2021] [Indexed: 11/24/2022] Open

Abstract

Background

The identification of protein families is of outstanding practical importance for in silico protein annotation and is at the basis of several bioinformatic resources. Pfam is possibly the most well known protein family database, built in many years of work by domain experts with extensive use of manual curation. This approach is generally very accurate, but it is quite time consuming and it may suffer from a bias generated from the hand-curation itself, which is often guided by the available experimental evidence.

Results

We introduce a procedure that aims to identify automatically putative protein families. The procedure is based on Density Peak Clustering and uses as input only local pairwise alignments between protein sequences. In the experiment we present here, we ran the algorithm on about 4000 full-length proteins with at least one domain classified by Pfam as belonging to the Pseudouridine synthase and Archaeosine transglycosylase (PUA) clan. We obtained 71 automatically-generated sequence clusters with at least 100 members. While our clusters were largely consistent with the Pfam classification, showing good overlap with either single or multi-domain Pfam family architectures, we also observed some inconsistencies. The latter were inspected using structural and sequence based evidence, which suggested that the automatic classification captured evolutionary signals reflecting non-trivial features of protein family architectures. Based on this analysis we identified a putative novel pre-PUA domain as well as alternative boundaries for a few PUA or PUA-associated families. As a first indication that our approach was unlikely to be clan-specific, we performed the same analysis on the P53 clan, obtaining comparable results.

Conclusions

The clustering procedure described in this work takes advantage of the information contained in a large set of pairwise alignments and successfully identifies a set of putative families and family architectures in an unsupervised manner. Comparison with the Pfam classification highlights significant overlap and points to interesting differences, suggesting that our new algorithm could have potential in applications related to automatic protein classification. Testing this hypothesis, however, will require further experiments on large and diverse sequence datasets.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12859-021-04013-x.

Collapse

Pothula KR, Geraets JA, Ferber II, Schröder GF. Clustering polymorphs of tau and IAPP fibrils with the CHEP algorithm. Prog Biophys Mol Biol 2021;160:16-25. [PMID: 33556421 DOI: 10.1016/j.pbiomolbio.2020.11.007] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 11/16/2020] [Accepted: 11/24/2020] [Indexed: 01/03/2023]

Zhang L, Zhang M, Chen X, He Y, Chen R, Zhang J, Huang J, Ouyang C, Shi G. Identification of the tubulointerstitial infiltrating immune cell landscape and immune marker related molecular patterns in lupus nephritis using bioinformatics analysis. Ann Transl Med 2021;8:1596. [PMID: 33437795 PMCID: PMC7791250 DOI: 10.21037/atm-20-7507] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

Kirk JM, Sprague D, Calabrese JM. Classification of Long Noncoding RNAs by k-mer Content. Methods Mol Biol 2021;2254:41-60. [PMID: 33326069 PMCID: PMC7850294 DOI: 10.1007/978-1-0716-1158-6_4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]

Li J, Jiang W, Han H, Liu J, Liu B, Wang Y. ScGSLC: An unsupervised graph similarity learning framework for single-cell RNA-seq data clustering. Comput Biol Chem 2020;90:107415. [PMID: 33307360 DOI: 10.1016/j.compbiolchem.2020.107415] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 09/30/2020] [Accepted: 10/06/2020] [Indexed: 01/18/2023]

Salmanpour MR, Shamsaei M, Saberi A, Hajianfar G, Soltanian-Zadeh H, Rahmim A. Robust identification of Parkinson's disease subtypes using radiomics and hybrid machine learning. Comput Biol Med 2021;129:104142. [PMID: 33260101 DOI: 10.1016/j.compbiomed.2020.104142] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 11/20/2020] [Accepted: 11/21/2020] [Indexed: 12/21/2022]

Abstract

OBJECTIVES

It is important to subdivide Parkinson's disease (PD) into subtypes, enabling potentially earlier disease recognition and tailored treatment strategies. We aimed to identify reproducible PD subtypes robust to variations in the number of patients and features.

METHODS

We applied multiple feature-reduction and cluster-analysis methods to cross-sectional and timeless data, extracted from longitudinal datasets (years 0, 1, 2 & 4; Parkinson's Progressive Marker Initiative; 885 PD/163 healthy-control visits; 35 datasets with combinations of non-imaging, conventional-imaging, and radiomics features from DAT-SPECT images). Hybrid machine-learning systems were constructed invoking 16 feature-reduction algorithms, 8 clustering algorithms, and 16 classifiers (C-index clustering evaluation used on each trajectory). We subsequently performed: i) identification of optimal subtypes, ii) multiple independent tests to assess reproducibility, iii) further confirmation by a statistical approach, iv) test of reproducibility to the size of the samples.

RESULTS

When using no radiomics features, the clusters were not robust to variations in features, whereas, utilizing radiomics information enabled consistent generation of clusters through ensemble analysis of trajectories. We arrived at 3 distinct subtypes, confirmed using the training and testing process of k-means, as well as Hotelling's T2 test. The 3 identified PD subtypes were 1) mild; 2) intermediate; and 3) severe, especially in terms of dopaminergic deficit (imaging), with some escalating motor and non-motor manifestations.

CONCLUSION

Appropriate hybrid systems and independent statistical tests enable robust identification of 3 distinct PD subtypes. This was assisted by utilizing radiomics features from SPECT images (segmented using MRI). The PD subtypes provided were robust to the number of the subjects, and features.

Collapse

Blumenberg L, Ruggles KV. Hypercluster: a flexible tool for parallelized unsupervised clustering optimization. BMC Bioinformatics 2020;21:428. [PMID: 32993491 PMCID: PMC7525959 DOI: 10.1186/s12859-020-03774-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2020] [Accepted: 09/22/2020] [Indexed: 12/24/2022] Open

Chung NC, Choi H, Wang D, Mirza B, Pelletier AR, Sigdel D, Wang W, Ping P. Identifying temporal molecular signatures underlying cardiovascular diseases: A data science platform. J Mol Cell Cardiol 2020;145:54-58. [PMID: 32504647 PMCID: PMC7583079 DOI: 10.1016/j.yjmcc.2020.05.020] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Revised: 05/18/2020] [Accepted: 05/31/2020] [Indexed: 11/26/2022]

Abstract

OBJECTIVE

During cardiovascular disease progression, molecular systems of myocardium (e.g., a proteome) undergo diverse and distinct changes. Dynamic, temporally-regulated alterations of individual molecules underlie the collective response of the heart to pathological drivers and the ultimate development of pathogenesis. Advances in high-throughput omics technologies have enabled cost-effective, temporal profiling of targeted systems in animal models of human diseases. However, computational analysis of temporal patterns from omics data remains challenging. In particular, bioinformatic pipelines involving unsupervised statistical approaches to support cardiovascular investigations are lacking, which hinders one's ability to extract biomedical insights from these complex datasets.

APPROACH AND RESULTS

We developed a non-parametric data analysis platform to resolve computational challenges unique to temporal omics datasets. Our platform consists of three modules. Module I preprocesses the temporal data using either cubic splines or principal component analysis (PCA), and it simultaneously accomplishes the tasks on missing data imputation and denoising. Module II performs an unsupervised classification by K-means or hierarchical clustering. Module III evaluates and identifies biological entities (e.g., molecular events) that exhibit strong associations to specific temporal patterns. The jackstraw method for cluster membership has been applied to estimate p-values and posterior inclusion probabilities (PIPs), both of which guided feature selection. To demonstrate the utility of the analysis platform, we employed a temporal proteomics dataset that captured the proteome-wide dynamics of oxidative stress induced post-translational modifications (O-PTMs) in mouse hearts undergoing isoproterenol (ISO)-induced hypertrophy.

CONCLUSION

We have created a platform, CV.Signature.TCP, to identify distinct temporal clusters in omics datasets. We presented a cardiovascular use case to demonstrate its utility in unveiling biological insights underlying O-PTM regulations in cardiac remodeling. This platform is implemented in an open source R package (https://github.com/UCLA-BD2K/CV.Signature.TCP).

Collapse

Affiliation(s)

Neo Christopher Chung NHLBI Integrated Cardiovascular Data Science Training Program at University of California (UCLA), Los Angeles, USA; Departments of Physiology and Medicine (Cardiology) at UCLA School of Medicine, USA; Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics University of Warsaw, Warsaw, Poland.
Howard Choi NHLBI Integrated Cardiovascular Data Science Training Program at University of California (UCLA), Los Angeles, USA; Departments of Physiology and Medicine (Cardiology) at UCLA School of Medicine, USA; Bioinformatics and Medical Informatics at UCLA School of Engineering, Los Angeles, CA 90095, USA; Scalable Analytics Institute (ScAi) at UCLA School of Engineering, Los Angeles, CA 90095, USA
Ding Wang Departments of Physiology and Medicine (Cardiology) at UCLA School of Medicine, USA
Bilal Mirza Departments of Physiology and Medicine (Cardiology) at UCLA School of Medicine, USA
Alexander R Pelletier NHLBI Integrated Cardiovascular Data Science Training Program at University of California (UCLA), Los Angeles, USA; Bioinformatics and Medical Informatics at UCLA School of Engineering, Los Angeles, CA 90095, USA; Scalable Analytics Institute (ScAi) at UCLA School of Engineering, Los Angeles, CA 90095, USA
Dibakar Sigdel NHLBI Integrated Cardiovascular Data Science Training Program at University of California (UCLA), Los Angeles, USA; Departments of Physiology and Medicine (Cardiology) at UCLA School of Medicine, USA
Wei Wang NHLBI Integrated Cardiovascular Data Science Training Program at University of California (UCLA), Los Angeles, USA; Bioinformatics and Medical Informatics at UCLA School of Engineering, Los Angeles, CA 90095, USA; Scalable Analytics Institute (ScAi) at UCLA School of Engineering, Los Angeles, CA 90095, USA
Peipei Ping NHLBI Integrated Cardiovascular Data Science Training Program at University of California (UCLA), Los Angeles, USA; Departments of Physiology and Medicine (Cardiology) at UCLA School of Medicine, USA; Bioinformatics and Medical Informatics at UCLA School of Engineering, Los Angeles, CA 90095, USA; Scalable Analytics Institute (ScAi) at UCLA School of Engineering, Los Angeles, CA 90095, USA.

Collapse

Sardaar S, Qi B, Dionne-Laporte A, Rouleau GA, Rabbany R, Trakadis YJ. Machine learning analysis of exome trios to contrast the genomic architecture of autism and schizophrenia. BMC Psychiatry 2020;20:92. [PMID: 32111185 DOI: 10.1186/s12888-020-02503-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Accepted: 02/17/2020] [Indexed: 12/23/2022] Open

Kaku H, Ozturk M, Viswanathan A, Shahed J, Sheth SA, Kumar S, Ince NF. Unsupervised clustering reveals spatially varying single neuronal firing patterns in the subthalamic nucleus of patients with Parkinson's disease. Clin Park Relat Disord 2020;3:100032. [PMID: 34316618 DOI: 10.1016/j.prdoa.2019.100032] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 10/29/2019] [Accepted: 12/17/2019] [Indexed: 11/30/2022] Open

Abstract

Introduction

Subthalamic nucleus (STN) is an effective target for deep brain stimulation (DBS) to reduce the motor symptoms of Parkinson's disease (PD). It is important to identify firing patterns within the structure for a better understanding of the electro-pathophysiology of the disease. Using recently established metrics, our study aims to autonomously identify the discharge patterns of individual cells and examine their spatial distribution within the STN.

Methods

We recorded single unit activity (SUA) from 12 awake PD patients undergoing a standard clinical DBS surgery. Three extracted features from raw SUA (local variation, bursting index and prominence of peak) were used with k-means clustering to achieve the aforementioned unsupervised grouping of firing patterns.

Results

279 neurons were isolated and four distinct firing patterns were identified across patients: tonic (11%), irregular (55%), periodic (9%) and non-periodic bursts (25%). The mean firing rates for irregular discharges were significantly lower (p < 0.05) than the rest. Tonic firings were significantly ventral (p < 0.05) while periodic (p < 0.05) and non-periodic (p < 0.01) bursts were dorsal. The percentage of periodically bursting neurons in dorsal region and entire STN were significantly correlated with off state UPDRS tremor scores (r = 0.51, p = 0.04) and improvement in bradykinesia and rigidity (r = 0.57, p = 0.02) respectively.

Conclusion

Strengthening the application of unsupervised clustering for firing patterns of individual cells, this study shows a unique spatial affinity of tonic activity towards the ventral and bursting activity towards the dorsal region of STN in PD patients. This spatial preference, together with the correlation of clinical scores, can provide a clue towards understanding Parkinsonian symptom generation.

Collapse

Min HK, Moon SJ, Park KS, Kim KJ. Integrated systems analysis of salivary gland transcriptomics reveals key molecular networks in Sjögren's syndrome. Arthritis Res Ther 2019;21:294. [PMID: 31856901 PMCID: PMC6921432 DOI: 10.1186/s13075-019-2082-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Accepted: 12/04/2019] [Indexed: 02/08/2023] Open

Green MJ, Girshkin L, Kremerskothen K, Watkeys O, Quidé Y. A Systematic Review of Studies Reporting Data-Driven Cognitive Subtypes across the Psychosis Spectrum. Neuropsychol Rev 2020;30:446-60. [PMID: 31853717 DOI: 10.1007/s11065-019-09422-7] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Accepted: 12/02/2019] [Indexed: 10/25/2022]

Abstract

The delineation of cognitive subtypes of schizophrenia and bipolar disorder may offer a means of determining shared genetic markers and neuropathology among individuals with these conditions. We systematically reviewed the evidence from published studies reporting the use of data-driven (i.e., unsupervised) clustering methods to delineate cognitive subtypes among adults diagnosed with schizophrenia, schizoaffective disorder, or bipolar disorder. We reviewed 24 studies in total, contributing data to 13 analyses of schizophrenia spectrum patients, 8 analyses of bipolar disorder, and 5 analyses of mixed samples of schizophrenia and bipolar disorder participants. Studies of bipolar disorder most consistently revealed a 3-cluster solution, comprising a subgroup with 'near-normal' (cognitively spared) cognition and two other subgroups demonstrating graded deficits across cognitive domains. In contrast, there was no clear consensus regarding the number of cognitive subtypes among studies of cognitive subtypes in schizophrenia, while four of the five studies of mixed diagnostic groups reported a 4-cluster solution. Common to all cluster solutions was a severe cognitive deficit subtype with cognitive impairments of moderate to large effect size relative to healthy controls. Our review highlights several key factors (e.g., symptom profile, sample size, statistical procedures, and cognitive domains examined) that may influence the results of data-driven clustering methods, and which were largely inconsistent across the studies reviewed. This synthesis of findings suggests caution should be exercised when interpreting the utility of particular cognitive subtypes for biological investigation, and demonstrates much heterogeneity among studies using unsupervised clustering approaches to cognitive subtyping within and across the psychosis spectrum.

Collapse

Qian J, Comin M. MetaCon: unsupervised clustering of metagenomic contigs with probabilistic k-mers statistics and coverage. BMC Bioinformatics 2019;20:367. [PMID: 31757198 PMCID: PMC6873667 DOI: 10.1186/s12859-019-2904-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Accepted: 05/15/2019] [Indexed: 11/30/2022] Open

Zhang Y, Poler SM, Li J, Abedi V, Pendergrass SA, Williams MS, Lee MTM. Dissecting genetic factors affecting phenylephrine infusion rates during anesthesia: a genome-wide association study employing EHR data. BMC Med 2019;17:168. [PMID: 31455332 PMCID: PMC6712853 DOI: 10.1186/s12916-019-1405-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 08/07/2019] [Indexed: 02/04/2023] Open

Abstract

BACKGROUND

The alpha-adrenergic agonist phenylephrine is often used to treat hypotension during anesthesia. In clinical situations, low blood pressure may require prompt intervention by intravenous bolus or infusion. Differences in responsiveness to phenylephrine treatment are commonly observed in clinical practice. Candidate gene studies indicate genetic variants may contribute to this variable response.

METHODS

Pharmacological and physiological data were retrospectively extracted from routine clinical anesthetic records. Response to phenylephrine boluses could not be reliably assessed, so infusion rates were used for analysis. Unsupervised k-means clustering was conducted on clean data containing 4130 patients based on phenylephrine infusion rate and blood pressure parameters, to identify potential phenotypic subtypes. Genome-wide association studies (GWAS) were performed against average infusion rates in two cohorts: phase I (n = 1205) and phase II (n = 329). Top genetic variants identified from the meta-analysis were further examined to see if they could differentiate subgroups identified by k-means clustering.

RESULTS

Three subgroups of patients with different response to phenylephrine were clustered and characterized: resistant (high infusion rate yet low mean systolic blood pressure (SBP)), intermediate (low infusion rate and low SBP), and sensitive (low infusion rate with high SBP). Differences among clusters were tabulated to assess for possible confounding influences. Comorbidity hierarchical clustering showed the resistant group had a higher prevalence of confounding factors than the intermediate and sensitive groups although overall prevalence is below 6%. Three loci with P < 1 × 10^-6 were associated with phenylephrine infusion rate. Only rs11572377 with P = 6.09 × 10^-7, a 3'UTR variant of EDN2, encoding a secretory vasoconstricting peptide, could significantly differentiate resistant from sensitive groups (P = 0.015 and 0.018 for phase I and phase II) or resistant from pooled sensitive and intermediate groups (P = 0.047 and 0.018).

CONCLUSIONS

Retrospective analysis of electronic anesthetic records data coupled with the genetic data identified genetic variants contributing to variable sensitivity to phenylephrine infusion during anesthesia. Although the identified top gene, EDN2, has robust biological relevance to vasoconstriction by binding to endothelin type A (ET_A) receptors on arterial smooth muscle cells, further functional as well as replication studies are necessary to confirm this association.

Collapse

Estiri H, Klann JG, Murphy SN. A clustering approach for detecting implausible observation values in electronic health records data. BMC Med Inform Decis Mak 2019;19:142. [PMID: 31337390 DOI: 10.1186/s12911-019-0852-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Accepted: 06/26/2019] [Indexed: 12/03/2022] Open

Abstract

Background

Identifying implausible clinical observations (e.g., laboratory test and vital sign values) in Electronic Health Record (EHR) data using rule-based procedures is challenging. Anomaly/outlier detection methods can be applied as an alternative algorithmic approach to flagging such implausible values in EHRs.

Methods

The primary objectives of this research were to develop and test an unsupervised clustering-based anomaly/outlier detection approach for detecting implausible observations in EHR data as an alternative algorithmic solution to the existing procedures. Our approach is built upon two underlying hypotheses that, (i) when there are large number of observations, implausible records should be sparse, and therefore (ii) if these data are clustered properly, clusters with sparse populations should represent implausible observations. To test these hypotheses, we applied an unsupervised clustering algorithm to EHR observation data on 50 laboratory tests from Partners HealthCare. We tested different specifications of the clustering approach and computed confusion matrix indices against a set of silver-standard plausibility thresholds. We compared the results from the proposed approach with conventional anomaly detection (CAD) approaches, including standard deviation and Mahalanobis distance.

Results

We found that the clustering approach produced results with exceptional specificity and high sensitivity. Compared with the conventional anomaly detection approaches, our proposed clustering approach resulted in significantly smaller number of false positive cases.

Conclusion

Our contributions include (i) a clustering approach for identifying implausible EHR observations, (ii) evidence that implausible observations are sparse in EHR laboratory test results, (iii) a parallel implementation of the clustering approach on i2b2 star schema, and (3) a set of silver-standard plausibility thresholds for 50 laboratory tests that can be used in other studies for validation. The proposed algorithmic solution can augment human decisions to improve data quality. Therefore, a workflow is needed to complement the algorithm’s job and initiate necessary actions that need to be taken in order to improve the quality of data.

Electronic supplementary material

The online version of this article (10.1186/s12911-019-0852-6) contains supplementary material, which is available to authorized users.

Collapse