1
|
Choudalakis S, Kastis GA, Dikaios N. Intra-clustering analysis reveals tissue-specific mutational patterns. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 263:108681. [PMID: 40050208 DOI: 10.1016/j.cmpb.2025.108681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2024] [Revised: 02/06/2025] [Accepted: 02/18/2025] [Indexed: 03/14/2025]
Abstract
BACKGROUND AND OBJECTIVE The identification of tissue-specific mutational patterns associated with cancer is challenging due to the low frequency of certain mutations and the high variability among tumors within the same cancer type. To address the inter-tumoral heterogeneity issue, our study aims to uncover infrequent mutational patterns by proposing a novel intra-clustering analysis. METHODS A Network Graph of 8303 patients and 198 genes was constructed using single-point-mutation data from The Cancer Genome Atlas (TCGA). Patient-gene groups were retrieved with the parallel use of two separate methodologies based on the: (a) Barber's modularity index, and (b) network dynamics. An intra-clustering analysis was employed to explore the patterns within smaller patient subgroups in two phases: i) to determine the significant presence of a gene with a cancer type using the Fisher's exact test and ii) to determine gene-to-gene patterns using multiple correspondence analysis and DISCOVER. The results are followed by a Benjamini-Hochberg false discovery rate of 5%. RESULTS This analysis was applied over 24 statistically meaningful groups of 2619 patients spanning 21 cancer types and it recovered 42 mutational patterns that are not reported in the TCGA consortium publications. Notably, our findings: (i) suggest that AMER1 mutations are a putative separative element between colon and rectal adenocarcinomas, (ii) highlight the significant presence of RAC1 in head and neck squamous cell carcinoma (iii) suggest that EP300 mutations in head and neck squamous cell carcinoma are irrelevant of the HPV status of the patients and (iv) show that mutational-based clusters can contain patients with contrasting genetic alterations. CONCLUSIONS The proposed intra-clustering analysis extracted statistically significant relationships within clusters, uncovering putative clinically relevant connections and disentangling mutational heterogeneity.
Collapse
Affiliation(s)
- Stamatis Choudalakis
- Mathematics Research Center, Academy of Athens, 4, Soranou Efesiou str., 11527 Athens, Greece; Medical School of Athens, National and Kapodistrian University of Athens, 75, Mikras Asias str., 11527 Athens, Greece.
| | - George A Kastis
- Mathematics Research Center, Academy of Athens, 4, Soranou Efesiou str., 11527 Athens, Greece.
| | - Nikolaos Dikaios
- Mathematics Research Center, Academy of Athens, 4, Soranou Efesiou str., 11527 Athens, Greece.
| |
Collapse
|
2
|
Yoshida K, Toyoizumi T. A biological model of nonlinear dimensionality reduction. SCIENCE ADVANCES 2025; 11:eadp9048. [PMID: 39908371 PMCID: PMC11801247 DOI: 10.1126/sciadv.adp9048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 01/06/2025] [Indexed: 02/07/2025]
Abstract
Obtaining appropriate low-dimensional representations from high-dimensional sensory inputs in an unsupervised manner is essential for straightforward downstream processing. Although nonlinear dimensionality reduction methods such as t-distributed stochastic neighbor embedding (t-SNE) have been developed, their implementation in simple biological circuits remains unclear. Here, we develop a biologically plausible dimensionality reduction algorithm compatible with t-SNE, which uses a simple three-layer feedforward network mimicking the Drosophila olfactory circuit. The proposed learning rule, described as three-factor Hebbian plasticity, is effective for datasets such as entangled rings and MNIST, comparable to t-SNE. We further show that the algorithm could be working in olfactory circuits in Drosophila by analyzing the multiple experimental data in previous studies. We lastly suggest that the algorithm is also beneficial for association learning between inputs and rewards, allowing the generalization of these associations to other inputs not yet associated with rewards.
Collapse
Affiliation(s)
- Kensuke Yoshida
- Laboratory for Neural Computation and Adaptation, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
- Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Taro Toyoizumi
- Laboratory for Neural Computation and Adaptation, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
- Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| |
Collapse
|
3
|
Vagenas G, Theodoropoulos C, Moutaouakil S, Benaissa H, Fendane Y, El Rharras A, Oikonomou A, Stoumboudi MT, Dimitriou E, Ghamizi M, Stamou A. Ecohydraulics-based environmental flow assessment in two arid North African rivers. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 954:176373. [PMID: 39299311 DOI: 10.1016/j.scitotenv.2024.176373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Revised: 09/15/2024] [Accepted: 09/16/2024] [Indexed: 09/22/2024]
Abstract
North Africa is among the most water-stressed regions in the world; still, the habitat requirements of its freshwater biota are largely unknown. In this study, (i) we developed habitat suitability curves (HSCs) for freshwater macroinvertebrates in two poorly studied, regulated North African rivers (Ziz and Oum Er-Rbia), and (ii) assessed environmental flows downstream of each river dam by incorporating the HSCs in two-dimensional ecohydraulic models. We demonstrate a low-cost sampling methodology combined with freely distributed ecohydraulic modeling software. The results showed that macroinvertebrates in the arid-desert Ziz River could tolerate a wide range of habitats in terms of flow velocity and water depth compared to the arid-steppe Oum Er-Rbia River, probably due to their adaptation to extreme (arid-desert) environmental conditions. Optimal environmental flows downstream of the Al Hassan Addakhil (Ziz River) and the Al Massira (Oum Er-Rbia River) dams were 1 m3/s and 2 m3/s, respectively. However, environmental flows at 0.5 m3/s and 1 m3/s, respectively, could still maintain sustainable freshwater biota downstream of the dams. The results further highlight the critical status of the Ziz River, which was completely dry, and the alarming status of the Oum Er-Rbia River due to the significant reduction in the water levels of the Al Massira Dam. In a continuously changing climate, we suggest that the proposed environmental flows should be immediately delivered to prevent droughts and ensure healthy freshwater communities downstream of the dams, within a basin-wide freshwater management framework. In this water scarce region, more research is necessary to increase ecological awareness about these understudied freshwater systems and achieve a balance between human needs and ecosystem requirements.
Collapse
Affiliation(s)
- G Vagenas
- Institute of Marine Biological Resources and Inland Waters, Hellenic Centre for Marine Research, 46.7km Athens-Sounio Av., 19013 Anavissos, Greece; School of Civil Engineering, Water Resources and Environmental Engineering, National Technical University of Athens, Greece; Natural History Museum of Marrakech, Cadi Ayyad University, Marrakech, Morocco.
| | - C Theodoropoulos
- Institute of Marine Biological Resources and Inland Waters, Hellenic Centre for Marine Research, 46.7km Athens-Sounio Av., 19013 Anavissos, Greece; School of Civil Engineering, Water Resources and Environmental Engineering, National Technical University of Athens, Greece
| | - S Moutaouakil
- Natural History Museum of Marrakech, Cadi Ayyad University, Marrakech, Morocco; Department of Biology, School of Science Semlalia, Cadi Ayyad University, Marrakech, Morocco
| | - H Benaissa
- Natural History Museum of Marrakech, Cadi Ayyad University, Marrakech, Morocco; Department of Biology, School of Science Semlalia, Cadi Ayyad University, Marrakech, Morocco; Institute of Technology of Maritime Fisheries, Al Hoceima, Morocco
| | - Y Fendane
- Natural History Museum of Marrakech, Cadi Ayyad University, Marrakech, Morocco
| | - A El Rharras
- Natural History Museum of Marrakech, Cadi Ayyad University, Marrakech, Morocco; Department of Biology, School of Science Semlalia, Cadi Ayyad University, Marrakech, Morocco
| | - A Oikonomou
- Institute of Marine Biological Resources and Inland Waters, Hellenic Centre for Marine Research, 46.7km Athens-Sounio Av., 19013 Anavissos, Greece
| | - M Th Stoumboudi
- Institute of Marine Biological Resources and Inland Waters, Hellenic Centre for Marine Research, 46.7km Athens-Sounio Av., 19013 Anavissos, Greece
| | - E Dimitriou
- Institute of Marine Biological Resources and Inland Waters, Hellenic Centre for Marine Research, 46.7km Athens-Sounio Av., 19013 Anavissos, Greece
| | - M Ghamizi
- Natural History Museum of Marrakech, Cadi Ayyad University, Marrakech, Morocco; Department of Biology, School of Science Semlalia, Cadi Ayyad University, Marrakech, Morocco
| | - A Stamou
- School of Civil Engineering, Water Resources and Environmental Engineering, National Technical University of Athens, Greece
| |
Collapse
|
4
|
Powers SD, Schmidt KM, Killelea A, Strumpf A, McManus KA. Clustering affordable care act qualified health plans to understand how and where insurance facilitates or impedes access to HIV prevention. AIDS Res Ther 2024; 21:83. [PMID: 39563344 PMCID: PMC11575131 DOI: 10.1186/s12981-024-00674-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Accepted: 11/14/2024] [Indexed: 11/21/2024] Open
Abstract
BACKGROUND With access to and uptake of pre-exposure prophylaxis (PrEP), the United States can prevent new HIV infections. To end the HIV epidemic, health insurance plans must facilitate access to comprehensive preventive care benefits. Since plan benefit designs vary considerably by plan, it is difficult to systematically determine plans that facilitate and restrict preventive services for PrEP. METHODS We applied an unsupervised machine learning method to cluster 17,061 Qualified Health Plans offered to individuals. We examined the clusters to draw conclusions about the types of benefits insurance companies tend to group together in plans. Then we analyzed the geographic distribution of those clusters across the United States to assess geographic inequities in access to HIV preventive care. RESULTS Our method uncovered three cohesive clusters of plans. Plans in Cluster 1: the least restrictive cluster, facilitate access to preventive care using copays over coinsurance on almost all benefits; Cluster 2: the moderately restrictive cluster, plans cover HIV prevention benefits with copays but restrict access to general health benefits with coinsurance; and Cluster 3: the most restrictive cluster, plans cover almost all benefits using coinsurance. Overall, increased prior authorization requirements tend to accompany reductions in out-of-pocket costs. Examining the geographic plan distribution, states with at least one rating area where at least 75% of plans offered are in the most restrictive cluster included: Georgia, Illinois, Missouri, Oklahoma, Texas, Virginia, and Wyoming. CONCLUSIONS Insurance plan design is complex. To address the ambitious call to end the HIV epidemic in this country, plans should also take into account both public health and health equity factors to create plan designs that ensure access to critical preventive services for people who need them most. Addressing the growing disparities in PrEP access along racial and ethnic lines should be a national priority, and federal and state insurance regulators as well as insurance plans themselves should be part of the conversation about how to ensure people who would benefit from PrEP can access it. Better state/federal regulation of plan design to ensure access is consistent, equitable, and based on clinical recommendations will reduce the variability across plan designs.
Collapse
Affiliation(s)
- Samuel D Powers
- Division of Infectious Diseases and International Health, Department of Medicine, University of Virginia, Charlottesville, VA, USA
- Department of Psychology, University of Virginia, Charlottesville, VA, USA
| | - Karen M Schmidt
- Department of Psychology, University of Virginia, Charlottesville, VA, USA
| | - Amy Killelea
- Division of Infectious Diseases and International Health, Department of Medicine, University of Virginia, Charlottesville, VA, USA
| | - Andrew Strumpf
- Division of Infectious Diseases and International Health, Department of Medicine, University of Virginia, Charlottesville, VA, USA
| | - Kathleen A McManus
- Division of Infectious Diseases and International Health, Department of Medicine, University of Virginia, Charlottesville, VA, USA.
| |
Collapse
|
5
|
Breimann S, Frishman D. AAclust: k-optimized clustering for selecting redundancy-reduced sets of amino acid scales. BIOINFORMATICS ADVANCES 2024; 4:vbae165. [PMID: 39544628 PMCID: PMC11562964 DOI: 10.1093/bioadv/vbae165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 09/10/2024] [Accepted: 10/23/2024] [Indexed: 11/17/2024]
Abstract
Summary Amino acid scales are crucial for sequence-based protein prediction tasks, yet no gold standard scale set or simple scale selection methods exist. We developed AAclust, a wrapper for clustering models that require a pre-defined number of clusters k, such as k-means. AAclust obtains redundancy-reduced scale sets by clustering and selecting one representative scale per cluster, where k can either be optimized by AAclust or defined by the user. The utility of AAclust scale selections was assessed by applying machine learning models to 24 protein benchmark datasets. We found that top-performing scale sets were different for each benchmark dataset and significantly outperformed scale sets used in previous studies. Noteworthy is the strong dependence of the model performance on the scale set size. AAclust enables a systematic optimization of scale-based feature engineering in machine learning applications. Availability and implementation The AAclust algorithm is part of AAanalysis, a Python-based framework for interpretable sequence-based protein prediction, which is documented and accessible at https://aaanalysis.readthedocs.io/en/latest and https://github.com/breimanntools/aaanalysis.
Collapse
Affiliation(s)
- Stephan Breimann
- Department of Bioinformatics, School of Life Sciences, Technical University of Munich (TUM), Freising, 85354, Germany
- Division of Metabolic Biochemistry, Biomedical Center (BMC), LMU Munich, Munich, 81377, Germany
- Biochemistry of γ-Secretase, German Center for Neurodegenerative Diseases (DZNE), Munich, 81377, Germany
| | - Dmitrij Frishman
- Department of Bioinformatics, School of Life Sciences, Technical University of Munich (TUM), Freising, 85354, Germany
| |
Collapse
|
6
|
Breimann S, Kamp F, Steiner H, Frishman D. AAontology: An Ontology of Amino Acid Scales for Interpretable Machine Learning. J Mol Biol 2024; 436:168717. [PMID: 39053689 DOI: 10.1016/j.jmb.2024.168717] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 07/15/2024] [Accepted: 07/19/2024] [Indexed: 07/27/2024]
Abstract
Amino acid scales are crucial for protein prediction tasks, many of them being curated in the AAindex database. Despite various clustering attempts to organize them and to better understand their relationships, these approaches lack the fine-grained classification necessary for satisfactory interpretability in many protein prediction problems. To address this issue, we developed AAontology-a two-level classification for 586 amino acid scales (mainly from AAindex) together with an in-depth analysis of their relations-using bag-of-word-based classification, clustering, and manual refinement over multiple iterations. AAontology organizes physicochemical scales into 8 categories and 67 subcategories, enhancing the interpretability of scale-based machine learning methods in protein bioinformatics. Thereby it enables researchers to gain a deeper biological insight. We anticipate that AAontology will be a building block to link amino acid properties with protein function and dysfunctions as well as aid informed decision-making in mutation analysis or protein drug design.
Collapse
Affiliation(s)
- Stephan Breimann
- Department of Bioinformatics, School of Life Sciences, Technical University of Munich, Freising, Germany; Ludwig-Maximilians-University Munich, Biomedical Center, Division of Metabolic Biochemistry, Munich, Germany; German Center for Neurodegenerative Diseases (DZNE), Munich, Germany
| | - Frits Kamp
- Ludwig-Maximilians-University Munich, Biomedical Center, Division of Metabolic Biochemistry, Munich, Germany
| | - Harald Steiner
- Ludwig-Maximilians-University Munich, Biomedical Center, Division of Metabolic Biochemistry, Munich, Germany; German Center for Neurodegenerative Diseases (DZNE), Munich, Germany
| | - Dmitrij Frishman
- Department of Bioinformatics, School of Life Sciences, Technical University of Munich, Freising, Germany.
| |
Collapse
|
7
|
Tiwari P, Tripathi LP. Long Non-Coding RNAs, Nuclear Receptors and Their Cross-Talks in Cancer-Implications and Perspectives. Cancers (Basel) 2024; 16:2920. [PMID: 39199690 PMCID: PMC11352509 DOI: 10.3390/cancers16162920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 07/30/2024] [Accepted: 08/14/2024] [Indexed: 09/01/2024] Open
Abstract
Long non-coding RNAs (lncRNAs) play key roles in various epigenetic and post-transcriptional events in the cell, thereby significantly influencing cellular processes including gene expression, development and diseases such as cancer. Nuclear receptors (NRs) are a family of ligand-regulated transcription factors that typically regulate transcription of genes involved in a broad spectrum of cellular processes, immune responses and in many diseases including cancer. Owing to their many overlapping roles as modulators of gene expression, the paths traversed by lncRNA and NR-mediated signaling often cross each other; these lncRNA-NR cross-talks are being increasingly recognized as important players in many cellular processes and diseases such as cancer. Here, we review the individual roles of lncRNAs and NRs, especially growth factor modulated receptors such as androgen receptors (ARs), in various types of cancers and how the cross-talks between lncRNAs and NRs are involved in cancer progression and metastasis. We discuss the challenges involved in characterizing lncRNA-NR associations and how to overcome them. Furthering our understanding of the mechanisms of lncRNA-NR associations is crucial to realizing their potential as prognostic features, diagnostic biomarkers and therapeutic targets in cancer biology.
Collapse
Affiliation(s)
- Prabha Tiwari
- Department of Microbiology and Immunology, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan
| | - Lokesh P. Tripathi
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Kanagawa, Japan
- AI Center for Health and Biomedical Research (ArCHER), National Institutes of Biomedical Innovation, Health and Nutrition, Kento Innovation Park NK Building, 3-17 Senrioka Shinmachi, Settsu 566-0002, Osaka, Japan
| |
Collapse
|
8
|
Shah M, Guo L, Xu X, Deng L, Lu K, Dong J, Zhao C, Xu J. eLIMS: Ensemble Learning-Based Spatial Segmentation of Mass Spectrometry Imaging to Explore Metabolic Heterogeneity. J Proteome Res 2024; 23:3088-3095. [PMID: 38690713 DOI: 10.1021/acs.jproteome.3c00764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
Spatial segmentation is an essential processing method for image analysis aiming to identify the characteristic suborgans or microregions from mass spectrometry imaging (MSI) data, which is critical for understanding the spatial heterogeneity of biological information and function and the underlying molecular signatures. Due to the intrinsic characteristics of MSI data including spectral nonlinearity, high-dimensionality, and large data size, the common segmentation methods lack the capability for capturing the accurate microregions associated with biological functions. Here we proposed an ensemble learning-based spatial segmentation strategy, named eLIMS, that combines a randomized unified manifold approximation and projection (r-UMAP) dimensionality reduction module for extracting significant features and an ensemble pixel clustering module for aggregating the clustering maps from r-UMAP. Three MSI datasets are used to evaluate the performance of eLIMS, including mouse fetus, human adenocarcinoma, and mouse brain. Experimental results demonstrate that the proposed method has potential in partitioning the heterogeneous tissues into several subregions associated with anatomical structure, i.e., the suborgans of the brain region in mouse fetus data are identified as dorsal pallium, midbrain, and brainstem. Furthermore, it effectively discovers critical microregions related to physiological and pathological variations offering new insight into metabolic heterogeneity.
Collapse
Affiliation(s)
- Mudassir Shah
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, Xiamen 361005, China
| | - Lei Guo
- Interdisciplinary Institute of Medical Engineering, Fuzhou University, Fuzhou 350108, China
| | - Xiangnan Xu
- School of Business and Economics, Humboldt-Universität zu Berlin, Berlin 10099, Germany
| | - Lingli Deng
- Department of Information Engineering, East China University of Technology, Nanchang 330013, China
| | - Keyi Lu
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, Xiamen 361005, China
| | - Jiyang Dong
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, Xiamen 361005, China
| | - Chao Zhao
- Bionic Sensing and Intelligence Center, Institute of Biomedical and Health Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China
| | - Jingjing Xu
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, Xiamen 361005, China
| |
Collapse
|
9
|
Flores VS, Amgarten DE, Iha BKV, Ryon KA, Danko D, Tierney BT, Mason C, da Silva AM, Setubal JC. Discovery and description of novel phage genomes from urban microbiomes sampled by the MetaSUB consortium. Sci Rep 2024; 14:7913. [PMID: 38575625 PMCID: PMC10994904 DOI: 10.1038/s41598-024-58226-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 03/26/2024] [Indexed: 04/06/2024] Open
Abstract
Bacteriophages are recognized as the most abundant members of microbiomes and have therefore a profound impact on microbial communities through the interactions with their bacterial hosts. The International Metagenomics and Metadesign of Subways and Urban Biomes Consortium (MetaSUB) has sampled mass-transit systems in 60 cities over 3 years using metagenomics, throwing light into these hitherto largely unexplored urban environments. MetaSUB focused primarily on the bacterial community. In this work, we explored MetaSUB metagenomic data in order to recover and analyze bacteriophage genomes. We recovered and analyzed 1714 phage genomes with size at least 40 kbp, from the class Caudoviricetes, the vast majority of which (80%) are novel. The recovered genomes were predicted to belong to temperate (69%) and lytic (31%) phages. Thirty-three of these genomes have more than 200 kbp, and one of them reaches 572 kbp, placing it among the largest phage genomes ever found. In general, the phages tended to be site-specific or nearly so, but 194 genomes could be identified in every city from which phage genomes were retrieved. We predicted hosts for 48% of the phages and observed general agreement between phage abundance and the respective bacterial host abundance, which include the most common nosocomial multidrug-resistant pathogens. A small fraction of the phage genomes are carriers of antibiotic resistance genes, and such genomes tended to be particularly abundant in the sites where they were found. We also detected CRISPR-Cas systems in five phage genomes. This study expands the previously reported MetaSUB results and is a contribution to the knowledge about phage diversity, global distribution, and phage genome content.
Collapse
Affiliation(s)
- Vinicius S Flores
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, 05508-000, Brazil
| | - Deyvid E Amgarten
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, 05508-000, Brazil
- Hospital Israelita Albert Einstein, São Paulo, Brazil
| | - Bruno Koshin Vázquez Iha
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, 05508-000, Brazil
| | | | | | - Braden T Tierney
- Weill Cornell Medicine, New York, NY, USA
- Harvard Medical School, Cambridge, MA, USA
| | | | - Aline Maria da Silva
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, 05508-000, Brazil.
| | - João Carlos Setubal
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, 05508-000, Brazil.
| |
Collapse
|
10
|
van Amstel RBE, Cremer OL, van Vught LA, Bos LDJ. Subphenotypes in critical illness: a priori biological rationale is key. Intensive Care Med 2024; 50:299-301. [PMID: 38015264 DOI: 10.1007/s00134-023-07273-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/11/2023] [Indexed: 11/29/2023]
Affiliation(s)
- Rombout B E van Amstel
- Department of Intensive Care Medicine, Amsterdam UMC, Location University of Amsterdam, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands.
- Laboratory of Experimental Intensive Care and Anesthesiology (L.E.I.C.A.), Amsterdam UMC, Location University of Amsterdam, Amsterdam, The Netherlands.
| | - Olaf L Cremer
- Department of Intensive Care, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Lonneke A van Vught
- Department of Intensive Care Medicine, Amsterdam UMC, Location University of Amsterdam, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands
- Center for Experimental and Molecular Medicine, Amsterdam UMC, Location University of Amsterdam, Amsterdam, The Netherlands
| | - Lieuwe D J Bos
- Department of Intensive Care Medicine, Amsterdam UMC, Location University of Amsterdam, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands
- Laboratory of Experimental Intensive Care and Anesthesiology (L.E.I.C.A.), Amsterdam UMC, Location University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
11
|
Ilgen U. Cluster analysis as a clinical and research tool in Behçet's syndrome. Curr Opin Rheumatol 2024; 36:3-8. [PMID: 37729051 DOI: 10.1097/bor.0000000000000980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
PURPOSE OF REVIEW The purpose of this review was to comprehensively summarize recent phenotype research findings in Behçet's syndrome. RECENT FINDINGS Cluster analysis has recently been employed as a phenotype research tool in Behçet's syndrome. Studies reported different clustering patterns caused by biological variation and some degree of artificial heterogeneity. However, some clusters were more consistent than others: 1) oral ulcers, genital ulcers, and skin lesions 2) oral ulcers, genital ulcers, skin lesions, and arthritis 3) oral ulcers, genital ulcers, skin lesions, and uveitis 4) oral ulcers, genital ulcers, skin lesions, and gastrointestinal involvement. A number of loci suggestive of differential risk for individual disease manifestations were proposed. Peripheral blood gene expression profile and plasma proteome exhibited significant differences in patients with different organ involvements and were able to differentiate between disease phenotypes. However, these observations require further validation and functional studies. SUMMARY Clustering patterns in Behçet's syndrome is highly heterogeneous. Artificial heterogeneity might obscure the true biological variation of disease expression. Preliminary genetic, transcriptomic and proteomic data suggest that different pathogenetic mechanisms may operate in different phenotypes of Behçet's syndrome.
Collapse
Affiliation(s)
- Ufuk Ilgen
- Rheumatology Clinic, Edirne State Hospital, Edirne, Turkey
| |
Collapse
|
12
|
Capouskova K, Zamora‐López G, Kringelbach ML, Deco G. Integration and segregation manifolds in the brain ensure cognitive flexibility during tasks and rest. Hum Brain Mapp 2023; 44:6349-6363. [PMID: 37846551 PMCID: PMC10681658 DOI: 10.1002/hbm.26511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 09/14/2023] [Accepted: 09/25/2023] [Indexed: 10/18/2023] Open
Abstract
Adapting to a constantly changing environment requires the human brain to flexibly switch among many demanding cognitive tasks, processing both specialized and integrated information associated with the activity in functional networks over time. In this study, we investigated the nature of the temporal alternation between segregated and integrated states in the brain during rest and six cognitive tasks using functional MRI. We employed a deep autoencoder to explore the 2D latent space associated with the segregated and integrated states. Our results show that the integrated state occupies less space in the latent space manifold compared to the segregated states. Moreover, the integrated state is characterized by lower entropy of occupancy than the segregated state, suggesting that integration plays a consolidating role, while segregation may serve as cognitive expertness. Comparing rest and the tasks, we found that rest exhibits higher entropy of occupancy, indicating a more random wandering of the mind compared to the expected focus during task performance. Our study demonstrates that both transient, short-lived integrated and segregated states are present during rest and task performance, flexibly switching between them, with integration serving as information compression and segregation related to information specialization.
Collapse
Affiliation(s)
- Katerina Capouskova
- Center for Brain and Cognition, Computational Neuroscience Group, DTICUniversitat Pompeu FabraBarcelonaSpain
| | - Gorka Zamora‐López
- Center for Brain and Cognition, Computational Neuroscience Group, DTICUniversitat Pompeu FabraBarcelonaSpain
| | - Morten L. Kringelbach
- Department of PsychiatryUniversity of OxfordOxfordUnited Kingdom
- Center for Music in the Brain, Department of Clinical MedicineAarhus UniversityAarhusDenmark
- Centre for Eudaimonia and Human Flourishing, Linacre CollegeUniversity of OxfordOxfordUnited Kingdom
| | - Gustavo Deco
- Center for Brain and Cognition, Computational Neuroscience Group, DTICUniversitat Pompeu FabraBarcelonaSpain
- Institució Catalana de Recerca i Estudis Avançats (ICREA)BarcelonaSpain
| |
Collapse
|
13
|
Shimpi AA, Williams ED, Ling L, Tamir T, White FM, Fischbach C. Phosphoproteomic Changes Induced by Cell-Derived Matrix and Their Effect on Tumor Cell Migration and Cytoskeleton Remodeling. ACS Biomater Sci Eng 2023; 9:6835-6848. [PMID: 38015076 DOI: 10.1021/acsbiomaterials.3c01034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Increased fibrotic extracellular matrix (ECM) deposition promotes tumor invasion, which is the first step of the metastatic cascade. Yet, the underlying mechanisms are poorly understood as conventional studies of tumor cell migration are often performed in 2D cultures lacking the compositional and structural complexity of native ECM. Moreover, these studies frequently focus on select candidate pathways potentially overlooking other relevant changes in cell signaling. Here, we combine a cell-derived matrix (CDM) model with phosphotyrosine phosphoproteomic analysis to investigate tumor cell migration on fibrotic ECM relative to standard tissue culture plastic (TCP). Our results suggest that tumor cells cultured on CDMs migrate faster and in a more directional manner than their counterparts on TCP. These changes in migration correlate with decreased cell spreading and increased cell elongation. While the formation of phosphorylated focal adhesion kinase (pFAK)+ adhesion complexes did not vary between TCP and CDMs, time-dependent phosphoproteomic analysis identified that the SRC family kinase LYN may be differentially regulated. Pharmacological inhibition of LYN decreased tumor cell migration and cytoskeletal rearrangement on CDMs and also on TCP, suggesting that LYN regulates tumor cell migration on CDMs in combination with other mechanisms. These data highlight how the combination of physicochemically complex in vitro systems with phosphoproteomics can help identify signaling mechanisms by which the fibrotic ECM regulates tumor cell migration.
Collapse
Affiliation(s)
- Adrian A Shimpi
- Nancy E. and Peter C. Meinig School of Biomedical Engineering, Cornell University, Ithaca, New York 14853, United States
| | - Erik D Williams
- Department of Information Science, Cornell University, Ithaca, New York 14853, United States
| | - Lu Ling
- Nancy E. and Peter C. Meinig School of Biomedical Engineering, Cornell University, Ithaca, New York 14853, United States
| | - Tigist Tamir
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 023139, United States
- Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, Massachusetts 023139, United States
| | - Forest M White
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 023139, United States
- Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, Massachusetts 023139, United States
| | - Claudia Fischbach
- Nancy E. and Peter C. Meinig School of Biomedical Engineering, Cornell University, Ithaca, New York 14853, United States
- Kavli Institute at Cornell for Nanoscale Science, Cornell University, Ithaca, New York 14853, United States
| |
Collapse
|
14
|
Riazi K, Ly M, Barty R, Callum J, Arnold DM, Heddle NM, Down DG, Sidhu D, Li N. An unsupervised learning approach to identify immunoglobulin utilization patterns using electronic health records. Transfusion 2023; 63:2234-2247. [PMID: 37861272 DOI: 10.1111/trf.17585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 08/20/2023] [Accepted: 09/20/2023] [Indexed: 10/21/2023]
Abstract
BACKGROUND Managing Canada's immunoglobulin (Ig) product resource allocation is challenging due to increasing demand, high expenditure, and global shortages. Detection of groups with high utilization rates can help with resource planning for Ig products. This study aims to uncover utilization subgroups among the Ig recipients using electronic health records (EHRs). METHODS The study included all Ig recipients (intravenous or subcutaneous) in Calgary from 2014 to 2020, and their EHR data, including blood inventory, recipient demographics, and laboratory test results, were analyzed. Patient clusters were derived based on patient characteristics and laboratory test data using K-means clustering. Clusters were interpreted using descriptive analyses and visualization techniques. RESULTS Among 4112 recipients, six clusters were identified. Clusters 1 and 2 comprised 408 (9.9%) and 1272 (30.9%) patients, respectively, contributing to 62.2% and 27.1% of total Ig utilization. Cluster 3 included 1253 (30.5%) patients, with 86.4% of infusions administered in an inpatient setting. Cluster 4, comprising 1034 (25.1%) patients, had a median age of 4 years, while clusters 2-6 were adults with median ages of 46-60. Cluster 5 had 62 (1.5%) patients, with 77.3% infusions occurring in emergency departments. Cluster 6 contained 83 (2.0%) patients receiving subcutaneous Ig treatments. CONCLUSION The results identified data-driven segmentations of patients with high Ig utilization rates and patients with high risk for short-term inpatient use. Our report is the first on EHR data-driven clustering of Ig utilization patterns. The findings hold the potential to inform demand forecasting and resource allocation decisions during shortages of Ig products.
Collapse
Affiliation(s)
- Kiarash Riazi
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Canada
| | - Mark Ly
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Canada
| | - Rebecca Barty
- Ontario Regional Blood Coordinating Network, Hamilton, Ontario, Canada
- Michael G. DeGroote Centre for Transfusion Research, Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Jeannie Callum
- Department of Pathology and Molecular Medicine, Kingston Health Sciences Centre and Queen's University, Kingston, Ontario, Canada
- Department of Laboratory Medicine and Molecular Diagnostics, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada
| | - Donald M Arnold
- Michael G. DeGroote Centre for Transfusion Research, Department of Medicine, McMaster University, Hamilton, Ontario, Canada
- Centre for Innovation, Canadian Blood Services, Ottawa, Ontario, Canada
- Department of Medicine, Michael G. DeGroote School of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Nancy M Heddle
- Michael G. DeGroote Centre for Transfusion Research, Department of Medicine, McMaster University, Hamilton, Ontario, Canada
- Centre for Innovation, Canadian Blood Services, Ottawa, Ontario, Canada
| | - Douglas G Down
- Department of Computing and Software, McMaster University, Hamilton, Ontario, Canada
| | - Davinder Sidhu
- Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Na Li
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Canada
- Michael G. DeGroote Centre for Transfusion Research, Department of Medicine, McMaster University, Hamilton, Ontario, Canada
- Department of Computing and Software, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
15
|
Zelig A, Kariti H, Kaplan N. KMD clustering: robust general-purpose clustering of biological data. Commun Biol 2023; 6:1110. [PMID: 37919399 PMCID: PMC10622433 DOI: 10.1038/s42003-023-05480-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 10/18/2023] [Indexed: 11/04/2023] Open
Abstract
The noisy and high-dimensional nature of biological data has spawned advanced clustering algorithms that are tailored for specific biological datatypes. However, the performance of such methods varies greatly between datasets and they require post hoc tuning of cryptic hyperparameters. We present k minimal distance (KMD) clustering, a general-purpose method based on a generalization of single and average linkage hierarchical clustering. We introduce a generalized silhouette-like function to eliminate the cryptic hyperparameter k, and use sampling to enable application to million-object datasets. Rigorous comparisons to general and specialized clustering methods on simulated, mass cytometry and scRNA-seq datasets show consistent high performance of KMD clustering across all datasets.
Collapse
Affiliation(s)
- Aviv Zelig
- Data Science & Engineering Program, Faculty of Industrial Engineering & Management, Technion - Israel Institute of Technology, Haifa, Israel
- Department of Physiology, Biophysics & Systems Biology, Rappaport Faculty of Medicine, Technion - Israel Institute of Technology, Haifa, Israel
| | - Hagai Kariti
- Department of Physiology, Biophysics & Systems Biology, Rappaport Faculty of Medicine, Technion - Israel Institute of Technology, Haifa, Israel
| | - Noam Kaplan
- Department of Physiology, Biophysics & Systems Biology, Rappaport Faculty of Medicine, Technion - Israel Institute of Technology, Haifa, Israel.
| |
Collapse
|
16
|
Ahsanuddin S, Wu AY. Single-cell transcriptomics of the ocular anterior segment: a comprehensive review. Eye (Lond) 2023; 37:3334-3350. [PMID: 37138096 PMCID: PMC10156079 DOI: 10.1038/s41433-023-02539-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 03/07/2023] [Accepted: 04/11/2023] [Indexed: 05/05/2023] Open
Abstract
Elucidating the cellular and genetic composition of ocular tissues is essential for uncovering the pathophysiology of ocular diseases. Since the introduction of single-cell RNA sequencing (scRNA-seq) in 2009, vision researchers have performed extensive single-cell analyses to better understand transcriptome complexity and heterogeneity of ocular structures. This technology has revolutionized our ability to identify rare cell populations and to make cross-species comparisons of gene expression in both steady state and disease conditions. Importantly, single-cell transcriptomic analyses have enabled the identification of cell-type specific gene markers and signalling pathways between ocular cell populations. While most scRNA-seq studies have been conducted on retinal tissues, large-scale transcriptomic atlases pertaining to the ocular anterior segment have also been constructed in the past three years. This timely review provides vision researchers with an overview of scRNA-seq experimental design, technical limitations, and clinical applications in a variety of anterior segment-related ocular pathologies. We review open-access anterior segment-related scRNA-seq datasets and illustrate how scRNA-seq can be an indispensable tool for the development of targeted therapeutics.
Collapse
Affiliation(s)
- Sofia Ahsanuddin
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, USA
- Department of Ophthalmology, New York Eye and Ear Infirmary of Mount Sinai, New York City, NY, USA
- Department of Ophthalmology, Icahn School of Medicine at Mount Sinai, New York City, NY, USA
| | - Albert Y Wu
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|
17
|
Wang Y, Wei W, Du W, Cai J, Liao Y, Lu H, Kong B, Zhang Z. Deep-Learning-Based Mixture Identification for Nuclear Magnetic Resonance Spectroscopy Applied to Plant Flavors. Molecules 2023; 28:7380. [PMID: 37959799 PMCID: PMC10648966 DOI: 10.3390/molecules28217380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 10/25/2023] [Accepted: 10/30/2023] [Indexed: 11/15/2023] Open
Abstract
Nuclear magnetic resonance (NMR) is a crucial technique for analyzing mixtures consisting of small molecules, providing non-destructive, fast, reproducible, and unbiased benefits. However, it is challenging to perform mixture identification because of the offset of chemical shifts and peak overlaps that often exist in mixtures such as plant flavors. Here, we propose a deep-learning-based mixture identification method (DeepMID) that can be used to identify plant flavors (mixtures) in a formulated flavor (mixture consisting of several plant flavors) without the need to know the specific components in the plant flavors. A pseudo-Siamese convolutional neural network (pSCNN) and a spatial pyramid pooling (SPP) layer were used to solve the problems due to their high accuracy and robustness. The DeepMID model is trained, validated, and tested on an augmented data set containing 50,000 pairs of formulated and plant flavors. We demonstrate that DeepMID can achieve excellent prediction results in the augmented test set: ACC = 99.58%, TPR = 99.48%, FPR = 0.32%; and two experimentally obtained data sets: one shows ACC = 97.60%, TPR = 92.81%, FPR = 0.78% and the other shows ACC = 92.31%, TPR = 80.00%, FPR = 0.00%. In conclusion, DeepMID is a reliable method for identifying plant flavors in formulated flavors based on NMR spectroscopy, which can assist researchers in accelerating the design of flavor formulations.
Collapse
Affiliation(s)
- Yufei Wang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Weiwei Wei
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Wen Du
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Jiaxiao Cai
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Yuxuan Liao
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Bo Kong
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| |
Collapse
|
18
|
Pividori M, Lu S, Li B, Su C, Johnson ME, Wei WQ, Feng Q, Namjou B, Kiryluk K, Kullo IJ, Luo Y, Sullivan BD, Voight BF, Skarke C, Ritchie MD, Grant SFA, Greene CS. Projecting genetic associations through gene expression patterns highlights disease etiology and drug mechanisms. Nat Commun 2023; 14:5562. [PMID: 37689782 PMCID: PMC10492839 DOI: 10.1038/s41467-023-41057-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 08/18/2023] [Indexed: 09/11/2023] Open
Abstract
Genes act in concert with each other in specific contexts to perform their functions. Determining how these genes influence complex traits requires a mechanistic understanding of expression regulation across different conditions. It has been shown that this insight is critical for developing new therapies. Transcriptome-wide association studies have helped uncover the role of individual genes in disease-relevant mechanisms. However, modern models of the architecture of complex traits predict that gene-gene interactions play a crucial role in disease origin and progression. Here we introduce PhenoPLIER, a computational approach that maps gene-trait associations and pharmacological perturbation data into a common latent representation for a joint analysis. This representation is based on modules of genes with similar expression patterns across the same conditions. We observe that diseases are significantly associated with gene modules expressed in relevant cell types, and our approach is accurate in predicting known drug-disease pairs and inferring mechanisms of action. Furthermore, using a CRISPR screen to analyze lipid regulation, we find that functionally important players lack associations but are prioritized in trait-associated modules by PhenoPLIER. By incorporating groups of co-expressed genes, PhenoPLIER can contextualize genetic associations and reveal potential targets missed by single-gene strategies.
Collapse
Affiliation(s)
- Milton Pividori
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Sumei Lu
- Center for Spatial and Functional Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Binglan Li
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Chun Su
- Center for Spatial and Functional Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Matthew E Johnson
- Center for Spatial and Functional Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Wei-Qi Wei
- Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Qiping Feng
- Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Bahram Namjou
- Cincinnati Children's Hospital Medical Center, Cincinnati, OH, 45229, USA
| | - Krzysztof Kiryluk
- Department of Medicine, Division of Nephrology, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, 10032, USA
| | | | - Yuan Luo
- Northwestern University, Chicago, IL, 60611, USA
| | - Blair D Sullivan
- Kahlert School of Computing, University of Utah, Salt Lake City, UT, 84112, USA
| | - Benjamin F Voight
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Carsten Skarke
- Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Marylyn D Ritchie
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Struan F A Grant
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Center for Spatial and Functional Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Division of Endocrinology and Diabetes, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Casey S Greene
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
- Center for Health AI, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| |
Collapse
|
19
|
Zucco AG, Bennedbæk M, Ekenberg C, Gabrielaite M, Leung P, Polizzotto MN, Kan V, Murray DD, Lundgren JD, MacPherson CR. Associations of functional human leucocyte antigen class I groups with HIV viral load in a heterogeneous cohort. AIDS 2023; 37:1643-1650. [PMID: 37534724 PMCID: PMC10399941 DOI: 10.1097/qad.0000000000003557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 03/13/2023] [Accepted: 03/21/2023] [Indexed: 04/07/2023]
Abstract
OBJECTIVE Human leucocyte antigen (HLA) class I alleles are the main host genetic factors involved in controlling HIV-1 viral load (VL). Nevertheless, HLA diversity has proven a significant challenge in association studies. We assessed how accounting for binding affinities of HLA class I alleles to HIV-1 peptides facilitate association testing of HLA with HIV-1 VL in a heterogeneous cohort. DESIGN Cohort from the Strategic Timing of AntiRetroviral Treatment (START) study. METHODS We imputed HLA class I alleles from host genetic data (2546 HIV+ participants) and sampled immunopeptidomes from 2079 host-paired viral genomes (targeted amplicon sequencing). We predicted HLA class I binding affinities to HIV-1 and unspecific peptides, grouping alleles into functional clusters through consensus clustering. These functional HLA class I clusters were used to test associations with HIV VL. RESULTS We identified four clades totaling 30 HLA alleles accounting for 11.4% variability in VL. We highlight HLA-B∗57:01 and B∗57:03 as functionally similar but yet overrepresented in distinct ethnic groups, showing when combined a protective association with HIV+ VL (log, β -0.25; adj. P-value < 0.05). We further demonstrate only a slight power reduction when using unspecific immunopeptidomes, facilitating the use of the inferred functional HLA groups in other studies. CONCLUSION The outlined computational approach provides a robust and efficient way to incorporate HLA function and peptide diversity, aiding clinical association studies in heterogeneous cohorts. To facilitate access to the proposed methods and results we provide an interactive application for exploring data.
Collapse
Affiliation(s)
| | - Marc Bennedbæk
- Virus Research and Development Laboratory, Virus and Microbiological Special Diagnostics, Statens Serum Institut
| | | | - Migle Gabrielaite
- Center for Genomic Medicine, Copenhagen University Hospital, Copenhagen, Denmark
| | | | - Mark N. Polizzotto
- Clinical Hub for Interventional Research, College of Health and Medicine, The Australian National University, Canberra, Australia
| | - Virginia Kan
- George Washington University, Veterans Affairs Medical Center, Washington, DC, USA
| | | | | | | |
Collapse
|
20
|
Robinault L, Niazi IK, Kumari N, Amjad I, Menard V, Haavik H. Non-Specific Low Back Pain: An Inductive Exploratory Analysis through Factor Analysis and Deep Learning for Better Clustering. Brain Sci 2023; 13:946. [PMID: 37371424 DOI: 10.3390/brainsci13060946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 06/08/2023] [Accepted: 06/12/2023] [Indexed: 06/29/2023] Open
Abstract
Non-specific low back pain (NSLBP) is a significant and pervasive public health issue in contemporary society. Despite the widespread prevalence of NSLBP, our understanding of its underlying causes, as well as our capacity to provide effective treatments, remains limited due to the high diversity in the population that does not respond to generic treatments. Clustering the NSLBP population based on shared characteristics offers a potential solution for developing personalized interventions. However, the complexity of NSLBP and the reliance on subjective categorical data in previous attempts present challenges in achieving reliable and clinically meaningful clusters. This study aims to explore the influence and importance of objective, continuous variables related to NSLBP and how to use these variables effectively to facilitate the clustering of NSLBP patients into meaningful subgroups. Data were acquired from 46 subjects who performed six simple movement tasks (back extension, back flexion, lateral trunk flexion right, lateral trunk flexion left, trunk rotation right, and trunk rotation left) at two different speeds (maximum and preferred). High-density electromyography (HD EMG) data from the lower back region were acquired, jointly with motion capture data, using passive reflective markers on the subject's body and clusters of markers on the subject's spine. An exploratory analysis was conducted using a deep neural network and factor analysis. Based on selected variables, various models were trained to classify individuals as healthy or having NSLBP in order to assess the importance of different variables. The models were trained using different subsets of data, including all variables, only anthropometric data (e.g., age, BMI, height, weight, and sex), only biomechanical data (e.g., shoulder and lower back movement), only neuromuscular data (e.g., HD EMG activity), or only balance-related data. The models achieved high accuracy in categorizing individuals as healthy or having NSLBP (full model: 93.30%, anthropometric model: 94.40%, biomechanical model: 84.47%, neuromuscular model: 88.07%, and balance model: 74.73%). Factor analysis revealed that individuals with NSLBP exhibited different movement patterns to healthy individuals, characterized by slower and more rigid movements. Anthropometric variables (age, sex, and BMI) were significantly correlated with NSLBP components. In conclusion, different data types, such as body measurements, movement patterns, and neuromuscular activity, can provide valuable information for identifying individuals with NSLBP. To gain a comprehensive understanding of NSLBP, it is crucial to investigate the main domains influencing its prognosis as a cohesive unit rather than studying them in isolation. Simplifying the conditions for acquiring dynamic data is recommended to reduce data complexity, and using back flexion and trunk rotation as effective options should be further explored.
Collapse
Affiliation(s)
- Lucien Robinault
- Centre for Chiropractic Research, New Zealand College of Chiropractic, Auckland 1060, New Zealand
| | - Imran Khan Niazi
- Centre for Chiropractic Research, New Zealand College of Chiropractic, Auckland 1060, New Zealand
- Faculty of Health and Environmental Sciences, Health and Rehabilitation Research Institute, AUT University, Auckland 1010, New Zealand
- Department of Health Science and Technology, Aalborg University, 9220 Aalborg, Denmark
| | - Nitika Kumari
- Centre for Chiropractic Research, New Zealand College of Chiropractic, Auckland 1060, New Zealand
| | - Imran Amjad
- Centre for Chiropractic Research, New Zealand College of Chiropractic, Auckland 1060, New Zealand
- Faculty of Rehabilitation and Allied Health Sciences and Department of Biomedical Engineering, Riphah International University, Islamabad 46000, Pakistan
| | - Vincent Menard
- M2S Laboratory, ENS Rennes, University of Rennes 2, 35065 Rennes, France
| | - Heidi Haavik
- Centre for Chiropractic Research, New Zealand College of Chiropractic, Auckland 1060, New Zealand
| |
Collapse
|
21
|
Nie X, Qin D, Zhou X, Duo H, Hao Y, Li B, Liang G. Clustering ensemble in scRNA-seq data analysis: Methods, applications and challenges. Comput Biol Med 2023; 159:106939. [PMID: 37075602 DOI: 10.1016/j.compbiomed.2023.106939] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 03/31/2023] [Accepted: 04/14/2023] [Indexed: 04/21/2023]
Abstract
With the rapid development of single-cell RNA-sequencing techniques, various computational methods and tools were proposed to analyze these high-throughput data, which led to an accelerated reveal of potential biological information. As one of the core steps of single-cell transcriptome data analysis, clustering plays a crucial role in identifying cell types and interpreting cellular heterogeneity. However, the results generated by different clustering methods showed distinguishing, and those unstable partitions can affect the accuracy of the analysis to a certain extent. To overcome this challenge and obtain more accurate results, currently clustering ensemble is frequently applied to cluster analysis of single-cell transcriptome datasets, and the results generated by all clustering ensembles are nearly more reliable than those from most of the single clustering partitions. In this review, we summarize applications and challenges of the clustering ensemble method in single-cell transcriptome data analysis, and provide constructive thoughts and references for researchers in this field.
Collapse
Affiliation(s)
- Xiner Nie
- Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400044, China; College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Dan Qin
- Department of Biology, College of Science, Northeastern University, Boston, MA, 02115, USA
| | - Xinyi Zhou
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Hongrui Duo
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Youjin Hao
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China.
| | - Guizhao Liang
- Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400044, China.
| |
Collapse
|
22
|
Babu M, Snyder M. Multi-Omics Profiling for Health. Mol Cell Proteomics 2023; 22:100561. [PMID: 37119971 PMCID: PMC10220275 DOI: 10.1016/j.mcpro.2023.100561] [Citation(s) in RCA: 107] [Impact Index Per Article: 53.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 04/20/2023] [Accepted: 04/23/2023] [Indexed: 05/01/2023] Open
Abstract
The world has witnessed a steady rise in both non-infectious and infectious chronic diseases, prompting a cross-disciplinary approach to understand and treating disease. Current medical care focuses on treating people after they become patients rather than preventing illness, leading to high costs in treating chronic and late-stage diseases. Additionally, a "one-size-fits all" approach to health care does not take into account individual differences in genetics, environment, or lifestyle factors, decreasing the number of people benefiting from interventions. Rapid advances in omics technologies and progress in computational capabilities have led to the development of multi-omics deep phenotyping, which profiles the interaction of multiple levels of biology over time and empowers precision health approaches. This review highlights current and emerging multi-omics modalities for precision health and discusses applications in the following areas: genetic variation, cardio-metabolic diseases, cancer, infectious diseases, organ transplantation, pregnancy, and longevity/aging. We will briefly discuss the potential of multi-omics approaches in disentangling host-microbe and host-environmental interactions. We will touch on emerging areas of electronic health record and clinical imaging integration with muti-omics for precision health. Finally, we will briefly discuss the challenges in the clinical implementation of multi-omics and its future prospects.
Collapse
Affiliation(s)
- Mohan Babu
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
| | - Michael Snyder
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA.
| |
Collapse
|
23
|
Nematimoez M, Breen A, Breen A. Spatio-temporal clustering of lumbar intervertebral flexion interactions in 127 asymptomatic individuals. J Biomech 2023; 154:111634. [PMID: 37209467 DOI: 10.1016/j.jbiomech.2023.111634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 04/18/2023] [Accepted: 05/09/2023] [Indexed: 05/22/2023]
Abstract
The purpose of this study was to categorize asymptomatic participants based on the clustering of spatial and temporal intervertebral kinematic variables during lumbar flexion. Lumbar segmental interactions (L2-S1) were evaluated in 127 asymptomatic participants during flexion using fluoroscopy. First, four variables were identified consisting of: 1. Range of motion (ROMC), 2. Peaking time of the first derivative for separate segmentation (PTFDs), 3. Peaking magnitude of the first derivative (PMFD), and 4. Peaking time of the first derivative for stepwise (grouped) segmentation (PTFDss). These variables were used to cluster and order the lumbar levels. The number of participants required to constitute a cluster was chosen as 7. Participants formed eight (ROMC), four (PTFDs), eight (PMFD), and four (PTFDss) clusters, which included 85%, 80%, 77%, and 60% of them, respectively, according to the above features. For all clustering variables, angle time series of some lumbar levels showed significant differences between clusters. However, in general, all clusters could be categorized based on the segmental mobility contexts into three main groups as incidental macro clusters: the upper (L2-L4 > L4-S1), middle (L2-L3 < L3-L5 > L5-S1) and lower (L2-L4 < L4-S1) domains. There are spatial and temporal segmental interactions and between-subject variability in asymptomatic participants. In addition, the differences in angle time series among the clusters have provided evidence of feedback control strategies, while the stepwise segmentation facilitates consideration of the lumbar spine as a system and provides supplementary information about segmental interactions. Clinically, these facts could be taken into account when considering any intervention, but especially fusion surgery.
Collapse
Affiliation(s)
| | - Alexander Breen
- Faculty of Science and Technology, Bournemouth University, Poole BH12 5BB, UK
| | - Alan Breen
- Faculty of Science and Technology, Bournemouth University, Poole BH12 5BB, UK
| |
Collapse
|
24
|
Ildefonso GV, Oliver Metzig M, Hoffmann A, Harris LA, Lopez CF. A biochemical necroptosis model explains cell-type-specific responses to cell death cues. Biophys J 2023; 122:817-834. [PMID: 36710493 PMCID: PMC10027451 DOI: 10.1016/j.bpj.2023.01.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 12/31/2022] [Accepted: 01/24/2023] [Indexed: 01/30/2023] Open
Abstract
Necroptosis is a form of regulated cell death associated with degenerative disorders, autoimmune and inflammatory diseases, and cancer. To better understand the biochemical mechanisms regulating necroptosis, we constructed a detailed computational model of tumor necrosis factor-induced necroptosis based on known molecular interactions from the literature. Intracellular protein levels, used as model inputs, were quantified using label-free mass spectrometry, and the model was calibrated using Bayesian parameter inference to experimental protein time course data from a well-established necroptosis-executing cell line. The calibrated model reproduced the dynamics of phosphorylated mixed lineage kinase domain-like protein, an established necroptosis reporter. A subsequent dynamical systems analysis identified four distinct modes of necroptosis signal execution, distinguished by rate constant values and the roles of the RIP1 deubiquitinating enzymes A20 and CYLD. In one case, A20 and CYLD both contribute to RIP1 deubiquitination, in another RIP1 deubiquitination is driven exclusively by CYLD, and in two modes either A20 or CYLD acts as the driver with the other enzyme, counterintuitively, inhibiting necroptosis. We also performed sensitivity analyses of initial protein concentrations and rate constants to identify potential targets for modulating necroptosis sensitivity within each mode. We conclude by associating numerous contrasting and, in some cases, counterintuitive experimental results reported in the literature with one or more of the model-predicted modes of necroptosis execution. In all, we demonstrate that a consensus pathway model of tumor necrosis factor-induced necroptosis can provide insights into unresolved controversies regarding the molecular mechanisms driving necroptosis execution in numerous cell types under different experimental conditions.
Collapse
Affiliation(s)
- Geena V Ildefonso
- Chemical and Physical Biology Program, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Marie Oliver Metzig
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, California; Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, California
| | - Alexander Hoffmann
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, California; Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, California
| | - Leonard A Harris
- Department of Biomedical Engineering, University of Arkansas, Fayetteville, Arkansas; Interdisciplinary Graduate Program in Cell and Molecular Biology, University of Arkansas, Fayetteville, Arkansas; Cancer Biology Program, Winthrop P. Rockefeller Cancer Institute, University of Arkansas for Medical Sciences, Little Rock, Arkansas.
| | - Carlos F Lopez
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, Tennessee; Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, Tennessee; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee.
| |
Collapse
|
25
|
Figgett WA, Hawson J, Lee G. Machine learning in EP research: New tools for old problems. J Cardiovasc Electrophysiol 2023; 34:1322-1323. [PMID: 36738150 DOI: 10.1111/jce.15851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 02/01/2023] [Indexed: 02/05/2023]
Affiliation(s)
- William A Figgett
- Garvan Institute of Medical Research, Darlinghurst, New South Wales, Australia
| | - Joshua Hawson
- Department of Cardiology, Royal Melbourne Hospital, Victoria, Australia.,Faculty of Medicine, Dentistry and Health Science, The University of Melbourne, Victoria, Australia
| | - Geoffrey Lee
- Department of Cardiology, Royal Melbourne Hospital, Victoria, Australia.,Faculty of Medicine, Dentistry and Health Science, The University of Melbourne, Victoria, Australia
| |
Collapse
|
26
|
Muñoz-Baena L, Poon AFY. Clustering Highly Divergent Homologous Proteins: An Alignment-Free Method. Curr Protoc 2023; 3:e666. [PMID: 36809686 DOI: 10.1002/cpz1.666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
The comparative analysis of amino acid sequences is an important tool in molecular biology that often requires multiple sequence alignments. In comparisons between less closely related genomes, however, it becomes more difficult to accurately align protein-coding sequences, or even to identify homologous regions in different genomes. In this article, we describe an alignment-free method for the classification of homologous protein-coding regions from different genomes. This methodology was originally developed for comparing genomes within virus families, but may be adapted for other organisms. We quantify sequence homology from the overlap (intersection distance) of the k-mer (word) frequency distributions for different protein sequences. Next, we extract groups of homologous sequences from the resulting distance matrix using a combination of dimensionality reduction and hierarchical clustering methods. Finally, we demonstrate how to generate visualizations of the composition of clusters with respect to protein annotations, and by coloring protein-coding regions of genomes by cluster assignments. These provide a useful means to quickly assess the reliability of the clustering results based on the distribution of homologous genes among genomes. © 2023 Wiley Periodicals LLC. Basic Protocol 1: Data collection and processing Basic Protocol 2: Calculating k-mer distances Basic Protocol 3: Extracting clusters of homology Support Protocol: Genome plot based on clustering results.
Collapse
Affiliation(s)
- Laura Muñoz-Baena
- Department of Microbiology and Immunology, Western University, London, Ontario, Canada
| | - Art F Y Poon
- Department of Microbiology and Immunology, Western University, London, Ontario, Canada.,Department of Pathology and Laboratory Medicine, Western University, London, Ontario, Canada
| |
Collapse
|
27
|
Thompson MA, Martin SA, Hislop BD, Younkin R, Andrews TM, Miller K, June RK, Adams ES. Sex-specific effects of calving season on joint health and biomarkers in Montana ranchers. BMC Musculoskelet Disord 2023; 24:80. [PMID: 36717802 PMCID: PMC9887842 DOI: 10.1186/s12891-022-05979-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 11/11/2022] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND Agricultural workers have a higher incidence of osteoarthritis (OA), but the etiology behind this phenomenon is unclear. Calving season, which occurs in mid- to late-winter for ranchers, includes physical conditions that may elevate OA risk. Our primary aim was to determine whether OA biomarkers are elevated at the peak of calving season compared to pre-season, and to compare these data with joint health survey information from the subjects. Our secondary aim was to detect biomarker differences between male and female ranchers. METHODS During collection periods before and during calving season, male (n = 28) and female (n = 10) ranchers completed joint health surveys and provided samples of blood, urine, and saliva for biomarker analysis. Statistical analyses examined associations between mean biomarker levels and survey predictors. Ensemble cluster analysis identified groups having unique biomarker profiles. RESULTS The number of calvings performed by each rancher positively correlated with plasma IL-6, serum hyaluronic acid (HA) and urinary CTX-I. Thiobarbituric acid reactive substances (TBARS), a marker of oxidative stress, was significantly higher during calving season than pre-season and was also correlated with ranchers having more months per year of joint pain. We found evidence of sexual dimorphism in the biomarkers among the ranchers, with leptin being elevated and matrix metalloproteinase-3 diminished in female ranchers. The opposite was detected in males. WOMAC score was positively associated with multiple biomarkers: IL-6, IL-2, HA, leptin, C2C, asymmetric dimethylarginine, and CTX-I. These biomarkers represent enzymatic degradation, inflammation, products of joint destruction, and OA severity. CONCLUSIONS The positive association between number of calvings performed by each rancher (workload) and both inflammatory and joint tissue catabolism biomarkers establishes that calving season is a risk factor for OA in Montana ranchers. Consistent with the literature, we found important sex differences in OA biomarkers, with female ranchers showing elevated leptin, whereas males showed elevated MMP-3.
Collapse
Affiliation(s)
- Matthew A. Thompson
- grid.41891.350000 0001 2156 6108Department of Chemical & Biological Engineering, Montana State University, Bozeman, MT USA
| | - Stephen A. Martin
- grid.41891.350000 0001 2156 6108Center for American Indian and Rural Health Equity, Translational Biomarkers Core Laboratory, Montana State University, Bozeman, MT USA
| | - Brady D. Hislop
- grid.41891.350000 0001 2156 6108Department of Mechanical & Industrial Engineering, Montana State University, PO Box 173800, Bozeman, MT 59717-3800 USA
| | - Roubie Younkin
- grid.41891.350000 0001 2156 6108MSU Extension Office, Montana State University, Bozeman, MT USA
| | - Tara M. Andrews
- grid.41891.350000 0001 2156 6108MSU Extension Office, Montana State University, Bozeman, MT USA
| | - Kaleena Miller
- grid.41891.350000 0001 2156 6108MSU Extension Office, Montana State University, Bozeman, MT USA
| | - Ronald K. June
- grid.41891.350000 0001 2156 6108Department of Mechanical & Industrial Engineering, Montana State University, PO Box 173800, Bozeman, MT 59717-3800 USA
| | - Erik S. Adams
- grid.41891.350000 0001 2156 6108Department of Mechanical & Industrial Engineering, Montana State University, PO Box 173800, Bozeman, MT 59717-3800 USA ,grid.34477.330000000122986657School of Medicine, Montana WWAMI, University of Washington, Seattle, WA USA
| |
Collapse
|
28
|
Sathyanarayanan A, Mueller TT, Ali Moni M, Schueler K, Baune BT, Lio P, Mehta D, Baune BT, Dierssen M, Ebert B, Fabbri C, Fusar-Poli P, Gennarelli M, Harmer C, Howes OD, Janzing JGE, Lio P, Maron E, Mehta D, Minelli A, Nonell L, Pisanu C, Potier MC, Rybakowski F, Serretti A, Squassina A, Stacey D, van Westrhenen R, Xicota L. Multi-omics data integration methods and their applications in psychiatric disorders. Eur Neuropsychopharmacol 2023; 69:26-46. [PMID: 36706689 DOI: 10.1016/j.euroneuro.2023.01.001] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 11/22/2022] [Accepted: 01/02/2023] [Indexed: 01/27/2023]
Abstract
To study mental illness and health, in the past researchers have often broken down their complexity into individual subsystems (e.g., genomics, transcriptomics, proteomics, clinical data) and explored the components independently. Technological advancements and decreasing costs of high throughput sequencing has led to an unprecedented increase in data generation. Furthermore, over the years it has become increasingly clear that these subsystems do not act in isolation but instead interact with each other to drive mental illness and health. Consequently, individual subsystems are now analysed jointly to promote a holistic understanding of the underlying biological complexity of health and disease. Complementing the increasing data availability, current research is geared towards developing novel methods that can efficiently combine the information rich multi-omics data to discover biologically meaningful biomarkers for diagnosis, treatment, and prognosis. However, clinical translation of the research is still challenging. In this review, we summarise conventional and state-of-the-art statistical and machine learning approaches for discovery of biomarker, diagnosis, as well as outcome and treatment response prediction through integrating multi-omics and clinical data. In addition, we describe the role of biological model systems and in silico multi-omics model designs in clinical translation of psychiatric research from bench to bedside. Finally, we discuss the current challenges and explore the application of multi-omics integration in future psychiatric research. The review provides a structured overview and latest updates in the field of multi-omics in psychiatry.
Collapse
Affiliation(s)
- Anita Sathyanarayanan
- Queensland University of Technology, Centre for Genomics and Personalised Health, School of Biomedical Sciences, Faculty of Health, Kelvin Grove, Queensland 4059, Australia
| | - Tamara T Mueller
- Institute for Artificial Intelligence and Informatics in Medicine, TU Munich, 80333 Munich, Germany
| | - Mohammad Ali Moni
- Artificial Intelligence and Digital Health Data Science, School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St Lucia, QLD, 4072, Australia
| | - Katja Schueler
- Clinic for Psychosomatics, Hospital zum Heiligen Geist, Frankfurt am Main, Germany; Frankfurt Psychoanalytic Institute, Frankfurt am Main, Germany
| | - Bernhard T Baune
- Department of Psychiatry and Psychotherapy, University of Münster, Germany; Department of Psychiatry, Melbourne Medical School, University of Melbourne, Australia; The Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Australia
| | - Pietro Lio
- Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom
| | - Divya Mehta
- Queensland University of Technology, Centre for Genomics and Personalised Health, School of Biomedical Sciences, Faculty of Health, Kelvin Grove, Queensland 4059, Australia.
| | | | - Bernhard T Baune
- Department of Psychiatry and Psychotherapy, University of Münster, Germany; Department of Psychiatry, Melbourne Medical School, University of Melbourne, Australia; The Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Australia
| | - Mara Dierssen
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology; Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Bjarke Ebert
- Medical Strategy & Communication, H. Lundbeck A/S, Valby, Denmark
| | - Chiara Fabbri
- Department of Biomedical and NeuroMotor Sciences, University of Bologna, Bologna, Italy; Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
| | - Paolo Fusar-Poli
- Early Psychosis: Intervention and Clinical-detection (EPIC) Lab, Department of Psychosis Studies, King's College London, United Kingdom; Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy
| | - Massimo Gennarelli
- Department of Molecular and Translational Medicine, University of Brescia; Genetics Unit, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| | | | - Oliver D Howes
- Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom; Psychiatric Imaging, Medical Research Council Clinical Sciences Centre, Imperial College London, Hammersmith Hospital Campus, London, United Kingdom
| | | | - Pietro Lio
- Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom
| | - Eduard Maron
- Department of Psychiatry, University of Tartu, Tartu, Estonia; Centre for Neuropsychopharmacology, Division of Brain Sciences, Imperial College London, London, United Kingdom; Documental Ltd, Tallin, Estonia; West Tallinn Central Hospital, Tallinn, Estonia
| | - Divya Mehta
- Queensland University of Technology, Centre for Genomics and Personalised Health, School of Biomedical Sciences, Faculty of Health, Kelvin Grove, Queensland 4059, Australia
| | - Alessandra Minelli
- Department of Molecular and Translational Medicine, University of Brescia; Genetics Unit, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| | - Lara Nonell
- MARGenomics, IMIM (Hospital del Mar Research Institute), Barcelona, Spain
| | - Claudia Pisanu
- Department of Biomedical Sciences, Section of Neuroscience and Clinical Pharmacology, University of Cagliari, Cagliari, Italy
| | | | - Filip Rybakowski
- Department of Psychiatry, Poznan University of Medical Sciences, Poznan, Poland
| | - Alessandro Serretti
- Department of Biomedical and NeuroMotor Sciences, University of Bologna, Bologna, Italy
| | - Alessio Squassina
- Department of Biomedical Sciences, Section of Neuroscience and Clinical Pharmacology, University of Cagliari, Cagliari, Italy
| | - David Stacey
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
| | - Roos van Westrhenen
- Parnassia Psychiatric Institute, Amsterdam, the Netherlands; Department of Psychiatry and Neuropsychology, Faculty of Health and Sciences, Maastricht University, Maastricht, the Netherlands; Institute of Psychiatry, Psychology & Neuroscience (IoPPN) King's College London, United Kingdom
| | - Laura Xicota
- Paris Brain Institute ICM, Salpetriere Hospital, Paris, France
| |
Collapse
|
29
|
Burton RJ, Cuff SM, Morgan MP, Artemiou A, Eberl M. GeoWaVe: geometric median clustering with weighted voting for ensemble clustering of cytometry data. Bioinformatics 2023; 39:6839973. [PMID: 36413065 PMCID: PMC9805571 DOI: 10.1093/bioinformatics/btac751] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 11/08/2022] [Accepted: 11/21/2022] [Indexed: 11/23/2022] Open
Abstract
MOTIVATION Clustering is an unsupervised method for identifying structure in unlabelled data. In the context of cytometry, it is typically used to categorize cells into subpopulations of similar phenotypes. However, clustering is greatly dependent on hyperparameters and the data to which it is applied as each algorithm makes different assumptions and generates a different 'view' of the dataset. As such, the choice of clustering algorithm can significantly influence results, and there is often not one preferred method but different insights to be obtained from different methods. To overcome these limitations, consensus approaches are needed that directly address the effect of competing algorithms. To the best of our knowledge, consensus clustering algorithms designed specifically for the analysis of cytometry data are lacking. RESULTS We present a novel ensemble clustering methodology based on geometric median clustering with weighted voting (GeoWaVe). Compared to graph ensemble clustering methods that have gained popularity in single-cell RNA sequencing analysis, GeoWaVe performed favourably on different sets of high-dimensional mass and flow cytometry data. Our findings provide proof of concept for the power of consensus methods to make the analysis, visualization and interpretation of cytometry data more robust and reproducible. The wide availability of ensemble clustering methods is likely to have a profound impact on our understanding of cellular responses, clinical conditions and therapeutic and diagnostic options. AVAILABILITY AND IMPLEMENTATION GeoWaVe is available as part of the CytoCluster package https://github.com/burtonrj/CytoCluster and published on the Python Package Index https://pypi.org/project/cytocluster. Benchmarking data described are available from https://doi.org/10.5281/zenodo.7134723. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Simone M Cuff
- Division of Infection and Immunity, School of Medicine, Cardiff University, Cardiff CF14 4XN, UK
| | - Matt P Morgan
- Adult Critical Care, University Hospital of Wales, Cardiff and Vale University Health Board, Cardiff CF14 4XW, UK
| | | | | |
Collapse
|
30
|
Hislop BD, Devine C, June RK, Heveran CM. Subchondral bone structure and synovial fluid metabolism are altered in injured and contralateral limbs 7 days after non-invasive joint injury in skeletally-mature C57BL/6 mice. Osteoarthritis Cartilage 2022; 30:1593-1605. [PMID: 36184957 PMCID: PMC9671828 DOI: 10.1016/j.joca.2022.09.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 08/16/2022] [Accepted: 09/04/2022] [Indexed: 02/02/2023]
Abstract
OBJECTIVE Post-traumatic osteoarthritis (PTOA) commonly develops after ACL injury, but early changes to the joint soon after injury are insufficiently understood. The objectives of this study were (1) evaluate the response of subchondral bone tissue modulus to joint injury and (2) identify which bone structural, material, and metabolic outcomes are local (i.e., injured joint only) or systemic (i.e., injured and contralateral-to-injured). DESIGN Female C57Bl∖6N mice (19 weeks at injury) underwent tibial compression overload to simulate ACL injury (n = 8) or a small pre-load (n = 8). Synovial fluid was harvested at euthanasia 7 days later for metabolomic profiling. Bone outcomes included epiphyseal and SCB microarchitecture, SCB nanoindentation modulus, SCB formation rate, and osteoclast number density. RESULTS Injury decreased epiphyseal bone volume fraction ([-5.29, -1.38%], P = 0.0016) and decreased SCB thickness for injured vs sham-injured limbs ([2.2, 31.4 μm], P = 0.017)). Epiphyseal bone loss commonly occurred for contralateral-to-injured limbs. There was not sufficient evidence to conclude that SCB modulus changes with injury. Metabolomic analyses revealed dysregulated synovial fluid metabolism with joint injury but that many metabolic pathways are shared between injured and contralateral-to-injured limbs. CONCLUSION This study demonstrates rapid changes to bone structure and synovial fluid metabolism after injury with the potential for influencing the progression to PTOA. These changes are often evidenced in the contralateral-to-injured limb, indicating that systemic musculoskeletal responses to joint injury should not be overlooked.
Collapse
Affiliation(s)
- B D Hislop
- Department of Mechanical & Industrial Engineering, Montana State University, USA
| | - C Devine
- Department of Chemical & Biological Engineering, Montana State University, USA
| | - R K June
- Department of Mechanical & Industrial Engineering, Montana State University, USA; Department of Microbiology & Cell Biology, Montana State University, USA
| | - C M Heveran
- Department of Mechanical & Industrial Engineering, Montana State University, USA.
| |
Collapse
|
31
|
Hu H, Laskin J. Emerging Computational Methods in Mass Spectrometry Imaging. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2022; 9:e2203339. [PMID: 36253139 PMCID: PMC9731724 DOI: 10.1002/advs.202203339] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 09/17/2022] [Indexed: 05/10/2023]
Abstract
Mass spectrometry imaging (MSI) is a powerful analytical technique that generates maps of hundreds of molecules in biological samples with high sensitivity and molecular specificity. Advanced MSI platforms with capability of high-spatial resolution and high-throughput acquisition generate vast amount of data, which necessitates the development of computational tools for MSI data analysis. In addition, computation-driven MSI experiments have recently emerged as enabling technologies for further improving the MSI capabilities with little or no hardware modification. This review provides a critical summary of computational methods and resources developed for MSI data analysis and interpretation along with computational approaches for improving throughput and molecular coverage in MSI experiments. This review is focused on the recently developed artificial intelligence methods and provides an outlook for a future paradigm shift in MSI with transformative computational methods.
Collapse
Affiliation(s)
- Hang Hu
- Department of ChemistryPurdue University560 Oval DriveWest LafayetteIN47907USA
| | - Julia Laskin
- Department of ChemistryPurdue University560 Oval DriveWest LafayetteIN47907USA
| |
Collapse
|
32
|
Samson C, Achim AM, Sicard V, Gilker A, Francoeur A, Franck N, Cloutier B, Giguère CE, Jean-Baptiste F, Lecomte T. Further validation of the Cognitive Biases Questionnaire for psychosis. BMC Psychiatry 2022; 22:560. [PMID: 35986316 PMCID: PMC9392283 DOI: 10.1186/s12888-022-04203-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 08/08/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cognitive biases are recognized as important treatment targets for reducing symptoms associated with severe mental disorders. Although cognitive biases have been linked to symptoms in most studies, few studies have looked at such biases transdiagnostically. The Cognitive Bias Questionnaire for psychosis (CBQp) is a self-reported questionnaire that assesses cognitive biases amongst individuals with a psychotic disorder, as well as individuals with other severe mental disorders. The current study aims to validate a French version of the CBQp and to explore transdiagnostic cognitive biases in individuals with psychotic disorders, individuals with depression, and in healthy controls. METHODS The CBQp was translated into French following a protocol based on international standards. Discriminant validity and internal consistency were determined for total score and each subscale score. Confirmatory factor analyses were performed to test construct validity. Finally, cluster analyses were conducted to investigate cognitive biases across diagnostic groups. RESULTS Our results were similar to those of the original authors, with the one-factor solution (assessment of a general thinking bias) being the strongest, but the two-factor solution (assessing biases within two themes relating to psychosis) and the five-factor solution (assessment of multiple distinct biases) being clinically more interesting. A six-cluster solution emerged, suggesting that individuals with similar diagnoses score differently on all cognitive biases, and that individuals with different diagnoses might have similar cognitive biases. CONCLUSIONS The current findings support the validity of the French translation of the CBQp. Our cluster analyses overall support the transdiagnostic presence of cognitive biases.
Collapse
Affiliation(s)
- Crystal Samson
- Département de Psychologie, Laboratoire d'étude sur la schizophrénie et les psychoses orienté vers l'intervention et le rétablissement Pavillon Marie-Victorin, Université de Montréal, 90 Vincent D'Indy Ave, Outremont, QC, H2V 2S9, Canada
- Centre de recherche de l'Institut Universitaire en Santé Mentale de Montréal (CR-IUSMM), Québec, Canada
| | - Amélie M Achim
- Université Laval, Québec, Canada
- Centre de recherche CERVO, Québec, Canada
- Centre de recherche en santé durable VITAM, Québec, Canada
| | - Veronik Sicard
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, Canada
| | - Andy Gilker
- Département de Génie biotechnologique, Université de Sherbrooke, Québec, Canada
| | - Audrey Francoeur
- Département de Psychologie, Laboratoire d'étude sur la schizophrénie et les psychoses orienté vers l'intervention et le rétablissement Pavillon Marie-Victorin, Université de Montréal, 90 Vincent D'Indy Ave, Outremont, QC, H2V 2S9, Canada
- Centre de recherche de l'Institut Universitaire en Santé Mentale de Montréal (CR-IUSMM), Québec, Canada
| | - Nicolas Franck
- Faculté de Médecine Lyon-Sud Charles Mérieux, Université Claude Bernard Lyon 1, Lyon, France
- Pôle Centre rive gauche & Centre ressource de réhabilitation psychosociale, Centre hospitalier Le Vinatier, Lyon, France
- Centre National de la Recherche Scientifique (CNRS), Bron, France
| | - Briana Cloutier
- Département de Psychologie, Laboratoire d'étude sur la schizophrénie et les psychoses orienté vers l'intervention et le rétablissement Pavillon Marie-Victorin, Université de Montréal, 90 Vincent D'Indy Ave, Outremont, QC, H2V 2S9, Canada
- Centre de recherche de l'Institut Universitaire en Santé Mentale de Montréal (CR-IUSMM), Québec, Canada
| | - Charles-Edouard Giguère
- Centre de recherche de l'Institut Universitaire en Santé Mentale de Montréal (CR-IUSMM), Québec, Canada
| | - Francelyne Jean-Baptiste
- Centre de recherche de l'Institut Universitaire en Santé Mentale de Montréal (CR-IUSMM), Québec, Canada
| | - Tania Lecomte
- Département de Psychologie, Laboratoire d'étude sur la schizophrénie et les psychoses orienté vers l'intervention et le rétablissement Pavillon Marie-Victorin, Université de Montréal, 90 Vincent D'Indy Ave, Outremont, QC, H2V 2S9, Canada.
- Centre de recherche de l'Institut Universitaire en Santé Mentale de Montréal (CR-IUSMM), Québec, Canada.
| |
Collapse
|
33
|
Tappu R, Haas J, Lehmann DH, Sedaghat-Hamedani F, Kayvanpour E, Keller A, Katus HA, Frey N, Meder B. Multi-omics assessment of dilated cardiomyopathy using non-negative matrix factorization. PLoS One 2022; 17:e0272093. [PMID: 35980883 PMCID: PMC9387871 DOI: 10.1371/journal.pone.0272093] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 07/11/2022] [Indexed: 11/19/2022] Open
Abstract
Dilated cardiomyopathy (DCM), a myocardial disease, is heterogeneous and often results in heart failure and sudden cardiac death. Unavailability of cardiac tissue has hindered the comprehensive exploration of gene regulatory networks and nodal players in DCM. In this study, we carried out integrated analysis of transcriptome and methylome data using non-negative matrix factorization from a cohort of DCM patients to uncover underlying latent factors and covarying features between whole-transcriptome and epigenome omics datasets from tissue biopsies of living patients. DNA methylation data from Infinium HM450 and mRNA Illumina sequencing of n = 33 DCM and n = 24 control probands were filtered, analyzed and used as input for matrix factorization using R NMF package. Mann-Whitney U test showed 4 out of 5 latent factors are significantly different between DCM and control probands (P<0.05). Characterization of top 10% features driving each latent factor showed a significant enrichment of biological processes known to be involved in DCM pathogenesis, including immune response (P = 3.97E-21), nucleic acid binding (P = 1.42E-18), extracellular matrix (P = 9.23E-14) and myofibrillar structure (P = 8.46E-12). Correlation network analysis revealed interaction of important sarcomeric genes like Nebulin, Tropomyosin alpha-3 and ERC-protein 2 with CpG methylation of ATPase Phospholipid Transporting 11A0, Solute Carrier Family 12 Member 7 and Leucine Rich Repeat Containing 14B, all with significant P values associated with correlation coefficients >0.7. Using matrix factorization, multi-omics data derived from human tissue samples can be integrated and novel interactions can be identified. Hypothesis generating nature of such analysis could help to better understand the pathophysiology of complex traits such as DCM.
Collapse
Affiliation(s)
- Rewati Tappu
- Institute for Cardiomyopathies Heidelberg (ICH), Heart Center Heidelberg, University of Heidelberg, Heidelberg, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Mannheim, Germany
| | - Jan Haas
- Institute for Cardiomyopathies Heidelberg (ICH), Heart Center Heidelberg, University of Heidelberg, Heidelberg, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Mannheim, Germany
| | - David H. Lehmann
- Institute for Cardiomyopathies Heidelberg (ICH), Heart Center Heidelberg, University of Heidelberg, Heidelberg, Germany
| | - Farbod Sedaghat-Hamedani
- Institute for Cardiomyopathies Heidelberg (ICH), Heart Center Heidelberg, University of Heidelberg, Heidelberg, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Mannheim, Germany
| | - Elham Kayvanpour
- Institute for Cardiomyopathies Heidelberg (ICH), Heart Center Heidelberg, University of Heidelberg, Heidelberg, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Mannheim, Germany
| | - Andreas Keller
- Department of Clinical Bioinformatics, Medical Faculty, Saarland University, Saarbrücken, Germany
| | - Hugo A. Katus
- Institute for Cardiomyopathies Heidelberg (ICH), Heart Center Heidelberg, University of Heidelberg, Heidelberg, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Mannheim, Germany
| | - Norbert Frey
- Institute for Cardiomyopathies Heidelberg (ICH), Heart Center Heidelberg, University of Heidelberg, Heidelberg, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Mannheim, Germany
| | - Benjamin Meder
- Institute for Cardiomyopathies Heidelberg (ICH), Heart Center Heidelberg, University of Heidelberg, Heidelberg, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Mannheim, Germany
- Department of Genetics, Stanford University School of Medicine, Palo Alto, California, United States of America
| |
Collapse
|
34
|
Bro-Jørgensen W, Hamill JM, Bro R, Solomon GC. Trusting our machines: validating machine learning models for single-molecule transport experiments. Chem Soc Rev 2022; 51:6875-6892. [PMID: 35686581 PMCID: PMC9377421 DOI: 10.1039/d1cs00884f] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Indexed: 11/21/2022]
Abstract
In this tutorial review, we will describe crucial aspects related to the application of machine learning to help users avoid the most common pitfalls. The examples we present will be based on data from the field of molecular electronics, specifically single-molecule electron transport experiments, but the concepts and problems we explore will be sufficiently general for application in other fields with similar data. In the first part of the tutorial review, we will introduce the field of single-molecule transport, and provide an overview of the most common machine learning algorithms employed. In the second part of the tutorial review, we will show, through examples grounded in single-molecule transport, that the promises of machine learning can only be fulfilled by careful application. We will end the tutorial review with a discussion of where we, as a field, could go from here.
Collapse
Affiliation(s)
- William Bro-Jørgensen
- Department of Chemistry and Nano-Science Center, University of Copenhagen, Universitetsparken 5, DK-2100, Copenhagen Ø, Denmark.
| | - Joseph M Hamill
- Department of Chemistry and Nano-Science Center, University of Copenhagen, Universitetsparken 5, DK-2100, Copenhagen Ø, Denmark.
| | - Rasmus Bro
- Department of Food Science, University of Copenhagen, Rolighedsvej 26, 1958 Frederiksberg, Denmark.
| | - Gemma C Solomon
- Department of Chemistry and Nano-Science Center, University of Copenhagen, Universitetsparken 5, DK-2100, Copenhagen Ø, Denmark.
| |
Collapse
|
35
|
Leppi JC, Rinella DJ, Wipfli MS, Whitman MS. Broad Whitefish (Coregonus nasus) isotopic niches: Stable isotopes reveal diverse foraging strategies and habitat use in Arctic Alaska. PLoS One 2022; 17:e0270474. [PMID: 35881611 PMCID: PMC9321764 DOI: 10.1371/journal.pone.0270474] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 06/10/2022] [Indexed: 11/28/2022] Open
Abstract
Understanding the ecological niche of some fishes is complicated by their frequent use of a broad range of food resources and habitats across space and time. Little is known about Broad Whitefish (Coregonus nasus) ecological niches in Arctic landscapes even though they are an important subsistence species for Alaska’s Indigenous communities. We investigated the foraging ecology and habitat use of Broad Whitefish via stable isotope analyses of muscle and liver tissue and otoliths from mature fish migrating in the Colville River within Arctic Alaska. The range of δ13C (-31.8– -21.9‰) and δ15N (6.6–13.1‰) across tissue types and among individuals overlapped with isotope values previously observed in Arctic lakes and rivers, estuaries, and nearshore marine habitat. The large range of δ18O (4.5–10.9‰) and δD (-237.6– -158.9‰) suggests fish utilized a broad spectrum of habitats across elevational and latitudinal gradients. Cluster analysis of muscle δ13Cˈ, δ15N, δ18O, and δD indicated that Broad Whitefish occupied four different foraging niches that relied on marine and land-based (i.e., freshwater and terrestrial) food sources to varying degrees. Most individuals had isotopic signatures representative of coastal freshwater habitat (Group 3; 25%) or coastal lagoon and delta habitat (Group 1; 57%), while individuals that mainly utilized inland freshwater (Group 4; 4%) and nearshore marine habitats (Group 2; 14%) represented smaller proportions. Otolith microchemistry confirmed that individuals with more enriched muscle tissue δ13Cˈ, δD, and δ18O tended to use marine habitats, while individuals that mainly used freshwater habitats had values that were less enriched. The isotopic niches identified here represent important foraging habitats utilized by Broad Whitefish. To preserve access to these diverse habitats it will be important to limit barriers along nearshore areas and reduce impacts like roads and climate change on natural flow regimes. Maintaining these diverse connected habitats will facilitate long-term population stability, buffering populations from future environmental and anthropogenic perturbations.
Collapse
Affiliation(s)
- Jason C. Leppi
- Alaska Cooperative Fish and Wildlife Research Unit, College of Fisheries and Ocean Sciences, University of Alaska Fairbanks, Fairbanks, Alaska, United States of America
- Research Department, The Wilderness Society, Anchorage, Alaska, United States of America
- * E-mail: ,
| | - Daniel J. Rinella
- Anchorage Fish and Wildlife Conservation Office, U.S. Fish and Wildlife Service, Anchorage, Alaska, United States of America
| | - Mark S. Wipfli
- U.S. Geological Survey, Alaska Cooperative Fish and Wildlife Research Unit, Institute of Arctic Biology, University of Alaska Fairbanks, Fairbanks, Alaska, United States of America
| | - Matthew S. Whitman
- Arctic District Office, Bureau of Land Management, Fairbanks, Alaska, United States of America
| |
Collapse
|
36
|
Capouskova K, Kringelbach ML, Deco G. Modes of cognition: Evidence from metastable brain dynamics. Neuroimage 2022; 260:119489. [PMID: 35882268 DOI: 10.1016/j.neuroimage.2022.119489] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 07/12/2022] [Accepted: 07/15/2022] [Indexed: 01/31/2023] Open
Abstract
Managing cognitive load depends on adequate resource allocation by the human brain through the engagement of metastable substates, which are large-scale functional networks that change over time. We employed a novel analysis method, deep autoencoder dynamical analysis (DADA), with 100 healthy adults selected from the Human Connectome Project (HCP) data set in rest and six cognitive tasks. The deep autoencoder of DADA described seven recurrent stochastic metastable substates from the functional connectome of BOLD phase coherence matrices. These substates were significantly differentiated in terms of their probability of appearance, time duration, and spatial attributes. We found that during different cognitive tasks, there was a higher probability of having more connected substates dominated by a high degree of connectivity in the thalamus. In addition, compared with those during tasks, resting brain dynamics have a lower level of predictability, indicating a more uniform distribution of metastability between substates, quantified by higher entropy. These novel findings provide empirical evidence for the philosophically motivated cognitive theory, suggesting on-line and off-line as two fundamentally distinct modes of cognition. On-line cognition refers to task-dependent engagement with the sensory input, while off-line cognition is a slower, environmentally detached mode engaged with decision and planning. Overall, the DADA framework provides a bridge between neuroscience and cognitive theory that can be further explored in the future.
Collapse
Affiliation(s)
- Katerina Capouskova
- Center for Brain and Cognition, Computational Neuroscience Group, Department of Information and Communication Technologies, Universitat Pompeu Fabra, Ramon Trias Fargas 25-27, Barcelona 08005, Spain.
| | - Morten L Kringelbach
- Department of Psychiatry, University of Oxford, Oxford, United Kingdom; Center for Music in the Brain, Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Gustavo Deco
- Center for Brain and Cognition, Computational Neuroscience Group, Department of Information and Communication Technologies, Universitat Pompeu Fabra, Ramon Trias Fargas 25-27, Barcelona 08005, Spain; Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain; Turner Institute for Brain and Mental Health, Monash University, Melbourne, VIC, Australia
| |
Collapse
|
37
|
Dalmaijer ES, Nord CL, Astle DE. Statistical power for cluster analysis. BMC Bioinformatics 2022; 23:205. [PMID: 35641905 PMCID: PMC9158113 DOI: 10.1186/s12859-022-04675-1] [Citation(s) in RCA: 144] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 03/02/2022] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Cluster algorithms are gaining in popularity in biomedical research due to their compelling ability to identify discrete subgroups in data, and their increasing accessibility in mainstream software. While guidelines exist for algorithm selection and outcome evaluation, there are no firmly established ways of computing a priori statistical power for cluster analysis. Here, we estimated power and classification accuracy for common analysis pipelines through simulation. We systematically varied subgroup size, number, separation (effect size), and covariance structure. We then subjected generated datasets to dimensionality reduction approaches (none, multi-dimensional scaling, or uniform manifold approximation and projection) and cluster algorithms (k-means, agglomerative hierarchical clustering with Ward or average linkage and Euclidean or cosine distance, HDBSCAN). Finally, we directly compared the statistical power of discrete (k-means), "fuzzy" (c-means), and finite mixture modelling approaches (which include latent class analysis and latent profile analysis). RESULTS We found that clustering outcomes were driven by large effect sizes or the accumulation of many smaller effects across features, and were mostly unaffected by differences in covariance structure. Sufficient statistical power was achieved with relatively small samples (N = 20 per subgroup), provided cluster separation is large (Δ = 4). Finally, we demonstrated that fuzzy clustering can provide a more parsimonious and powerful alternative for identifying separable multivariate normal distributions, particularly those with slightly lower centroid separation (Δ = 3). CONCLUSIONS Traditional intuitions about statistical power only partially apply to cluster analysis: increasing the number of participants above a sufficient sample size did not improve power, but effect size was crucial. Notably, for the popular dimensionality reduction and clustering algorithms tested here, power was only satisfactory for relatively large effect sizes (clear separation between subgroups). Fuzzy clustering provided higher power in multivariate normal distributions. Overall, we recommend that researchers (1) only apply cluster analysis when large subgroup separation is expected, (2) aim for sample sizes of N = 20 to N = 30 per expected subgroup, (3) use multi-dimensional scaling to improve cluster separation, and (4) use fuzzy clustering or mixture modelling approaches that are more powerful and more parsimonious with partially overlapping multivariate normal distributions.
Collapse
Affiliation(s)
- Edwin S Dalmaijer
- MRC Cognition and Brain Sciences Unit, University of Cambridge, 15 Chaucer Road, Cambridge, CB2 7EF, UK.
| | - Camilla L Nord
- MRC Cognition and Brain Sciences Unit, University of Cambridge, 15 Chaucer Road, Cambridge, CB2 7EF, UK
| | - Duncan E Astle
- MRC Cognition and Brain Sciences Unit, University of Cambridge, 15 Chaucer Road, Cambridge, CB2 7EF, UK
| |
Collapse
|
38
|
Muñoz-Baena L, Poon AFY. Using networks to analyze and visualize the distribution of overlapping genes in virus genomes. PLoS Pathog 2022; 18:e1010331. [PMID: 35202429 PMCID: PMC8903798 DOI: 10.1371/journal.ppat.1010331] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 03/08/2022] [Accepted: 02/02/2022] [Indexed: 11/19/2022] Open
Abstract
Gene overlap occurs when two or more genes are encoded by the same nucleotides. This phenomenon is found in all taxonomic domains, but is particularly common in viruses, where it may increase the information content of compact genomes or influence the creation of new genes. Here we report a global comparative study of overlapping open reading frames (OvRFs) of 12,609 virus reference genomes in the NCBI database. We retrieved metadata associated with all annotated open reading frames (ORFs) in each genome record to calculate the number, length, and frameshift of OvRFs. Our results show that while the number of OvRFs increases with genome length, they tend to be shorter in longer genomes. The majority of overlaps involve +2 frameshifts, predominantly found in dsDNA viruses. Antisense overlaps in which one of the ORFs was encoded in the same frame on the opposite strand (−0) tend to be longer. Next, we develop a new graph-based representation of the distribution of overlaps among the ORFs of genomes in a given virus family. In the absence of an unambiguous partition of ORFs by homology at this taxonomic level, we used an alignment-free k-mer based approach to cluster protein coding sequences by similarity. We connect these clusters with two types of directed edges to indicate (1) that constituent ORFs are adjacent in one or more genomes, and (2) that these ORFs overlap. These adjacency graphs not only provide a natural visualization scheme, but also a novel statistical framework for analyzing the effects of gene- and genome-level attributes on the frequencies of overlaps.
Collapse
Affiliation(s)
- Laura Muñoz-Baena
- Department of Microbiology and Immunology, Western University, London, ON, Canada
| | - Art F. Y. Poon
- Department of Microbiology and Immunology, Western University, London, ON, Canada
- Department of Pathology and Laboratory Medicine, Western University, London, ON, Canada
- * E-mail:
| |
Collapse
|
39
|
Renigunta Mohammed N, Mohammed M. Multi-viewpoints visual models for efficient modeling and analysis of Twitter based health-care services. INTERNATIONAL JOURNAL OF PERVASIVE COMPUTING AND COMMUNICATIONS 2021. [DOI: 10.1108/ijpcc-06-2021-0140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Purpose
The purpose of this study for eHealth text mining domains, cosine-based visual methods (VM) assess the clusters more accurately than Euclidean; which are recommended for tweet data models for clusters assessment. Such VM determines the clusters concerning a single viewpoint or none, which are less informative. Multi-viewpoints (MVP) were used for addressing the more informative clusters assessment of health-care tweet documents and to demonstrate visual analysis of cluster tendency.
Design/methodology/approach
In this paper, the authors proposed MVP-based VM by using traditional topic models with visual techniques to find cluster tendency, partitioning for cluster validity to propose health-care recommendations based on tweets. The authors demonstrated the effectiveness of proposed methods on different real-time Twitter health-care data sets in the experimental study. The authors also did a comparative analysis of proposed models with existing visual assessment tendency (VAT) and cVAT models by using cluster validity indices and computational complexities; the examples suggest that MVP VM were more informative.
Findings
In this paper, the authors proposed MVP-based VM by using traditional topic models with visual techniques to find cluster tendency, partitioning for cluster validity to propose health-care recommendations based on tweets.
Originality/value
In this paper, the authors proposed multi-viewpoints distance metric in topic model cluster tendency for the first time and visual representation using VAT images using hybrid topic models to find cluster tendency, partitioning for cluster validity to propose health-care recommendations based on tweets.
Collapse
|
40
|
McCullough MH, Goodhill GJ. Unsupervised quantification of naturalistic animal behaviors for gaining insight into the brain. Curr Opin Neurobiol 2021; 70:89-100. [PMID: 34482006 DOI: 10.1016/j.conb.2021.07.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 07/20/2021] [Accepted: 07/21/2021] [Indexed: 01/02/2023]
Abstract
Neural computation has evolved to optimize the behaviors that enable our survival. Although much previous work in neuroscience has focused on constrained task behaviors, recent advances in computer vision are fueling a trend toward the study of naturalistic behaviors. Automated tracking of fine-scale behaviors is generating rich datasets for animal models including rodents, fruit flies, zebrafish, and worms. However, extracting meaning from these large and complex data often requires sophisticated computational techniques. Here we review the latest methods and modeling approaches providing new insights into the brain from behavior. We focus on unsupervised methods for identifying stereotyped behaviors and for resolving details of the structure and dynamics of behavioral sequences.
Collapse
Affiliation(s)
- Michael H McCullough
- Queensland Brain Institute, The University of Queensland, Brisbane, Queensland, 4072, Australia
| | - Geoffrey J Goodhill
- Queensland Brain Institute, The University of Queensland, Brisbane, Queensland, 4072, Australia; School of Mathematics and Physics, The University of Queensland, Brisbane, Queensland, 4072, Australia.
| |
Collapse
|
41
|
A Constrained Feature Selection Approach Based on Feature Clustering and Hypothesis Margin Maximization. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021. [DOI: 10.1155/2021/5554873] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In this paper, we propose a semisupervised feature selection approach that is based on feature clustering and hypothesis margin maximization. The aim is to improve the classification accuracy by choosing the right feature subset and to allow building more interpretable models. Our approach handles the two core aspects of feature selection, i.e., relevance and redundancy, and is divided into three steps. First, the similarity weights between features are represented by a sparse graph where each feature can be reconstructed from the sparse linear combination of the others. Second, features are then hierarchically clustered identifying groups of the most similar ones. Finally, a semisupervised margin-based objective function is optimized to select the most data discriminative feature from within each cluster, hence maximizing relevance while minimizing redundancy among features. Eventually, we empirically validate our proposed approach on multiple well-known UCI benchmark datasets in terms of classification accuracy and representation entropy, where it proved to outperform four other semisupervised and unsupervised methods and competed with two widely used supervised ones.
Collapse
|
42
|
Hagemann C, Tyzack GE, Taha DM, Devine H, Greensmith L, Newcombe J, Patani R, Serio A, Luisier R. Automated and unbiased discrimination of ALS from control tissue at single cell resolution. Brain Pathol 2021; 31:e12937. [PMID: 33576079 PMCID: PMC8412073 DOI: 10.1111/bpa.12937] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 12/21/2020] [Accepted: 01/07/2021] [Indexed: 12/27/2022] Open
Abstract
Histopathological analysis of tissue sections is invaluable in neurodegeneration research. However, cell-to-cell variation in both the presence and severity of a given phenotype is a key limitation of this approach, reducing the signal to noise ratio and leaving unresolved the potential of single-cell scoring for a given disease attribute. Here, we tested different machine learning methods to analyse high-content microscopy measurements of hundreds of motor neurons (MNs) from amyotrophic lateral sclerosis (ALS) post-mortem tissue sections. Furthermore, we automated the identification of phenotypically distinct MN subpopulations in VCP- and SOD1-mutant transgenic mice, revealing common morphological cellular phenotypes. Additionally we established scoring metrics to rank cells and tissue samples for both disease probability and severity. By adapting this paradigm to human post-mortem tissue, we validated our core finding that morphological descriptors robustly discriminate ALS from control healthy tissue at single cell resolution. Determining disease presence, severity and unbiased phenotypes at single cell resolution might prove transformational in our understanding of ALS and neurodegeneration more broadly.
Collapse
Affiliation(s)
- Cathleen Hagemann
- The Francis Crick InstituteLondonUK
- Centre for Craniofacial & Regenerative BiologyKing's College LondonLondonUK
| | - Giulia E. Tyzack
- The Francis Crick InstituteLondonUK
- Department of Neuromuscular DiseasesUCL Queen Square Institute of NeurologyLondonUK
| | - Doaa M. Taha
- The Francis Crick InstituteLondonUK
- Department of Neuromuscular DiseasesUCL Queen Square Institute of NeurologyLondonUK
| | - Helen Devine
- Department of Neuromuscular DiseasesUCL Queen Square Institute of NeurologyLondonUK
| | - Linda Greensmith
- Department of Neuromuscular DiseasesUCL Queen Square Institute of NeurologyLondonUK
| | - Jia Newcombe
- NeuroResourceDepartment of NeuroinflammationUCL Queen Square Institute of NeurologyLondonUK
| | - Rickie Patani
- The Francis Crick InstituteLondonUK
- Department of Neuromuscular DiseasesUCL Queen Square Institute of NeurologyLondonUK
| | - Andrea Serio
- The Francis Crick InstituteLondonUK
- Centre for Craniofacial & Regenerative BiologyKing's College LondonLondonUK
| | | |
Collapse
|
43
|
Taus P, Pospisilova S, Plevova K. Identification of Clinically Relevant Subgroups of Chronic Lymphocytic Leukemia Through Discovery of Abnormal Molecular Pathways. Front Genet 2021; 12:627964. [PMID: 34262590 PMCID: PMC8273263 DOI: 10.3389/fgene.2021.627964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 05/04/2021] [Indexed: 11/13/2022] Open
Abstract
Chronic lymphocytic leukemia (CLL) is the most common form of adult leukemia in the Western world with a highly variable clinical course. Its striking genetic heterogeneity is not yet fully understood. Although the CLL genetic landscape has been well-described, patient stratification based on mutation profiles remains elusive mainly due to the heterogeneity of data. Here we attempted to decrease the heterogeneity of somatic mutation data by mapping mutated genes in the respective biological processes. From the sequencing data gathered by the International Cancer Genome Consortium for 506 CLL patients, we generated pathway mutation scores, applied ensemble clustering on them, and extracted abnormal molecular pathways with a machine learning approach. We identified four clusters differing in pathway mutational profiles and time to first treatment. Interestingly, common CLL drivers such as ATM or TP53 were associated with particular subtypes, while others like NOTCH1 or SF3B1 were not. This study provides an important step in understanding mutational patterns in CLL.
Collapse
Affiliation(s)
- Petr Taus
- Central European Institute of Technology, Masaryk University, Brno, Czechia
| | - Sarka Pospisilova
- Central European Institute of Technology, Masaryk University, Brno, Czechia.,Department of Internal Medicine - Hematology and Oncology, University Hospital Brno, Brno, Czechia.,Faculty of Medicine, Masaryk University, Brno, Czechia
| | - Karla Plevova
- Central European Institute of Technology, Masaryk University, Brno, Czechia.,Department of Internal Medicine - Hematology and Oncology, University Hospital Brno, Brno, Czechia.,Faculty of Medicine, Masaryk University, Brno, Czechia
| |
Collapse
|
44
|
Peterson EJR, Abidi AA, Arrieta-Ortiz ML, Aguilar B, Yurkovich JT, Kaur A, Pan M, Srinivas V, Shmulevich I, Baliga NS. Intricate Genetic Programs Controlling Dormancy in Mycobacterium tuberculosis. Cell Rep 2021; 31:107577. [PMID: 32348771 PMCID: PMC7605849 DOI: 10.1016/j.celrep.2020.107577] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Revised: 12/18/2019] [Accepted: 04/06/2020] [Indexed: 11/24/2022] Open
Abstract
Mycobacterium tuberculosis (MTB) displays the remarkable ability to transition in and out of dormancy, a hallmark of the pathogen’s capacity to evade the immune system and exploit susceptible individuals. Uncovering the gene regulatory programs that underlie the phenotypic shifts in MTB during disease latency and reactivation has posed a challenge. We develop an experimental system to precisely control dissolved oxygen levels in MTB cultures in order to capture the transcriptional events that unfold as MTB transitions into and out of hypoxia-induced dormancy. Using a comprehensive genome-wide transcription factor binding map and insights from network topology analysis, we identify regulatory circuits that deterministically drive sequential transitions across six transcriptionally and functionally distinct states encompassing more than three-fifths of the MTB genome. The architecture of the genetic programs explains the transcriptional dynamics underlying synchronous entry of cells into a dormant state that is primed to infect the host upon encountering favorable conditions. Mycobacterium tuberculosis (MTB) persists within the host by counteracting disparate stressors including hypoxia. Peterson et al. report a transcriptional program that coordinates sequential state transitions to drive MTB in and out of hypoxia-induced dormancy. Among varied properties, this program encodes advanced preparedness to infect the host in favorable conditions.
Collapse
Affiliation(s)
| | - Abrar A Abidi
- Institute for Systems Biology, Seattle, WA 98109, USA
| | | | - Boris Aguilar
- Institute for Systems Biology, Seattle, WA 98109, USA
| | | | - Amardeep Kaur
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Min Pan
- Institute for Systems Biology, Seattle, WA 98109, USA
| | | | | | - Nitin S Baliga
- Institute for Systems Biology, Seattle, WA 98109, USA; Molecular and Cellular Biology Program, Departments of Microbiology and Biology, University of Washington, Seattle, WA; Lawrence Berkeley National Laboratories, Berkeley, CA.
| |
Collapse
|
45
|
Kavran AJ, Clauset A. Denoising large-scale biological data using network filters. BMC Bioinformatics 2021; 22:157. [PMID: 33765911 PMCID: PMC7992843 DOI: 10.1186/s12859-021-04075-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 03/15/2021] [Indexed: 11/29/2022] Open
Abstract
Background Large-scale biological data sets are often contaminated by noise, which can impede accurate inferences about underlying processes. Such measurement noise can arise from endogenous biological factors like cell cycle and life history variation, and from exogenous technical factors like sample preparation and instrument variation. Results We describe a general method for automatically reducing noise in large-scale biological data sets. This method uses an interaction network to identify groups of correlated or anti-correlated measurements that can be combined or “filtered” to better recover an underlying biological signal. Similar to the process of denoising an image, a single network filter may be applied to an entire system, or the system may be first decomposed into distinct modules and a different filter applied to each. Applied to synthetic data with known network structure and signal, network filters accurately reduce noise across a wide range of noise levels and structures. Applied to a machine learning task of predicting changes in human protein expression in healthy and cancerous tissues, network filtering prior to training increases accuracy up to 43% compared to using unfiltered data. Conclusions Network filters are a general way to denoise biological data and can account for both correlation and anti-correlation between different measurements. Furthermore, we find that partitioning a network prior to filtering can significantly reduce errors in networks with heterogenous data and correlation patterns, and this approach outperforms existing diffusion based methods. Our results on proteomics data indicate the broad potential utility of network filters to applications in systems biology. Supplementary Information The online version supplementary material available at 10.1186/s12859-021-04075-x.
Collapse
Affiliation(s)
- Andrew J Kavran
- Department of Biochemistry, University of Colorado, Boulder, CO, USA.,BioFrontiers Institute, University of Colorado, Boulder, CO, USA
| | - Aaron Clauset
- BioFrontiers Institute, University of Colorado, Boulder, CO, USA. .,Department of Computer Science, University of Colorado, Boulder, CO, USA. .,Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
46
|
Hu H, Yin R, Brown HM, Laskin J. Spatial Segmentation of Mass Spectrometry Imaging Data by Combining Multivariate Clustering and Univariate Thresholding. Anal Chem 2021; 93:3477-3485. [PMID: 33570915 PMCID: PMC7904669 DOI: 10.1021/acs.analchem.0c04798] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Spatial segmentation partitions mass spectrometry imaging (MSI) data into distinct regions, providing a concise visualization of the vast amount of data and identifying regions of interest (ROIs) for downstream statistical analysis. Unsupervised approaches are particularly attractive, as they may be used to discover the underlying subpopulations present in the high-dimensional MSI data without prior knowledge of the properties of the sample. Herein, we introduce an unsupervised spatial segmentation approach, which combines multivariate clustering and univariate thresholding to generate comprehensive spatial segmentation maps of the MSI data. This approach combines matrix factorization and manifold learning to enable high-quality image segmentation without an extensive hyperparameter search. In parallel, some ion images inadequately represented in the multivariate analysis were treated using univariate thresholding to generate complementary spatial segments. The final spatial segmentation map was assembled from segment candidates that were generated using both techniques. We demonstrate the performance and robustness of this approach for two MSI data sets of mouse uterine and kidney tissue sections that were acquired with different spatial resolutions. The resulting segmentation maps are easy to interpret and project onto the known anatomical regions of the tissue.
Collapse
Affiliation(s)
- Hang Hu
- Department of Chemistry, Purdue University, West Lafayette, Indiana 47907, United States
| | - Ruichuan Yin
- Department of Chemistry, Purdue University, West Lafayette, Indiana 47907, United States
| | - Hilary M Brown
- Department of Chemistry, Purdue University, West Lafayette, Indiana 47907, United States
| | - Julia Laskin
- Department of Chemistry, Purdue University, West Lafayette, Indiana 47907, United States
| |
Collapse
|
47
|
Peris-Díaz MD, Krężel A. A guide to good practice in chemometric methods for vibrational spectroscopy, electrochemistry, and hyphenated mass spectrometry. Trends Analyt Chem 2021. [DOI: 10.1016/j.trac.2020.116157] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
48
|
Khan SR, Al Rijjal D, Piro A, Wheeler MB. Integration of AI and traditional medicine in drug discovery. Drug Discov Today 2021; 26:982-992. [PMID: 33476566 DOI: 10.1016/j.drudis.2021.01.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 12/01/2020] [Accepted: 01/11/2021] [Indexed: 11/24/2022]
Abstract
AI integration in plant-based traditional medicine could be used to overcome drug discovery challenges.
Collapse
Affiliation(s)
- Saifur R Khan
- Endocrine and Diabetes Platform, Department of Physiology, University of Toronto, Medical Sciences Building, Room 3352, 1 King's College Circle, Toronto, ON M5S 1A8, Canada; Advanced Diagnostics, Metabolism, Toronto General Hospital Research Institute, Toronto, ON, Canada.
| | - Dana Al Rijjal
- Endocrine and Diabetes Platform, Department of Physiology, University of Toronto, Medical Sciences Building, Room 3352, 1 King's College Circle, Toronto, ON M5S 1A8, Canada; Advanced Diagnostics, Metabolism, Toronto General Hospital Research Institute, Toronto, ON, Canada
| | - Anthony Piro
- Endocrine and Diabetes Platform, Department of Physiology, University of Toronto, Medical Sciences Building, Room 3352, 1 King's College Circle, Toronto, ON M5S 1A8, Canada; Advanced Diagnostics, Metabolism, Toronto General Hospital Research Institute, Toronto, ON, Canada
| | - Michael B Wheeler
- Endocrine and Diabetes Platform, Department of Physiology, University of Toronto, Medical Sciences Building, Room 3352, 1 King's College Circle, Toronto, ON M5S 1A8, Canada; Advanced Diagnostics, Metabolism, Toronto General Hospital Research Institute, Toronto, ON, Canada
| |
Collapse
|
49
|
|
50
|
Muñoz-Rojas AR, Kelsey I, Pappalardo JL, Chen M, Miller-Jensen K. Co-stimulation with opposing macrophage polarization cues leads to orthogonal secretion programs in individual cells. Nat Commun 2021; 12:301. [PMID: 33436596 PMCID: PMC7804107 DOI: 10.1038/s41467-020-20540-2] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Accepted: 12/07/2020] [Indexed: 12/13/2022] Open
Abstract
Macrophages are innate immune cells that contribute to fighting infections, tissue repair, and maintaining tissue homeostasis. To enable such functional diversity, macrophages resolve potentially conflicting cues in the microenvironment via mechanisms that are unclear. Here, we use single-cell RNA sequencing to explore how individual macrophages respond when co-stimulated with inflammatory stimuli LPS and IFN-γ and the resolving cytokine IL-4. These co-stimulated macrophages display a distinct global transcriptional program. However, variable negative cross-regulation between some LPS + IFN-γ-specific and IL-4-specific genes results in cell-to-cell heterogeneity in transcription. Interestingly, negative cross-regulation leads to mutually exclusive expression of the T-cell-polarizing cytokine genes Il6 and Il12b versus the IL-4-associated factors Arg1 and Chil3 in single co-stimulated macrophages, and single-cell secretion measurements show that these specialized functions are maintained for at least 48 h. This study suggests that increasing functional diversity in the population is one strategy macrophages use to respond to conflicting environmental cues.
Collapse
Affiliation(s)
- Andrés R Muñoz-Rojas
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
- Department of Immunology, Harvard Medical School, Boston, MA, USA
| | - Ilana Kelsey
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
| | - Jenna L Pappalardo
- Department of Immunobiology, Yale University School of Medicine, New Haven, CT, USA
| | - Meibin Chen
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
| | - Kathryn Miller-Jensen
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA.
- Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT, USA.
- Systems Biology Institute, Yale University, New Haven, CT, USA.
| |
Collapse
|