1
|
DePamphilis GM, Legere C, Vigne MM, Tirrell E, Holler K, Carpenter LL, Kavanaugh BC. Transdiagnostic Attentional Deficits Are Associated with Depressive and Externalizing Symptoms in Children and Adolescents with Neuropsychiatric Disorders. Arch Clin Neuropsychol 2025; 40:783-793. [PMID: 39540608 DOI: 10.1093/arclin/acae103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Revised: 09/19/2024] [Accepted: 10/21/2024] [Indexed: 11/16/2024] Open
Abstract
OBJECTIVE Although inattention, impulsivity, and impairments to vigilance are most associated with attention-deficit/hyperactivity disorder (ADHD), transdiagnostic attentional deficits are prevalent across all psychiatric disorders. To further elucidate this relationship, the present study investigated parent-reported neuropsychiatric symptom correlates of attention deficits using the factor structure of the Conners' Continuous Performance Test (CPT-II), a neuropsychological test of attention. METHOD Two-hundred and eighteen children and adolescents (7-21 years old) completed the CPT-II as part of standard clinical protocol during outpatient pediatric neuropsychology visits. The factor structure of the CPT-II was determined with a principal component analysis (PCA) using Promax rotation. Pearson correlation analyses and regression models examined the relationship between the generated factor structure, parent-reported clinical symptoms, and pre-determined clinical diagnoses. RESULTS Results from the PCA suggested a three-factor model best supported the structure of the CPT-II, and were subsequently defined as inattention, impulsivity, and vigilance. Performance-based inattention was significantly correlated with parent-reported hyperactivity, aggression, conduct problems, and depression. Parent-reported depressive symptoms and conduct problems were the strongest correlates of performance-based inattention, not hyperactivity or aggression. Performance-based inattention was significantly associated with an ADHD diagnosis but not a depression or anxiety diagnosis. CONCLUSIONS Findings suggest attentional deficits are not specific to any one disorder. To enhance the identification, classification, and treatment of neuropsychiatric disorders, both researchers and clinicians alike must diminish the importance of categorical approaches to child/adolescent psychopathology and continue to consider the dimensionality of transdiagnostic characteristics such as inattention.
Collapse
Affiliation(s)
- Gian M DePamphilis
- Center of Biomedical Research Excellence (COBRE), Center for Neuromodulation, Butler Hospital, 345 Blackstone Boulevard, Providence, RI 02906, USA
| | - Christopher Legere
- Emma Pendleton Bradley Hospital, 1011 Veterans Memorial Parkway, East Providence, RI 02915, USA
| | - Megan M Vigne
- Center of Biomedical Research Excellence (COBRE), Center for Neuromodulation, Butler Hospital, 345 Blackstone Boulevard, Providence, RI 02906, USA
| | - Eric Tirrell
- Center of Biomedical Research Excellence (COBRE), Center for Neuromodulation, Butler Hospital, 345 Blackstone Boulevard, Providence, RI 02906, USA
- Department of Psychiatry and Human Behavior, Warren Alpert Medical School, 222 Richmond Street, Providence, RI 02903, USA
| | - Karen Holler
- Emma Pendleton Bradley Hospital, 1011 Veterans Memorial Parkway, East Providence, RI 02915, USA
- Department of Psychiatry and Human Behavior, Warren Alpert Medical School, 222 Richmond Street, Providence, RI 02903, USA
| | - Linda L Carpenter
- Center of Biomedical Research Excellence (COBRE), Center for Neuromodulation, Butler Hospital, 345 Blackstone Boulevard, Providence, RI 02906, USA
- Department of Psychiatry and Human Behavior, Warren Alpert Medical School, 222 Richmond Street, Providence, RI 02903, USA
| | - Brian C Kavanaugh
- Emma Pendleton Bradley Hospital, 1011 Veterans Memorial Parkway, East Providence, RI 02915, USA
- Department of Psychiatry and Human Behavior, Warren Alpert Medical School, 222 Richmond Street, Providence, RI 02903, USA
| |
Collapse
|
2
|
de Oliveira BR, Zuffo AM, Dos Santos Silva FC, Steiner F, AlGarawi AM, Okla MK, Nhs M, Alhaj Hamoud Y, Josko I, Sheteiwy MS, Alyafei MS, Sulieman S. Random forest algorithms: a tool to identify the impact of arbuscular mycorrhizal fungi inoculation, seed maturation stage and geographic diversity of Pimpinella anisum L. accessions on the physicochemical composition of seeds. BMC PLANT BIOLOGY 2025; 25:608. [PMID: 40346478 PMCID: PMC12063322 DOI: 10.1186/s12870-025-06536-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2024] [Accepted: 04/10/2025] [Indexed: 05/11/2025]
Abstract
BACKGROUND A study using random forest (RF) algorithms and principal component analysis (PCA) was proposed to identify the effects of arbuscular mycorrhizal fungal inoculation, the seed maturation stage and the geographic diversity of Pimpinella anisum L. accessions on the physicochemical composition of seeds. Seeds of six anise varieties from North African and Middle Eastern accessions were inoculated or not inoculated with AMF (an arbuscular mycorrhizal fungus) and then grown under controlled conditions. Seeds were harvested at three different maturity stages: mature seeds (157 d after sowing), premature seeds (147 d after sowing), and immature seeds (137 d after sowing). Forty-nine variables related to physical properties, total nutrients, metabolic compounds, essential oils, and biological activity were measured in P. anisum seeds. RESULTS The RF algorithm allows the differentiation of P. anisum varieties inoculated with AMF from different countries in North Africa and the Middle East. This evidence proves that the geographic origin of P. anisum seeds significantly influences the efficiency of the symbiotic association between anise roots and AMF. In turn, no significant effects of the seed maturation stage on the symbiotic interaction of plants with mycorrhizae were observed. The chemical compounds related to the biological activity of seeds are not influenced by AMF, followed by chemical compounds related to metabolism, total nutrients, and oil components. CONCLUSIONS The performance of classification models using RF is driven primarily by independent variables related to the chemical composition of anise seeds, overshadowing the effects of geographic diversity and the seed maturation stage. Among the chemical constituents of the seed, the variables belonging to the biological activity category best contain information (patterns) on the impacts of AMF inoculation.
Collapse
Affiliation(s)
| | - Alan Mario Zuffo
- Department of Agronomy, State University of Maranhão, Balsas, MA, Brazil.
| | | | - Fábio Steiner
- Department of Agronomy, State University of Mato Grosso Do Sul, Cassilândia, MS, Brazil
| | - Amal Mohamed AlGarawi
- Botany and Microbiology Department, College of Science, King Saud University, P.O. Box 2455, 11451, Riyadh, Saudi Arabia
| | - Mohammad K Okla
- Botany and Microbiology Department, College of Science, King Saud University, P.O. Box 2455, 11451, Riyadh, Saudi Arabia
| | - Mousa Nhs
- Botany & Microbiology Department, Faculty of Science, Assiut University, P.O. Box 71516, Assiut, Egypt
| | - Yousef Alhaj Hamoud
- The National Key Laboratory of Water Disaster Prevention, and College of Hydrology and Water Resources, Hohai University, Nanjing, 210098, China
- Research Centre for Horticultural Crops (FGK), Fachhochschule Erfurt, 99090, Erfurt, Germany
| | - Izabela Josko
- Institute of Plant Genetics, Breeding and Biotechnology, Faculty of Agrobioengineering, University of Life Sciences, 20-950, Lublin, Poland
| | - Mohamed S Sheteiwy
- Department of Integrative Agriculture, College of Agriculture and Veterinary Medicine, United Arab Emirates University, P.O. Box 15551, Al Ain, Abu Dhabi, United Arab Emirates.
| | - Mohamed Salem Alyafei
- Department of Integrative Agriculture, College of Agriculture and Veterinary Medicine, United Arab Emirates University, P.O. Box 15551, Al Ain, Abu Dhabi, United Arab Emirates
| | - Saad Sulieman
- Department of Integrative Agriculture, College of Agriculture and Veterinary Medicine, United Arab Emirates University, P.O. Box 15551, Al Ain, Abu Dhabi, United Arab Emirates
- Department of Agronomy, Faculty of Agriculture, University of Khartoum, Khartoum North 13314, Shambat, Sudan
| |
Collapse
|
3
|
Deepali, Goel N, Khandnor P. DeepOmicsSurv: a deep learning-based model for survival prediction of oral cancer. Discov Oncol 2025; 16:614. [PMID: 40278990 PMCID: PMC12031713 DOI: 10.1007/s12672-025-02346-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/09/2024] [Accepted: 04/09/2025] [Indexed: 04/26/2025] Open
Abstract
OBJECTIVE Oral cancer is an important health challenge worldwide and accurate survival time prediction of this disease can guide treatment decisions. This study aims to propose a deep learning-based model, DeepOmicsSurv, to predict survival in oral cancer patients using clinical and multi-omics data. METHODS DeepOmicsSurv builds on the DeepSurv model, incorporating multi-head attention convolutional layers, dropout, pooling, and batch normalization to boost its strength and precision. Various dimensionality reduction techniques, including Principal Component Analysis (PCA), Kernel PCA, Non-Negative Matrix Factorization (NMF), Singular Value Decomposition (SVD), Partial Least Squares (PLS), Multidimensional Scaling (MDS), and Autoencoders, were employed to manage the high-dimensional omics data. The model's performance was evaluated against DeepSurv, DeepHit, Cox Proportional Hazards (CoxPH), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN). Additionally, SHapley Additive Explanations (SHAP) was used to analyze the impact of clinical features on survival predictions. RESULTS DeepOmicsSurv achieved a C-index of 0.966, MSE of 0.0138, RMSE of 0.1174, MAE of 0.0795, and MedAE of 0.0515, outperforming other deep learning models. Among various dimensionality reduction techniques, autoencoder performed the best with DeepOmicsSurv. SHAP analysis showed that Age, AJCC N Stage, alcohol history and patient smoking history are prevalent clinical features for survival time. CONCLUSION In conclusion, DeepOmicsSurv has the potential to predict survival time in oral cancer patients. This model achieved high accuracy with various data types including Clinical, DNAmethylation + clinical, mRNA + clinical, Copy number alteration + clinical, or multi-omics data. Additionally, SHAP analysis reveals clinical factors that influence survival time.
Collapse
Affiliation(s)
- Deepali
- University Institute of Engineering and Technology, Panjab University, Chandigarh, 160014, India
- Department of Computer Science, Guru Nanak College, Budhlada, 151502, India
| | - Neelam Goel
- University Institute of Engineering and Technology, Panjab University, Chandigarh, 160014, India.
| | - Padmavati Khandnor
- Department of Computer Science, Punjab Engineering College (Deemed to be University), Chandigarh, 160012, India
| |
Collapse
|
4
|
John J, Stannard S, Fraser SDS, Berrington A, Alwan NA. Clusters and associations of adverse neonatal events with adult risk of multimorbidity: A secondary analysis of birth cohort data. PLoS One 2025; 20:e0319200. [PMID: 40100914 PMCID: PMC11918344 DOI: 10.1371/journal.pone.0319200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2024] [Accepted: 01/28/2025] [Indexed: 03/20/2025] Open
Abstract
OBJECTIVE To investigate associations between clustered adverse neonatal events and later-life multimorbidity. DESIGN Secondary analysis of birth cohort data. SETTING Prospective birth cohort study of individuals born in Britain in one week of 1970. POPULATION Respondents provided data at birth (n = 17,196), age 34 (n = 11,261), age 38 (n = 9,665), age 42 (n = 9,840), and age 46 (n = 8,580). METHODS Mixed components analysis determined included factors, 'Birthweight'; 'Neonatal cyanosis'; 'Neonatal cerebral signs'; 'Neonatal illnesses'; 'Neonatal breathing difficulties'; and 'Prolonged duration to establishment of respiratory rate at birth', within the composite adverse neonatal event score. Log-binomial regression quantified the unadjusted and covariate-adjusted (paternal employment status and social class; maternal smoking status; maternal age; parity; cohort member smoking status and Body Mass Index) associations between the adverse neonatal event score and risk of multimorbidity in adulthood. OUTCOME MEASURES Multimorbidity at each adult data sweep, defined as the presence of two or more Long-Term Conditions (LTCs). RESULTS 13.7% of respondents experienced one or more adverse neonatal event(s) at birth. The percentage reporting multimorbidity increased steadily from 14.6% at age 34 to 25.5% at age 46. A significant association was only observed at the 38 years sweep; those who had experienced two or more adverse neonatal events had a 41.0% (95% CI: 1.05 - 1.88) increased risk of multimorbidity, compared to those who had not suffered any adverse neonatal events at birth. This association was maintained following adjustment for parental confounders and adult smoking status. CONCLUSIONS Adverse neonatal events at birth may be independently associated with the development of midlife multimorbidity. Programmes and policies aimed at tackling the growing public health burden of multimorbidity may also need to consider interventions to reduce adverse neonatal events at birth.
Collapse
Affiliation(s)
- Jeeva John
- School of Primary Care, Population Sciences and Medical Education, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | - Seb Stannard
- School of Primary Care, Population Sciences and Medical Education, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | - Simon D. S. Fraser
- School of Primary Care, Population Sciences and Medical Education, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | - Ann Berrington
- Department of Social Statistics and Demography, University of Southampton, Southampton, United Kingdom
| | - Nisreen A. Alwan
- School of Primary Care, Population Sciences and Medical Education, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
- University Hospital Southampton National Health Service Foundation Trust, Southampton, United Kingdom
- National Institute for Health Research Applied Research Collaboration Wessex, Southampton, United Kingdom
| |
Collapse
|
5
|
Gliozzo J, Soto-Gomez M, Guarino V, Bonometti A, Cabri A, Cavalleri E, Reese J, Robinson PN, Mesiti M, Valentini G, Casiraghi E. Intrinsic-dimension analysis for guiding dimensionality reduction and data fusion in multi-omics data processing. Artif Intell Med 2025; 160:103049. [PMID: 39673960 DOI: 10.1016/j.artmed.2024.103049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 12/03/2024] [Accepted: 12/04/2024] [Indexed: 12/16/2024]
Abstract
Multi-omics data have revolutionized biomedical research by providing a comprehensive understanding of biological systems and the molecular mechanisms of disease development. However, analyzing multi-omics data is challenging due to high dimensionality and limited sample sizes, necessitating proper data-reduction pipelines to ensure reliable analyses. Additionally, its multimodal nature requires effective data-integration pipelines. While several dimensionality reduction and data fusion algorithms have been proposed, crucial aspects are often overlooked. Specifically, the choice of projection space dimension is typically heuristic and uniformly applied across all omics, neglecting the unique high dimension small sample size challenges faced by individual omics. This paper introduces a novel multi-modal dimensionality reduction pipeline tailored to individual views. By leveraging intrinsic dimensionality estimators, we assess the curse-of-dimensionality impact on each view and propose a two-step reduction strategy for significantly affected views, combining feature selection with feature extraction. Compared to traditional uniform reduction pipelines in a crucial and supervised multi-omics analysis setting, our approach shows significant improvement. Additionally, we explore three effective unsupervised multi-omics data fusion methods rooted in the main data fusion strategies to gain insights into their performance under crucial, yet overlooked, settings.
Collapse
Affiliation(s)
- Jessica Gliozzo
- AnacletoLab, Computer Science Department, Università degli Studi di Milano, Milan, Italy; European Commission, Joint Research Centre (JRC), Ispra, Italy
| | - Mauricio Soto-Gomez
- AnacletoLab, Computer Science Department, Università degli Studi di Milano, Milan, Italy
| | - Valentina Guarino
- AnacletoLab, Computer Science Department, Università degli Studi di Milano, Milan, Italy
| | - Arturo Bonometti
- Department of Biomedical Sciences, Humanitas University, Milan, Italy; Department of Pathology, IRCCS Humanitas Clinical and Research Hospital, Milan, Italy
| | - Alberto Cabri
- AnacletoLab, Computer Science Department, Università degli Studi di Milano, Milan, Italy
| | - Emanuele Cavalleri
- AnacletoLab, Computer Science Department, Università degli Studi di Milano, Milan, Italy
| | - Justin Reese
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Marco Mesiti
- AnacletoLab, Computer Science Department, Università degli Studi di Milano, Milan, Italy; Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Giorgio Valentini
- AnacletoLab, Computer Science Department, Università degli Studi di Milano, Milan, Italy; CINI, Infolife National Laboratory, Roma, Italy
| | - Elena Casiraghi
- AnacletoLab, Computer Science Department, Università degli Studi di Milano, Milan, Italy; Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA; CINI, Infolife National Laboratory, Roma, Italy; Department of Computer Science, Aalto University, Espoo, Finland.
| |
Collapse
|
6
|
Pakkir Shah AK, Walter A, Ottosson F, Russo F, Navarro-Diaz M, Boldt J, Kalinski JCJ, Kontou EE, Elofson J, Polyzois A, González-Marín C, Farrell S, Aggerbeck MR, Pruksatrakul T, Chan N, Wang Y, Pöchhacker M, Brungs C, Cámara B, Caraballo-Rodríguez AM, Cumsille A, de Oliveira F, Dührkop K, El Abiead Y, Geibel C, Graves LG, Hansen M, Heuckeroth S, Knoblauch S, Kostenko A, Kuijpers MCM, Mildau K, Papadopoulos Lambidis S, Portal Gomes PW, Schramm T, Steuer-Lodd K, Stincone P, Tayyab S, Vitale GA, Wagner BC, Xing S, Yazzie MT, Zuffa S, de Kruijff M, Beemelmanns C, Link H, Mayer C, van der Hooft JJJ, Damiani T, Pluskal T, Dorrestein P, Stanstrup J, Schmid R, Wang M, Aron A, Ernst M, Petras D. Statistical analysis of feature-based molecular networking results from non-targeted metabolomics data. Nat Protoc 2025; 20:92-162. [PMID: 39304763 DOI: 10.1038/s41596-024-01046-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 07/02/2024] [Indexed: 09/22/2024]
Abstract
Feature-based molecular networking (FBMN) is a popular analysis approach for liquid chromatography-tandem mass spectrometry-based non-targeted metabolomics data. While processing liquid chromatography-tandem mass spectrometry data through FBMN is fairly streamlined, downstream data handling and statistical interrogation are often a key bottleneck. Especially users new to statistical analysis struggle to effectively handle and analyze complex data matrices. Here we provide a comprehensive guide for the statistical analysis of FBMN results, focusing on the downstream analysis of the FBMN output table. We explain the data structure and principles of data cleanup and normalization, as well as uni- and multivariate statistical analysis of FBMN results. We provide explanations and code in two scripting languages (R and Python) as well as the QIIME2 framework for all protocol steps, from data clean-up to statistical analysis. All code is shared in the form of Jupyter Notebooks ( https://github.com/Functional-Metabolomics-Lab/FBMN-STATS ). Additionally, the protocol is accompanied by a web application with a graphical user interface ( https://fbmn-statsguide.gnps2.org/ ) to lower the barrier of entry for new users and for educational purposes. Finally, we also show users how to integrate their statistical results into the molecular network using the Cytoscape visualization tool. Throughout the protocol, we use a previously published environmental metabolomics dataset for demonstration purposes. Together, the protocol, code and web application provide a complete guide and toolbox for FBMN data integration, cleanup and advanced statistical analysis, enabling new users to uncover molecular insights from their non-targeted metabolomics data. Our protocol is tailored for the seamless analysis of FBMN results from Global Natural Products Social Molecular Networking and can be easily adapted to other mass spectrometry feature detection, annotation and networking tools.
Collapse
Affiliation(s)
- Abzer K Pakkir Shah
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Axel Walter
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
- Applied Bioinformatics, Department of Computer Science, University of Tübingen, Tübingen, Germany
| | - Filip Ottosson
- Section for Clinical Mass Spectrometry, Danish Center for Neonatal Screening, Department of Congenital Disorders, Statens Serum Institut, Copenhagen S, Denmark
| | - Francesco Russo
- Section for Clinical Mass Spectrometry, Danish Center for Neonatal Screening, Department of Congenital Disorders, Statens Serum Institut, Copenhagen S, Denmark
| | - Marcelo Navarro-Diaz
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Judith Boldt
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
- German Center for Infection Research, Partner Site Braunschweig-Hannover, Braunschweig, Germany
| | - Jarmo-Charles J Kalinski
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Biochemistry and Microbiology, Rhodes University, Makhanda, South Africa
| | - Eftychia Eva Kontou
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- The Novo Nordisk Foundation for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
| | - James Elofson
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO, USA
| | - Alexandros Polyzois
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Boyce Thompson Institute and Department of Chemistry and Chemical Biology, Cornell University, Ithaca, NY, USA
| | - Carolina González-Marín
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Universidad EAFIT, Medellín, Antioquia, Colombia
| | - Shane Farrell
- Bigelow Laboratory for Ocean Sciences, East Boothbay, ME, USA
- School of Marine Sciences, Darling Marine Center, University of Maine, Walpole, ME, USA
| | - Marie R Aggerbeck
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Environmental Science, Aarhus University, Roskilde, Denmark
| | - Thapanee Pruksatrakul
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, Thailand Science Park, Pathum Thani, Thailand
| | - Nathan Chan
- Department of Computer Science, University of California Riverside, Riverside, CA, USA
| | - Yunshu Wang
- Department of Computer Science, University of California Riverside, Riverside, CA, USA
| | - Magdalena Pöchhacker
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Food Chemistry and Toxicology, University of Vienna, Vienna, Austria
| | - Corinna Brungs
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Beatriz Cámara
- Laboratorio de Microbiología Molecular y Biotecnología Ambiental, Centro de Biotecnología DAL, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | | | - Andres Cumsille
- Laboratorio de Microbiología Molecular y Biotecnología Ambiental, Centro de Biotecnología DAL, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | - Fernanda de Oliveira
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
- Department of Biotechnology, Engineering School of Lorena, University of São Paulo, Lorena, São Paulo, Brazil
| | - Kai Dührkop
- Department of Bioinformatics, University of Jena, Jena, Germany
| | - Yasin El Abiead
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Christian Geibel
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Lana G Graves
- Department of Environmental Systems Analysis, University of Tübingen, Tübingen, Germany
- Leibniz Institute of Freshwater Ecology and Inland Fisheries, Berlin, Germany
| | - Martin Hansen
- Department of Environmental Science, Aarhus University, Roskilde, Denmark
| | - Steffen Heuckeroth
- Institute of Inorganic and Analytical Chemistry, University of Münster, Münster, Germany
| | - Simon Knoblauch
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Anastasiia Kostenko
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO, USA
| | - Mirte C M Kuijpers
- Department of Ecology, Behavior and Evolution, University of California San Diego, San Diego, CA, USA
| | - Kevin Mildau
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Analytical Chemistry, University of Vienna, Vienna, Austria
- Bioinformatics Group, Wageningen University and Research, Wageningen, the Netherlands
| | | | - Paulo Wender Portal Gomes
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Tilman Schramm
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
- Department of Biochemistry, University of California Riverside, Riverside, CA, USA
| | - Karoline Steuer-Lodd
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
- Department of Biochemistry, University of California Riverside, Riverside, CA, USA
| | - Paolo Stincone
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Sibgha Tayyab
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Giovanni Andrea Vitale
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Berenike C Wagner
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Shipei Xing
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Marquis T Yazzie
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO, USA
| | - Simone Zuffa
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Martinus de Kruijff
- Helmholtz Institute for Pharmaceutical Research Saarland, Helmholtz Centre for Infection Research, Saarbrücken, Germany
| | - Christine Beemelmanns
- Helmholtz Institute for Pharmaceutical Research Saarland, Helmholtz Centre for Infection Research, Saarbrücken, Germany
- Saarland University, Saarbrücken, Germany
| | - Hannes Link
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Christoph Mayer
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Justin J J van der Hooft
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Bioinformatics Group, Wageningen University and Research, Wageningen, the Netherlands
- Department of Biochemistry, University of Johannesburg, Johannesburg, South Africa
| | - Tito Damiani
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Tomáš Pluskal
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Pieter Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Jan Stanstrup
- Department of Nutrition, Exercise and Sports, University of Copenhagen, Frederiksberg C, Denmark
| | - Robin Schmid
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Mingxun Wang
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Computer Science, University of California Riverside, Riverside, CA, USA
| | - Allegra Aron
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO, USA
| | - Madeleine Ernst
- Section for Clinical Mass Spectrometry, Danish Center for Neonatal Screening, Department of Congenital Disorders, Statens Serum Institut, Copenhagen S, Denmark.
| | - Daniel Petras
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA.
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany.
- Department of Biochemistry, University of California Riverside, Riverside, CA, USA.
| |
Collapse
|
7
|
Collí-Dulá RC, Papatheodorou I. Single-cell RNA sequencing offers opportunities to explore the depth of physiology, adaptation, and biochemistry in non-model organisms exposed to pollution. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY. PART D, GENOMICS & PROTEOMICS 2024; 52:101339. [PMID: 39393164 DOI: 10.1016/j.cbd.2024.101339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 09/28/2024] [Accepted: 10/02/2024] [Indexed: 10/13/2024]
Abstract
Single-cell Sequencing technology (scSeq) has revolutionized our understanding of individual cells, uncovering unprecedented heterogeneity within tissues and cell populations, principality through single-cell RNA Sequencing (scRNA-Seq). This short review highlights the pivotal role of scRNA-Seq in elucidating genotype-phenotype relationships, particularly in biological systems. Based on published articles, our analysis involved manual curation and automated Scopus tools to illustrate recent advances in the application of scRNA-Seq. The results reveal that scRNA-Seq has been extensively utilized in various biological areas, including biochemistry, genetics, molecular biology, immunology, and microbiology, followed by health sciences covering studies related to the nervous system, immune system, human health, development, and diseases, with a particular focus on cancer research. However, the potential of scRNA-Seq extends beyond disease research, offering insights into non-model organisms' responses to environmental contaminants. By enabling the study of cellular reactions at a molecular level, scRNA-Seq provides a comprehensive understanding of intracellular heterogeneity that enhances our comprehension of physiological, biochemical, and pathological environmental impacts on non-model organisms exposed to pollution. This understanding has many practical benefits, as it can aid in regulation and conservation efforts that benefit the environment and the use of economically essential and ecologically relevant organisms.
Collapse
Affiliation(s)
- Reyna C Collí-Dulá
- Departamento de Recursos del Mar, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional, 97310 Mérida, Yucatán, Mexico; Consejo Nacional de Humanidades Ciencia y Tecnología, Ciudad de México, Mexico.
| | - Irene Papatheodorou
- European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom; Earlham Institute Norwich Research Park, Norwich NR4 7UZ, UK; Medical School, University of East Anglia, Norwich Research Park, Norwich, NR4 7UA, UK.
| |
Collapse
|
8
|
de Souza Rodrigues J, Basinger NT, Leon RG, Bacha AL, da Silva Santos RT, Eason KM, Shilling D, Grey TL. Growth Analysis of Glyphosate-Resistant and Susceptible Amaranthus palmeri Biotypes. PLANT-ENVIRONMENT INTERACTIONS (HOBOKEN, N.J.) 2024; 5:e70023. [PMID: 39703194 PMCID: PMC11655308 DOI: 10.1002/pei3.70023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 12/04/2024] [Accepted: 12/06/2024] [Indexed: 12/21/2024]
Abstract
This study examined the growth parameters of both glyphosate-susceptible and glyphosate-resistant biotypes of Amaranthus palmeri, designated as GA2005 and GA2017, respectively. A two-year microplot field study was conducted to assess their growth characteristics. Scheduled destructive harvests on named harvest days (HD) were conducted to collect measurements for further calculation of net assimilation rate (NAR; g m-2 day-1), specific leaf area (SLA), leaf weight ratio (LWR), stem-to-leaf ratio (SLR), leaf area index (LAI), leaf area ratio (LAR; cm2 g-1), leaf area duration (LAD; days), relative growth rate (RGR; g.g-1 day-1) and plant volume (m3). In addition, stem diameter, number of leaves, and Chlorophyll content (μmol m2) were determined. The main objective was to identify growth parameters that differentiate biotypes along the plant life cycle. While certain growth parameters showed no variation among biotypes, differences in leaf area index (LAI) over HD and chlorophyll content and leaf area duration (LAD) were observed as the main effects. Glyphosate-resistant biotypes exhibited higher LAD and chlorophyll content, potentially conferring a competitive advantage, especially in heavily used glyphosate environments. The study highlights the complexity of intraspecific genetic differentiation, adaptation, and environmental factors affecting A. palmeri. It may offer insights into biotype distinction and resistance spread while advancing our comprehension of species adaptation and growth strategies for enhanced control.
Collapse
Affiliation(s)
| | | | - Ramon G. Leon
- Department of Crop and Soil SciencesNorth Carolina State UniversityRaleighNorth CarolinaUSA
| | - Allan L. Bacha
- Department of Biology Applied to AgricultureSao Paulo State UniversityJaboticabalSao PauloBrazil
| | | | | | - Donn Shilling
- Department of Crop and Soil SciencesUniversity of GeorgiaAthensGeorgiaUSA
| | - Timothy L. Grey
- Department of Crop and Soil SciencesUniversity of GeorgiaTiftonGeorgiaUSA
| |
Collapse
|
9
|
Batty JA, Smith L, Hall M. Impact of multiple long term conditions on hospital admission and mortality during winter: importance of linked, population scale healthcare data. BMJ MEDICINE 2024; 3:e001114. [PMID: 39574423 PMCID: PMC11580320 DOI: 10.1136/bmjmed-2024-001114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Accepted: 10/21/2024] [Indexed: 11/24/2024]
Affiliation(s)
- Jonathan Adam Batty
- Leeds Institute of Cardiovascular and Metabolic Medicine, University of Leeds, Leeds, UK
- Leeds Institute for Data Analytics, University of Leeds, Leeds LS2 9NL, UK
| | - Lesley Smith
- Leeds Institute of Clinical Trials Research, University of Leeds, Leeds, UK
| | - Marlous Hall
- Leeds Institute of Cardiovascular and Metabolic Medicine, University of Leeds, Leeds, UK
- Leeds Institute for Data Analytics, University of Leeds, Leeds LS2 9NL, UK
| |
Collapse
|
10
|
Lopez-Moreno H, Phillips M, Diaz-Garcia L, Torres-Meraz M, Jarquin D, Loarca J, Ikeda S, Giongo L, Grygleski E, Iorizzo M, Zalapa J. Multiparametric Cranberry (Vaccinium macrocarpon Ait.) Fruit Textural Trait Development for Harvest and Postharvest Evaluation in Representative Cultivars. J Texture Stud 2024; 55:e12866. [PMID: 39261281 DOI: 10.1111/jtxs.12866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 07/25/2024] [Accepted: 08/12/2024] [Indexed: 09/13/2024]
Abstract
Fruit texture is a priority trait that guarantees the long-term economic sustainability of the cranberry industry through value-added products such as sweetened dried cranberries (SDCs). To develop a standard methodology to measure texture, we conducted a comparative analysis of 22 textural traits using five different methods under both harvest and postharvest conditions in 10 representative cranberry cultivars. A set of textural traits from the 10%-strain compression and puncture methods were identified that differentiate between cultivars primarily based on hardness/stiffness and elasticity properties. The complementary use of both methodologies allowed for a detailed evaluation by capturing the effect of key texture-determining factors such as structure, flesh, and skin. Furthermore, the high effectiveness of this approach in different conditions and its ability to capture high phenotypic variation in cultivars highlights its great potential for applicability in various areas of the value chain and research. Therefore, this study provides an informed reference for unifying future efforts to enhance cranberry fruit texture and quality.
Collapse
Affiliation(s)
- Hector Lopez-Moreno
- Department of Plant and Agroecosystem Sciences, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Matthew Phillips
- Department of Plant and Agroecosystem Sciences, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Luis Diaz-Garcia
- Department of Viticulture and Enology, University of California Davis, Davis, California, USA
| | - Maria Torres-Meraz
- Department of Plant and Agroecosystem Sciences, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Diego Jarquin
- Agronomy Department, University of Florida, Gainesville, Florida, USA
| | - Jenyne Loarca
- Department of Plant and Agroecosystem Sciences, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Shinya Ikeda
- USDA-ARS, Vegetable Crops Research Unit, Department of Food Science, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Lara Giongo
- Fondazione Edmund Mach - Research and Innovation Centre - Berry Genetics and Breeding Unit, San Michele All'adige, Italy
| | | | - Massimo Iorizzo
- Department of Horticultural Science, Plants for Human Health Institute, North Carolina State University, Kannapolis, North Carolina, USA
| | - Juan Zalapa
- Department of Plant and Agroecosystem Sciences, University of Wisconsin-Madison, Madison, Wisconsin, USA
- USDA-ARS, Vegetable Crops Research Unit, Madison, Wisconsin, USA
| |
Collapse
|
11
|
Qin Z, Ren H, Zhao P, Wang K, Liu H, Miao C, Du Y, Li J, Wu L, Chen Z. Current computational tools for protein lysine acylation site prediction. Brief Bioinform 2024; 25:bbae469. [PMID: 39316944 PMCID: PMC11421846 DOI: 10.1093/bib/bbae469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Revised: 08/20/2024] [Accepted: 09/07/2024] [Indexed: 09/26/2024] Open
Abstract
As a main subtype of post-translational modification (PTM), protein lysine acylations (PLAs) play crucial roles in regulating diverse functions of proteins. With recent advancements in proteomics technology, the identification of PTM is becoming a data-rich field. A large amount of experimentally verified data is urgently required to be translated into valuable biological insights. With computational approaches, PLA can be accurately detected across the whole proteome, even for organisms with small-scale datasets. Herein, a comprehensive summary of 166 in silico PLA prediction methods is presented, including a single type of PLA site and multiple types of PLA sites. This recapitulation covers important aspects that are critical for the development of a robust predictor, including data collection and preparation, sample selection, feature representation, classification algorithm design, model evaluation, and method availability. Notably, we discuss the application of protein language models and transfer learning to solve the small-sample learning issue. We also highlight the prediction methods developed for functionally relevant PLA sites and species/substrate/cell-type-specific PLA sites. In conclusion, this systematic review could potentially facilitate the development of novel PLA predictors and offer useful insights to researchers from various disciplines.
Collapse
Affiliation(s)
- Zhaohui Qin
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Haoran Ren
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Pei Zhao
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang 455000, China
| | - Kaiyuan Wang
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Huixia Liu
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Chunbo Miao
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Yanxiu Du
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Junzhou Li
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Liuji Wu
- National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Zhen Chen
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| |
Collapse
|
12
|
Bonnici V, Chicco D. Seven quick tips for gene-focused computational pangenomic analysis. BioData Min 2024; 17:28. [PMID: 39227987 PMCID: PMC11370085 DOI: 10.1186/s13040-024-00380-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 08/12/2024] [Indexed: 09/05/2024] Open
Abstract
Pangenomics is a relatively new scientific field which investigates the union of all the genomes of a clade. The word pan means everything in ancient Greek; the term pangenomics originally regarded genomes of bacteria and was later intended to refer to human genomes as well. Modern bioinformatics offers several tools to analyze pangenomics data, paving the way to an emerging field that we can call computational pangenomics. Current computational power available for the bioinformatics community has made computational pangenomic analyses easy to perform, but this higher accessibility to pangenomics analysis also increases the chances to make mistakes and to produce misleading or inflated results, especially by beginners. To handle this problem, we present here a few quick tips for efficient and correct computational pangenomic analyses with a focus on bacterial pangenomics, by describing common mistakes to avoid and experienced best practices to follow in this field. We believe our recommendations can help the readers perform more robust and sound pangenomic analyses and to generate more reliable results.
Collapse
Affiliation(s)
- Vincenzo Bonnici
- Dipartimento di Scienze Matematiche Fisiche e Informatiche, Università di Parma, Parma, Italy.
| | - Davide Chicco
- Dipartimento di Informatica Sistemistica e Comunicazione, Università di Milano-Bicocca, Milan, Italy.
- Institute of Health Policy Management and Evaluation, University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|
13
|
Park Y, Hauschild AC. The effect of data transformation on low-dimensional integration of single-cell RNA-seq. BMC Bioinformatics 2024; 25:171. [PMID: 38689234 PMCID: PMC11059821 DOI: 10.1186/s12859-024-05788-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 04/16/2024] [Indexed: 05/02/2024] Open
Abstract
BACKGROUND Recent developments in single-cell RNA sequencing have opened up a multitude of possibilities to study tissues at the level of cellular populations. However, the heterogeneity in single-cell sequencing data necessitates appropriate procedures to adjust for technological limitations and various sources of noise when integrating datasets from different studies. While many analysis procedures employ various preprocessing steps, they often overlook the importance of selecting and optimizing the employed data transformation methods. RESULTS This work investigates data transformation approaches used in single-cell clustering analysis tools and their effects on batch integration analysis. In particular, we compare 16 transformations and their impact on the low-dimensional representations, aiming to reduce the batch effect and integrate multiple single-cell sequencing data. Our results show that data transformations strongly influence the results of single-cell clustering on low-dimensional data space, such as those generated by UMAP or PCA. Moreover, these changes in low-dimensional space significantly affect trajectory analysis using multiple datasets, as well. However, the performance of the data transformations greatly varies across datasets, and the optimal method was different for each dataset. Additionally, we explored how data transformation impacts the analysis of deep feature encodings using deep neural network-based models, including autoencoder-based models and proto-typical networks. Data transformation also strongly affects the outcome of deep neural network models. CONCLUSIONS Our findings suggest that the batch effect and noise in integrative analysis are highly influenced by data transformation. Low-dimensional features can integrate different batches well when proper data transformation is applied. Furthermore, we found that the batch mixing score on low-dimensional space can guide the selection of the optimal data transformation. In conclusion, data preprocessing is one of the most crucial analysis steps and needs to be cautiously considered in the integrative analysis of multiple scRNA-seq datasets.
Collapse
Affiliation(s)
- Youngjun Park
- Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
- International Max Planck Research Schools for Genome Science, Georg-August-Universität Göttingen, Göttingen, Germany
| | - Anne-Christin Hauschild
- Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany.
- Campus-Institute Data Science (CIDAS), Georg-August-Universität Göttingen, Göttingen, Germany.
| |
Collapse
|
14
|
Dixit S, Middelkoop TC, Choubey S. Governing principles of transcriptional logic out of equilibrium. Biophys J 2024; 123:1015-1029. [PMID: 38486450 PMCID: PMC11052701 DOI: 10.1016/j.bpj.2024.03.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 03/04/2024] [Accepted: 03/11/2024] [Indexed: 03/24/2024] Open
Abstract
To survive, adapt, and develop, cells respond to external and internal stimuli by tightly regulating transcription. Transcriptional regulation involves the combinatorial binding of a repertoire of transcription factors to DNA, which often results in switch-like binary outputs akin to Boolean logic gates. Recent experimental studies have demonstrated that in eukaryotes, transcription factor binding to DNA often involves energy expenditure, thereby driving the system out of equilibrium. The governing principles of transcriptional logic operations out of equilibrium remain unexplored. Here, we employ a simple two-input, single-locus model of transcription that can accommodate both equilibrium and nonequilibrium mechanisms. Using this model, we find that nonequilibrium regimes can give rise to all the logic operations accessible in equilibrium. Strikingly, energy expenditure alters the regulatory function of the two transcription factors in a mutually exclusive manner. This allows for the emergence of new logic operations that are inaccessible in equilibrium. Overall, our results show that energy expenditure can expand the range of cellular decision-making without the need for more complex promoter architectures.
Collapse
Affiliation(s)
- Smruti Dixit
- The Institute of Mathematical Sciences, CIT Campus, Chennai, India.
| | - Teije C Middelkoop
- Laboratory of Developmental Mechanobiology, Division BIOCEV, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic
| | - Sandeep Choubey
- The Institute of Mathematical Sciences, CIT Campus, Chennai, India; Homi Bhabha National Institute, Training School Complex, Mumbai, India.
| |
Collapse
|
15
|
Julkaew S, Wongsirichot T, Damkliang K, Sangthawan P. Improving accuracy of vascular access quality classification in hemodialysis patients using deep learning with K highest score feature selection. J Int Med Res 2024; 52:3000605241232519. [PMID: 38573764 PMCID: PMC10996358 DOI: 10.1177/03000605241232519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 01/26/2024] [Indexed: 04/05/2024] Open
Abstract
OBJECTIVE To develop and evaluate a novel feature selection technique, using photoplethysmography (PPG) sensors, for enhancing the performance of deep learning models in classifying vascular access quality in hemodialysis patients. METHODS This cross-sectional study involved creating a novel feature selection method based on SelectKBest principles, specifically designed to optimize deep learning models for PPG sensor data, in hemodialysis patients. The method effectiveness was assessed by comparing the performance of multiple deep learning models using the feature selection approach versus complete feature set. The model with the highest accuracy was then trained and tested using a 70:30 approach, respectively, with the full dataset and the SelectKBest dataset. Performance results were compared using Student's paired t-test. RESULTS Data from 398 hemodialysis patients were included. The 1-dimensional convolutional neural network (CNN1D) displayed the highest accuracy among different models. Implementation of the SelectKBest-based feature selection technique resulted in a statistically significant improvement in the CNN1D model's performance, achieving an accuracy of 92.05% (with feature selection) versus 90.79% (with full feature set). CONCLUSION These findings suggest that the newly developed feature selection approach might aid in accurately predicting vascular access quality in hemodialysis patients. This advancement may contribute to the development of reliable diagnostic tools for identifying vascular complications, such as stenosis, potentially improving patient outcomes and their quality of life.
Collapse
Affiliation(s)
- Sarayut Julkaew
- College of Digital Science, Prince of Songkla University, Hat Yai, Songkhla, Thailand
| | - Thakerng Wongsirichot
- Division of Computational Science, Faculty of Science, Prince of Songkla University, Hat Yai, Songkhla, Thailand
| | - Kasikrit Damkliang
- Division of Computational Science, Faculty of Science, Prince of Songkla University, Hat Yai, Songkhla, Thailand
| | - Pornpen Sangthawan
- Division of Nephrology, Department of Medicine, Faculty of Medicine, Prince of Songkhla University, Hat Yai, Songkhla, Thailand
| |
Collapse
|
16
|
Razi A, Lo CC, Wang S, Leek JT, Hansen KD. Genotype prediction of 336,463 samples from public expression data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.21.562237. [PMID: 38559266 PMCID: PMC10979922 DOI: 10.1101/2023.10.21.562237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Tens of thousands of RNA-sequencing experiments comprising hundreds of thousands of individual samples have now been performed. These data represent a broad range of experimental conditions, sequencing technologies, and hypotheses under study. The Recount project has aggregated and uniformly processed hundreds of thousands of publicly available RNA-seq samples. Most of these samples only include RNA expression measurements; genotype data for these same samples would enable a wide range of analyses including variant prioritization, eQTL analysis, and studies of allele specific expression. Here, we developed a statistical model based on the existing reference and alternative read counts from the RNA-seq experiments available through Recount3 to predict genotypes at autosomal biallelic loci in coding regions. We demonstrate the accuracy of our model using large-scale studies that measured both gene expression and genotype genome-wide. We show that our predictive model is highly accurate with 99.5% overall accuracy, 99.6% major allele accuracy, and 90.4% minor allele accuracy. Our model is robust to tissue and study effects, provided the coverage is high enough. We applied this model to genotype all the samples in Recount 3 and provide the largest ready-to-use expression repository containing genotype information. We illustrate that the predicted genotype from RNA-seq data is sufficient to unravel the underlying population structure of samples in Recount3 using Principal Component Analysis.
Collapse
Affiliation(s)
- Afrooz Razi
- Department of Genetic Medicine, Johns Hopkins University School of Medicine
| | - Christopher C. Lo
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
| | - Siruo Wang
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
| | - Jeffrey T. Leek
- Biostatistics Program, Division of Public Health Sciences, Fred Hutchinson Cancer Center
| | - Kasper D. Hansen
- Department of Genetic Medicine, Johns Hopkins University School of Medicine
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine
| |
Collapse
|
17
|
Delgado S, Somovilla P, Ferrer-Orta C, Martínez-González B, Vázquez-Monteagudo S, Muñoz-Flores J, Soria ME, García-Crespo C, de Ávila AI, Durán-Pastor A, Gadea I, López-Galíndez C, Moran F, Lorenzo-Redondo R, Verdaguer N, Perales C, Domingo E. Incipient functional SARS-CoV-2 diversification identified through neural network haplotype maps. Proc Natl Acad Sci U S A 2024; 121:e2317851121. [PMID: 38416684 PMCID: PMC10927536 DOI: 10.1073/pnas.2317851121] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 01/08/2024] [Indexed: 03/01/2024] Open
Abstract
Since its introduction in the human population, SARS-CoV-2 has evolved into multiple clades, but the events in its intrahost diversification are not well understood. Here, we compare three-dimensional (3D) self-organized neural haplotype maps (SOMs) of SARS-CoV-2 from thirty individual nasopharyngeal diagnostic samples obtained within a 19-day interval in Madrid (Spain), at the time of transition between clades 19 and 20. SOMs have been trained with the haplotype repertoire present in the mutant spectra of the nsp12- and spike (S)-coding regions. Each SOM consisted of a dominant neuron (displaying the maximum frequency), surrounded by a low-frequency neuron cloud. The sequence of the master (dominant) neuron was either identical to that of the reference Wuhan-Hu-1 genome or differed from it at one nucleotide position. Six different deviant haplotype sequences were identified among the master neurons. Some of the substitutions in the neural clouds affected critical sites of the nsp12-nsp8-nsp7 polymerase complex and resulted in altered kinetics of RNA synthesis in an in vitro primer extension assay. Thus, the analysis has identified mutations that are relevant to modification of viral RNA synthesis, present in the mutant clouds of SARS-CoV-2 quasispecies. These mutations most likely occurred during intrahost diversification in several COVID-19 patients, during an initial stage of the pandemic, and within a brief time period.
Collapse
Affiliation(s)
- Soledad Delgado
- Departamento de Sistemas Informáticos, Escuela Técnica Superior de Ingeniería de Sistemas Informáticos, Universidad Politécnica de Madrid, Madrid28031, Spain
| | - Pilar Somovilla
- Microbes in Health and Welfare Program, Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Consejo Superior de Investigaciones Científicas, Madrid28049, Spain
- Departamento de Biología Molecular, Universidad Autónoma de Madrid, Madrid28049, Spain
| | - Cristina Ferrer-Orta
- Structural and Molecular Biology Department, Institut de Biología Molecular de Barcelona, Consejo Superior de Investigaciones Científicas, Barcelona08028, Spain
| | - Brenda Martínez-González
- Department of Molecular and Cell Biology, Centro Nacional de Biotecnología, Consejo Superior de Investigaciones Científicas, Madrid28049, Spain
- Department of Clinical Microbiology, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid, Madrid28040, Spain
| | - Sergi Vázquez-Monteagudo
- Structural and Molecular Biology Department, Institut de Biología Molecular de Barcelona, Consejo Superior de Investigaciones Científicas, Barcelona08028, Spain
| | | | - María Eugenia Soria
- Microbes in Health and Welfare Program, Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Consejo Superior de Investigaciones Científicas, Madrid28049, Spain
- Department of Clinical Microbiology, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid, Madrid28040, Spain
| | - Carlos García-Crespo
- Microbes in Health and Welfare Program, Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Consejo Superior de Investigaciones Científicas, Madrid28049, Spain
| | - Ana Isabel de Ávila
- Microbes in Health and Welfare Program, Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Consejo Superior de Investigaciones Científicas, Madrid28049, Spain
| | - Antoni Durán-Pastor
- Department of Molecular and Cell Biology, Centro Nacional de Biotecnología, Consejo Superior de Investigaciones Científicas, Madrid28049, Spain
| | - Ignacio Gadea
- Department of Clinical Microbiology, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid, Madrid28040, Spain
| | - Cecilio López-Galíndez
- Unidad de Virología Molecular, Laboratorio de Referencia e Investigación en retrovirus, Centro Nacional de Microbiología, Instituto de salud Carlos III, Majadahonda28222, Spain
| | - Federico Moran
- Departamento de Bioquímica y Biología Molecular, Universidad Complutense de Madrid, Madrid28040, Spain
| | - Ramon Lorenzo-Redondo
- Department of Medicine, Division of Infectious Diseases, Northwestern University Feinberg School of Medicine, Center for Pathogen Genomics and Microbial Evolution, Northwestern University Havey Institute for Global Health, Chicago, IL60611
| | - Nuria Verdaguer
- Structural and Molecular Biology Department, Institut de Biología Molecular de Barcelona, Consejo Superior de Investigaciones Científicas, Barcelona08028, Spain
| | - Celia Perales
- Department of Molecular and Cell Biology, Centro Nacional de Biotecnología, Consejo Superior de Investigaciones Científicas, Madrid28049, Spain
- Department of Clinical Microbiology, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid, Madrid28040, Spain
| | - Esteban Domingo
- Microbes in Health and Welfare Program, Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Consejo Superior de Investigaciones Científicas, Madrid28049, Spain
| |
Collapse
|
18
|
Gregory W, Sarwar N, Kevrekidis G, Villar S, Dumitrascu B. MarkerMap: nonlinear marker selection for single-cell studies. NPJ Syst Biol Appl 2024; 10:17. [PMID: 38351188 PMCID: PMC10864304 DOI: 10.1038/s41540-024-00339-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 01/17/2024] [Indexed: 02/16/2024] Open
Abstract
Single-cell RNA-seq data allow the quantification of cell type differences across a growing set of biological contexts. However, pinpointing a small subset of genomic features explaining this variability can be ill-defined and computationally intractable. Here we introduce MarkerMap, a generative model for selecting minimal gene sets which are maximally informative of cell type origin and enable whole transcriptome reconstruction. MarkerMap provides a scalable framework for both supervised marker selection, aimed at identifying specific cell type populations, and unsupervised marker selection, aimed at gene expression imputation and reconstruction. We benchmark MarkerMap's competitive performance against previously published approaches on real single cell gene expression data sets. MarkerMap is available as a pip installable package, as a community resource aimed at developing explainable machine learning techniques for enhancing interpretability in single-cell studies.
Collapse
Affiliation(s)
- Wilson Gregory
- Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Nabeel Sarwar
- Center for Data Science, New York University, New York, NY, 10012, USA
| | - George Kevrekidis
- Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Soledad Villar
- Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, 21218, USA.
- Mathematical Institute for Data Science, Johns Hopkins University, Baltimore, MD, 21218, USA.
| | - Bianca Dumitrascu
- Department of Statistics, Columbia University, New York, NY, 10027, USA.
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY, 10027, USA.
| |
Collapse
|
19
|
Qattous H, Azzeh M, Ibrahim R, Abed Al-Ghafer I, Al Sorkhy M, Alkhateeb A. PaCMAP-embedded convolutional neural network for multi-omics data integration. Heliyon 2024; 10:e23195. [PMID: 38163104 PMCID: PMC10756978 DOI: 10.1016/j.heliyon.2023.e23195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 11/22/2023] [Accepted: 11/29/2023] [Indexed: 01/03/2024] Open
Abstract
Aims The multi-omics data integration has emerged as a prominent avenue within the healthcare industry, presenting substantial potential for enhancing predictive models. The main motivation behind this study stems from the imperative need to advance prognostic methodologies in cancer diagnosis, an area where precision is pivotal for effective clinical decision-making. In this context, the present study introduces an innovative methodology that integrates copy number alteration (CNA), DNA methylation, and gene expression data. Methods The three omics data were successfully merged into a two-dimensional (2D) map using the PaCMAP dimensionality reduction technique. Utilizing the RGB coloring scheme, a visual representation of the integration was produced utilizing the values of the three omics of each sample. Then, the colored 2D maps were fed into a convolutional neural network (CNN) to forecast the Gleason score. Results Our proposed model outperforms the cutting-edge i-SOM-GSN model by integrating multi-omics data and the CNN architecture with an accuracy of 98.89, and AUC of 0.9996. Conclusion This study demonstrates the effectiveness of multi-omics data integration in predicting health outcomes. The proposed methodology, combining PaCMAP for dimensionality reduction, RGB coloring for visualization, and CNN for prediction, offers a comprehensive framework for integrating heterogeneous omics data and improving predictive accuracy. These findings contribute to the advancement of personalized medicine and have the potential to aid in clinical decision-making for prostate cancer patients.
Collapse
Affiliation(s)
- Hazem Qattous
- Software Engineering Department, Princess Sumaya University for Technology, Amman P.O. Box 1438, Jordan
| | - Mohammad Azzeh
- Data Science Department, Princess Sumaya University for Technology, Amman P.O. Box 1438, Jordan
| | - Rahmeh Ibrahim
- Computer Science Department, Princess Sumaya University for Technology, Amman P.O. Box 1438, Jordan
| | - Ibrahim Abed Al-Ghafer
- Data Science Department, Princess Sumaya University for Technology, Amman P.O. Box 1438, Jordan
| | - Mohammad Al Sorkhy
- Heritage College of Osteopathic medicine, Ohio University, Cleveland, OH 44122, USA
| | - Abedalrhman Alkhateeb
- Computer Science Department, Lakehead University, 955 Oliver Rd, Thunder Bay, ON P7B 5E1, Ontario, Canada
| |
Collapse
|
20
|
Fan YV, Čuček L, Si C, Jiang P, Vujanović A, Krajnc D, Lee CT. Uncovering environmental performance patterns of plastic packaging waste in high recovery rate countries: An example of EU-27. ENVIRONMENTAL RESEARCH 2024; 241:117581. [PMID: 37967705 DOI: 10.1016/j.envres.2023.117581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 10/30/2023] [Accepted: 11/01/2023] [Indexed: 11/17/2023]
Abstract
Plastic consumption and its end-of-life management pose a significant environmental footprint and are energy intensive. Waste-to-resources and prevention strategies have been promoted widely in Europe as countermeasures; however, their effectiveness remains uncertain. This study aims to uncover the environmental footprint patterns of the plastics value chain in the European Union Member States (EU-27) through exploratory data analysis with dimension reduction and grouping. Nine variables are assessed, ranging from socioeconomic and demographic to environmental impacts. Three clusters are formed according to the similarity of a range of characteristics (nine), with environmental impacts being identified as the primary influencing variable in determining the clusters. Most countries belong to Cluster 0, consisting of 17 countries in 2014 and 18 countries in 2019. They represent clusters with a relatively low global warming potential (GWP), with an average value of 2.64 t CO2eq/cap in 2014 and 4.01 t CO2eq/cap in 2019. Among all the assessed countries, Denmark showed a significant change when assessed within the traits of EU-27, categorised from Cluster 1 (high GWP) in 2014 to Cluster 0 (low GWP) in 2019. The analysis of plastic packaging waste statistics in 2019 (data released in 2022) shows that, despite an increase in the recovery rate within the EU-27, the GWP has not reduced, suggesting a rebound effect. The GWP tends to increase in correlation with the higher plastic waste amount. In contrast, other environmental impacts, like eutrophication, abiotic and acidification potential, are identified to be mitigated effectively via recovery, suppressing the adverse effects of an increase in plastic waste generation. The five-year interval data analysis identified distinct clusters within a set of patterns, categorising them based on their similarities. The categorisation and managerial insights serve as a foundation for devising a focused mitigation strategy.
Collapse
Affiliation(s)
- Yee Van Fan
- Sustainable Process Integration Laboratory - SPIL, NETME Centre, Faculty of Mechanical Engineering, Brno University of Technology, Technická 2896/2, 616 69 Brno, Czech Republic.
| | - Lidija Čuček
- Faculty of Chemistry and Chemical Engineering, University of Maribor, Smetanova 17, Maribor, Slovenia
| | - Chunyan Si
- Sustainable Process Integration Laboratory - SPIL, NETME Centre, Faculty of Mechanical Engineering, Brno University of Technology, Technická 2896/2, 616 69 Brno, Czech Republic
| | - Peng Jiang
- Department of Industrial Engineering and Management, Business School, Sichuan University, Chengdu 610064, China
| | - Annamaria Vujanović
- Faculty of Chemistry and Chemical Engineering, University of Maribor, Smetanova 17, Maribor, Slovenia
| | - Damjan Krajnc
- Faculty of Chemistry and Chemical Engineering, University of Maribor, Smetanova 17, Maribor, Slovenia
| | - Chew Tin Lee
- Faculty of Chemical and Energy Engineering, Universiti Teknologi Malaysia, 81310, Johor Bahru, Johor, Malaysia
| |
Collapse
|
21
|
Erfani M, Baalousha M, Goharian E. Unveiling elemental fingerprints: A comparative study of clustering methods for multi-element nanoparticle data. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 905:167176. [PMID: 37730026 DOI: 10.1016/j.scitotenv.2023.167176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 09/03/2023] [Accepted: 09/16/2023] [Indexed: 09/22/2023]
Abstract
Single particle-inductively coupled plasma-time of flight-mass spectrometers (SP-ICP-TOF-MS) generates large datasets of the multi-elemental composition of nanoparticles. However, extracting useful information from such datasets is challenging. Hierarchical clustering (HC) has been successfully applied to extract elemental fingerprints from multi-element nanoparticle data obtained by SP-ICP-TOF-MS. However, many other clustering approaches can be applied to analyze SP-ICP-TOF-MS data that have not yet been evaluated. This study fills this knowledge gap by comparing the performance of three clustering approaches: HC, spectral clustering, and t-distributed Stochastic Neighbor Embedding coupled with Density-Based Spatial Clustering of Applications with Noise (tSNE-DBSCAN) for analyzing SP-ICP-TOF-MS data. The performance of these clustering techniques was evaluated by comparing the size of the extracted clusters and the similarity of the elemental composition of nanoparticles within each cluster. Hierarchical clustering often failed to achieve an optimal clustering solution for SP-ICP-TOF-MS data because HC is sensitive to the presence of outliers. Spectral clustering and tSNE-DBSCAN extracted clusters that were not identified by HC. This is because spectral clustering, a method developed based on graph theory, reveals the global and local structure in the data. tSNE reduces and maps the data into a lower-dimensional space, enabling clustering algorithms such as DBSCAN to identify subclusters with subtle differences in their elemental composition. However, tSNE-DBSCAN can lead to unsatisfactory clustering solutions because tuning the perplexity hyperparameter of tSNE is a difficult and a time-consuming task, and the relative distance between datapoints is not maintained. Although the three clustering approaches successfully extract useful information from SP-ICP-TOF-MS data, spectral clustering outperforms HC and tSNE-DBSCAN by generating clusters of a large number of nanoparticles with similar elemental compositions.
Collapse
Affiliation(s)
- Mahdi Erfani
- Department of Civil and Environmental Engineering, University of South Carolina, SC 29208, USA
| | - Mohammed Baalousha
- Center for Environmental Nanoscience and Risk, Department of Environmental Health Sciences, Arnold School of Public Health, University of South Carolina, Columbia, SC, 29201, USA.
| | - Erfan Goharian
- Department of Civil and Environmental Engineering, University of South Carolina, SC 29208, USA.
| |
Collapse
|
22
|
Lamperti L, Sanchez T, Si Moussi S, Mouillot D, Albouy C, Flück B, Bruno M, Valentini A, Pellissier L, Manel S. New deep learning-based methods for visualizing ecosystem properties using environmental DNA metabarcoding data. Mol Ecol Resour 2023; 23:1946-1958. [PMID: 37702270 DOI: 10.1111/1755-0998.13861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 07/29/2023] [Accepted: 08/14/2023] [Indexed: 09/14/2023]
Abstract
Environmental DNA (eDNA) metabarcoding provides an efficient approach for documenting biodiversity patterns in marine and terrestrial ecosystems. The complexity of these data prevents current methods from extracting and analyzing all the relevant ecological information they contain, and new methods may provide better dimensionality reduction and clustering. Here we present two new deep learning-based methods that combine different types of neural networks (NNs) to ordinate eDNA samples and visualize ecosystem properties in a two-dimensional space: the first is based on variational autoencoders and the second on deep metric learning. The strength of our new methods lies in the combination of two inputs: the number of sequences found for each molecular operational taxonomic unit (MOTU) detected and their corresponding nucleotide sequence. Using three different datasets, we show that our methods accurately represent several biodiversity indicators in a two-dimensional latent space: MOTU richness per sample, sequence α-diversity per sample, Jaccard's and sequence β-diversity between samples. We show that our nonlinear methods are better at extracting features from eDNA datasets while avoiding the major biases associated with eDNA. Our methods outperform traditional dimension reduction methods such as Principal Component Analysis, t-distributed Stochastic Neighbour Embedding, Nonmetric Multidimensional Scaling and Uniform Manifold Approximation and Projection for dimension reduction. Our results suggest that NNs provide a more efficient way of extracting structure from eDNA metabarcoding data, thereby improving their ecological interpretation and thus biodiversity monitoring.
Collapse
Affiliation(s)
- Letizia Lamperti
- CEFE, Univ Montpellier, CNRS, EPHE-PSL University, IRD, Montpellier, France
- Ecosystems and Landscape Evolution, Department of Environmental Systems Science, ETH Zürich, Zürich, Switzerland
- Ecosystems and Landscape Evolution, Land Change Science Research Unit, Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), Switzerland
| | - Théophile Sanchez
- Ecosystems and Landscape Evolution, Department of Environmental Systems Science, ETH Zürich, Zürich, Switzerland
- Ecosystems and Landscape Evolution, Land Change Science Research Unit, Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), Switzerland
| | - Sara Si Moussi
- Laboratoire d'Ecologie Alpine, Univ. Grenoble Alpes, Univ. Savoie MontBlanc, CNRS, Grenoble, France
| | - David Mouillot
- MARBEC, Univ Montpellier, CNRS, IFREMER, IRD, Montpellier, France
- Institut Universitaire de France, Paris, France
| | - Camille Albouy
- Ecosystems and Landscape Evolution, Department of Environmental Systems Science, ETH Zürich, Zürich, Switzerland
- Ecosystems and Landscape Evolution, Land Change Science Research Unit, Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), Switzerland
| | - Benjamin Flück
- Ecosystems and Landscape Evolution, Department of Environmental Systems Science, ETH Zürich, Zürich, Switzerland
- Ecosystems and Landscape Evolution, Land Change Science Research Unit, Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), Switzerland
| | - Morgane Bruno
- CEFE, Univ Montpellier, CNRS, EPHE-PSL University, IRD, Montpellier, France
| | | | - Loïc Pellissier
- Ecosystems and Landscape Evolution, Department of Environmental Systems Science, ETH Zürich, Zürich, Switzerland
- Ecosystems and Landscape Evolution, Land Change Science Research Unit, Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), Switzerland
| | - Stéphanie Manel
- CEFE, Univ Montpellier, CNRS, EPHE-PSL University, IRD, Montpellier, France
- Institut Universitaire de France, Paris, France
| |
Collapse
|
23
|
Li C, Chan TF, Yang C, Lin Z. stVAE deconvolves cell-type composition in large-scale cellular resolution spatial transcriptomics. Bioinformatics 2023; 39:btad642. [PMID: 37862237 PMCID: PMC10612402 DOI: 10.1093/bioinformatics/btad642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 10/10/2023] [Accepted: 10/19/2023] [Indexed: 10/22/2023] Open
Abstract
MOTIVATION Recent rapid developments in spatial transcriptomic techniques at cellular resolution have gained increasing attention. However, the unique characteristics of large-scale cellular resolution spatial transcriptomic datasets, such as the limited number of transcripts captured per spot and the vast number of spots, pose significant challenges to current cell-type deconvolution methods. RESULTS In this study, we introduce stVAE, a method based on the variational autoencoder framework to deconvolve the cell-type composition of cellular resolution spatial transcriptomic datasets. To assess the performance of stVAE, we apply it to five datasets across three different biological tissues. In the Stereo-seq and Slide-seqV2 datasets of the mouse brain, stVAE accurately reconstructs the laminar structure of the pyramidal cell layers in the cortex, which are mainly organized by the subtypes of telencephalon projecting excitatory neurons. In the Stereo-seq dataset of the E12.5 mouse embryo, stVAE resolves the complex spatial patterns of osteoblast subtypes, which are supported by their marker genes. In Stereo-seq and Pixel-seq datasets of the mouse olfactory bulb, stVAE accurately delineates the spatial distributions of known cell types. In summary, stVAE can accurately identify spatial patterns of cell types and their relative proportions across spots for cellular resolution spatial transcriptomic data. It is instrumental in understanding the heterogeneity of cell populations and their interactions within tissues. AVAILABILITY AND IMPLEMENTATION stVAE is available in GitHub (https://github.com/lichen2018/stVAE) and Figshare (https://figshare.com/articles/software/stVAE/23254538).
Collapse
Affiliation(s)
- Chen Li
- Department of Statistics, Chinese University of Hong Kong, Hong Kong 999077, China
| | - Ting-Fung Chan
- School of Life Sciences, The Chinese University of Hong Kong, Hong Kong 999077, China
- State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Hong Kong 999077, China
| | - Can Yang
- Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong 999077, China
- Guangdong-Hong Kong-Macao Joint Laboratory for Data-Driven Fluid Mechanics and Engineering Applications, The Hong Kong University of Science and Technology, Hong Kong 999077, China
| | - Zhixiang Lin
- Department of Statistics, Chinese University of Hong Kong, Hong Kong 999077, China
| |
Collapse
|
24
|
Wu YS, Taniar D, Adhinugraha K, Tsai LK, Pai TW. Detection of Amyotrophic Lateral Sclerosis (ALS) Comorbidity Trajectories Based on Principal Tree Model Analytics. Biomedicines 2023; 11:2629. [PMID: 37893003 PMCID: PMC10604752 DOI: 10.3390/biomedicines11102629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/11/2023] [Accepted: 09/22/2023] [Indexed: 10/29/2023] Open
Abstract
The multifaceted nature and swift progression of Amyotrophic Lateral Sclerosis (ALS) pose considerable challenges to our understanding of its evolution and interplay with comorbid conditions. This study seeks to elucidate the temporal dynamics of ALS progression and its interaction with associated diseases. We employed a principal tree-based model to decipher patterns within clinical data derived from a population-based database in Taiwan. The disease progression was portrayed as branched trajectories, each path representing a series of distinct stages. Each stage embodied the cumulative occurrence of co-existing diseases, depicted as nodes on the tree, with edges symbolizing potential transitions between these linked nodes. Our model identified eight distinct ALS patient trajectories, unveiling unique patterns of disease associations at various stages of progression. These patterns may suggest underlying disease mechanisms or risk factors. This research re-conceptualizes ALS progression as a migration through diverse stages, instead of the perspective of a sequence of isolated events. This new approach illuminates patterns of disease association across different progression phases. The insights obtained from this study hold the potential to inform doctors regarding the development of personalized treatment strategies, ultimately enhancing patient prognosis and quality of life.
Collapse
Affiliation(s)
- Yang-Sheng Wu
- Department of Computer Science and Information Engineering, National Taipei University of Technology, Taipei 106, Taiwan;
| | - David Taniar
- Department of Software Systems & Cybersecurity, Monash University, Melbourne, VIC 3800, Australia;
| | - Kiki Adhinugraha
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, VIC 3086, Australia;
| | - Li-Kai Tsai
- Department of Neurology and Stroke Center, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei 100, Taiwan;
| | - Tun-Wen Pai
- Department of Computer Science and Information Engineering, National Taipei University of Technology, Taipei 106, Taiwan;
| |
Collapse
|
25
|
Bolt H, Suffel A, Matthewman J, Sandmann F, Tomlinson L, Eggo R. Seasonality of acute kidney injury phenotypes in England: an unsupervised machine learning classification study of electronic health records. BMC Nephrol 2023; 24:234. [PMID: 37558976 PMCID: PMC10413486 DOI: 10.1186/s12882-023-03269-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 07/14/2023] [Indexed: 08/11/2023] Open
Abstract
BACKGROUND Acute Kidney Injury (AKI) is a multifactorial condition which presents a substantial burden to healthcare systems. There is limited evidence on whether it is seasonal. We sought to investigate the seasonality of AKI hospitalisations in England and use unsupervised machine learning to explore clustering of underlying comorbidities, to gain insights for future intervention. METHODS We used Hospital Episodes Statistics linked to the Clinical Practice Research Datalink to describe the overall incidence of AKI admissions between 2015 and 2019 weekly by demographic and admission characteristics. We carried out dimension reduction on 850 diagnosis codes using multiple correspondence analysis and applied k-means clustering to classify patients. We phenotype each group based on the dominant characteristics and describe the seasonality of AKI admissions by these different phenotypes. RESULTS Between 2015 and 2019, weekly AKI admissions peaked in winter, with additional summer peaks related to periods of extreme heat. Winter seasonality was more evident in those diagnosed with AKI on admission. From the cluster classification we describe six phenotypes of people admitted to hospital with AKI. Among these, seasonality of AKI admissions was observed among people who we described as having a multimorbid phenotype, established risk factor phenotype, and general AKI phenotype. CONCLUSION We demonstrate winter seasonality of AKI admissions in England, particularly among those with AKI diagnosed on admission, suggestive of community triggers. Differences in seasonality between phenotypes suggests some groups may be more likely to develop AKI as a result of these factors. This may be driven by underlying comorbidity profiles or reflect differences in uptake of seasonal interventions such as vaccines.
Collapse
Affiliation(s)
- Hikaru Bolt
- London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK.
| | - Anne Suffel
- London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| | - Julian Matthewman
- London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| | - Frank Sandmann
- London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
- European Centre for Disease Prevention and Control (ECDC), Stockholm, Sweden
| | - Laurie Tomlinson
- London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| | - Rosalind Eggo
- London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| |
Collapse
|
26
|
Moloney NM, Barylyuk K, Tromer E, Crook OM, Breckels LM, Lilley KS, Waller RF, MacGregor P. Mapping diversity in African trypanosomes using high resolution spatial proteomics. Nat Commun 2023; 14:4401. [PMID: 37479728 PMCID: PMC10361982 DOI: 10.1038/s41467-023-40125-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 07/06/2023] [Indexed: 07/23/2023] Open
Abstract
African trypanosomes are dixenous eukaryotic parasites that impose a significant human and veterinary disease burden on sub-Saharan Africa. Diversity between species and life-cycle stages is concomitant with distinct host and tissue tropisms within this group. Here, the spatial proteomes of two African trypanosome species, Trypanosoma brucei and Trypanosoma congolense, are mapped across two life-stages. The four resulting datasets provide evidence of expression of approximately 5500 proteins per cell-type. Over 2500 proteins per cell-type are classified to specific subcellular compartments, providing four comprehensive spatial proteomes. Comparative analysis reveals key routes of parasitic adaptation to different biological niches and provides insight into the molecular basis for diversity within and between these pathogen species.
Collapse
Affiliation(s)
- Nicola M Moloney
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK
| | | | - Eelco Tromer
- Cell Biochemistry, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, 9747 AG, Groningen, Netherlands
| | - Oliver M Crook
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK
- Department of Statistics, University of Oxford, Oxford, OX1 3LB, UK
| | - Lisa M Breckels
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK
| | - Kathryn S Lilley
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK
| | - Ross F Waller
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK
| | - Paula MacGregor
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK.
- School of Biological Sciences, University of Bristol, Bristol, BS8 1TQ, UK.
| |
Collapse
|
27
|
Rasmussen A, Dawkins BA, Li C, Pezant N, Levin AM, Rybicki BA, Iannuzzi MC, Montgomery CG. Multiple Correspondence Analysis and HLA-Associations of Organ Involvement in a Large Cohort of African-American and European-American Patients with Sarcoidosis. Lung 2023; 201:297-302. [PMID: 37322162 PMCID: PMC10284928 DOI: 10.1007/s00408-023-00626-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 06/02/2023] [Indexed: 06/17/2023]
Abstract
Sarcoidosis is a systemic granulomatous disease with predominant pulmonary involvement and vast heterogeneity of clinical manifestations and disease outcomes. African American (AA) patients suffer greater morbidity and mortality. Using Multiple Correspondence Analysis, we identified seven clusters of organ involvement in European American (EA; n = 385) patients which were similar to those previously described in a Pan-European (GenPhenReSa) and a Spanish cohort (SARCOGEAS). In contrast, AA (n = 987) had six, less well-defined and overlapping clusters with little similarity to the cluster identified in the EA cohort evaluated at the same U.S. institutions. Association of cluster membership with two-digit HLA-DRB1 alleles demonstrated ancestry-specific patterns of association and replicated known HLA effects.These results further support the notion that genetically influenced immune risk profiles, which differ based on ancestry, play a role in phenotypic heterogeneity. Dissecting such risk profiles will move us closer to personalized medicine for this complex disease.
Collapse
Affiliation(s)
- Astrid Rasmussen
- Genes and Human Disease Program, Oklahoma Medical Research Foundation, 825 NE 13th, Research Tower, Suite 2202, Oklahoma City, Ok, 73104, USA
| | - Bryan A Dawkins
- Genes and Human Disease Program, Oklahoma Medical Research Foundation, 825 NE 13th, Research Tower, Suite 2202, Oklahoma City, Ok, 73104, USA
| | - Chuang Li
- Genes and Human Disease Program, Oklahoma Medical Research Foundation, 825 NE 13th, Research Tower, Suite 2202, Oklahoma City, Ok, 73104, USA
| | - Nathan Pezant
- Genes and Human Disease Program, Oklahoma Medical Research Foundation, 825 NE 13th, Research Tower, Suite 2202, Oklahoma City, Ok, 73104, USA
| | - Albert M Levin
- Department of Public Health Sciences, Henry Ford Health System, Detroit, MI, USA
| | - Benjamin A Rybicki
- Department of Public Health Sciences, Henry Ford Health System, Detroit, MI, USA
| | - Michael C Iannuzzi
- Department of Medical Education, City University of New York School of Medicine, New York, NY, USA
| | - Courtney G Montgomery
- Genes and Human Disease Program, Oklahoma Medical Research Foundation, 825 NE 13th, Research Tower, Suite 2202, Oklahoma City, Ok, 73104, USA.
| |
Collapse
|
28
|
Banerjee J, Taroni JN, Allaway RJ, Prasad DV, Guinney J, Greene C. Machine learning in rare disease. Nat Methods 2023:10.1038/s41592-023-01886-z. [PMID: 37248386 DOI: 10.1038/s41592-023-01886-z] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Accepted: 04/22/2023] [Indexed: 05/31/2023]
Abstract
High-throughput profiling methods (such as genomics or imaging) have accelerated basic research and made deep molecular characterization of patient samples routine. These approaches provide a rich portrait of genes, molecular pathways and cell types involved in disease phenotypes. Machine learning (ML) can be a useful tool for extracting disease-relevant patterns from high-dimensional datasets. However, depending upon the complexity of the biological question, machine learning often requires many samples to identify recurrent and biologically meaningful patterns. Rare diseases are inherently limited in clinical cases, leading to few samples to study. In this Perspective, we outline the challenges and emerging solutions for using ML for small sample sets, specifically in rare diseases. Advances in ML methods for rare diseases are likely to be informative for applications beyond rare diseases for which few samples exist with high-dimensional data. We propose that the method community prioritize the development of ML techniques for rare disease research.
Collapse
Affiliation(s)
| | - Jaclyn N Taroni
- Childhood Cancer Data Lab, Alex's Lemonade Stand Foundation, Philadelphia, PA, USA
| | | | | | | | - Casey Greene
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, USA.
| |
Collapse
|
29
|
Dagnino PC, Braboszcz C, Kroupi E, Splittgerber M, Brauer H, Dempfle A, Breitling-Ziegler C, Prehn-Kristensen A, Krauel K, Siniatchkin M, Moliadze V, Soria-Frisch A. Stratification of responses to tDCS intervention in a healthy pediatric population based on resting-state EEG profiles. Sci Rep 2023; 13:8438. [PMID: 37231030 DOI: 10.1038/s41598-023-34724-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 05/06/2023] [Indexed: 05/27/2023] Open
Abstract
Transcranial Direct Current Stimulation (tDCS) is a non-invasive neuromodulation technique with a wide variety of clinical and research applications. As increasingly acknowledged, its effectiveness is subject dependent, which may lead to time consuming and cost ineffective treatment development phases. We propose the combination of electroencephalography (EEG) and unsupervised learning for the stratification and prediction of individual responses to tDCS. A randomized, sham-controlled, double-blind crossover study design was conducted within a clinical trial for the development of pediatric treatments based on tDCS. The tDCS stimulation (sham and active) was applied either in the left dorsolateral prefrontal cortex or in the right inferior frontal gyrus. Following the stimulation session, participants performed 3 cognitive tasks to assess the response to the intervention: the Flanker Task, N-Back Task and Continuous Performance Test (CPT). We used data from 56 healthy children and adolescents to implement an unsupervised clustering approach that stratify participants based on their resting-state EEG spectral features before the tDCS intervention. We then applied a correlational analysis to characterize the clusters of EEG profiles in terms of participant's difference in the behavioral outcome (accuracy and response time) of the cognitive tasks when performed after a tDCS-sham or a tDCS-active session. Better behavioral performance following the active tDCS session compared to the sham tDCS session is considered a positive intervention response, whilst the reverse is considered a negative one. Optimal results in terms of validity measures was obtained for 4 clusters. These results show that specific EEG-based digital phenotypes can be associated to particular responses. While one cluster presents neurotypical EEG activity, the remaining clusters present non-typical EEG characteristics, which seem to be associated with a positive response. Findings suggest that unsupervised machine learning can be successfully used to stratify and eventually predict responses of individuals to a tDCS treatment.
Collapse
Affiliation(s)
| | - Claire Braboszcz
- Neuroscience BU, Starlab Barcelona SL, Av Tibidabo 47 bis, Barcelona, Spain
| | - Eleni Kroupi
- Neuroscience BU, Starlab Barcelona SL, Av Tibidabo 47 bis, Barcelona, Spain
| | - Maike Splittgerber
- Institute of Medical Psychology and Medical Sociology, University Medical Center Schleswig-Holstein, Kiel University, Kiel, Germany
| | - Hannah Brauer
- Department of Child and Adolescent Psychiatry, Center for Integrative Psychiatry Kiel, University Medical Center Schleswig-Holstein, Kiel, Germany
| | - Astrid Dempfle
- Institute of Medical Informatics and Statistics, University Hospital Schleswig Holstein, Kiel University, Kiel, Germany
| | - Carolin Breitling-Ziegler
- Department of Child and Adolescent Psychiatry and Psychotherapy, University of Magdeburg, Magdeburg, Germany
| | - Alexander Prehn-Kristensen
- Department of Child and Adolescent Psychiatry, Center for Integrative Psychiatry Kiel, University Medical Center Schleswig-Holstein, Kiel, Germany
| | - Kerstin Krauel
- Department of Child and Adolescent Psychiatry and Psychotherapy, University of Magdeburg, Magdeburg, Germany
| | - Michael Siniatchkin
- Clinic for Child and Adolescent Psychiatry and Psychotherapy, Protestant Hospital Bethel, University of Bielefeld, Campus Bielefeld Bethel, Bielefeld, Germany
| | - Vera Moliadze
- Institute of Medical Psychology and Medical Sociology, University Medical Center Schleswig-Holstein, Kiel University, Kiel, Germany
| | - Aureli Soria-Frisch
- Neuroscience BU, Starlab Barcelona SL, Av Tibidabo 47 bis, Barcelona, Spain.
| |
Collapse
|
30
|
Sujeeun LY, Goonoo N, Moutou KM, Baichoo S, Bhaw-Luximon A. Predictive modeling as a tool to assess polymer–polymer and polymer–drug interactions for tissue engineering applications. Macromol Res 2023. [DOI: 10.1007/s13233-023-00155-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023]
|
31
|
Will they take this offer? A machine learning price elasticity model for predicting upselling acceptance of premium airline seating. INFORMATION & MANAGEMENT 2023. [DOI: 10.1016/j.im.2023.103759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
32
|
Zhou J, You D, Bai J, Chen X, Wu Y, Wang Z, Tang Y, Zhao Y, Feng G. Machine Learning Methods in Real-World Studies of Cardiovascular Disease. CARDIOVASCULAR INNOVATIONS AND APPLICATIONS 2023. [DOI: 10.15212/cvia.2023.0011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023] Open
Abstract
Objective:Cardiovascular disease (CVD) is one of the leading causes of death worldwide, and answers are urgently needed regarding many aspects, particularly risk identification and prognosis prediction. Real-world studies with large numbers of observations provide an important basis for CVD research but are constrained by high dimensionality, and missing or unstructured data. Machine learning (ML) methods, including a variety of supervised and unsupervised algorithms, are useful for data governance, and are effective for high dimensional data analysis and imputation in real-world studies. This article reviews the theory, strengths and limitations, and applications of several commonly used ML methods in the CVD field, to provide a reference for further application.Methods:This article introduces the origin, purpose, theory, advantages and limitations, and applications of multiple commonly used ML algorithms, including hierarchical and k-means clustering, principal component analysis, random forest, support vector machine, and neural networks. An example uses a random forest on the Systolic Blood Pressure Intervention Trial (SPRINT) data to demonstrate the process and main results of ML application in CVD.Conclusion:ML methods are effective tools for producing real-world evidence to support clinical decisions and meet clinical needs. This review explains the principles of multiple ML methods in plain language, to provide a reference for further application. Future research is warranted to develop accurate ensemble learning methods for wide application in the medical field.
Collapse
|
33
|
Abstract
The concept of a core microbiome has been broadly used to refer to the consistent presence of a set of taxa across multiple samples within a given habitat. The assignment of taxa to core microbiomes can be performed by several methods based on the abundance and occupancy (i.e., detection across samples) of individual taxa. These approaches have led to methodological inconsistencies, with direct implications for ecological interpretation. Here, we reviewed a set of methods most commonly used to infer core microbiomes in divergent systems. We applied these methods using large data sets and analyzed simulations to determine their accuracy in core microbiome assignments. Our results show that core taxa assignments vary significantly across methods and data set types, with occupancy-based methods most accurately defining true core membership. We also found the ability of these methods to accurately capture core assignments to be contingent on the distribution of taxon abundance and occupancy in the data set. Finally, we provide specific recommendations for further studies using core taxa assignments and discuss the need for unifying methodical approaches toward data processing to advance ecological synthesis. IMPORTANCE Different methods are commonly used to assign core microbiome membership, leading to methodological inconsistencies across studies. In this study, we review a set of the most commonly used core microbiome assignment methods and compare their core assignments using both simulated and empirical data. We report inconsistent classifications from commonly applied core microbiome assignment methods. Furthermore, we demonstrate the implication that variable core assignments may have on downstream ecological interpretations. Although we still lack a standardized approach to core taxa assignments, our study provides a direction to properly test core assignment methods and offers advances in model parameterization and method choice across distinct data types.
Collapse
|
34
|
Hrebik R, Kukal J. Concept of hidden classes in pattern classification. Artif Intell Rev 2023. [DOI: 10.1007/s10462-023-10430-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
AbstractOur paper presents a novel approach to pattern classification. The general disadvantage of a traditional classifier is in too different behaviour and optimal parameter settings during training on a given pattern set and the following cross-validation. We describe the term critical sensitivity, which means the lowest reached sensitivity for an individual class. This approach ensures a uniform classification quality for individual class classification. Therefore, it prevents outlier classes with terrible results. We focus on the evaluation of critical sensitivity, as a quality criterion. Our proposed classifier eliminates this disadvantage in many cases. Our aim is to present that easily formed hidden classes can significantly contribute to improving the quality of a classifier. Therefore, we decided to propose classifier will have a relatively simple structure. The proposed classifier structure consists of three layers. The first is linear, used for dimensionality reduction. The second layer serves for clustering and forms hidden classes. The third one is the output layer for optimal cluster unioning. For verification of the proposed system results, we use standard datasets. Cross-validation performed on standard datasets showed that our critical sensitivity-based classifier provides comparable sensitivity to reference classifiers.
Collapse
|
35
|
Toptygina A, Grebennikov D, Bocharov G. Prediction of Specific Antibody- and Cell-Mediated Responses Using Baseline Immune Status Parameters of Individuals Received Measles-Mumps-Rubella Vaccine. Viruses 2023; 15:v15020524. [PMID: 36851738 PMCID: PMC9960117 DOI: 10.3390/v15020524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 02/06/2023] [Accepted: 02/10/2023] [Indexed: 02/16/2023] Open
Abstract
A successful vaccination implies the induction of effective specific immune responses. We intend to find biomarkers among various immune cell subpopulations, cytokines and antibodies that could be used to predict the levels of specific antibody- and cell-mediated responses after measles-mumps-rubella vaccination. We measured 59 baseline immune status parameters (frequencies of 42 immune cell subsets, levels of 13 cytokines, immunoglobulins) before vaccination and 13 response variables (specific IgA and IgG, antigen-induced IFN-γ production, CD107a expression on CD8+ T lymphocytes, and cellular proliferation levels by CFSE dilution) 6 weeks after vaccination for 19 individuals. Statistically significant Spearman correlations between some baseline parameters and response variables were found for each response variable (p < 0.05). Because of the low number of observations relative to the number of baseline parameters and missing data for some observations, we used three feature selection strategies to select potential predictors of the post-vaccination responses among baseline variables: (a) screening of the variables based on correlation analysis; (b) supervised screening based on the information of changes of baseline variables at day 7; and (c) implicit feature selection using regularization-based sparse regression. We identified optimal multivariate linear regression models for predicting the effectiveness of vaccination against measles-mumps-rubella using the baseline immune status parameters. It turned out that the sufficient number of predictor variables ranges from one to five, depending on the response variable of interest.
Collapse
Affiliation(s)
- Anna Toptygina
- Gabrichevsky Research Institute for Epidemiology and Microbiology, 125212 Moscow, Russia
- Correspondence: (A.T.); (D.G.); (G.B.)
| | - Dmitry Grebennikov
- Marchuk Institute of Numerical Mathematics, Russian Academy of Sciences, (INM RAS), 119333 Moscow, Russia
- Moscow Center for Fundamental and Applied Mathematics, INM RAS, 119333 Moscow, Russia
- World-Class Research Center “Digital Biodesign and Personalized Healthcare”, Sechenov First Moscow State Medical University, 119991 Moscow, Russia
- Correspondence: (A.T.); (D.G.); (G.B.)
| | - Gennady Bocharov
- Marchuk Institute of Numerical Mathematics, Russian Academy of Sciences, (INM RAS), 119333 Moscow, Russia
- Moscow Center for Fundamental and Applied Mathematics, INM RAS, 119333 Moscow, Russia
- Institute of Computer Science and Mathematical Modelling, Sechenov First Moscow State Medical University, 119991 Moscow, Russia
- Correspondence: (A.T.); (D.G.); (G.B.)
| |
Collapse
|
36
|
Bratchenko IA, Bratchenko LA. Comment on "Feasibility of Raman spectroscopy as a potential in vivo tool to screen for pre-diabetes and diabetes". JOURNAL OF BIOPHOTONICS 2023; 16:e202200272. [PMID: 36306108 DOI: 10.1002/jbio.202200272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 10/26/2022] [Indexed: 06/16/2023]
Abstract
This paper comments recent findings about Raman spectroscopy application for in vivo noninvasive diabetes detection, published in the Journal of Biophotonics by E. Guevara et al. (J. Biophotonics 2022, 15, e202200055). The proposed results may be not entirely correct due to possible overestimation of classification models and absence of additional information regarding age of tested volunteers.
Collapse
Affiliation(s)
- Ivan A Bratchenko
- Laser and Biotechnical Systems Department, Samara National Research University, Samara, Russia
| | - Lyudmila A Bratchenko
- Laser and Biotechnical Systems Department, Samara National Research University, Samara, Russia
| |
Collapse
|
37
|
Hsu LL, Culhane AC. Correspondence analysis for dimension reduction, batch integration, and visualization of single-cell RNA-seq data. Sci Rep 2023; 13:1197. [PMID: 36681709 PMCID: PMC9867729 DOI: 10.1038/s41598-022-26434-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 12/14/2022] [Indexed: 01/22/2023] Open
Abstract
Effective dimension reduction is essential for single cell RNA-seq (scRNAseq) analysis. Principal component analysis (PCA) is widely used, but requires continuous, normally-distributed data; therefore, it is often coupled with log-transformation in scRNAseq applications, which can distort the data and obscure meaningful variation. We describe correspondence analysis (CA), a count-based alternative to PCA. CA is based on decomposition of a chi-squared residual matrix, avoiding distortive log-transformation. To address overdispersion and high sparsity in scRNAseq data, we propose five adaptations of CA, which are fast, scalable, and outperform standard CA and glmPCA, to compute cell embeddings with more performant or comparable clustering accuracy in 8 out of 9 datasets. In particular, we find that CA with Freeman-Tukey residuals performs especially well across diverse datasets. Other advantages of the CA framework include visualization of associations between genes and cell populations in a "CA biplot," and extension to multi-table analysis; we introduce corralm for integrative multi-table dimension reduction of scRNAseq data. We implement CA for scRNAseq data in corral, an R/Bioconductor package which interfaces directly with single cell classes in Bioconductor. Switching from PCA to CA is achieved through a simple pipeline substitution and improves dimension reduction of scRNAseq datasets.
Collapse
Affiliation(s)
- Lauren L Hsu
- Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, MA, USA
- Department of Cancer Immunology and Virology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Aedín C Culhane
- Limerick Digital Cancer Research Centre, Health Research Institute, School of Medicine, University of Limerick, Limerick, Ireland.
| |
Collapse
|
38
|
Gurung RL, Burdon KP, McComish BJ. A Guide to Genome-Wide Association Study Design for Diabetic Retinopathy. Methods Mol Biol 2023; 2678:49-89. [PMID: 37326705 DOI: 10.1007/978-1-0716-3255-0_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Diabetic retinopathy (DR) is the most common microvascular complication related to diabetes. There is evidence that genetics play an important role in DR pathogenesis, but the complexity of the disease makes genetic studies a challenge. This chapter is a practical overview of the basic steps for genome-wide association studies with respect to DR and its associated traits. Also described are approaches that can be adopted in future DR studies. This is intended to serve as a guide for beginners and to provide a framework for further in-depth analysis.
Collapse
Affiliation(s)
- Rajya L Gurung
- Menzies Institute for Medical Research, University of Tasmania, Hobart, TAS, Australia.
| | - Kathryn P Burdon
- Menzies Institute for Medical Research, University of Tasmania, Hobart, TAS, Australia.
| | - Bennet J McComish
- Menzies Institute for Medical Research, University of Tasmania, Hobart, TAS, Australia
| |
Collapse
|
39
|
Abstract
Medical imaging is a great asset for modern medicine, since it allows physicians to spatially interrogate a disease site, resulting in precise intervention for diagnosis and treatment, and to observe particular aspect of patients' conditions that otherwise would not be noticeable. Computational analysis of medical images, moreover, can allow the discovery of disease patterns and correlations among cohorts of patients with the same disease, thus suggesting common causes or providing useful information for better therapies and cures. Machine learning and deep learning applied to medical images, in particular, have produced new, unprecedented results that can pave the way to advanced frontiers of medical discoveries. While computational analysis of medical images has become easier, however, the possibility to make mistakes or generate inflated or misleading results has become easier, too, hindering reproducibility and deployment. In this article, we provide ten quick tips to perform computational analysis of medical images avoiding common mistakes and pitfalls that we noticed in multiple studies in the past. We believe our ten guidelines, if taken into practice, can help the computational-medical imaging community to perform better scientific research that eventually can have a positive impact on the lives of patients worldwide.
Collapse
Affiliation(s)
- Davide Chicco
- Institute of Health Policy Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
| | - Rakesh Shiradkar
- Department of Biomedical Engineering, Emory University, Atlanta, Georgia, United States of America
| |
Collapse
|
40
|
Single-Cell RNAseq Complexity Reduction. Methods Mol Biol 2022; 2584:217-230. [PMID: 36495452 DOI: 10.1007/978-1-0716-2756-3_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
An important step in single-cell RNAseq data analysis is the preparation of the single cell transcription data for cell sub-population partitioning. In this chapter, we describe how to perform complexity reduction for 3' end single-cell RNAseq transcriptomics data.
Collapse
|
41
|
van der Klis M, Tellings J. Generating semantic maps through multidimensional scaling: linguistic applications and theory. CORPUS LINGUISTICS AND LINGUISTIC THEORY 2022; 18:627-665. [PMCID: PMC9536326 DOI: 10.1515/cllt-2021-0018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 12/06/2021] [Indexed: 12/13/2023]
Abstract
This paper reports on the state-of-the-art in application of multidimensional scaling (MDS) techniques to create semantic maps in linguistic research. MDS refers to a statistical technique that represents objects (lexical items, linguistic contexts, languages, etc.) as points in a space so that close similarity between the objects corresponds to close distances between the corresponding points in the representation. We focus on the use of MDS in combination with parallel corpus data as used in research on cross-linguistic variation. We first introduce the mathematical foundations of MDS and then give an exhaustive overview of past research that employs MDS techniques in combination with parallel corpus data. We propose a set of terminology to succinctly describe the key parameters of a particular MDS application. We then show that this computational methodology is theory-neutral, i.e. it can be employed to answer research questions in a variety of linguistic theoretical frameworks. Finally, we show how this leads to two lines of future developments for MDS research in linguistics.
Collapse
Affiliation(s)
| | - Jos Tellings
- UiL OTS, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
42
|
Kong L, Zhang T, Zhou C, Gomez MA, Hu Y, Zhang S. The evaluation of playing styles integrating with contextual variables in professional soccer. Front Psychol 2022; 13:1002566. [PMID: 36211871 PMCID: PMC9539538 DOI: 10.3389/fpsyg.2022.1002566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 08/31/2022] [Indexed: 11/17/2022] Open
Abstract
Purpose Playing styles play a key role in winning soccer matches, but the technical and physical styles of play between home and away match considering team quality in the Chinese Soccer Super League (CSL) remain unclear. The aim of this study was to explore the technical and physical styles of play between home and away matches integrating with team quality in the CSL. Materials and methods The study sample consists of 480 performance records from 240 matches during the 2019 competitive season in the CSL. These match events were collected using a semi-automatic computerized video tracking system, Amisco Pro®. A k-means cluster analysis was used to evaluate team quality and then using principal component analysis (PCA) to identify the playing styles between home and away matches according to team quality. Differences between home and away matches in terms of playing styles were analyzed using a linear mixed model. Results Our study found that PC1 presented a positive correlation with physical-related variables such as HIRD, HIRE, HSRD, and HSRE while PC2 was positively associated with the passing-related variables such as Pass, FPass, PassAcc, and FPAcc. Therefore, PC1 typically represents intense-play styles while PC2 represents possession-play styles at home and away matches, respectively. In addition, strong teams preferred to utilize intensity play whereas medium and weak teams utilized possession play whenever playing at home or away matches. Furthermore, the first five teams in the final overall ranking in the CSL presented a compensated technical-physical playing style whereas the last five teams showed inferior performance in terms of intensity and possession play. Conclusion Intensity or possession play was associated with the final overall ranking in the CSL, and playing styles that combine these two factors could be more liable to win the competition. Our study provides a detailed explanation for the impact of playing styles on match performances whereby coaches can adjust and combine different playing styles for ultimate success.
Collapse
Affiliation(s)
- Lingfeng Kong
- Department of Physical Education, Hohai University, Nanjing, China
| | - Tianbo Zhang
- Department of Automation, Tsinghua University, Beijing, China
| | - Changjing Zhou
- School of Physical Education and Sports Training, Shanghai University of Sport, Shanghai, China
| | - Miguel-Angel Gomez
- Faculty of Physical Activity and Sport Sciences (INEF), Universidad Politécnica de Madrid, Madrid, Spain
| | - Yue Hu
- Department of Political Science, Tsinghua University, Beijing, China
| | - Shaoliang Zhang
- Division of Sports Science and Physical Education, Research Centre for Athletic Performance and Data Science, Tsinghua University, Beijing, China
| |
Collapse
|
43
|
Zerfaß C, Lehmann R, Ueberschaar N, Sanchez-Arcos C, Totsche KU, Pohnert G. Groundwater metabolome responds to recharge in fractured sedimentary strata. WATER RESEARCH 2022; 223:118998. [PMID: 36030668 DOI: 10.1016/j.watres.2022.118998] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 08/01/2022] [Accepted: 08/16/2022] [Indexed: 06/15/2023]
Abstract
Understanding the sources, structure and fate of dissolved organic matter (DOM) in groundwater is paramount for the protection and sustainable use of this vital resource. On its passage through the Critical Zone, DOM is subject to biogeochemical conversions. Therefore, it carries valuable cross-habitat information for monitoring and predicting the stability of groundwater ecosystem services and assessing these ecosystems' response to fluctuations caused by external impacts such as climatic extremes. Challenges arise from insufficient knowledge on groundwater metabolite composition and dynamics due to a lack of consistent analytical approaches for long-term monitoring. Our study establishes groundwater metabolomics to decipher the complex biogeochemical transport and conversion of DOM. We explore fractured sedimentary bedrock along a hillslope recharge area by a 5-year untargeted metabolomics monitoring of oxic perched and anoxic phreatic groundwater. A summer with extremely high temperatures and low precipitation was included in the monitoring. Water was accessed by a monitoring well-transect and regularly collected for liquid chromatography-mass spectrometry (LC-MS) investigation. Dimension reduction of the resulting complex data set by principal component analysis revealed that metabolome dissimilarities between distant wells coincide with transient cross-stratal flow indicated by groundwater levels. Time series of the groundwater metabolome data provides detailed insights into subsurface responses to recharge dynamics. We demonstrate that dissimilarity variability between groundwater bodies with contrasting aquifer properties coincides with recharge dynamics. This includes groundwater high- and lowstands as well as recharge and recession phases. Our monitoring approach allows to survey groundwater ecosystems even under extreme conditions. Notably, the metabolome was highly variable lacking seasonal patterns and did not segregate by geographical location of sampling wells, thus ruling out vegetation or (agricultural) land use as a primary driving factor. Patterns that emerge from metabolomics monitoring give insight into subsurface ecosystem functioning and water quality evolution, essential for sustainable groundwater use and climate change-adapted management.
Collapse
Affiliation(s)
- Christian Zerfaß
- Department of Bioorganic Analytics, Institute of Inorganic and Analytical Chemistry, Friedrich Schiller University, Jena, Germany
| | - Robert Lehmann
- Department of Hydrogeology, Institute of Geosciences, Friedrich Schiller University, Jena, Germany
| | - Nico Ueberschaar
- Mass Spectrometry Platform, Faculty for Chemistry and Earth Sciences, Friedrich Schiller University, Jena, Germany
| | - Carlos Sanchez-Arcos
- Department of Bioorganic Analytics, Institute of Inorganic and Analytical Chemistry, Friedrich Schiller University, Jena, Germany
| | - Kai Uwe Totsche
- Department of Hydrogeology, Institute of Geosciences, Friedrich Schiller University, Jena, Germany
| | - Georg Pohnert
- Department of Bioorganic Analytics, Institute of Inorganic and Analytical Chemistry, Friedrich Schiller University, Jena, Germany.
| |
Collapse
|
44
|
Shou G, Yuan H, Cha YH, Sweeney JA, Ding L. Age-related changes of whole-brain dynamics in spontaneous neuronal coactivations. Sci Rep 2022; 12:12140. [PMID: 35840643 PMCID: PMC9287374 DOI: 10.1038/s41598-022-16125-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 07/05/2022] [Indexed: 01/04/2023] Open
Abstract
Human brains experience whole-brain anatomic and functional changes throughout the lifespan. Age-related whole-brain network changes have been studied with functional magnetic resonance imaging (fMRI) to determine their low-frequency spatial and temporal characteristics. However, little is known about age-related changes in whole-brain fast dynamics at the scale of neuronal events. The present study investigated age-related whole-brain dynamics in resting-state electroencephalography (EEG) signals from 73 healthy participants from 6 to 65 years old via characterizing transient neuronal coactivations at a resolution of tens of milliseconds. These uncovered transient patterns suggest fluctuating brain states at different energy levels of global activations. Our results indicate that with increasing age, shorter lifetimes and more occurrences were observed in the brain states that show the global high activations and more consecutive visits to the global highest-activation brain state. There were also reduced transitional steps during consecutive visits to the global lowest-activation brain state. These age-related effects suggest reduced stability and increased fluctuations when visiting high-energy brain states and with a bias toward staying low-energy brain states. These age-related whole-brain dynamics changes are further supported by changes observed in classic alpha and beta power, suggesting its promising applications in examining the effect of normal healthy brain aging, brain development, and brain disease.
Collapse
Affiliation(s)
- Guofa Shou
- Stephenson School of Biomedical Engineering, University of Oklahoma, Norman, USA
| | - Han Yuan
- Stephenson School of Biomedical Engineering, University of Oklahoma, Norman, USA.,Institute for Biomedical Engineering, Science, and Technology, University of Oklahoma, Norman, USA
| | - Yoon-Hee Cha
- Department of Neurology, University of Minnesota, Minneapolis, MN, USA
| | - John A Sweeney
- Department of Psychiatry, University of Cincinnati, Cincinnati, OH, USA
| | - Lei Ding
- Stephenson School of Biomedical Engineering, University of Oklahoma, Norman, USA. .,Institute for Biomedical Engineering, Science, and Technology, University of Oklahoma, Norman, USA. .,University of Oklahoma, 173 Felgar St., Gallogly Hall, Room 101, Norman, OK, 73019, USA.
| |
Collapse
|
45
|
Namba S, Nakamura K, Watanabe K. The spatio-temporal features of perceived-as-genuine and deliberate expressions. PLoS One 2022; 17:e0271047. [PMID: 35839208 PMCID: PMC9286247 DOI: 10.1371/journal.pone.0271047] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 06/22/2022] [Indexed: 11/24/2022] Open
Abstract
Reading the genuineness of facial expressions is important for increasing the credibility of information conveyed by faces. However, it remains unclear which spatio-temporal characteristics of facial movements serve as critical cues to the perceived genuineness of facial expressions. This study focused on observable spatio-temporal differences between perceived-as-genuine and deliberate expressions of happiness and anger expressions. In this experiment, 89 Japanese participants were asked to judge the perceived genuineness of faces in videos showing happiness or anger expressions. To identify diagnostic facial cues to the perceived genuineness of the facial expressions, we analyzed a total of 128 face videos using an automated facial action detection system; thereby, moment-to-moment activations in facial action units were annotated, and nonnegative matrix factorization extracted sparse and meaningful components from all action units data. The results showed that genuineness judgments reduced when more spatial patterns were observed in facial expressions. As for the temporal features, the perceived-as-deliberate expressions of happiness generally had faster onsets to the peak than the perceived-as-genuine expressions of happiness. Moreover, opening the mouth negatively contributed to the perceived-as-genuine expressions, irrespective of the type of facial expressions. These findings provide the first evidence for dynamic facial cues to the perceived genuineness of happiness and anger expressions.
Collapse
Affiliation(s)
- Shushi Namba
- Psychological Process Research Team, Guardian Robot Project, RIKEN, Kyoto, Japan
| | - Koyo Nakamura
- Faculty of Psychology, Department of Cognition, Emotion, and Methods in Psychology, University of Vienna, Vienna, Austria
- Japan Society for the Promotion of Science, Tokyo, Japan
- Faculty of Science and Engineering, Waseda University, Tokyo, Japan
| | - Katsumi Watanabe
- Faculty of Science and Engineering, Waseda University, Tokyo, Japan
| |
Collapse
|
46
|
Heino J, García Girón J, Hämäläinen H, Hellsten S, Ilmonen J, Karjalainen J, Mäkinen T, Nyholm K, Ropponen J, Takolander A, Tolonen KT. Assessing the conservation priority of freshwater lake sites based on taxonomic, functional and environmental uniqueness. DIVERS DISTRIB 2022. [DOI: 10.1111/ddi.13598] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Affiliation(s)
- Jani Heino
- Finnish Environment Institute, Freshwater Centre Oulu Finland
| | - Jorge García Girón
- Finnish Environment Institute, Freshwater Centre Oulu Finland
- Ecology Research Unit University of León León Spain
| | - Heikki Hämäläinen
- Department of Biological and Environmental Science University of Jyväskylä Jyväskylä Finland
| | - Seppo Hellsten
- Finnish Environment Institute, Freshwater Centre Oulu Finland
| | | | - Juha Karjalainen
- Department of Biological and Environmental Science University of Jyväskylä Jyväskylä Finland
| | | | - Kristiina Nyholm
- Department of Biological and Environmental Science University of Jyväskylä Jyväskylä Finland
| | - Janne Ropponen
- Finnish Environment Institute, Freshwater Centre Jyväskylä Finland
| | - Antti Takolander
- Finnish Environment Institute, Marine Research Centre Helsinki Finland
| | - Kimmo T. Tolonen
- Finnish Environment Institute, Freshwater Centre Jyväskylä Finland
| |
Collapse
|
47
|
Namba S, Sato W, Matsui H. Spatio-Temporal Properties of Amused, Embarrassed, and Pained Smiles. JOURNAL OF NONVERBAL BEHAVIOR 2022. [DOI: 10.1007/s10919-022-00404-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
AbstractSmiles are universal but nuanced facial expressions that are most frequently used in face-to-face communications, typically indicating amusement but sometimes conveying negative emotions such as embarrassment and pain. Although previous studies have suggested that spatial and temporal properties could differ among these various types of smiles, no study has thoroughly analyzed these properties. This study aimed to clarify the spatiotemporal properties of smiles conveying amusement, embarrassment, and pain using a spontaneous facial behavior database. The results regarding spatial patterns revealed that pained smiles showed less eye constriction and more overall facial tension than amused smiles; no spatial differences were identified between embarrassed and amused smiles. Regarding temporal properties, embarrassed and pained smiles remained in a state of higher facial tension than amused smiles. Moreover, embarrassed smiles showed a more gradual change from tension states to the smile state than amused smiles, and pained smiles had lower probabilities of staying in or transitioning to the smile state compared to amused smiles. By comparing the spatiotemporal properties of these three smile types, this study revealed that the probability of transitioning between discrete states could help distinguish amused, embarrassed, and pained smiles.
Collapse
|
48
|
Adolphe M, Sawayama M, Maurel D, Delmas A, Oudeyer PY, Sauzéon H. An Open-Source Cognitive Test Battery to Assess Human Attention and Memory. Front Psychol 2022; 13:880375. [PMID: 35756204 PMCID: PMC9231481 DOI: 10.3389/fpsyg.2022.880375] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 04/26/2022] [Indexed: 11/13/2022] Open
Abstract
Cognitive test batteries are widely used in diverse research fields, such as cognitive training, cognitive disorder assessment, or brain mechanism understanding. Although they need flexibility according to their usage objectives, most test batteries are not available as open-source software and are not be tuned by researchers in detail. The present study introduces an open-source cognitive test battery to assess attention and memory, using a javascript library, p5.js. Because of the ubiquitous nature of dynamic attention in our daily lives, it is crucial to have tools for its assessment or training. For that purpose, our test battery includes seven cognitive tasks (multiple-objects tracking, enumeration, go/no-go, load-induced blindness, task-switching, working memory, and memorability), common in cognitive science literature. By using the test battery, we conducted an online experiment to collect the benchmark data. Results conducted on 2 separate days showed the high cross-day reliability. Specifically, the task performance did not largely change with the different days. Besides, our test battery captures diverse individual differences and can evaluate them based on the cognitive factors extracted from latent factor analysis. Since we share our source code as open-source software, users can expand and manipulate experimental conditions flexibly. Our test battery is also flexible in terms of the experimental environment, i.e., it is possible to experiment either online or in a laboratory environment.
Collapse
Affiliation(s)
- Maxime Adolphe
- Flowers Team, Inria, Bordeaux, France.,Research and Development Team, Onepoint, Bordeaux, France.,Department of Cognitive Sciences and Ergonomics, Université de Bordeaux, Bordeaux, France
| | | | - Denis Maurel
- Research and Development Team, Onepoint, Bordeaux, France
| | | | - Pierre-Yves Oudeyer
- Flowers Team, Inria, Bordeaux, France.,Microsoft Research Montreal, Montreal, QC, Canada
| | - Hélène Sauzéon
- Flowers Team, Inria, Bordeaux, France.,ACTIVE Team, Université de Bordeaux, INSERM, BPH, U1219, Bordeaux, France
| |
Collapse
|
49
|
Restrepo-Montoya D, Hulse-Kemp AM, Scheffler JA, Haigler CH, Hinze LL, Love J, Percy RG, Jones DC, Frelichowski J. Leveraging National Germplasm Collections to Determine Significantly Associated Categorical Traits in Crops: Upland and Pima Cotton as a Case Study. FRONTIERS IN PLANT SCIENCE 2022; 13:837038. [PMID: 35557715 PMCID: PMC9087864 DOI: 10.3389/fpls.2022.837038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 03/21/2022] [Indexed: 06/15/2023]
Abstract
Observable qualitative traits are relatively stable across environments and are commonly used to evaluate crop genetic diversity. Recently, molecular markers have largely superseded describing phenotypes in diversity surveys. However, qualitative descriptors are useful in cataloging germplasm collections and for describing new germplasm in patents, publications, and/or the Plant Variety Protection (PVP) system. This research focused on the comparative analysis of standardized cotton traits as represented within the National Cotton Germplasm Collection (NCGC). The cotton traits are named by 'descriptors' that have non-numerical sub-categories (descriptor states) reflecting the details of how each trait manifests or is absent in the plant. We statistically assessed selected accessions from three major groups of Gossypium as defined by the NCGC curator: (1) "Stoneville accessions (SA)," containing mainly Upland cotton (Gossypium hirsutum) cultivars; (2) "Texas accessions (TEX)," containing mainly G. hirsutum landraces; and (3) Gossypium barbadense (Gb), containing cultivars or landraces of Pima cotton (Gossypium barbadense). For 33 cotton descriptors we: (a) revealed distributions of character states for each descriptor within each group; (b) analyzed bivariate associations between paired descriptors; and (c) clustered accessions based on their descriptors. The fewest significant associations between descriptors occurred in the SA dataset, likely reflecting extensive breeding for cultivar development. In contrast, the TEX and Gb datasets showed a higher number of significant associations between descriptors, likely correlating with less impact from breeding efforts. Three significant bivariate associations were identified for all three groups, bract nectaries:boll nectaries, leaf hair:stem hair, and lint color:seed fuzz color. Unsupervised clustering analysis recapitulated the species labels for about 97% of the accessions. Unexpected clustering results indicated accessions that may benefit from potential further investigation. In the future, the significant associations between standardized descriptors can be used by curators to determine whether new exotic/unusual accessions most closely resemble Upland or Pima cotton. In addition, the study shows how existing descriptors for large germplasm datasets can be useful to inform downstream goals in breeding and research, such as identifying rare individuals with specific trait combinations and targeting breakdown of remaining trait associations through breeding, thus demonstrating the utility of the analytical methods employed in categorizing germplasm diversity within the collection.
Collapse
Affiliation(s)
- Daniel Restrepo-Montoya
- Department of Crop and Soil Sciences, North Carolina State University, Raleigh, NC, United States
| | - Amanda M. Hulse-Kemp
- Department of Crop and Soil Sciences, North Carolina State University, Raleigh, NC, United States
- Genomics and Bioinformatics Research Unit, United States Department of Agriculture - Agricultural Research Service (USDA-ARS), Raleigh, NC, United States
| | - Jodi A. Scheffler
- Crop Genetics Research Unit, United States Department of Agriculture - Agricultural Research Service (USDA-ARS), Stoneville, MS, United States
| | - Candace H. Haigler
- Department of Crop and Soil Sciences, North Carolina State University, Raleigh, NC, United States
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, United States
| | - Lori L. Hinze
- Crop Germplasm Research Unit, United States Department of Agriculture - Agricultural Research Service (USDA-ARS), College Station, TX, United States
| | - Janna Love
- Crop Germplasm Research Unit, United States Department of Agriculture - Agricultural Research Service (USDA-ARS), College Station, TX, United States
| | - Richard G. Percy
- Crop Germplasm Research Unit, United States Department of Agriculture - Agricultural Research Service (USDA-ARS), College Station, TX, United States
| | | | - James Frelichowski
- Crop Germplasm Research Unit, United States Department of Agriculture - Agricultural Research Service (USDA-ARS), College Station, TX, United States
| |
Collapse
|
50
|
Leveraging Deep Learning Techniques and Integrated Omics Data for Tailored Treatment of Breast Cancer. J Pers Med 2022; 12:jpm12050674. [PMID: 35629097 PMCID: PMC9147748 DOI: 10.3390/jpm12050674] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 03/06/2022] [Accepted: 04/14/2022] [Indexed: 12/12/2022] Open
Abstract
Multiomics data of cancer patients and cell lines, in synergy with deep learning techniques, have aided in unravelling predictive problems related to cancer research and treatment. However, there is still room for improvement in the performance of the existing models based on the aforementioned combination. In this work, we propose two models that complement the treatment of breast cancer patients. First, we discuss our deep learning-based model for breast cancer subtype classification. Second, we propose DCNN-DR, a deep convolute.ion neural network-drug response method for predicting the effectiveness of drugs on in vitro and in vivo breast cancer datasets. Finally, we applied DCNN-DR for predicting effective drugs for the basal-like breast cancer subtype and validated the results with the information available in the literature. The models proposed use late integration methods and have fairly better predictive performance compared to the existing methods. We use the Pearson correlation coefficient and accuracy as the performance measures for the regression and classification models, respectively.
Collapse
|