1
|
Kleczynski M, Bergonzo C, Kearsley AJ. Spatial and Sequential Topological Analysis of Molecular Dynamics Simulations of IgG1 Fc Domains. J Chem Theory Comput 2025; 21:4884-4897. [PMID: 40261915 PMCID: PMC12079798 DOI: 10.1021/acs.jctc.5c00161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2025] [Revised: 04/05/2025] [Accepted: 04/10/2025] [Indexed: 04/24/2025]
Abstract
Monoclonal antibodies are utilized in a wide range of biomedical applications. The NIST monoclonal antibody is a resource for developing analysis methods for monoclonal antibody based biopharmaceutical platforms. Techniques from topological data analysis quantify structural features such as loops and tunnels which are not easily measured by classical data analysis methods. In this paper, we introduce the Gaussian CROCKER column differences (GCCD) matrix, which augments standard topological data analysis summaries with biological sequence information. We use GCCD matrices to successfully differentiate between glycosylated and aglycosylated conformations from molecular dynamics simulations of the NIST monoclonal antibody Fc domain. We are optimistic that other researchers will be able to utilize GCCD matrices to quantify multiscale spatial and sequential features.
Collapse
Affiliation(s)
- Melinda Kleczynski
- National
Institute of Standards and Technology, Gaithersburg, Maryland 20899, United States
| | - Christina Bergonzo
- National
Institute of Standards and Technology, Gaithersburg, Maryland 20899, United States
- Institute
for Bioscience and Biotechnology Research, Rockville, Maryland 20850, United States
| | - Anthony J. Kearsley
- National
Institute of Standards and Technology, Gaithersburg, Maryland 20899, United States
| |
Collapse
|
2
|
Sykes JA, Nicuşan AL, Werner D, Herald MT, Weston D, Wheldon TK, Windows-Yule CRK. A topological approach to positron emission particle tracking for finding multiple particles in high noise environments. Sci Rep 2025; 15:13599. [PMID: 40253448 PMCID: PMC12009289 DOI: 10.1038/s41598-025-97175-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2024] [Accepted: 04/02/2025] [Indexed: 04/21/2025] Open
Abstract
Positron emission particle tracking (PEPT) is an advanced imaging technique that accurately tracks the three-dimensional spatial coordinates of a radioactively-labelled particle with sub-millimetre and sub-millisecond precision. By detecting back-to-back 511 keV gamma rays from positron-electron annihilation coincidence events, PEPT can locate particles within highly dense, opaque systems such as fluidised beds, rotating drums, and mills. Despite the progress made in enhancing the precision and accuracy of PEPT, simultaneous multiple particle tracking remains a significant challenge, particularly in high-noise environments. This paper introduces T-PEPT, a novel algorithm that leverages topological data analysis-a relatively new field of applied mathematics that explores the underlying 'shape' of data through techniques like persistence homology. By creating simplicial complexes and applying persistence homology to PEPT point data, T-PEPT demonstrates highly effective performance in multiple-particle tracking, especially in scenarios with high noise. When benchmarked against existing PEPT algorithms using a widely recognised standard framework, T-PEPT consistently maintains sub-millimetre spatial and sub-millisecond temporal precision in nearly all cases, demonstrating its robustness and accuracy. For Data availability for T-PEPT, please use the GitHub repository: https://github.com/uob-positron-imaging-centre/pept .
Collapse
Affiliation(s)
- Jack A Sykes
- School of Physics and Astronomy, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK.
- School of Chemical Engineering, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK.
| | - Andrei L Nicuşan
- School of Chemical Engineering, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| | - Dominik Werner
- School of Chemical Engineering, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| | - Matthew T Herald
- School of Chemical Engineering, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| | - Daniel Weston
- School of Chemical Engineering, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| | - Tzany Kokalova Wheldon
- School of Physics and Astronomy, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| | | |
Collapse
|
3
|
Suay-García B, Climent J, Pérez-Gracia MT, Falcó A. A comprehensive update on the use of molecular topology applications for anti-infective drug discovery. Expert Opin Drug Discov 2025; 20:465-474. [PMID: 40056200 DOI: 10.1080/17460441.2025.2477625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2024] [Revised: 02/17/2025] [Accepted: 03/06/2025] [Indexed: 03/10/2025]
Abstract
INTRODUCTION The rapid emergence of infectious diseases poses a significant threat to global economies and public health. To combat this, it is crucial to develop effective treatments. One essential tool in drug design is molecular topology, which uses topological indices to build QSAR models. This mathematical framework describes chemical compound structures, facilitating easy characterization. AREAS COVERED Classical ligand-based molecular topology has a series of limitations that can be overcome by shifting focus into structure-based approaches. Recent developments have emerged, focusing on target protein topology rather than drug molecules. Techniques like TDA, ESPH, LWPH, and molecular GDL are among the new methods being explored. This review is based on literature searches utilizing PubMed, Web of Science, and Google Scholar to identify articles published between the year 2000 and 2024. EXPERT OPINION The authors believe that it is time to move away from traditional molecular topology and toward innovative approaches and technologies. Shifting focus from ligand-based to structure-based molecular topology, combined with new databases and algorithms, can aid in fighting drug-resistant microorganisms. This shift opens a broader chemical space for developing new anti-infective drugs, ultimately improving public health outcomes.
Collapse
Affiliation(s)
- Beatriz Suay-García
- Departamento de Matemáticas, Física y Ciencias Tecnológicas, Universidad Cardenal Herrera-CEU, CEU Universities, Valencia, Spain
| | - Joan Climent
- Departamento de Producción y Sanidad Animal, Salud Pública Veterinaria y Ciencia y Tecnología de los Alimentos, Facultad de Veterinaria, Universidad CEU Cardenal Herrera, CEU Universities, Valencia, Spain
| | - María Teresa Pérez-Gracia
- Área de Microbiología, Departamento de Farmacia, Instituto de Ciencias Biomédicas, Facultad de Ciencias de la Salud Universidad Cardenal Herrera-CEU, CEU Universities, Alfara del Patriarca, Valencia, Spain
| | - Antonio Falcó
- Departamento de Matemáticas, Física y Ciencias Tecnológicas, Universidad Cardenal Herrera-CEU, CEU Universities, Valencia, Spain
| |
Collapse
|
4
|
Jafree DJ, Perera C, Ball M, Tolomeo D, Pomeranz G, Wilson L, Davis B, Mason WJ, Funk EM, Kolatsi-Joannou M, Polschi R, Malik S, Stewart BJ, Price KL, Mitchell H, Motallebzadeh R, Muto Y, Lees R, Needham S, Moulding D, Chandler JC, Nandanwar S, Walsh CL, Winyard PJD, Scambler PJ, Hägerling R, Clatworthy MR, Humphreys BD, Lythgoe MF, Walker-Samuel S, Woolf AS, Long DA. Microvascular aberrations found in human polycystic kidneys are an early feature in a Pkd1 mutant mouse model. Dis Model Mech 2025; 18:dmm052024. [PMID: 40114603 PMCID: PMC12067086 DOI: 10.1242/dmm.052024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Accepted: 03/13/2025] [Indexed: 03/22/2025] Open
Abstract
Therapies targeting blood vessels hold promise for autosomal dominant polycystic kidney disease (ADPKD), the most common inherited disorder causing kidney failure. However, the onset and nature of kidney vascular abnormalities in ADPKD are poorly defined. Accordingly, we employed a combination of single-cell transcriptomics; three-dimensional imaging with geometric, topological and fractal analyses; and multimodal magnetic resonance imaging with arterial spin labelling to investigate aberrant microvasculature in ADPKD kidneys. Within human ADPKD kidneys with advanced cystic pathology and excretory failure, we identified a molecularly distinct blood microvascular subpopulation, characterised by impaired angiogenic signalling and metabolic dysfunction, differing from endothelial injury profiles observed in non-cystic human kidney diseases. Next, Pkd1 mutant mouse kidneys were examined postnatally, when cystic pathology is well established, but before excretory failure. An aberrant endothelial subpopulation was also detected, concurrent with reduced cortical blood perfusion. Disorganised kidney cortical microvasculature was also present in Pkd1 mutant mouse fetal kidneys when tubular dilation begins. Thus, aberrant features of cystic kidney vasculature are harmonised between human and mouse ADPKD, supporting early targeting of the vasculature as a strategy to ameliorate ADPKD progression.
Collapse
Affiliation(s)
- Daniyal J. Jafree
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
- UCL Centre for Kidney and Bladder Health, University College London, London WC1E 6BT, UK
- Specialised Foundation Programme in Research, NHS East of England, Cambridge CB21 5XB, UK
| | - Charith Perera
- UCL Centre for Advanced Biomedical Imaging, University College London, London WC1E 6DD, UK
| | - Mary Ball
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
| | - Daniele Tolomeo
- UCL Centre for Advanced Biomedical Imaging, University College London, London WC1E 6DD, UK
| | - Gideon Pomeranz
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
- UCL Centre for Kidney and Bladder Health, University College London, London WC1E 6BT, UK
| | - Laura Wilson
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
- UCL Centre for Kidney and Bladder Health, University College London, London WC1E 6BT, UK
| | - Benjamin Davis
- Central Laser Facility, Science and Technologies Facilities Council, UK Research and Innovation, Didcot OX11 0QX, UK
| | - William J. Mason
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
- UCL Centre for Kidney and Bladder Health, University College London, London WC1E 6BT, UK
| | - Eva Maria Funk
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
- Lymphovascular Medicine and Translational 3D-Histopathology Research Group, Charité Universitätsmedizin Berlin, Berlin 10117, Germany
- Berlin Institute of Health at Charité-Universitätsmedizin Berlin, BIH Center for Regenerative Therapies, Berlin 10117, Germany
| | - Maria Kolatsi-Joannou
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
- UCL Centre for Kidney and Bladder Health, University College London, London WC1E 6BT, UK
| | - Radu Polschi
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
| | - Saif Malik
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
- UCL Centre for Kidney and Bladder Health, University College London, London WC1E 6BT, UK
| | - Benjamin J. Stewart
- Molecular Immunity Unit, Department of Medicine, University of Cambridge, Cambridge CB2 1TN, UK
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Karen L. Price
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
- UCL Centre for Kidney and Bladder Health, University College London, London WC1E 6BT, UK
| | - Hannah Mitchell
- Mathematical Sciences Research Centre, Queen's University Belfast, Belfast BT7 1NN, UK
| | - Reza Motallebzadeh
- UCL Centre for Kidney and Bladder Health, University College London, London WC1E 6BT, UK
- Research Department of Surgical Biotechnology, Division of Surgery and Interventional Science, University College London, London NW3 2PF, UK
- UCL Institute of Immunity and Transplantation, University College London, London NW3 2PF, UK
| | - Yoshiharu Muto
- Division of Nephrology, Department of Medicine, Washington University in St Louis, St Louis, MO 63110, USA
| | - Robert Lees
- Central Laser Facility, Science and Technologies Facilities Council, UK Research and Innovation, Didcot OX11 0QX, UK
| | - Sarah Needham
- Central Laser Facility, Science and Technologies Facilities Council, UK Research and Innovation, Didcot OX11 0QX, UK
| | - Dale Moulding
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
| | - Jennie C. Chandler
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
- UCL Centre for Kidney and Bladder Health, University College London, London WC1E 6BT, UK
| | - Sonal Nandanwar
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
- UCL Centre for Kidney and Bladder Health, University College London, London WC1E 6BT, UK
- Department of Mechanical Engineering, University College London, London WC1E 7JE, UK
| | - Claire L. Walsh
- UCL Centre for Advanced Biomedical Imaging, University College London, London WC1E 6DD, UK
- Department of Mechanical Engineering, University College London, London WC1E 7JE, UK
| | - Paul J. D. Winyard
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
- UCL Centre for Kidney and Bladder Health, University College London, London WC1E 6BT, UK
| | - Peter J. Scambler
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
| | - René Hägerling
- Lymphovascular Medicine and Translational 3D-Histopathology Research Group, Charité Universitätsmedizin Berlin, Berlin 10117, Germany
- Berlin Institute of Health at Charité-Universitätsmedizin Berlin, BIH Center for Regenerative Therapies, Berlin 10117, Germany
| | - Menna R. Clatworthy
- Molecular Immunity Unit, Department of Medicine, University of Cambridge, Cambridge CB2 1TN, UK
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Benjamin D. Humphreys
- Division of Nephrology, Department of Medicine, Washington University in St Louis, St Louis, MO 63110, USA
| | - Mark F. Lythgoe
- UCL Centre for Advanced Biomedical Imaging, University College London, London WC1E 6DD, UK
| | - Simon Walker-Samuel
- UCL Centre for Advanced Biomedical Imaging, University College London, London WC1E 6DD, UK
| | - Adrian S. Woolf
- School of Biological Sciences, Faculty of Biology Medicine and Health, University of Manchester, Manchester M13 9PT, UK
| | - David A. Long
- Developmental Biology and Cancer Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London WC1N 1EH, UK
- UCL Centre for Kidney and Bladder Health, University College London, London WC1E 6BT, UK
| |
Collapse
|
5
|
Tao Y, Ge S. A distribution-guided Mapper algorithm. BMC Bioinformatics 2025; 26:73. [PMID: 40045218 PMCID: PMC11881416 DOI: 10.1186/s12859-025-06085-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 02/14/2025] [Indexed: 03/09/2025] Open
Abstract
BACKGROUND The Mapper algorithm is an essential tool for exploring the data shape in topological data analysis. With a dataset as an input, the Mapper algorithm outputs a graph representing the topological features of the whole dataset. This graph is often regarded as an approximation of a Reeb graph of a dataset. The classic Mapper algorithm uses fixed interval lengths and overlapping ratios, which might fail to reveal subtle features of a dataset, especially when the underlying structure is complex. RESULTS In this work, we introduce a distribution-guided Mapper algorithm named D-Mapper, which utilizes the property of the probability model and data intrinsic characteristics to generate density-guided covers and provide enhanced topological features. Moreover, we introduce a metric accounting for both the quality of overlap clustering and extended persistent homology to measure the performance of Mapper-type algorithms. Our numerical experiments indicate that the D-Mapper outperforms the classic Mapper algorithm in various scenarios. We also apply the D-Mapper to a SARS-COV-2 coronavirus RNA sequence dataset to explore the topological structure of different virus variants. The results indicate that the D-Mapper algorithm can reveal both the vertical and horizontal evolutionary processes of the viruses. Our code is available at https://github.com/ShufeiGe/D-Mapper . CONCLUSION The D-Mapper algorithm can generate covers from data based on a probability model. This work demonstrates the power of fusing probabilistic models with Mapper algorithms.
Collapse
Affiliation(s)
- Yuyang Tao
- Institute of Mathematical Sciences, ShanghaiTech University, 393 Middle Huaxia Road, 201210, Shanghai, China
| | - Shufei Ge
- Institute of Mathematical Sciences, ShanghaiTech University, 393 Middle Huaxia Road, 201210, Shanghai, China.
| |
Collapse
|
6
|
Cuerno M, Guijarro L, Valdés RMA, Comendador FG. Topological data analysis in air traffic management: The shape of big flight data sets. PLoS One 2025; 20:e0318108. [PMID: 40014633 PMCID: PMC11867395 DOI: 10.1371/journal.pone.0318108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Accepted: 12/18/2024] [Indexed: 03/01/2025] Open
Abstract
Analyzing flight trajectory data sets poses challenges due to the intricate interconnections among various factors and the high dimensionality of the data. Topological Data Analysis (TDA) is a way of analyzing big data sets focusing on the topological features this data sets have as point clouds in some metric space. Techniques as the ones that TDA provides are suitable for dealing with high dimensionality and intricate interconnections. This paper introduces TDA and its tools and methods as a way to derive meaningful insights from ATM data. Our focus is on employing TDA to extract valuable information related to airports. Specifically, by utilizing persistence landscapes (a potent TDA tool) we generate footprints for each airport. These footprints, obtained by averaging over a specific time period, are based on the deviation of trajectories and delays. We apply this method to the set of Spanish' airports in the Summer Season of 2018. Remarkably, our results align with the established Spanish airport classification and raise intriguing questions for further exploration. This analysis serves as a proof of concept, showcasing the potential application of TDA in the ATM field. While previous works have outlined the general applicability of TDA in aviation, this paper marks the first comprehensive application of TDA to a substantial volume of ATM data. Finally, we present conclusions and guidelines to address future challenges in the ATM domain.
Collapse
Affiliation(s)
- Manuel Cuerno
- Department of Mathematics, CUNEF University, Madrid, Spain
- Department of Mathematics, Universidad Autónoma de Madrid and ICMAT CSIC-UAM-UCM-UC3M, Madrid, Spain
| | - Luis Guijarro
- Department of Mathematics, Universidad Autónoma de Madrid and ICMAT CSIC-UAM-UCM-UC3M, Madrid, Spain
| | - Rosa María Arnaldo Valdés
- Department of Aerospace Systems, Air Transportation and Airports, E.T.S.I. Aeronáutica y del Espacio, Universidad Politécnica de Madrid, Madrid, Spain
| | - Fernando Gómez Comendador
- Department of Aerospace Systems, Air Transportation and Airports, E.T.S.I. Aeronáutica y del Espacio, Universidad Politécnica de Madrid, Madrid, Spain
| |
Collapse
|
7
|
Huang K, Lidbury BA, Thomas N, Gooley PR, Armstrong CW. Machine learning and multi-omics in precision medicine for ME/CFS. J Transl Med 2025; 23:68. [PMID: 39810236 PMCID: PMC11731168 DOI: 10.1186/s12967-024-05915-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2024] [Accepted: 11/25/2024] [Indexed: 01/16/2025] Open
Abstract
Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) is a complex and multifaceted disorder that defies simplistic characterisation. Traditional approaches to diagnosing and treating ME/CFS have often fallen short due to the condition's heterogeneity and the lack of validated biomarkers. The growing field of precision medicine offers a promising approach which focuses on the genetic and molecular underpinnings of individual patients. In this review, we explore how machine learning and multi-omics (genomics, transcriptomics, proteomics, and metabolomics) can transform precision medicine in ME/CFS research and healthcare. We provide an overview on machine learning concepts for analysing large-scale biological data, highlight key advancements in multi-omics biomarker discovery, data quality and integration strategies, while reflecting on ME/CFS case study examples. We also highlight several priorities, including the critical need for applying robust computational tools and collaborative data-sharing initiatives in the endeavour to unravel the biological intricacies of ME/CFS.
Collapse
Affiliation(s)
- Katherine Huang
- Department of Biochemistry and Pharmacology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, VIC, 3052, Australia
| | - Brett A Lidbury
- The National Centre for Epidemiology and Population Health, ANU College of Health and Medicine, The Australian National University, Canberra, ACT, 2601, Australia
| | - Natalie Thomas
- Department of Biochemistry and Pharmacology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, VIC, 3052, Australia
| | - Paul R Gooley
- Department of Biochemistry and Pharmacology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, VIC, 3052, Australia
| | - Christopher W Armstrong
- Department of Biochemistry and Pharmacology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, VIC, 3052, Australia.
| |
Collapse
|
8
|
Valerio J, Vasconcelos-Filho JE, Stosic B, de Oliveira WR, Santana FM, Antonino ACD, Duarte-Neto PJ. Topological analysis of the three-dimensional radiodensity distribution of fish otoliths: Point sampling effects on dimensionality reduction. Micron 2025; 188:103731. [PMID: 39471532 DOI: 10.1016/j.micron.2024.103731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 10/09/2024] [Accepted: 10/18/2024] [Indexed: 11/01/2024]
Abstract
Otoliths are calcified structures found in the inner ears of teleost fish, pivotal in marine biology for studies on metabolism, age, growth, and the identification of fish stocks, potentially leading to sustainable management practices. An important feature of this structure is its density, as it corresponds to modifications in the crystalline form of calcium carbonate during the fish's lifetime, resulting in variations in its final shape. The internal and external 3D radiodensity of otoliths from different species was obtained utilizing micro-computed tomography, however, an appropriate methodology for describing and conducting comparative studies on these data appears to be absent in the current body of literature. Therefore, we study otolith density variations from 3D computed tomography images, employing the Ball Mapper technique of Topological Data Analysis. We focus on reducing the computational cost of this analysis by applying probabilistic sampling and assessing its effects on the density variations provided by the Ball Mapper graph. To determine the sample size, we used the topology to establish what we term "Topological Sample Validation", which provided the minimum resolution with the same density information as raw data. Sample representativeness was validated through non-parametric statistical tests on the density variable. Based on the network's structural characteristics, network properties allowed for evaluating similarity between graphs. Besides the small sample size, remarkable correlations were obtained between age and network variables. Additionally, the Ball Mapper technique proved effective as a preprocessing algorithm for tomographic images, enabling the segmentation of undesired features in the object of interest.
Collapse
Affiliation(s)
- João Valerio
- Graduate Program in Biometry and Applied Statistics, Federal Rural University of Pernambuco, Recife, Pernambuco, Brazil; Department of Agricultural Engineering, Federal University of Maranhão, Chapadinha, Maranhão, Brazil
| | - Jonas E Vasconcelos-Filho
- Graduate Program in Biometry and Applied Statistics, Federal Rural University of Pernambuco, Recife, Pernambuco, Brazil
| | - Borko Stosic
- Graduate Program in Biometry and Applied Statistics, Federal Rural University of Pernambuco, Recife, Pernambuco, Brazil; Department of Statistics and Informatics, Federal Rural University of Pernambuco, Recife, Brazil
| | - Wilson R de Oliveira
- Graduate Program in Biometry and Applied Statistics, Federal Rural University of Pernambuco, Recife, Pernambuco, Brazil; Department of Statistics and Informatics, Federal Rural University of Pernambuco, Recife, Brazil
| | - Francisco M Santana
- Department of Fishery and Aquaculture, Federal Rural University of Pernambuco, Recife, Brazil
| | - Antonio C D Antonino
- Department of Nuclear Energy, Federal University of Pernambuco, Recife, Pernambuco, Brazil
| | - Paulo J Duarte-Neto
- Graduate Program in Biometry and Applied Statistics, Federal Rural University of Pernambuco, Recife, Pernambuco, Brazil; Department of Statistics and Informatics, Federal Rural University of Pernambuco, Recife, Brazil.
| |
Collapse
|
9
|
Guardieiro V, de Oliveira FI, Doraiswamy H, Nonato LG, Silva C. TopoMap++: A Faster and More Space Efficient Technique to Compute Projections with Topological Guarantees. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:229-239. [PMID: 39255150 DOI: 10.1109/tvcg.2024.3456365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
High-dimensional data, characterized by many features, can be difficult to visualize effectively. Dimensionality reduction techniques, such as PCA, UMAP, and t-SNE, address this challenge by projecting the data into a lower-dimensional space while preserving important relationships. TopoMap is another technique that excels at preserving the underlying structure of the data, leading to interpretable visualizations. In particular, TopoMap maps the high-dimensional data into a visual space, guaranteeing that the 0-dimensional persistence diagram of the Rips filtration of the visual space matches the one from the high-dimensional data. However, the original TopoMap algorithm can be slow and its layout can be too sparse for large and complex datasets. In this paper, we propose three improvements to TopoMap: 1) a more space-efficient layout, 2) a significantly faster implementation, and 3) a novel TreeMap-based representation that makes use of the topological hierarchy to aid the exploration of the projections. These advancements make TopoMap, now referred to as TopoMap++, a more powerful tool for visualizing high-dimensional data which we demonstrate through different use case scenarios.
Collapse
|
10
|
L Rocha H, Aguilar B, Getz M, Shmulevich I, Macklin P. A multiscale model of immune surveillance in micrometastases gives insights on cancer patient digital twins. NPJ Syst Biol Appl 2024; 10:144. [PMID: 39627216 PMCID: PMC11614875 DOI: 10.1038/s41540-024-00472-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Accepted: 11/15/2024] [Indexed: 12/06/2024] Open
Abstract
Metastasis is the leading cause of death in patients with cancer, driving considerable scientific and clinical interest in immunosurveillance of micrometastases. We investigated this process by creating a multiscale mathematical model to study the interactions between the immune system and the progression of micrometastases in general epithelial tissue. We analyzed the parameter space of the model using high-throughput computing resources to generate over 100,000 virtual patient trajectories. We demonstrated that the model could recapitulate a wide variety of virtual patient trajectories, including uncontrolled growth, partial response, and complete immune response to tumor growth. We classified the virtual patients and identified key patient parameters with the greatest effect on the simulated immunosurveillance. We highlight the lessons derived from this analysis and their impact on the nascent field of cancer patient digital twins (CPDTs). While CPDTs could enable clinicians to systematically dissect the complexity of cancer in each individual patient and inform treatment choices, our work shows that key challenges remain before we can reach this vision. In particular, we show that there remain considerable uncertainties in immune responses, unreliable patient stratification, and unpredictable personalized treatment. Nonetheless, we also show that in spite of these challenges, patient-specific models suggest strategies to increase control of clinically undetectable micrometastases even without complete parameter certainty.
Collapse
Affiliation(s)
- Heber L Rocha
- Intelligent Systems Engineering, Indiana University, Bloomington, IN, USA
| | | | - Michael Getz
- Intelligent Systems Engineering, Indiana University, Bloomington, IN, USA
| | | | - Paul Macklin
- Intelligent Systems Engineering, Indiana University, Bloomington, IN, USA.
| |
Collapse
|
11
|
Moldenhauer S, Potluri N, Xie Y, Southwell AL. A tool to automate assessment of regional brain atrophy in mouse models of neurodegenerative disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.30.626190. [PMID: 39651151 PMCID: PMC11623679 DOI: 10.1101/2024.11.30.626190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]
Abstract
As life expectancy rises, so too does the prevalence of neurodegenerative diseases. Neurodegeneration causes progressive regional brain atrophy, typically initiating prior to symptom onset. Researchers measure the impact of potential treatments on atrophy in mouse models to assess their effectiveness. This is important because treatments designed to combat neuropathology are more likely to modify the disease, per contra to symptom management. Magnetic resonance imaging, while accurate in measurement of brain region structure volumes, is prohibitively expensive. Conversely, stereological volume assessment, the process of estimating the volume of individual 3D brain regions from imaged 2D brain sections, is more commonly used. This involves manually tracing brain region(s) of interest in regularly spaced imaged cross-sections to determine their 2D area, followed by application of the Cavalieri principle to estimate the volume. The pertinent caveats of this approach are the labor-intensive manual tracing process, and potential inaccuracies that arise due to human variation. To overcome these challenges, we have created a Neuropathology Assessment Tool (NAT) to automate regional brain tracing and identification using artificial intelligence (AI) and concepts from topological data analysis. The NAT was validated by comparing manual and NAT analysis of striatal volume in Huntington disease model mice. The NAT detected striatal atrophy with higher efficiency, 93.8% agreement with manual measurements, and lower inter-group variability. The NAT will increase efficiency of preclinical neuropathology assessment, allowing for a greater number of experimental therapies to be tested and facilitating drug discovery intractable neurodegenerative diseases.
Collapse
|
12
|
Kokkanti A, Atanasiu A, Kolbin D, Adalsteinsson D, Bloom K, Vasquez PA. TopoLoop: A new tool for chromatin loop detection in live cells via single-particle tracking. J Chem Phys 2024; 161:204105. [PMID: 39575737 PMCID: PMC11604096 DOI: 10.1063/5.0236090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Accepted: 10/28/2024] [Indexed: 11/27/2024] Open
Abstract
We present a novel method for identifying topological features of chromatin domains in live cells using single-particle tracking and topological data analysis (TDA). By applying TDA to particle trajectories, we can effectively detect complex spatial patterns, such as loops, that are often missed by traditional time series analysis. Using simulations of polymer bead-spring chains, we have validated the accuracy of our method and determined its limitations for detecting loops. Our approach offers a promising avenue for exploring the topological complexity of chromatin in living cells using TDA techniques.
Collapse
Affiliation(s)
- Aryan Kokkanti
- Department of Biology, University of North Carolina at Chapel Hill, 622 Fordham Hall, CB3280, Chapel Hill, North Carolina 27599, USA
| | - Andrew Atanasiu
- Department of Biology, University of North Carolina at Chapel Hill, 622 Fordham Hall, CB3280, Chapel Hill, North Carolina 27599, USA
| | - Daniel Kolbin
- Department of Biology, University of North Carolina at Chapel Hill, 622 Fordham Hall, CB3280, Chapel Hill, North Carolina 27599, USA
| | - David Adalsteinsson
- Department of Mathematics, University of North Carolina at Chapel Hill, 120 E Cameron Avenue, CB3250, Chapel Hill, North Carolina 27599, USA
| | - Kerry Bloom
- Department of Biology, University of North Carolina at Chapel Hill, 622 Fordham Hall, CB3280, Chapel Hill, North Carolina 27599, USA
| | - Paula A. Vasquez
- Department of Mathematics, University of South Carolina, 1523 Greene St., LC417, Columbia, South Carolina 29208, USA
| |
Collapse
|
13
|
Singh Y, Farrelly C, Hathaway QA, Carlsson G. Visualizing radiological data bias through persistence images. Oncotarget 2024; 15:787-789. [PMID: 39535539 PMCID: PMC11559657 DOI: 10.18632/oncotarget.28670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2024] [Indexed: 11/16/2024] Open
Abstract
Persistence images, derived from topological data analysis, emerge as a powerful tool for visualizing and mitigating biases in radiological data interpretation and AI model development. This technique transforms complex topological features into stable, interpretable representations, offering unique insights into medical imaging data structure. By providing intuitive visualizations, persistence images enable the identification of subtle structural differences and potential biases in data acquisition, interpretation, and AI model training. Persistence images can also facilitate stratified sampling, matching statistics, and noise filtration, enhancing the accuracy and equity of radiological analysis. Despite challenges in computational complexity and workflow integration, persistence images show promise in developing more accurate, equitable, and trustworthy AI systems in radiology, potentially improving patient outcomes and personalized healthcare delivery.
Collapse
Affiliation(s)
- Yashbir Singh
- Correspondence to:Yashbir Singh, Department of Radiology, Mayo Clinic, Rochester, MN 55905, USA
| | | | | | | |
Collapse
|
14
|
Singh Y, Farrelly C, Hathaway QA, Carlsson G. Persistence landscapes: Charting a path to unbiased radiological interpretation. Oncotarget 2024; 15:790-792. [PMID: 39535533 PMCID: PMC11559655 DOI: 10.18632/oncotarget.28671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2024] [Indexed: 11/16/2024] Open
Abstract
Persistence landscapes, a sophisticated tool from topological data analysis, offer a promising approach to address biases in radiological interpretation and AI model development. By transforming complex topological features into statistically analyzable functions, they enable robust comparisons between populations and datasets. Persistence landscapes excel in noise filtration, fusion bias mitigation, and enhancing machine learning models. Despite challenges in computation and integration, they show potential to improve the accuracy and equity of radiological analysis, particularly in multi-modal imaging and AI-assisted interpretation.
Collapse
Affiliation(s)
- Yashbir Singh
- Correspondence to:Yashbir Singh, Department of Radiology, Mayo Clinic, Rochester, MN 55905, USA
| | | | | | | |
Collapse
|
15
|
Panconi L, Euchner J, Tashev SA, Makarova M, Herten DP, Owen DM, Nieves DJ. Mapping membrane biophysical nano-environments. Nat Commun 2024; 15:9641. [PMID: 39511199 PMCID: PMC11544141 DOI: 10.1038/s41467-024-53883-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 10/25/2024] [Indexed: 11/15/2024] Open
Abstract
The mammalian plasma membrane is known to contain domains with varying lipid composition and biophysical properties. However, studying these membrane lipid domains presents challenges due to their predicted morphological similarity to the bulk membrane and their scale being below the classical resolution limit of optical microscopy. To address this, we combine the solvatochromic probe di-4-ANEPPDHQ, which reports on its biophysical environment through changes in its fluorescence emission, with spectrally resolved single-molecule localisation microscopy. The resulting data comprises nanometre-precision localisation coordinates and a generalised polarisation value related to the probe's environment - a marked point pattern. We introduce quantification algorithms based on topological data analysis (PLASMA) to detect and map nano-domains in this marked data, demonstrating their effectiveness in both artificial membranes and live cells. By leveraging environmentally sensitive fluorophores, multi-modal single molecule localisation microscopy, and advanced analysis methods, we achieve nanometre scale mapping of membrane properties and assess changes in response to external perturbation with methyl-β-cyclodextrin. This integrated methodology represents an integrated toolset for investigating marked point pattern data at nanometre spatial scales.
Collapse
Affiliation(s)
- Luca Panconi
- Department of Immunology and Immunotherapy, School of Infection, Inflammation and Immunology, College of Medicine and Health, University of Birmingham, Birmingham, UK
- School of Physics and Astronomy, College of Engineering and Physical Sciences, University of Birmingham, Birmingham, UK
- Centre of Membrane Proteins and Receptors, University of Birmingham, Birmingham, UK
| | - Jonas Euchner
- Centre of Membrane Proteins and Receptors, University of Birmingham, Birmingham, UK
- Department of Cardiovascular Sciences, School of Medical Sciences, College of Medicine and Health, University of Birmingham, Birmingham, UK
- School of Chemistry, College of Engineering and Physical Sciences, University of Birmingham, Birmingham, UK
| | - Stanimir A Tashev
- Centre of Membrane Proteins and Receptors, University of Birmingham, Birmingham, UK
- Department of Cardiovascular Sciences, School of Medical Sciences, College of Medicine and Health, University of Birmingham, Birmingham, UK
- School of Chemistry, College of Engineering and Physical Sciences, University of Birmingham, Birmingham, UK
| | - Maria Makarova
- Centre of Membrane Proteins and Receptors, University of Birmingham, Birmingham, UK
- School of Biosciences, College of Life and Environmental Science, University of Birmingham, Birmingham, UK
- Department of Metabolism and Systems Science, School of Medical Sciences, College of Medicine and Health, University of Birmingham, Birmingham, UK
| | - Dirk-Peter Herten
- Centre of Membrane Proteins and Receptors, University of Birmingham, Birmingham, UK
- Department of Cardiovascular Sciences, School of Medical Sciences, College of Medicine and Health, University of Birmingham, Birmingham, UK
- School of Chemistry, College of Engineering and Physical Sciences, University of Birmingham, Birmingham, UK
| | - Dylan M Owen
- Department of Immunology and Immunotherapy, School of Infection, Inflammation and Immunology, College of Medicine and Health, University of Birmingham, Birmingham, UK
- Centre of Membrane Proteins and Receptors, University of Birmingham, Birmingham, UK
- School of Mathematics, College of Engineering and Physical Sciences, University of Birmingham, Birmingham, UK
| | - Daniel J Nieves
- Department of Immunology and Immunotherapy, School of Infection, Inflammation and Immunology, College of Medicine and Health, University of Birmingham, Birmingham, UK.
- Centre of Membrane Proteins and Receptors, University of Birmingham, Birmingham, UK.
| |
Collapse
|
16
|
Rai A, Nath Sharma B, Rabindrajit Luwang S, Nurujjaman M, Majhi S. Identifying extreme events in the stock market: A topological data analysis. CHAOS (WOODBURY, N.Y.) 2024; 34:103106. [PMID: 39352199 DOI: 10.1063/5.0220424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2024] [Accepted: 09/09/2024] [Indexed: 10/03/2024]
Abstract
This paper employs Topological Data Analysis (TDA) to detect extreme events (EEs) in the stock market at a continental level. Previous approaches, which analyzed stock indices separately, could not detect EEs for multiple time series in one go. TDA provides a robust framework for such analysis and identifies the EEs during the crashes for different indices. The TDA analysis shows that L1, L2 norms and Wasserstein distance (WD) of the world leading indices rise abruptly during the crashes, surpassing a threshold of μ+4∗σ, where μ and σ are the mean and the standard deviation of norm or WD, respectively. Our study identified the stock index crashes of the 2008 financial crisis and the COVID-19 pandemic across continents as EEs. Given that different sectors in an index behave differently, a sector-wise analysis was conducted during the COVID-19 pandemic for the Indian stock market. The sector-wise results show that after the occurrence of EE, we have observed strong crashes surpassing μ+2∗σ for an extended period for the banking, automobile, IT, realty, energy, and metal sectors. While for the pharmaceutical and FMCG sectors, no significant spikes were noted. Hence, TDA also proves successful in identifying the duration of shocks after the occurrence of EEs. This also indicates that the banking sector continued to face stress and remained volatile even after the crash. This study gives us the applicability of TDA as a powerful analytical tool to study EEs in various fields.
Collapse
Affiliation(s)
- Anish Rai
- Department of Physics, National Institute of Technology Sikkim, Ravangla, Sikkim 737139, India
| | - Buddha Nath Sharma
- Department of Physics, National Institute of Technology Sikkim, Ravangla, Sikkim 737139, India
| | | | - Md Nurujjaman
- Department of Physics, National Institute of Technology Sikkim, Ravangla, Sikkim 737139, India
| | - Sushovan Majhi
- Data Science Program, George Washington University, Washington, DC 20052, USA
| |
Collapse
|
17
|
Hathaway QA, Jamthikar AD, Rajiv N, Chaitman BR, Carson JL, Yanamala N, Sengupta PP. Cardiac ultrasomics for acute myocardial infarction risk stratification and prediction of all-cause mortality: a feasibility study. Echo Res Pract 2024; 11:22. [PMID: 39278898 PMCID: PMC11403884 DOI: 10.1186/s44156-024-00057-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Accepted: 07/23/2024] [Indexed: 09/18/2024] Open
Abstract
BACKGROUND Current risk stratification tools for acute myocardial infarction (AMI) have limitations, particularly in predicting mortality. This study utilizes cardiac ultrasound radiomics (i.e., ultrasomics) to risk stratify AMI patients when predicting all-cause mortality. RESULTS The study included 197 patients: (a) retrospective internal cohort (n = 155) of non-ST-elevation myocardial infarction (n = 63) and ST-elevation myocardial infarction (n = 92) patients, and (b) external cohort from the multicenter Door-To-Unload in ST-segment-elevation myocardial infarction [DTU-STEMI] Pilot Trial (n = 42). Echocardiography images of apical 2, 3, and 4-chamber were processed through an automated deep-learning pipeline to extract ultrasomic features. Unsupervised machine learning (topological data analysis) generated AMI clusters followed by a supervised classifier to generate individual predicted probabilities. Validation included assessing the incremental value of predicted probabilities over the Global Registry of Acute Coronary Events (GRACE) risk score 2.0 to predict 1-year all-cause mortality in the internal cohort and infarct size in the external cohort. Three phenogroups were identified: Cluster A (high-risk), Cluster B (intermediate-risk), and Cluster C (low-risk). Cluster A patients had decreased LV ejection fraction (P < 0.01) and global longitudinal strain (P = 0.03) and increased mortality at 1-year (log rank P = 0.05). Ultrasomics features alone (C-Index: 0.74 vs. 0.70, P = 0.04) and combined with global longitudinal strain (C-Index: 0.81 vs. 0.70, P < 0.01) increased prediction of mortality beyond the GRACE 2.0 score. In the DTU-STEMI clinical trial, Cluster A was associated with larger infarct size (> 10% LV mass, P < 0.01), compared to remaining clusters. CONCLUSIONS Ultrasomics-based phenogroup clustering, augmented by TDA and supervised machine learning, provides a novel approach for AMI risk stratification.
Collapse
Affiliation(s)
- Quincy A Hathaway
- Division of Cardiovascular Disease and Hypertension, Department of Medicine, Rutgers Robert Wood Johnson Medical School, New Brunswick, NJ, USA
- Department of Radiology, University of Pennsylvania, Philadelphia, PA, USA
| | - Ankush D Jamthikar
- Division of Cardiovascular Disease and Hypertension, Department of Medicine, Rutgers Robert Wood Johnson Medical School, New Brunswick, NJ, USA
| | - Nivedita Rajiv
- Division of Cardiovascular Disease and Hypertension, Department of Medicine, Rutgers Robert Wood Johnson Medical School, New Brunswick, NJ, USA
| | - Bernard R Chaitman
- Department of Medicine, St. Louis University School of Medicine, St. Louis, MO, USA
| | - Jeffrey L Carson
- Division of General Internal Medicine, Department of Medicine, Rutgers Robert Wood Johnson Medical School, New Brunswick, NJ, USA
| | - Naveena Yanamala
- Division of Cardiovascular Disease and Hypertension, Department of Medicine, Rutgers Robert Wood Johnson Medical School, New Brunswick, NJ, USA
| | - Partho P Sengupta
- Division of Cardiovascular Disease and Hypertension, Department of Medicine, Rutgers Robert Wood Johnson Medical School, New Brunswick, NJ, USA.
- Rutgers Robert Wood Johnson Medical School, Division of Cardiovascular Disease and Hypertension, 125 Patterson St, New Brunswick, NJ, 08901, USA.
| |
Collapse
|
18
|
Maciejewski K, Czerwinska P. Scoping Review: Methods and Applications of Spatial Transcriptomics in Tumor Research. Cancers (Basel) 2024; 16:3100. [PMID: 39272958 PMCID: PMC11394603 DOI: 10.3390/cancers16173100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Revised: 08/30/2024] [Accepted: 08/30/2024] [Indexed: 09/15/2024] Open
Abstract
Spatial transcriptomics (ST) examines gene expression within its spatial context on tissue, linking morphology and function. Advances in ST resolution and throughput have led to an increase in scientific interest, notably in cancer research. This scoping study reviews the challenges and practical applications of ST, summarizing current methods, trends, and data analysis techniques for ST in neoplasm research. We analyzed 41 articles published by the end of 2023 alongside public data repositories. The findings indicate cancer biology is an important focus of ST research, with a rising number of studies each year. Visium (10x Genomics, Pleasanton, CA, USA) is the leading ST platform, and SCTransform from Seurat R library is the preferred method for data normalization and integration. Many studies incorporate additional data types like single-cell sequencing and immunohistochemistry. Common ST applications include discovering the composition and function of tumor tissues in the context of their heterogeneity, characterizing the tumor microenvironment, or identifying interactions between cells, including spatial patterns of expression and co-occurrence. However, nearly half of the studies lacked comprehensive data processing protocols, hindering their reproducibility. By recommending greater transparency in sharing analysis methods and adapting single-cell analysis techniques with caution, this review aims to improve the reproducibility and reliability of future studies in cancer research.
Collapse
Affiliation(s)
- Kacper Maciejewski
- Undergraduate Research Group “Biobase”, Poznan University of Medical Sciences, 61-701 Poznan, Poland;
| | - Patrycja Czerwinska
- Undergraduate Research Group “Biobase”, Poznan University of Medical Sciences, 61-701 Poznan, Poland;
- Department of Cancer Immunology, Poznan University of Medical Sciences, 61-866 Poznan, Poland
- Department of Diagnostics and Cancer Immunology, Greater Poland Cancer Centre, 61-866 Poznan, Poland
| |
Collapse
|
19
|
Ghafuri J, Jassim S. Singular-Value-Decomposition-Based Matrix Surgery. ENTROPY (BASEL, SWITZERLAND) 2024; 26:701. [PMID: 39202171 PMCID: PMC11353412 DOI: 10.3390/e26080701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 07/24/2024] [Accepted: 08/15/2024] [Indexed: 09/03/2024]
Abstract
This paper is motivated by the need to stabilise the impact of deep learning (DL) training for medical image analysis on the conditioning of convolution filters in relation to model overfitting and robustness. We present a simple strategy to reduce square matrix condition numbers and investigate its effect on the spatial distributions of point clouds of well- and ill-conditioned matrices. For a square matrix, the SVD surgery strategy works by: (1) computing its singular value decomposition (SVD), (2) changing a few of the smaller singular values relative to the largest one, and (3) reconstructing the matrix by reverse SVD. Applying SVD surgery on CNN convolution filters during training acts as spectral regularisation of the DL model without requiring the learning of extra parameters. The fact that the further away a matrix is from the non-invertible matrices, the higher its condition number is suggests that the spatial distributions of square matrices and those of their inverses are correlated to their condition number distributions. We shall examine this assertion empirically by showing that applying various versions of SVD surgery on point clouds of matrices leads to bringing their persistent diagrams (PDs) closer to the matrices of the point clouds of their inverses.
Collapse
Affiliation(s)
- Jehan Ghafuri
- School of Computing, The University of Buckingham, Buckingham MK18 1EG, UK;
| | | |
Collapse
|
20
|
El-Yaagoubi AB, Chung MK, Ombao H. Dynamic topological data analysis: a novel fractal dimension-based testing framework with application to brain signals. Front Neuroinform 2024; 18:1387400. [PMID: 39071176 PMCID: PMC11272560 DOI: 10.3389/fninf.2024.1387400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Accepted: 06/21/2024] [Indexed: 07/30/2024] Open
Abstract
Topological data analysis (TDA) is increasingly recognized as a promising tool in the field of neuroscience, unveiling the underlying topological patterns within brain signals. However, most TDA related methods treat brain signals as if they were static, i.e., they ignore potential non-stationarities and irregularities in the statistical properties of the signals. In this study, we develop a novel fractal dimension-based testing approach that takes into account the dynamic topological properties of brain signals. By representing EEG brain signals as a sequence of Vietoris-Rips filtrations, our approach accommodates the inherent non-stationarities and irregularities of the signals. The application of our novel fractal dimension-based testing approach in analyzing dynamic topological patterns in EEG signals during an epileptic seizure episode exposes noteworthy alterations in total persistence across 0, 1, and 2-dimensional homology. These findings imply a more intricate influence of seizures on brain signals, extending beyond mere amplitude changes.
Collapse
Affiliation(s)
- Anass B. El-Yaagoubi
- Statistics Program, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Moo K. Chung
- Department of Biostatistics & Medical Informatics, University of Wisconsin-Madison, Madison, WI, United States
| | - Hernando Ombao
- Statistics Program, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| |
Collapse
|
21
|
Arango AS, Park H, Tajkhorshid E. Topological Learning Approach to Characterizing Biological Membranes. J Chem Inf Model 2024; 64:5242-5252. [PMID: 38912752 PMCID: PMC12009557 DOI: 10.1021/acs.jcim.4c00552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/25/2024]
Abstract
Biological membranes play key roles in cellular compartmentalization, structure, and its signaling pathways. At varying temperatures, individual membrane lipids sample from different configurations, a process that frequently leads to higher-order phase behavior and phenomena. Here, we present a persistent homology (PH)-based method for quantifying the structural features of individual and bulk lipids, providing local and contextual information on lipid tail organization. Our method leverages the mathematical machinery of algebraic topology and machine learning to infer temperature-dependent structural information on lipids from static coordinates. To train our model, we generated multiple molecular dynamics trajectories of dipalmitoyl-phosphatidylcholine membranes at varying temperatures. A fingerprint was then constructed for each set of lipid coordinates by PH filtration, in which interaction spheres were grown around the lipid atoms while tracking their intersections. The sphere filtration formed a simplicial complex that captures enduring key topological features of the configuration landscape using homology, yielding persistence data. Following fingerprint extraction for physiologically relevant temperatures, the persistence data were used to train an attention-based neural network for assignment of effective temperature values to selected membrane regions. Our persistence homology-based method captures the local structural effects, via effective temperature, of lipids adjacent to other membrane constituents, e.g., sterols and proteins. This topological learning approach can predict lipid effective temperatures from static coordinates across multiple spatial resolutions. The tool, called MembTDA, can be accessed at https://github.com/hyunp2/Memb-TDA.
Collapse
Affiliation(s)
- Andres S Arango
- Theoretical and Computational Biophysics Group, NIH Resource Center for Macromolecular Modeling and Visualization, Beckman Institute for Advanced Science and Technology, Department of Biochemistry, and Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Hyun Park
- Theoretical and Computational Biophysics Group, NIH Resource Center for Macromolecular Modeling and Visualization, Beckman Institute for Advanced Science and Technology, Department of Biochemistry, and Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Emad Tajkhorshid
- Theoretical and Computational Biophysics Group, NIH Resource Center for Macromolecular Modeling and Visualization, Beckman Institute for Advanced Science and Technology, Department of Biochemistry, and Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| |
Collapse
|
22
|
Levenson RM, Singh Y, Rieck B, Hathaway QA, Farrelly C, Rozenblit J, Prasanna P, Erickson B, Choudhary A, Carlsson G, Sarkar D. Advancing Precision Medicine: Algebraic Topology and Differential Geometry in Radiology and Computational Pathology. J Transl Med 2024; 104:102060. [PMID: 38626875 PMCID: PMC12054847 DOI: 10.1016/j.labinv.2024.102060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 04/08/2024] [Accepted: 04/10/2024] [Indexed: 05/19/2024] Open
Abstract
Precision medicine aims to provide personalized care based on individual patient characteristics, rather than guideline-directed therapies for groups of diseases or patient demographics. Images-both radiology- and pathology-derived-are a major source of information on presence, type, and status of disease. Exploring the mathematical relationship of pixels in medical imaging ("radiomics") and cellular-scale structures in digital pathology slides ("pathomics") offers powerful tools for extracting both qualitative and, increasingly, quantitative data. These analytical approaches, however, may be significantly enhanced by applying additional methods arising from fields of mathematics such as differential geometry and algebraic topology that remain underexplored in this context. Geometry's strength lies in its ability to provide precise local measurements, such as curvature, that can be crucial for identifying abnormalities at multiple spatial levels. These measurements can augment the quantitative features extracted in conventional radiomics, leading to more nuanced diagnostics. By contrast, topology serves as a robust shape descriptor, capturing essential features such as connected components and holes. The field of topological data analysis was initially founded to explore the shape of data, with functional network connectivity in the brain being a prominent example. Increasingly, its tools are now being used to explore organizational patterns of physical structures in medical images and digitized pathology slides. By leveraging tools from both differential geometry and algebraic topology, researchers and clinicians may be able to obtain a more comprehensive, multi-layered understanding of medical images and contribute to precision medicine's armamentarium.
Collapse
Affiliation(s)
- Richard M Levenson
- Department of Pathology and Laboratory Medicine, University of California Davis, Davis, California.
| | - Yashbir Singh
- Department of Radiology, Mayo Clinic, Rochester, Minnesota.
| | - Bastian Rieck
- Helmholtz Munich and Technical University of Munich, Munich, Germany
| | - Quincy A Hathaway
- Department of Medical Education, West Virginia University, Morgantown, West Virginia
| | | | | | - Prateek Prasanna
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York
| | | | | | - Gunnar Carlsson
- Department of Mathematics, Stanford University, Stanford, California
| | - Deepa Sarkar
- Institute of Genomic Health, Ichan school of Medicine, Mount Sinai, New York
| |
Collapse
|
23
|
Rolland J, Boutin R, Eveillard D, Delahaye B. Datascape: exploring heterogeneous dataspace. Sci Rep 2024; 14:7041. [PMID: 38580694 PMCID: PMC10997776 DOI: 10.1038/s41598-024-52493-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 01/19/2024] [Indexed: 04/07/2024] Open
Abstract
Data science is a powerful field for gaining insights, comparing, and predicting behaviors from datasets. However, the diversity of methods and hypotheses needed to abstract a dataset exhibits a lack of genericity. Moreover, the shape of a dataset, which structures its contained information and uncertainties, is rarely considered. Inspired by state-of-the-art manifold learning and hull estimations algorithms, we propose a novel framework, the datascape, that leverages topology and graph theory to abstract heterogeneous datasets. Built upon the combination of a nearest neighbor graph, a set of convex hulls, and a metric distance that respects the shape of the data, the datascape allows exploration of the dataset's underlying space. We show that the datascape can uncover underlying functions from simulated datasets, build predictive algorithms with performance close to state-of-the-art algorithms, and reveal insightful geodesic paths between points. It demonstrates versatility through ecological, medical, and simulated data use cases.
Collapse
Affiliation(s)
- Jakez Rolland
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, 44322, Nantes, France.
- Bio Logbook, 44200, Nantes, France.
| | | | - Damien Eveillard
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, 44322, Nantes, France
| | - Benoit Delahaye
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, 44322, Nantes, France
| |
Collapse
|
24
|
Nguyen KC, Jameson CD, Baldwin SA, Nardini JT, Smith RC, Haugh JM, Flores KB. Quantifying collective motion patterns in mesenchymal cell populations using topological data analysis and agent-based modeling. Math Biosci 2024; 370:109158. [PMID: 38373479 PMCID: PMC10966690 DOI: 10.1016/j.mbs.2024.109158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 02/06/2024] [Accepted: 02/11/2024] [Indexed: 02/21/2024]
Abstract
Fibroblasts in a confluent monolayer are known to adopt elongated morphologies in which cells are oriented parallel to their neighbors. We collected and analyzed new microscopy movies to show that confluent fibroblasts are motile and that neighboring cells often move in anti-parallel directions in a collective motion phenomenon we refer to as "fluidization" of the cell population. We used machine learning to perform cell tracking for each movie and then leveraged topological data analysis (TDA) to show that time-varying point-clouds generated by the tracks contain significant topological information content that is driven by fluidization, i.e., the anti-parallel movement of individual neighboring cells and neighboring groups of cells over long distances. We then utilized the TDA summaries extracted from each movie to perform Bayesian parameter estimation for the D'Orsgona model, an agent-based model (ABM) known to produce a wide array of different patterns, including patterns that are qualitatively similar to fluidization. Although the D'Orsgona ABM is a phenomenological model that only describes inter-cellular attraction and repulsion, the estimated region of D'Orsogna model parameter space was consistent across all movies, suggesting that a specific level of inter-cellular repulsion force at close range may be a mechanism that helps drive fluidization patterns in confluent mesenchymal cell populations.
Collapse
Affiliation(s)
- Kyle C Nguyen
- Biomathematics Graduate Program, North Carolina State University, Raleigh, NC 27607, USA; Center for Research in Scientific Computation, North Carolina State University, Raleigh, NC 27607, USA.
| | | | - Scott A Baldwin
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA
| | - John T Nardini
- Department of Mathematics and Statistics, The College of New Jersey, Ewing, NJ 08628, USA
| | - Ralph C Smith
- Department of Mathematics, North Carolina State University, Raleigh, NC 27607, USA
| | - Jason M Haugh
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA
| | - Kevin B Flores
- Center for Research in Scientific Computation, North Carolina State University, Raleigh, NC 27607, USA; Department of Mathematics, North Carolina State University, Raleigh, NC 27607, USA
| |
Collapse
|
25
|
Singer B, Meling D, Hirsch-Hoffmann M, Michels L, Kometer M, Smigielski L, Dornbierer D, Seifritz E, Vollenweider FX, Scheidegger M. Psilocybin enhances insightfulness in meditation: a perspective on the global topology of brain imaging during meditation. Sci Rep 2024; 14:7211. [PMID: 38531905 PMCID: PMC10966054 DOI: 10.1038/s41598-024-55726-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 02/27/2024] [Indexed: 03/28/2024] Open
Abstract
In this study, for the first time, we explored a dataset of functional magnetic resonance images collected during focused attention and open monitoring meditation before and after a five-day psilocybin-assisted meditation retreat using a recently established approach, based on the Mapper algorithm from topological data analysis. After generating subject-specific maps for two groups (psilocybin vs. placebo, 18 subjects/group) of experienced meditators, organizational principles were uncovered using graph topological tools, including the optimal transport (OT) distance, a geometrically rich measure of similarity between brain activity patterns. This revealed characteristics of the topology (i.e. shape) in space (i.e. abstract space of voxels) and time dimension of whole-brain activity patterns during different styles of meditation and psilocybin-induced alterations. Most interestingly, we found that (psilocybin-induced) positive derealization, which fosters insightfulness specifically when accompanied by enhanced open-monitoring meditation, was linked to the OT distance between open-monitoring and resting state. Our findings suggest that enhanced meta-awareness through meditation practice in experienced meditators combined with potential psilocybin-induced positive alterations in perception mediate insightfulness. Together, these findings provide a novel perspective on meditation and psychedelics that may reveal potential novel brain markers for positive synergistic effects between mindfulness practices and psilocybin.
Collapse
Affiliation(s)
- Berit Singer
- Department of Adult Psychiatry and Psychotherapy, Psychiatric University Clinic Zurich and University of Zurich, Zurich, Switzerland.
| | - Daniel Meling
- Department of Adult Psychiatry and Psychotherapy, Psychiatric University Clinic Zurich and University of Zurich, Zurich, Switzerland
- Department of Psychosomatic Medicine and Psychotherapy, Medical Center - University of Freiburg, Freiburg, Germany
| | - Matthias Hirsch-Hoffmann
- Department of Adult Psychiatry and Psychotherapy, Psychiatric University Clinic Zurich and University of Zurich, Zurich, Switzerland
| | - Lars Michels
- Department of Neuroradiology, University Hospital Zurich, Neuroscience Center Zurich (ZNZ), University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Michael Kometer
- Department of Adult Psychiatry and Psychotherapy, Psychiatric University Clinic Zurich and University of Zurich, Zurich, Switzerland
| | - Lukasz Smigielski
- Department of Adult Psychiatry and Psychotherapy, Psychiatric University Clinic Zurich and University of Zurich, Zurich, Switzerland
| | - Dario Dornbierer
- Department of Adult Psychiatry and Psychotherapy, Psychiatric University Clinic Zurich and University of Zurich, Zurich, Switzerland
| | - Erich Seifritz
- Department of Adult Psychiatry and Psychotherapy, Psychiatric University Clinic Zurich and University of Zurich, Zurich, Switzerland
| | - Franz X Vollenweider
- Department of Adult Psychiatry and Psychotherapy, Psychiatric University Clinic Zurich and University of Zurich, Zurich, Switzerland.
| | - Milan Scheidegger
- Department of Adult Psychiatry and Psychotherapy, Psychiatric University Clinic Zurich and University of Zurich, Zurich, Switzerland
- Department of Neuroradiology, University Hospital Zurich, Neuroscience Center Zurich (ZNZ), University of Zurich and ETH Zurich, Zurich, Switzerland
| |
Collapse
|
26
|
Dervić E, Sorger J, Yang L, Leutner M, Kautzky A, Thurner S, Kautzky-Willer A, Klimek P. Unraveling cradle-to-grave disease trajectories from multilayer comorbidity networks. NPJ Digit Med 2024; 7:56. [PMID: 38454004 PMCID: PMC10920888 DOI: 10.1038/s41746-024-01015-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 01/18/2024] [Indexed: 03/09/2024] Open
Abstract
We aim to comprehensively identify typical life-spanning trajectories and critical events that impact patients' hospital utilization and mortality. We use a unique dataset containing 44 million records of almost all inpatient stays from 2003 to 2014 in Austria to investigate disease trajectories. We develop a new, multilayer disease network approach to quantitatively analyze how cooccurrences of two or more diagnoses form and evolve over the life course of patients. Nodes represent diagnoses in age groups of ten years; each age group makes up a layer of the comorbidity multilayer network. Inter-layer links encode a significant correlation between diagnoses (p < 0.001, relative risk > 1.5), while intra-layers links encode correlations between diagnoses across different age groups. We use an unsupervised clustering algorithm for detecting typical disease trajectories as overlapping clusters in the multilayer comorbidity network. We identify critical events in a patient's career as points where initially overlapping trajectories start to diverge towards different states. We identified 1260 distinct disease trajectories (618 for females, 642 for males) that on average contain 9 (IQR 2-6) different diagnoses that cover over up to 70 years (mean 23 years). We found 70 pairs of diverging trajectories that share some diagnoses at younger ages but develop into markedly different groups of diagnoses at older ages. The disease trajectory framework can help us to identify critical events as specific combinations of risk factors that put patients at high risk for different diagnoses decades later. Our findings enable a data-driven integration of personalized life-course perspectives into clinical decision-making.
Collapse
Affiliation(s)
- Elma Dervić
- Complexity Science Hub Vienna, Vienna, Austria
- Supply Chain Intelligence Institute Austria (ASCII), Vienna, Austria
- Medical University of Vienna, Section for Science of Complex Systems, CeMSIIS, Vienna, Austria
| | | | | | - Michael Leutner
- Medical University of Vienna, Department of Internal Medicine III, Clinical Division of Endocrinology and Metabolism, Vienna, Austria
| | - Alexander Kautzky
- Medical University of Vienna, Department of Psychiatry and Psychotherapy, Vienna, Austria
| | - Stefan Thurner
- Complexity Science Hub Vienna, Vienna, Austria
- Medical University of Vienna, Section for Science of Complex Systems, CeMSIIS, Vienna, Austria
- Santa Fe Institute, Santa Fe, NM, USA
| | - Alexandra Kautzky-Willer
- Medical University of Vienna, Department of Internal Medicine III, Clinical Division of Endocrinology and Metabolism, Vienna, Austria
- Gender Institute, Gars am Kamp, Austria
| | - Peter Klimek
- Complexity Science Hub Vienna, Vienna, Austria.
- Supply Chain Intelligence Institute Austria (ASCII), Vienna, Austria.
- Medical University of Vienna, Section for Science of Complex Systems, CeMSIIS, Vienna, Austria.
| |
Collapse
|
27
|
Li M, Liu Z, Jiang N, Laws B, Tiskevich C, Moose SP, Topp CN. Topological data analysis expands the genotype to phenotype map for 3D maize root system architecture. FRONTIERS IN PLANT SCIENCE 2024; 14:1260005. [PMID: 38288407 PMCID: PMC10822944 DOI: 10.3389/fpls.2023.1260005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 12/27/2023] [Indexed: 01/31/2024]
Abstract
A central goal of biology is to understand how genetic variation produces phenotypic variation, which has been described as a genotype to phenotype (G to P) map. The plant form is continuously shaped by intrinsic developmental and extrinsic environmental inputs, and therefore plant phenomes are highly multivariate and require comprehensive approaches to fully quantify. Yet a common assumption in plant phenotyping efforts is that a few pre-selected measurements can adequately describe the relevant phenome space. Our poor understanding of the genetic basis of root system architecture is at least partially a result of this incongruence. Root systems are complex 3D structures that are most often studied as 2D representations measured with relatively simple univariate traits. In prior work, we showed that persistent homology, a topological data analysis method that does not pre-suppose the salient features of the data, could expand the phenotypic trait space and identify new G to P relations from a commonly used 2D root phenotyping platform. Here we extend the work to entire 3D root system architectures of maize seedlings from a mapping population that was designed to understand the genetic basis of maize-nitrogen relations. Using a panel of 84 univariate traits, persistent homology methods developed for 3D branching, and multivariate vectors of the collective trait space, we found that each method captures distinct information about root system variation as evidenced by the majority of non-overlapping QTL, and hence that root phenotypic trait space is not easily exhausted. The work offers a data-driven method for assessing 3D root structure and highlights the importance of non-canonical phenotypes for more accurate representations of the G to P map.
Collapse
Affiliation(s)
- Mao Li
- Donald Danforth Plant Science Center, St. Louis, MO, United States
| | - Zhengbin Liu
- Donald Danforth Plant Science Center, St. Louis, MO, United States
| | - Ni Jiang
- Donald Danforth Plant Science Center, St. Louis, MO, United States
| | - Benjamin Laws
- Donald Danforth Plant Science Center, St. Louis, MO, United States
| | - Christine Tiskevich
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Stephen P. Moose
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | | |
Collapse
|
28
|
Hernández-Lemus E, Miramontes P, Martínez-García M. Topological Data Analysis in Cardiovascular Signals: An Overview. ENTROPY (BASEL, SWITZERLAND) 2024; 26:67. [PMID: 38248193 PMCID: PMC10814033 DOI: 10.3390/e26010067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 01/04/2024] [Accepted: 01/10/2024] [Indexed: 01/23/2024]
Abstract
Topological data analysis (TDA) is a recent approach for analyzing and interpreting complex data sets based on ideas a branch of mathematics called algebraic topology. TDA has proven useful to disentangle non-trivial data structures in a broad range of data analytics problems including the study of cardiovascular signals. Here, we aim to provide an overview of the application of TDA to cardiovascular signals and its potential to enhance the understanding of cardiovascular diseases and their treatment in the form of a literature or narrative review. We first introduce the concept of TDA and its key techniques, including persistent homology, Mapper, and multidimensional scaling. We then discuss the use of TDA in analyzing various cardiovascular signals, including electrocardiography, photoplethysmography, and arterial stiffness. We also discuss the potential of TDA to improve the diagnosis and prognosis of cardiovascular diseases, as well as its limitations and challenges. Finally, we outline future directions for the use of TDA in cardiovascular signal analysis and its potential impact on clinical practice. Overall, TDA shows great promise as a powerful tool for the analysis of complex cardiovascular signals and may offer significant insights into the understanding and management of cardiovascular diseases.
Collapse
Affiliation(s)
- Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City 14610, Mexico;
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| | - Pedro Miramontes
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City 14610, Mexico;
- Department of Mathematics, Sciences School, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| | | |
Collapse
|
29
|
Bhattacharya A, Mondal S, De S, Mukhopadhyay A, Sen S. Lean blowout detection using topological data analysis. CHAOS (WOODBURY, N.Y.) 2024; 34:013102. [PMID: 38170473 DOI: 10.1063/5.0156500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Accepted: 12/01/2023] [Indexed: 01/05/2024]
Abstract
Modern lean premixed combustors are operated in ultra-lean mode to conform to strict emission norms. However, this causes the combustors to become prone to lean blowout (LBO). Online monitoring of combustion dynamics may help to avoid LBO and help the combustor run more safely and reliably. Previous studies have suggested various techniques to early predict LBO in single-burner combustors. In contrast, early detection of LBO in multi-burner combustors has been little explored to date. Recent studies have discovered significantly different combustion dynamics between multi-burner combustors and single-burner combustors. In the present paper, we show that some well-established early LBO detection techniques suitable for single-burner combustor are less effective in early detecting LBO in multi-burner combustors. To resolve this, we propose a novel tool, topological data analysis (TDA), for real-time LBO prediction in a wide range of combustor configurations. We find that the TDA metrics are computationally cheap and follow monotonic trends during the transition to LBO. This indicates that the TDA metrics can be used to fine-tune the LBO safety margin, which is a desirable feature from practical implementation point of view. Furthermore, we show that the sublevel set TDA metrics show approximately monotonic changes during the transition to LBO even with low sampling-rate signals. Sublevel set TDA is computationally inexpensive and does not require phase-space embedding. Therefore, TDA can potentially be used for real-time monitoring of combustor dynamics with simple, low-cost, and low sampling-rate sensors.
Collapse
Affiliation(s)
- Arijit Bhattacharya
- Department of Mechanical Engineering, Institute of Engineering and Management, Kolkata 700091, India
- Department of Mechanical Engineering, Jadavpur University, Kolkata 700032, India
| | - Sabyasachi Mondal
- Department of Mechanical Engineering, Jadavpur University, Kolkata 700032, India
| | - Somnath De
- Department of Aerospace Engineering, Indian Institute of Technology Madras, Chennai 600036, India
| | | | - Swarnendu Sen
- Department of Mechanical Engineering, Jadavpur University, Kolkata 700032, India
| |
Collapse
|
30
|
Guzmán-Vargas L, Zabaleta-Ortega A, Guzmán-Sáenz A. Simplicial complex entropy for time series analysis. Sci Rep 2023; 13:22696. [PMID: 38123652 PMCID: PMC10733285 DOI: 10.1038/s41598-023-49958-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Accepted: 12/13/2023] [Indexed: 12/23/2023] Open
Abstract
The complex behavior of many systems in nature requires the application of robust methodologies capable of identifying changes in their dynamics. In the case of time series (which are sensed values of a system during a time interval), several methods have been proposed to evaluate their irregularity. However, for some types of dynamics such as stochastic and chaotic, new approaches are required that can provide a better characterization of them. In this paper we present the simplicial complex approximate entropy, which is based on the conditional probability of the occurrence of elements of a simplicial complex. Our results show that this entropy measure provides a wide range of values with details not easily identifiable with standard methods. In particular, we show that our method is able to quantify the irregularity in simulated random sequences and those from low-dimensional chaotic dynamics. Furthermore, it is possible to consistently differentiate cardiac interbeat sequences from healthy subjects and from patients with heart failure, as well as to identify changes between dynamical states of coupled chaotic maps. Our results highlight the importance of the structures revealed by the simplicial complexes, which holds promise for applications of this approach in various contexts.
Collapse
Affiliation(s)
- Lev Guzmán-Vargas
- Unidad Profesional Interdisciplinaria en Ingeniería y Tecnologías Avanzadas, Instituto Politécnico Nacional, 07340, Mexico City, Mexico.
| | - Alvaro Zabaleta-Ortega
- Unidad Profesional Interdisciplinaria en Ingeniería y Tecnologías Avanzadas, Instituto Politécnico Nacional, 07340, Mexico City, Mexico
| | - Aldo Guzmán-Sáenz
- Topological Data Analysis in Genomics, Thomas J. Watson Research Center, Yorktown Heights, NY, USA
| |
Collapse
|
31
|
Madeleine T, Podoliak N, Buchnev O, Membrillo Solis I, Orlova T, van Rossem M, Kaczmarek M, D’Alessandro G, Brodzki J. Topological Learning for the Classification of Disorder: An Application to the Design of Metasurfaces. ACS NANO 2023; 18. [PMID: 38108267 PMCID: PMC10796169 DOI: 10.1021/acsnano.3c08776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 12/01/2023] [Accepted: 12/06/2023] [Indexed: 12/19/2023]
Abstract
Structural disorder can improve the optical properties of metasurfaces, whether it is emerging from some large-scale fabrication methods or explicitly designed and built lithographically. For example, correlated disorder, induced by a minimum inter-nanostructure distance or by hyperuniformity properties, is particularly beneficial for light extraction. Inspired by topology, we introduce numerical descriptors to provide quantitative measures of disorder with universal properties, suitable to treat both uncorrelated and correlated disorder at all length scales. The accuracy of these topological descriptors is illustrated both theoretically and experimentally by using them to design plasmonic metasurfaces with controlled disorder that we then correlate to the strength of their surface lattice resonances. These descriptors are an example of topological tools that can be used for the fast and accurate design of disordered structures or as aid in improving their fabrication methods.
Collapse
Affiliation(s)
- Tristan Madeleine
- Mathematical
Sciences, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | - Nina Podoliak
- Physics
and Astronomy, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | - Oleksandr Buchnev
- Optoelectronics
Research Centre and Centre for Photonic Metamaterials, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | | | - Tetiana Orlova
- Physics
and Astronomy, University of Southampton, Southampton SO17 1BJ, United Kingdom
- Infochemistry
Scientific Center, ITMO University, 9 Lomonosova Street, Saint-Petersburg, 191002, Russia
| | - Maria van Rossem
- Physics
and Astronomy, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | - Malgosia Kaczmarek
- Physics
and Astronomy, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | | | - Jacek Brodzki
- Mathematical
Sciences, University of Southampton, Southampton SO17 1BJ, United Kingdom
| |
Collapse
|
32
|
El-Yaagoubi AB, Chung MK, Ombao H. Statistical inference for dependence networks in topological data analysis. Front Artif Intell 2023; 6:1293504. [PMID: 38156039 PMCID: PMC10752923 DOI: 10.3389/frai.2023.1293504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 11/22/2023] [Indexed: 12/30/2023] Open
Abstract
Topological data analysis (TDA) provide tools that are becoming increasingly popular for analyzing multivariate time series data. One key aspect in analyzing multivariate time series is dependence between components. One application is on brain signal analysis. In particular, various dependence patterns in brain networks may be linked to specific tasks and cognitive processes. These dependence patterns may be altered by various neurological and cognitive impairments such as Alzheimer's and Parkinson's diseases, as well as attention deficit hyperactivity disorder (ADHD). Because there is no ground-truth with known dependence patterns in real brain signals, testing new TDA methods on multivariate time series is still a challenge. Our goal here is to develop novel statistical inference procedures via simulations. Simulations are useful for generating some null distributions of a test statistic (for hypothesis testing), forming confidence regions, and for evaluating the performance of proposed TDA methods. To the best of our knowledge, there are no methods that simulate multivariate time series data with potentially complex user-specified connectivity patterns. In this paper we present a novel approach to simulate multivariate time series with specific number of cycles/holes in its dependence network. Furthermore, we also provide a procedure for generating higher dimensional topological features.
Collapse
Affiliation(s)
- Anass B. El-Yaagoubi
- Statistics Program, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Moo K. Chung
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, United States
| | - Hernando Ombao
- Statistics Program, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| |
Collapse
|
33
|
Panconi L, Tansell A, Collins AJ, Makarova M, Owen DM. Three-dimensional topology-based analysis segments volumetric and spatiotemporal fluorescence microscopy. BIOLOGICAL IMAGING 2023; 4:e1. [PMID: 38516632 PMCID: PMC10951800 DOI: 10.1017/s2633903x23000260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 11/13/2023] [Accepted: 12/01/2023] [Indexed: 03/23/2024]
Abstract
Image analysis techniques provide objective and reproducible statistics for interpreting microscopy data. At higher dimensions, three-dimensional (3D) volumetric and spatiotemporal data highlight additional properties and behaviors beyond the static 2D focal plane. However, increased dimensionality carries increased complexity, and existing techniques for general segmentation of 3D data are either primitive, or highly specialized to specific biological structures. Borrowing from the principles of 2D topological data analysis (TDA), we formulate a 3D segmentation algorithm that implements persistent homology to identify variations in image intensity. From this, we derive two separate variants applicable to spatial and spatiotemporal data, respectively. We demonstrate that this analysis yields both sensitive and specific results on simulated data and can distinguish prominent biological structures in fluorescence microscopy images, regardless of their shape. Furthermore, we highlight the efficacy of temporal TDA in tracking cell lineage and the frequency of cell and organelle replication.
Collapse
Affiliation(s)
- Luca Panconi
- Institute of Immunology and Immunotherapy, University of Birmingham, Birmingham, UK
- College of Engineering and Physical Sciences, University of Birmingham, Birmingham, UK
- Centre of Membrane Proteins and Receptors, University of Birmingham, Birmingham, UK
| | - Amy Tansell
- College of Engineering and Physical Sciences, University of Birmingham, Birmingham, UK
- School of Mathematics, University of Birmingham, Birmingham, UK
| | | | - Maria Makarova
- School of Biosciences, College of Life and Environmental Science, University of Birmingham, Birmingham, UK
- Institute of Metabolism and Systems Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
| | - Dylan M. Owen
- Institute of Immunology and Immunotherapy, University of Birmingham, Birmingham, UK
- Centre of Membrane Proteins and Receptors, University of Birmingham, Birmingham, UK
- School of Mathematics, University of Birmingham, Birmingham, UK
| |
Collapse
|
34
|
Arango AS, Park H, Tajkhorshid E. Topological Learning Approach to Characterizing Biological Membranes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.28.569053. [PMID: 38076911 PMCID: PMC10705453 DOI: 10.1101/2023.11.28.569053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Biological membranes play key roles in cellular compartmentalization, structure, and its signaling pathways. At varying temperatures, individual membrane lipids sample from different configurations, a process that frequently leads to higher-order phase behavior and phenomena. Here we present a persistent homology-based method for quantifying the structural features of individual and bulk lipids, providing local and contextual information on lipid tail organization. Our method leverages the mathematical machinery of algebraic topology and machine learning to infer temperature-dependent structural information of lipids from static coordinates. To train our model, we generated multiple molecular dynamics trajectories of DPPC membranes at varying temperatures. A fingerprint was then constructed for each set of lipid coordinates by a persistent homology filtration, in which interactions spheres were grown around the lipid atoms while tracking their intersections. The sphere filtration formed a simplicial complex that captures enduring key topological features of the configuration landscape, using homology, yielding persistence data. Following fingerprint extraction for physiologically relevant temperatures, the persistence data were used to train an attention-based neural network for assignment of effective temperature values to selected membrane regions. Our persistence homology-based method captures the local structural effects, via effective temperature, of lipids adjacent to other membrane constituents, e.g. sterols and proteins. This topological learning approach can predict lipid effective temperatures from static coordinates across multiple spatial resolutions. The tool, called MembTDA, can be accessed at https://github.com/hyunp2/Memb-TDA.
Collapse
Affiliation(s)
- Andres S Arango
- Theoretical and Computational Biophysics Group, NIH Resource Center for Macromolecular Modeling and Visualization, Beckman Institute for Advanced Science and Technology, Department of Biochemistry, and Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Hyun Park
- Theoretical and Computational Biophysics Group, NIH Resource Center for Macromolecular Modeling and Visualization, Beckman Institute for Advanced Science and Technology, Department of Biochemistry, and Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Emad Tajkhorshid
- Theoretical and Computational Biophysics Group, NIH Resource Center for Macromolecular Modeling and Visualization, Beckman Institute for Advanced Science and Technology, Department of Biochemistry, and Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| |
Collapse
|
35
|
Zabaleta-Ortega A, Masoller C, Guzmán-Vargas L. Topological data analysis of the synchronization of a network of Rössler chaotic electronic oscillators. CHAOS (WOODBURY, N.Y.) 2023; 33:113110. [PMID: 37921586 DOI: 10.1063/5.0167523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 10/13/2023] [Indexed: 11/04/2023]
Abstract
Synchronization study allows a better understanding of the exchange of information among systems. In this work, we study experimental data recorded from a set of Rössler-like chaotic electronic oscillators arranged in a complex network, where the interactions between the oscillators are given in terms of a connectivity matrix, and their intensity is controlled by a global coupling parameter. We use the zero and one persistent homology groups to characterize the point clouds obtained from the signals recorded in pairs of oscillators. We show that the normalized persistent entropy (NPE) allows us to characterize the effective coupling between pairs of oscillators because it tends to increase with the coupling strength and to decrease with the distance between the oscillators. We also observed that pairs of oscillators that have similar degrees and are nearest neighbors tend to have higher NPE values than pairs with different degrees. However, large variability is found in the NPE values. Comparing the NPE behavior with that of the phase-locking value (PLV, commonly used to evaluate the synchronization of phase oscillators), we find that for large enough coupling, PLV only displays a monotonic increase, while NPE shows a richer behavior that captures variations in the behavior of the oscillators. This is due to the fact that PLV only captures coupling-induced phase changes, while NPE also captures amplitude changes. Moreover, when we consider the same network but with Kuramoto phase oscillators, we also find that NPE captures the transition to synchronization (as it increases with the coupling strength), and it also decreases with the distance between the oscillators. Therefore, we propose NPE as a data analysis technique to try to differentiate pairs of oscillators that have strong effective coupling because they are first or near neighbors, from those that have weaker coupling because they are distant neighbors.
Collapse
Affiliation(s)
- A Zabaleta-Ortega
- Unidad Profesional Interdisciplinaria en Ingeniería y Tecnologías Avanzadas, Instituto Politécnico Nacional, 07340 Ciudad de México, Mexico
| | - C Masoller
- Departament de Física, Universitat Politècnica de Catalunya, Rambla St. Nebridi 22, 08222 Terrassa, Spain
| | - L Guzmán-Vargas
- Unidad Profesional Interdisciplinaria en Ingeniería y Tecnologías Avanzadas, Instituto Politécnico Nacional, 07340 Ciudad de México, Mexico
| |
Collapse
|
36
|
Wang H, Huang G, Zhao Z, Cheng L, Juncker-Jensen A, Nagy ML, Lu X, Zhang X, Chen DZ. CCF-GNN: A Unified Model Aggregating Appearance, Microenvironment, and Topology for Pathology Image Classification. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3179-3193. [PMID: 37027573 DOI: 10.1109/tmi.2023.3249343] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Pathology images contain rich information of cell appearance, microenvironment, and topology features for cancer analysis and diagnosis. Among such features, topology becomes increasingly important in analysis for cancer immunotherapy. By analyzing geometric and hierarchically structured cell distribution topology, oncologists can identify densely-packed and cancer-relevant cell communities (CCs) for making decisions. Compared to commonly-used pixel-level Convolution Neural Network (CNN) features and cell-instance-level Graph Neural Network (GNN) features, CC topology features are at a higher level of granularity and geometry. However, topological features have not been well exploited by recent deep learning (DL) methods for pathology image classification due to lack of effective topological descriptors for cell distribution and gathering patterns. In this paper, inspired by clinical practice, we analyze and classify pathology images by comprehensively learning cell appearance, microenvironment, and topology in a fine-to-coarse manner. To describe and exploit topology, we design Cell Community Forest (CCF), a novel graph that represents the hierarchical formulation process of big-sparse CCs from small-dense CCs. Using CCF as a new geometric topological descriptor of tumor cells in pathology images, we propose CCF-GNN, a GNN model that successively aggregates heterogeneous features (e.g., appearance, microenvironment) from cell-instance-level, cell-community-level, into image-level for pathology image classification. Extensive cross-validation experiments show that our method significantly outperforms alternative methods on H&E-stained and immunofluorescence images for disease grading tasks with multiple cancer types. Our proposed CCF-GNN establishes a new topological data analysis (TDA) based method, which facilitates integrating multi-level heterogeneous features of point clouds (e.g., for cells) into a unified DL framework.
Collapse
|
37
|
Gonzalez-Castillo J, Fernandez IS, Lam KC, Handwerker DA, Pereira F, Bandettini PA. Manifold learning for fMRI time-varying functional connectivity. Front Hum Neurosci 2023; 17:1134012. [PMID: 37497043 PMCID: PMC10366614 DOI: 10.3389/fnhum.2023.1134012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Accepted: 06/21/2023] [Indexed: 07/28/2023] Open
Abstract
Whole-brain functional connectivity (FC) measured with functional MRI (fMRI) evolves over time in meaningful ways at temporal scales going from years (e.g., development) to seconds [e.g., within-scan time-varying FC (tvFC)]. Yet, our ability to explore tvFC is severely constrained by its large dimensionality (several thousands). To overcome this difficulty, researchers often seek to generate low dimensional representations (e.g., 2D and 3D scatter plots) hoping those will retain important aspects of the data (e.g., relationships to behavior and disease progression). Limited prior empirical work suggests that manifold learning techniques (MLTs)-namely those seeking to infer a low dimensional non-linear surface (i.e., the manifold) where most of the data lies-are good candidates for accomplishing this task. Here we explore this possibility in detail. First, we discuss why one should expect tvFC data to lie on a low dimensional manifold. Second, we estimate what is the intrinsic dimension (ID; i.e., minimum number of latent dimensions) of tvFC data manifolds. Third, we describe the inner workings of three state-of-the-art MLTs: Laplacian Eigenmaps (LEs), T-distributed Stochastic Neighbor Embedding (T-SNE), and Uniform Manifold Approximation and Projection (UMAP). For each method, we empirically evaluate its ability to generate neuro-biologically meaningful representations of tvFC data, as well as their robustness against hyper-parameter selection. Our results show that tvFC data has an ID that ranges between 4 and 26, and that ID varies significantly between rest and task states. We also show how all three methods can effectively capture subject identity and task being performed: UMAP and T-SNE can capture these two levels of detail concurrently, but LE could only capture one at a time. We observed substantial variability in embedding quality across MLTs, and within-MLT as a function of hyper-parameter selection. To help alleviate this issue, we provide heuristics that can inform future studies. Finally, we also demonstrate the importance of feature normalization when combining data across subjects and the role that temporal autocorrelation plays in the application of MLTs to tvFC data. Overall, we conclude that while MLTs can be useful to generate summary views of labeled tvFC data, their application to unlabeled data such as resting-state remains challenging.
Collapse
Affiliation(s)
- Javier Gonzalez-Castillo
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
| | - Isabel S. Fernandez
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
| | - Ka Chun Lam
- Machine Learning Group, National Institute of Mental Health, Bethesda, MD, United States
| | - Daniel A. Handwerker
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
| | - Francisco Pereira
- Machine Learning Group, National Institute of Mental Health, Bethesda, MD, United States
| | - Peter A. Bandettini
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
- Functional Magnetic Resonance Imaging (FMRI) Core, National Institute of Mental Health, Bethesda, MD, United States
| |
Collapse
|
38
|
Klaila G, Vutov V, Stefanou A. Supervised topological data analysis for MALDI mass spectrometry imaging applications. BMC Bioinformatics 2023; 24:279. [PMID: 37430224 DOI: 10.1186/s12859-023-05402-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Accepted: 06/26/2023] [Indexed: 07/12/2023] Open
Abstract
BACKGROUND Matrix-assisted laser desorption/ionization mass spectrometry imaging (MALDI MSI) displays significant potential for applications in cancer research, especially in tumor typing and subtyping. Lung cancer is the primary cause of tumor-related deaths, where the most lethal entities are adenocarcinoma (ADC) and squamous cell carcinoma (SqCC). Distinguishing between these two common subtypes is crucial for therapy decisions and successful patient management. RESULTS We propose a new algebraic topological framework, which obtains intrinsic information from MALDI data and transforms it to reflect topological persistence. Our framework offers two main advantages. Firstly, topological persistence aids in distinguishing the signal from noise. Secondly, it compresses the MALDI data, saving storage space and optimizes computational time for subsequent classification tasks. We present an algorithm that efficiently implements our topological framework, relying on a single tuning parameter. Afterwards, logistic regression and random forest classifiers are employed on the extracted persistence features, thereby accomplishing an automated tumor (sub-)typing process. To demonstrate the competitiveness of our proposed framework, we conduct experiments on a real-world MALDI dataset using cross-validation. Furthermore, we showcase the effectiveness of the single denoising parameter by evaluating its performance on synthetic MALDI images with varying levels of noise. CONCLUSION Our empirical experiments demonstrate that the proposed algebraic topological framework successfully captures and leverages the intrinsic spectral information from MALDI data, leading to competitive results in classifying lung cancer subtypes. Moreover, the framework's ability to be fine-tuned for denoising highlights its versatility and potential for enhancing data analysis in MALDI applications.
Collapse
Affiliation(s)
- Gideon Klaila
- Institute for Algebra, Geometry, Topology and their Applications (ALTA), University of Bremen, 28359, Bremen, Germany.
| | - Vladimir Vutov
- Institute for Statistics, University of Bremen, 28359, Bremen, Germany
| | - Anastasios Stefanou
- Institute for Algebra, Geometry, Topology and their Applications (ALTA), University of Bremen, 28359, Bremen, Germany
| |
Collapse
|
39
|
Derwae H, Nijs M, Geysels A, Waelkens E, De Moor B. Spatiochemical Characterization of the Pancreas Using Mass Spectrometry Imaging and Topological Data Analysis. Anal Chem 2023. [PMID: 37402207 DOI: 10.1021/acs.analchem.2c05606] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/06/2023]
Abstract
Mass Spectrometry Imaging (MSI) is a technique used to identify the spatial distribution of molecules in tissues. An MSI experiment results in large amounts of high dimensional data, so efficient computational methods are needed to analyze the output. Topological Data Analysis (TDA) has proven to be effective in all kinds of applications. TDA focuses on the topology of the data in high dimensional space. Looking at the shape in a high dimensional data set can lead to new or different insights. In this work, we investigate the use of Mapper, a form of TDA, applied on MSI data. Mapper is used to find data clusters inside two healthy mouse pancreas data sets. The results are compared to previous work using UMAP for MSI data analysis on the same data sets. This work finds that the proposed technique discovers the same clusters in the data as UMAP and is also able to uncover new clusters, such as an additional ring structure inside the pancreatic islets and a better defined cluster containing blood vessels. The technique can be used for a large variety of data types and sizes and can be optimized for specific applications. It is also computationally similar to UMAP for clustering. Mapper is a very interesting method, especially its use in biomedical applications.
Collapse
Affiliation(s)
- Helena Derwae
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, 3001 Leuven, Belgium
| | - Melanie Nijs
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, 3001 Leuven, Belgium
| | - Axel Geysels
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, 3001 Leuven, Belgium
| | - Etienne Waelkens
- Department of Cellular and Molecular Medicine, KU Leuven, 3001 Leuven, Belgium
| | - Bart De Moor
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, 3001 Leuven, Belgium
- Fellow IEEE, SIAM at STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, 3001 Leuven, Belgium
| |
Collapse
|
40
|
Manjunath S, Perea JA, Sathyanarayana A. Topological Data Analysis of Electroencephalogram Signals for Pediatric Obstructive Sleep Apnea. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-4. [PMID: 38083500 DOI: 10.1109/embc40787.2023.10340674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Topological data analysis (TDA) is an emerging technique for biological signal processing. TDA leverages the invariant topological features of signals in a metric space for robust analysis of signals even in the presence of noise. In this paper, we leverage TDA on brain connectivity networks derived from electroencephalogram (EEG) signals to identify statistical differences between pediatric patients with obstructive sleep apnea (OSA) and pediatric patients without OSA. We leverage a large corpus of data, and show that TDA enables us to see a statistical difference between the brain dynamics of the two groups.Clinical relevance- This establishes the potential of topological data analysis as a tool to identify obstructive sleep apnea without requiring a full polysomnogram study, and provides an initial investigation towards easier and more scalable obstructive sleep apnea diagnosis.
Collapse
|
41
|
Venkat A, Bhaskar D, Krishnaswamy S. Multiscale geometric and topological analyses for characterizing and predicting immune responses from single cell data. Trends Immunol 2023; 44:551-563. [PMID: 37301677 DOI: 10.1016/j.it.2023.05.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Revised: 05/03/2023] [Accepted: 05/04/2023] [Indexed: 06/12/2023]
Abstract
Single cell genomics has revolutionized our ability to map immune heterogeneity and responses. With the influx of large-scale data sets from diverse modalities, the resolution achieved has supported the long-held notion that immune cells are naturally organized into hierarchical relationships, characterized at multiple levels. Such a multigranular structure corresponds to key geometric and topological features. Given that differences between an effective and ineffective immunological response may not be found at one level, there is vested interest in characterizing and predicting outcomes from such features. In this review, we highlight single cell methods and principles for learning geometric and topological properties of data at multiple scales, discussing their contributions to immunology. Ultimately, multiscale approaches go beyond classical clustering, revealing a more comprehensive picture of cellular heterogeneity.
Collapse
Affiliation(s)
- Aarthi Venkat
- Computational Biology and Bioinformatics Program, Yale University, New Haven, CT, USA
| | | | - Smita Krishnaswamy
- Computational Biology and Bioinformatics Program, Yale University, New Haven, CT, USA; Department of Genetics, Yale University, New Haven, CT, USA; Department of Computer Science, Yale University, New Haven, CT, USA.
| |
Collapse
|
42
|
Zhang M, Chowdhury S, Saggar M. Temporal Mapper: Transition networks in simulated and real neural dynamics. Netw Neurosci 2023; 7:431-460. [PMID: 37397880 PMCID: PMC10312258 DOI: 10.1162/netn_a_00301] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 12/07/2022] [Indexed: 07/26/2023] Open
Abstract
Characterizing large-scale dynamic organization of the brain relies on both data-driven and mechanistic modeling, which demands a low versus high level of prior knowledge and assumptions about how constituents of the brain interact. However, the conceptual translation between the two is not straightforward. The present work aims to provide a bridge between data-driven and mechanistic modeling. We conceptualize brain dynamics as a complex landscape that is continuously modulated by internal and external changes. The modulation can induce transitions between one stable brain state (attractor) to another. Here, we provide a novel method-Temporal Mapper-built upon established tools from the field of topological data analysis to retrieve the network of attractor transitions from time series data alone. For theoretical validation, we use a biophysical network model to induce transitions in a controlled manner, which provides simulated time series equipped with a ground-truth attractor transition network. Our approach reconstructs the ground-truth transition network from simulated time series data better than existing time-varying approaches. For empirical relevance, we apply our approach to fMRI data gathered during a continuous multitask experiment. We found that occupancy of the high-degree nodes and cycles of the transition network was significantly associated with subjects' behavioral performance. Taken together, we provide an important first step toward integrating data-driven and mechanistic modeling of brain dynamics.
Collapse
Affiliation(s)
- Mengsen Zhang
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
- Department of Psychiatry, University of North Carolina at Chapel Hill, NC, USA
| | - Samir Chowdhury
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
| | - Manish Saggar
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
| |
Collapse
|
43
|
Ryu H, Habeck C, Stern Y, Lee S. Persistent homology-based functional connectivity and its association with cognitive ability: Life-span study. Hum Brain Mapp 2023; 44:3669-3683. [PMID: 37067099 PMCID: PMC10203816 DOI: 10.1002/hbm.26304] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 03/10/2023] [Accepted: 03/25/2023] [Indexed: 04/18/2023] Open
Abstract
Brain-segregation attributes in resting-state functional networks have been widely investigated to understand cognition and cognitive aging using various approaches [e.g., average connectivity within/between networks and brain system segregation (BSS)]. While these approaches have assumed that resting-state functional networks operate in a modular structure, a complementary perspective assumes that a core-periphery or rich club structure accounts for brain functions where the hubs are tightly interconnected to each other to allow for integrated processing. In this article, we apply a novel method, persistent homology (PH), to develop an alternative to standard functional connectivity by quantifying the pattern of information during the integrated processing. We also investigate whether PH-based functional connectivity explains cognitive performance and compare the amount of variability in explaining cognitive performance for three sets of independent variables: (1) PH-based functional connectivity, (2) graph theory-based measures, and (3) BSS. Resting-state functional connectivity data were extracted from 279 healthy participants, and cognitive ability scores were generated in four domains (fluid reasoning, episodic memory, vocabulary, and processing speed). The results first highlight the pattern of brain-information flow over whole brain regions (i.e., integrated processing) accounts for more variance of cognitive abilities than other methods. The results also show that fluid reasoning and vocabulary performance significantly decrease as the strength of the additional information flow on functional connectivity with the shortest path increases. While PH has been applied to functional connectivity analysis in recent studies, our results demonstrate potential utility of PH-based functional connectivity in understanding cognitive function.
Collapse
Affiliation(s)
- Hyunnam Ryu
- Cognitive Neuroscience Division of the Department of Neurology and Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Vagelos College of Physicians and SurgeonsColumbia UniversityNew YorkNew YorkUSA
- Mental Health Data ScienceNew York State Psychiatric InstituteNew YorkNew YorkUSA
| | - Christian Habeck
- Cognitive Neuroscience Division of the Department of Neurology and Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Vagelos College of Physicians and SurgeonsColumbia UniversityNew YorkNew YorkUSA
| | - Yaakov Stern
- Cognitive Neuroscience Division of the Department of Neurology and Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Vagelos College of Physicians and SurgeonsColumbia UniversityNew YorkNew YorkUSA
| | - Seonjoo Lee
- Mental Health Data ScienceNew York State Psychiatric InstituteNew YorkNew YorkUSA
- Department of Biostatistics, Mailman School of Public HealthColumbia UniversityNew YorkNew YorkUSA
- Department of PsychiatryColumbia UniversityNew YorkNew YorkUSA
| |
Collapse
|
44
|
Malek AA, Alias MA, Razak FA, Noorani MSM, Mahmud R, Zulkepli NFS. Persistent Homology-Based Machine Learning Method for Filtering and Classifying Mammographic Microcalcification Images in Early Cancer Detection. Cancers (Basel) 2023; 15:cancers15092606. [PMID: 37174071 PMCID: PMC10177619 DOI: 10.3390/cancers15092606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 03/23/2023] [Accepted: 03/30/2023] [Indexed: 05/15/2023] Open
Abstract
Microcalcifications in mammogram images are primary indicators for detecting the early stages of breast cancer. However, dense tissues and noise in the images make it challenging to classify the microcalcifications. Currently, preprocessing procedures such as noise removal techniques are applied directly on the images, which may produce a blurry effect and loss of image details. Further, most of the features used in classification models focus on local information of the images and are often burdened with details, resulting in data complexity. This research proposed a filtering and feature extraction technique using persistent homology (PH), a powerful mathematical tool used to study the structure of complex datasets and patterns. The filtering process is not performed directly on the image matrix but through the diagrams arising from PH. These diagrams will enable us to distinguish prominent characteristics of the image from noise. The filtered diagrams are then vectorised using PH features. Supervised machine learning models are trained on the MIAS and DDSM datasets to evaluate the extracted features' efficacy in discriminating between benign and malignant classes and to obtain the optimal filtering level. This study reveals that appropriate PH filtering levels and features can improve classification accuracy in early cancer detection.
Collapse
Affiliation(s)
- Aminah Abdul Malek
- Department of Mathematical Sciences, Faculty of Science & Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia
- Mathematical Sciences Studies, College of Computing, Informatics and Media, Universiti Teknologi MARA (UiTM) Negeri Sembilan Branch, Seremban Campus, Seremban 70300, Negeri Sembilan, Malaysia
| | - Mohd Almie Alias
- Department of Mathematical Sciences, Faculty of Science & Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia
- Centre for Modelling and Data Analysis (DELTA), Faculty of Science & Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia
| | - Fatimah Abdul Razak
- Department of Mathematical Sciences, Faculty of Science & Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia
- Centre for Modelling and Data Analysis (DELTA), Faculty of Science & Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia
| | - Mohd Salmi Md Noorani
- Department of Mathematical Sciences, Faculty of Science & Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia
| | - Rozi Mahmud
- Department of Radiology and Imaging, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia (UPM), Serdang 43400, Selangor, Malaysia
| | | |
Collapse
|
45
|
Ji J, Venderley J, Zhang H, Lei M, Ruan G, Patel N, Chung YM, Giesting R, Miller L. Assessing nocturnal scratch with actigraphy in atopic dermatitis patients. NPJ Digit Med 2023; 6:72. [PMID: 37100893 PMCID: PMC10133290 DOI: 10.1038/s41746-023-00821-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 04/04/2023] [Indexed: 04/28/2023] Open
Abstract
Nocturnal scratch is one major factor leading to impaired quality of life in atopic dermatitis (AD) patients. Therefore, objectively quantifying nocturnal scratch events aids in assessing the disease state, treatment effect, and AD patients' quality of life. In this paper, we describe the use of actigraphy, highly predictive topological features, and a model-ensembling approach to develop an assessment of nocturnal scratch events by measuring scratch duration and intensity. Our assessment is tested in a clinical setting against the ground truth obtained from video recordings. The new approach addresses unmet challenges in existing studies, such as the lack of generalizability to real-world applications, the failure to capture finger scratches, and the limitations in the evaluation due to imbalanced data in the current literature. Furthermore, the performance evaluation shows agreement between derived digital endpoints and the video annotation ground truth, as well as patient-reported outcomes, which demonstrated the validity of the new assessment of nocturnal scratch.
Collapse
Affiliation(s)
- Ju Ji
- Eli Lilly & Company, INc., Indianapolis, IN, USA.
| | | | - Hui Zhang
- Eli Lilly & Company, INc., Indianapolis, IN, USA
| | - Mengjue Lei
- Eli Lilly & Company, INc., Indianapolis, IN, USA
| | | | - Neel Patel
- Eli Lilly & Company, INc., Indianapolis, IN, USA
| | - Yu-Min Chung
- Eli Lilly & Company, INc., Indianapolis, IN, USA
| | | | - Leah Miller
- Eli Lilly & Company, INc., Indianapolis, IN, USA
| |
Collapse
|
46
|
Weidner J, Neitzel C, Gote M, Deck J, Küntzelmann K, Pilarczyk G, Falk M, Hausmann M. Advanced image-free analysis of the nano-organization of chromatin and other biomolecules by Single Molecule Localization Microscopy (SMLM). Comput Struct Biotechnol J 2023; 21:2018-2034. [PMID: 36968017 PMCID: PMC10030913 DOI: 10.1016/j.csbj.2023.03.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 03/08/2023] [Accepted: 03/08/2023] [Indexed: 03/11/2023] Open
Abstract
The cell as a system of many components, governed by the laws of physics and chemistry drives molecular functions having an impact on the spatial organization of these systems and vice versa. Since the relationship between structure and function is an almost universal rule not only in biology, appropriate methods are required to parameterize the relationship between the structure and function of biomolecules and their networks, the mechanisms of the processes in which they are involved, and the mechanisms of regulation of these processes. Single molecule localization microscopy (SMLM), which we focus on here, offers a significant advantage for the quantitative parametrization of molecular organization: it provides matrices of coordinates of fluorescently labeled biomolecules that can be directly subjected to advanced mathematical analytical procedures without the need for laborious and sometimes misleading image processing. Here, we propose mathematical tools for comprehensive quantitative computer data analysis of SMLM point patterns that include Ripley distance frequency analysis, persistent homology analysis, persistent 'imaging', principal component analysis and co-localization analysis. The application of these methods is explained using artificial datasets simulating different, potentially possible and interpretatively important situations. Illustrative analyses of real complex biological SMLM data are presented to emphasize the applicability of the proposed algorithms. This manuscript demonstrated the extraction of features and parameters quantifying the influence of chromatin (re)organization on genome function, offering a novel approach to study chromatin architecture at the nanoscale. However, the ability to adapt the proposed algorithms to analyze essentially any molecular organizations, e.g., membrane receptors or protein trafficking in the cytosol, offers broad flexibility of use.
Collapse
Affiliation(s)
- Jonas Weidner
- Kirchhoff-Institute for Physics, Heidelberg University, Im Neuenheimer Feld 227, 69120 Heidelberg, Germany
| | - Charlotte Neitzel
- Kirchhoff-Institute for Physics, Heidelberg University, Im Neuenheimer Feld 227, 69120 Heidelberg, Germany
| | - Martin Gote
- Kirchhoff-Institute for Physics, Heidelberg University, Im Neuenheimer Feld 227, 69120 Heidelberg, Germany
| | - Jeanette Deck
- Kirchhoff-Institute for Physics, Heidelberg University, Im Neuenheimer Feld 227, 69120 Heidelberg, Germany
| | - Kim Küntzelmann
- Kirchhoff-Institute for Physics, Heidelberg University, Im Neuenheimer Feld 227, 69120 Heidelberg, Germany
| | - Götz Pilarczyk
- Kirchhoff-Institute for Physics, Heidelberg University, Im Neuenheimer Feld 227, 69120 Heidelberg, Germany
| | - Martin Falk
- Kirchhoff-Institute for Physics, Heidelberg University, Im Neuenheimer Feld 227, 69120 Heidelberg, Germany
- Institute of Biophysics of the Czech Academy of Sciences, Královopolská 135, 612 00 Brno, Czech Republic
| | - Michael Hausmann
- Kirchhoff-Institute for Physics, Heidelberg University, Im Neuenheimer Feld 227, 69120 Heidelberg, Germany
| |
Collapse
|
47
|
Rather AA, Chachoo MA. Robust correlation estimation and UMAP assisted topological analysis of omics data for disease subtyping. Comput Biol Med 2023; 155:106640. [PMID: 36774889 DOI: 10.1016/j.compbiomed.2023.106640] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 01/08/2023] [Accepted: 02/05/2023] [Indexed: 02/10/2023]
Abstract
Deciphering information hidden in the gene expression assays for identifying disease subtypes has significant importance in precision medicine. However, computational limitations thwart this process due to the intricacy of the biological networks and the curse of dimensionality of gene expression data. Therefore, clustering in such scenarios often becomes the first choice of exploratory data analysis to identify natural structures and intrinsic patterns in the data. However, sparse and high dimensional nature of omics data prevents conventional clustering algorithms to discover subtypes that are clinically relevant and statistically significant. Hence, non-linear dimensionality reduction techniques coupled with clustering in such scenarios often becomes imperative to improve the clustering results. In this study, we present a robust pipeline to discover disease subtypes with clinical relevance. Specifically, we focus on discovering patient sub-groups that have a residual life patterns remarkably different from other sub-groups. This is significant because by refining prognosis, subtyping can reduce uncertainty in approximating patients expected outcome. The methodology present is based on robust correlation estimation, UMAP- a non-linear dimensionality reduction method and mapper- a tool from topology. Notably, we suggest a method for improving the robustness of the correlation matrix of gene expression data for improving the clustering results. The performance of the model is evaluated by applying to five cancer datasets obtained through TCGA and comparisons are performed with some state of the art methods of NEMO, RSC-OTRI and SNF with regard to log-rank test and Restricted Life Expectancy Difference. For example in GBM dataset, the minimum separation for any two discovered subtypes is 221 days which is significantly higher than the other methodologies. We also compared the results without using the robust correlation based estimate and observed that robust correlation improves separability between survival curves significantly. From the results we infer that our methodology performs better compared to other methodologies with regard to separating survival curves of patient sub-groups despite using single omics profiles of patients compared to multiple omics profiles of SNF and NEMO. Pathway over-representation analysis is performed on the final clustering results to investigate the biological underpinnings characterizing each subtype.
Collapse
Affiliation(s)
- Arif Ahmad Rather
- Department of Computer Sciences, University of Kashmir, Srinagar, JK, India.
| | | |
Collapse
|
48
|
Ye X, Sun F, Xiang S. TREPH: A Plug-In Topological Layer for Graph Neural Networks. ENTROPY (BASEL, SWITZERLAND) 2023; 25:331. [PMID: 36832697 PMCID: PMC9954936 DOI: 10.3390/e25020331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 02/04/2023] [Accepted: 02/08/2023] [Indexed: 06/18/2023]
Abstract
Topological Data Analysis (TDA) is an approach to analyzing the shape of data using techniques from algebraic topology. The staple of TDA is Persistent Homology (PH). Recent years have seen a trend of combining PH and Graph Neural Networks (GNNs) in an end-to-end manner to capture topological features from graph data. Though effective, these methods are limited by the shortcomings of PH: incomplete topological information and irregular output format. Extended Persistent Homology (EPH), as a variant of PH, addresses these problems elegantly. In this paper, we propose a plug-in topological layer for GNNs, termed Topological Representation with Extended Persistent Homology (TREPH). Taking advantage of the uniformity of EPH, a novel aggregation mechanism is designed to collate topological features of different dimensions to the local positions determining their living processes. The proposed layer is provably differentiable and more expressive than PH-based representations, which in turn is strictly stronger than message-passing GNNs in expressive power. Experiments on real-world graph classification tasks demonstrate the competitiveness of TREPH compared with the state-of-the-art approaches.
Collapse
Affiliation(s)
- Xue Ye
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 101408, China
| | - Fang Sun
- School of Mathematical Sciences, Capital Normal University, Beijing 100048, China
| | - Shiming Xiang
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 101408, China
| |
Collapse
|
49
|
Gonzalez-Castillo J, Fernandez I, Lam KC, Handwerker DA, Pereira F, Bandettini PA. Manifold Learning for fMRI time-varying FC. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.14.523992. [PMID: 36789436 PMCID: PMC9928030 DOI: 10.1101/2023.01.14.523992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Whole-brain functional connectivity ( FC ) measured with functional MRI (fMRI) evolve over time in meaningful ways at temporal scales going from years (e.g., development) to seconds (e.g., within-scan time-varying FC ( tvFC )). Yet, our ability to explore tvFC is severely constrained by its large dimensionality (several thousands). To overcome this difficulty, researchers seek to generate low dimensional representations (e.g., 2D and 3D scatter plots) expected to retain its most informative aspects (e.g., relationships to behavior, disease progression). Limited prior empirical work suggests that manifold learning techniques ( MLTs )-namely those seeking to infer a low dimensional non-linear surface (i.e., the manifold) where most of the data lies-are good candidates for accomplishing this task. Here we explore this possibility in detail. First, we discuss why one should expect tv FC data to lie on a low dimensional manifold. Second, we estimate what is the intrinsic dimension (i.e., minimum number of latent dimensions; ID ) of tvFC data manifolds. Third, we describe the inner workings of three state-of-the-art MLTs : Laplacian Eigenmaps ( LE ), T-distributed Stochastic Neighbor Embedding ( T-SNE ), and Uniform Manifold Approximation and Projection ( UMAP ). For each method, we empirically evaluate its ability to generate neuro-biologically meaningful representations of tvFC data, as well as their robustness against hyper-parameter selection. Our results show that tvFC data has an ID that ranges between 4 and 26, and that ID varies significantly between rest and task states. We also show how all three methods can effectively capture subject identity and task being performed: UMAP and T-SNE can capture these two levels of detail concurrently, but L E could only capture one at a time. We observed substantial variability in embedding quality across MLTs , and within- MLT as a function of hyper-parameter selection. To help alleviate this issue, we provide heuristics that can inform future studies. Finally, we also demonstrate the importance of feature normalization when combining data across subjects and the role that temporal autocorrelation plays in the application of MLTs to tvFC data. Overall, we conclude that while MLTs can be useful to generate summary views of labeled tvFC data, their application to unlabeled data such as resting-state remains challenging.
Collapse
Affiliation(s)
| | - Isabel Fernandez
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD
| | - Ka Chun Lam
- Machine Learning Group, National Institute of Mental Health, Bethesda, MD
| | - Daniel A Handwerker
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD
| | - Francisco Pereira
- Machine Learning Group, National Institute of Mental Health, Bethesda, MD
| | - Peter A Bandettini
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD
- Machine Learning Group, National Institute of Mental Health, Bethesda, MD
- FMRI Core, National Institute of Mental Health, Bethesda, MD
| |
Collapse
|
50
|
De Lara MLD. Persistent homology classification algorithm. PeerJ Comput Sci 2023; 9:e1195. [PMID: 37346603 PMCID: PMC10280283 DOI: 10.7717/peerj-cs.1195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 12/01/2022] [Indexed: 06/23/2023]
Abstract
Data classification is an important aspect of machine learning, as it is utilized to solve issues in a wide variety of contexts. There are numerous classifiers, but there is no single best-performing classifier for all types of data, as the no free lunch theorem implies. Topological data analysis is an emerging topic concerned with the shape of data. One of the key tools in this field for analyzing the shape or topological properties of a dataset is persistent homology, an algebraic topology-based method for estimating the topological features of a space of points that persists across several resolutions. This study proposes a supervised learning classification algorithm that makes use of persistent homology between training data classes in the form of persistence diagrams to predict the output category of new observations. Validation of the developed algorithm was performed on real-world and synthetic datasets. The performance of the proposed classification algorithm on these datasets was compared to that of the most widely used classifiers. Validation runs demonstrated that the proposed persistent homology classification algorithm performed at par if not better than the majority of classifiers considered.
Collapse
Affiliation(s)
- Mark Lexter D. De Lara
- Institute of Mathematical Sciences and Physics, College of Arts and Sciences, University of the Philippines Los Baños, College, Los Baños, Laguna, Philippines
- Institute of Mathematics, University of the Philippines Diliman, Quezon City, Metro Manila, Philippines
| |
Collapse
|