1
|
Li Q, Nichols C, Welner RS, Chen JY, Ku WS, Yue Z. Toden-E: Topology-Based and Density-Based Ensembled Clustering for the Development of Super-PAG in Functional Genomics using PAG Network and LLM. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.20.619308. [PMID: 39484450 PMCID: PMC11526983 DOI: 10.1101/2024.10.20.619308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
The integrative analysis of gene sets, networks, and pathways is pivotal for deciphering omics data in translational biomedical research. To significantly increase gene coverage and enhance the utility of pathways, annotated gene lists, and gene signatures from diverse sources, we introduced pathways, annotated gene lists, and gene signatures (PAGs) enriched with metadata to represent biological functions. Furthermore, we established PAG-PAG networks by leveraging gene member similarity and gene regulations. However, in practice, high similarity in functional descriptions or gene membership often leads to redundant PAGs, hindering the interpretation from a fuzzy enriched PAG list. In this study, we developed todenE (topology-based and density-based ensemble) clustering, pioneering in integrating topology-based and density-based clustering methods to detect PAG communities leveraging the PAG network and Large Language Models (LLM). In computational genomics annotation, the genes can be grouped/clustered through the gene relationships and gene functions via guilt by association. Similarly, PAGs can be grouped into higher-level clusters, forming concise functional representations called Super-PAGs. TodenE captures PAG-PAG similarity and encapsulates functional information through LLM, in characterizing network-based functional Super-PAGs. In synthetic data, we introduced a metric called the Disparity Index (DI), measuring the connectivity of gene neighbors to gauge clusterability. We compared multiple clustering algorithms to identify the best method for generating performance-driven clusters. In non-simulated data (Gene Ontology), by leveraging transfer learning and LLM, we formed a language-based similarity embedding. TodenE utilizes this embedding together with the topology-based embedding to generate putative Super-PAGs with superior performance in semantic and gene member inclusiveness.
Collapse
|
3
|
Laufer VA, Tiwari HK, Reynolds RJ, Danila MI, Wang J, Edberg JC, Kimberly RP, Kottyan LC, Harley JB, Mikuls TR, Gregersen PK, Absher DM, Langefeld CD, Arnett DK, Bridges SL. Genetic influences on susceptibility to rheumatoid arthritis in African-Americans. Hum Mol Genet 2020; 28:858-874. [PMID: 30423114 DOI: 10.1093/hmg/ddy395] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Revised: 11/05/2018] [Accepted: 11/09/2018] [Indexed: 12/29/2022] Open
Abstract
Large meta-analyses of rheumatoid arthritis (RA) susceptibility in European (EUR) and East Asian (EAS) populations have identified >100 RA risk loci, but genome-wide studies of RA in African-Americans (AAs) are absent. To address this disparity, we performed an analysis of 916 AA RA patients and 1392 controls and aggregated our data with genotyping data from >100 000 EUR and Asian RA patients and controls. We identified two novel risk loci that appear to be specific to AAs: GPC5 and RBFOX1 (PAA < 5 × 10-9). Most RA risk loci are shared across different ethnicities, but among discordant loci, we observed strong enrichment of variants having large effect sizes. We found strong evidence of effect concordance for only 3 of the 21 largest effect index variants in EURs. We used the trans-ethnic fine-mapping algorithm PAINTOR3 to prioritize risk variants in >90 RA risk loci. Addition of AA data to those of EUR and EAS descent enabled identification of seven novel high-confidence candidate pathogenic variants (defined by posterior probability > 0.8). In summary, our trans-ethnic analyses are the first to include AAs, identified several new RA risk loci and point to candidate pathogenic variants that may underlie this common autoimmune disease. These findings may lead to better ways to diagnose or stratify treatment approaches in RA.
Collapse
Affiliation(s)
- Vincent A Laufer
- Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Hemant K Tiwari
- Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Richard J Reynolds
- Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Maria I Danila
- Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Jelai Wang
- Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Jeffrey C Edberg
- Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Robert P Kimberly
- Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Leah C Kottyan
- Center for Autoimmune Genetics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - John B Harley
- Center for Autoimmune Genetics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA.,United States Department of Veterans Affairs Medical Center, Cincinnati, OH, USA
| | - Ted R Mikuls
- VA Nebraska-Western Iowa Health Care System and the Department of Internal Medicine, University of Nebraska Medical Center, Omaha, NE, USA
| | - Peter K Gregersen
- Robert S. Boas Center for Genomics and Human Genetics, Feinstein Institute for Medical Research, North Shore-LIJ Health System, Manhasset, NY, USA
| | - Devin M Absher
- Hudson Alpha Institute for Biotechnology, Huntsville, AL, USA
| | - Carl D Langefeld
- Department of Biostatistical Sciences, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Donna K Arnett
- University of Kentucky College of Public Health, Lexington, KY, USA
| | - S Louis Bridges
- Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| |
Collapse
|
4
|
Plant D, Barton A. Adding value to real-world data: the role of biomarkers. Rheumatology (Oxford) 2020; 59:31-38. [PMID: 31329972 PMCID: PMC6909909 DOI: 10.1093/rheumatology/kez113] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Revised: 02/28/2019] [Indexed: 12/13/2022] Open
Abstract
Adding biomarker information to real world datasets (e.g. biomarker data collected into disease/drug registries) can enhance mechanistic understanding of intra-patient differences in disease trajectories and differences in important clinical outcomes. Biomarkers can detect pathologies present early in disease potentially paving the way for preventative intervention strategies, which may help patients to avoid disability, poor treatment outcome, disease sequelae and premature mortality. However, adding biomarker data to real world datasets comes with a number of important challenges including sample collection and storage, study design and data analysis and interpretation. In this narrative review we will consider the benefits and challenges of adding biomarker data to real world datasets and discuss how biomarker data have added to our understanding of complex diseases, focusing on rheumatoid arthritis.
Collapse
Affiliation(s)
- Darren Plant
- Manchester Academic Health Science Centre, The University of Manchester, Arthritis Research UK Centre for Genetics and Genomics, Centre for Musculoskeletal Research, UK
- Manchester Academic Health Science Centre, NIHR Manchester Biomedical Research Centre, Manchester University NHS Foundation Trust, Manchester, UK
| | - Anne Barton
- Manchester Academic Health Science Centre, The University of Manchester, Arthritis Research UK Centre for Genetics and Genomics, Centre for Musculoskeletal Research, UK
- Manchester Academic Health Science Centre, NIHR Manchester Biomedical Research Centre, Manchester University NHS Foundation Trust, Manchester, UK
| |
Collapse
|
5
|
Wermuth PJ, Piera-Velazquez S, Rosenbloom J, Jimenez SA. Existing and novel biomarkers for precision medicine in systemic sclerosis. Nat Rev Rheumatol 2019; 14:421-432. [PMID: 29789665 DOI: 10.1038/s41584-018-0021-9] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The discovery and validation of biomarkers resulting from technological advances in the analysis of genomic, transcriptomic, lipidomic and metabolomic pathways involved in the pathogenesis of complex human diseases have led to the development of personalized and rationally designed approaches for the clinical management of such disorders. Although some of these approaches have been applied to systemic sclerosis (SSc), an unmet need remains for validated, non-invasive biomarkers to aid in the diagnosis of SSc, as well as in the assessment of disease progression and response to therapeutic interventions. Advances in global transcriptomic technology over the past 15 years have enabled the assessment of microRNAs that circulate in the blood of patients and the analysis of the macromolecular content of a diverse group of lipid bilayer membrane-enclosed extracellular vesicles, such as exosomes and other microvesicles, which are released by all cells into the extracellular space and circulation. Such advances have provided new opportunities for the discovery of biomarkers in SSc that could potentially be used to improve the design and evaluation of clinical trials and that will undoubtedly enable the development of personalized and individualized medicine for patients with SSc.
Collapse
Affiliation(s)
- Peter J Wermuth
- Jefferson Institute of Molecular Medicine, Thomas Jefferson University, Philadelphia, PA, USA.,The Joan and Joel Rosenbloom Center for Fibrosis Research, Thomas Jefferson University, Philadelphia, PA, USA
| | - Sonsoles Piera-Velazquez
- Jefferson Institute of Molecular Medicine, Thomas Jefferson University, Philadelphia, PA, USA.,The Joan and Joel Rosenbloom Center for Fibrosis Research, Thomas Jefferson University, Philadelphia, PA, USA
| | - Joel Rosenbloom
- Jefferson Institute of Molecular Medicine, Thomas Jefferson University, Philadelphia, PA, USA.,The Joan and Joel Rosenbloom Center for Fibrosis Research, Thomas Jefferson University, Philadelphia, PA, USA
| | - Sergio A Jimenez
- Jefferson Institute of Molecular Medicine, Thomas Jefferson University, Philadelphia, PA, USA. .,The Joan and Joel Rosenbloom Center for Fibrosis Research, Thomas Jefferson University, Philadelphia, PA, USA. .,The Scleroderma Center, Thomas Jefferson University, Philadelphia, PA, USA.
| |
Collapse
|