1
|
Ahmed MH, Samia NSN, Singh G, Gupta V, Mishal MFM, Hossain A, Suman KH, Raza A, Dutta AK, Labony MA, Sultana J, Faysal EH, Alnasser SM, Alam P, Azam F. An immuno-informatics approach for annotation of hypothetical proteins and multi-epitope vaccine designed against the Mpox virus. J Biomol Struct Dyn 2024; 42:5288-5307. [PMID: 37519185 DOI: 10.1080/07391102.2023.2239921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Accepted: 06/09/2023] [Indexed: 08/01/2023]
Abstract
A worrying new outbreak of Monkeypox (Mpox) in humans is caused by the Mpox virus (MpoxV). The pathogen has roughly 28 hypothetical proteins of unknown structure, function, and pathogenicity. Using reliable bioinformatics tools, we attempted to analyze the MpoxV genome, identify the role of hypothetical proteins (HPs), and design a potential candidate vaccine. Out of 28, we identified seven hypothetical proteins using multi-server validation with high confidence for the occurrence of conserved domains. Their physical, chemical, and functional characterizations, including molecular weight, theoretical isoelectric point, 3D structures, GRAVY value, subcellular localization, functional motifs, antigenicity, and virulence factors, were performed. We predicted possible cytotoxic T cell (CTL), helper T cell (HTL) and linear and conformational B cell epitopes, which were combined in a 219 amino acid multiepitope vaccine with human β defensin as a linker. This multi-epitopic vaccine was structurally modelled and docked with toll-like receptor-3 (TLR-3). The dynamical stability of the vaccine-TLR-3 docked complexes exhibited stable interactions based on RMSD and RMSF tests. Additionally, the modelled vaccine was cloned in-silico in an E. coli host to check the appropriate expression of the final vaccine built. Our results might conform to an immunogenic and safe vaccine, which would require further experimental validation.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Md Hridoy Ahmed
- Department of Genetic Engineering and Biotechnology, University of Chittagong, Chittagong, Bangladesh
| | - Nure Sharaf Nower Samia
- Department of Life Sciences (DLS), School of Environment and Life Sciences (SELS), Independent University, Dhaka, Bangladesh
| | - Gagandeep Singh
- Kusuma School of Biological Sciences, Indian Institute of Technology, New Delhi, India
- Section of Microbiology, Central Ayurveda Research Institute, Jhansi CCRAS, Ministry of Ayush, India
| | - Vandana Gupta
- Department of Microbiology, Ram Lal Anand College, University of Delhi, New Delhi, India
| | | | - Alomgir Hossain
- Department of Genetic Engineering and Biotechnology, University of Rajshahi, Rajshahi, Bangladesh
| | | | - Adnan Raza
- Bioscience department, COMSATS University of Islamabad, Islamabad, Pakistan
| | - Amit Kumar Dutta
- Department of Microbiology, University of Rajshahi, Rajshahi, Bangladesh
| | - Moriom Akhter Labony
- Department of Genetic Engineering and Biotechnology, University of Chittagong, Chittagong, Bangladesh
| | - Jakia Sultana
- Department of Botany, University of Rajshahi, Rajshahi, Bangladesh
| | | | - Sulaiman Mohammed Alnasser
- Department of Pharmacology and Toxicology, Unaizah College of Pharmacy, Qassim University, Buraydah, Saudi Arabia
| | - Prawez Alam
- Department of Pharmacognosy, College of Pharmacy, Prince Sattam Bin Abdulaziz University, Al Kharj, Saudi Arabia
| | - Faizul Azam
- Department of Pharmaceutical Chemistry and Pharmacognosy, Unaizah College of Pharmacy, Qassim University, Buraydah, Saudi Arabia
| |
Collapse
|
2
|
Yamaguchi T, Ikegami M, Aruga T, Kanemasa Y, Horiguchi SI, Kawai K, Takao M, Yamada T, Ishida H. Genomic landscape of comprehensive genomic profiling in patients with malignant solid tumors in Japan. Int J Clin Oncol 2024:10.1007/s10147-024-02554-8. [PMID: 38795236 DOI: 10.1007/s10147-024-02554-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 05/14/2024] [Indexed: 05/27/2024]
Abstract
BACKGROUND Comprehensive genomic profiling (CGP) can aid the discovery of clinically useful, candidate antitumor agents; however, the variant annotations sometimes differ among the various types of CGP tests as well as the public database. The aim of this study is to clarify the genomic landscape of evaluating detected variants in patients with a malignant solid tumor. METHODS The present, cross-sectional study used data from 57,084 patients with a malignant solid tumor who underwent CGP at the Center for Cancer Genomics and Advanced Therapeutics (C-CAT) between June 1, 2019 and August 18, 2023. The pathogenicity of the variants was annotated using public databases. RESULTS As a result of re-annotation of the detected variants, 20.1% were pathogenic and 1.4% were benign. The mean number of pathogenic variants was 4.30 (95% confidence interval: 4.27-4.32) per patient. Of the entire cohort, 5.7% had no pathogenic variant. The co-occurrence of the genes depended on the tumor type. Germline findings were detected in 6.2%, 8.8%, and 15.8% of the patients using a tumor/normal panel, tumor-only panel, and liquid panel, respectively, with the most common gene being BRCA2 followed by TP53 and BRCA1. CONCLUSIONS The detected variants should be re-annotated because several benign variants or variants of unknown significance were included in the CGP, and the genomic landscape derived from these results will help researchers and physicians interpret the results of CGP tests. The method of extracting presumptive, germline, pathogenic variants from patients using a tumor-only panel or circulating tumor DNA panel requires improvement.
Collapse
Affiliation(s)
- Tatsuro Yamaguchi
- Department of Clinical Genetics, Tokyo Metropolitan Cancer and Infectious Diseases Center Komagome Hospital, Tokyo, Japan.
| | - Masachika Ikegami
- Department of Clinical Genetics, Tokyo Metropolitan Cancer and Infectious Diseases Center Komagome Hospital, Tokyo, Japan
- Department of Musculoskeletal Oncology, Tokyo Metropolitan Cancer and Infectious Diseases Center Komagome Hospital, 3-18-22 Honkomagome, Bunkyo-Ku, Tokyo, 113-8677, Japan
| | - Tomoyuki Aruga
- Department of Clinical Genetics, Tokyo Metropolitan Cancer and Infectious Diseases Center Komagome Hospital, Tokyo, Japan
- Department of Surgery, Tokyo Metropolitan Cancer and Infectious Diseases Center Komagome Hospital, Tokyo, Japan
| | - Yusuke Kanemasa
- Department of Clinical Genetics, Tokyo Metropolitan Cancer and Infectious Diseases Center Komagome Hospital, Tokyo, Japan
- Department of Medical Oncology, Tokyo Metropolitan Cancer and Infectious Diseases Center Komagome Hospital, Tokyo, Japan
| | - Shin-Ichiro Horiguchi
- Department of Pathology, Tokyo Metropolitan Cancer and Infectious Diseases Center Komagome Hospital, Tokyo, Japan
| | - Kazushige Kawai
- Department of Surgery, Tokyo Metropolitan Cancer and Infectious Diseases Center Komagome Hospital, Tokyo, Japan
| | - Misato Takao
- Department of Surgery, Tokyo Metropolitan Cancer and Infectious Diseases Center Komagome Hospital, Tokyo, Japan
| | - Takeshi Yamada
- Department of Surgery, Nihon Medical University, Tokyo, Japan
| | - Hideyuki Ishida
- Department of Digestive Tract and General Surgery, Saitama Medical Center, Saitama Medical University, Kawagoe, Japan
| |
Collapse
|
3
|
Qin J, Wang J, Chen J, Xu J, Liu S, Deng D, Li F. Homozygous variant in DRC3 (LRRC48) gene causes asthenozoospermia and male infertility. J Hum Genet 2024:10.1038/s10038-024-01253-6. [PMID: 38769386 DOI: 10.1038/s10038-024-01253-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 04/16/2024] [Accepted: 04/16/2024] [Indexed: 05/22/2024]
Abstract
Human infertility affects 10-15% of couples. Asthenozoospermia accounts for 18% of men with infertility and is a common male infertility phenotype. The nexin-dynein regulatory complex (N-DRC) is a large protein complex in the sperm flagellum that connects adjacent doublets of microtubules. Defects in the N-DRC can disrupt cilia/flagellum movement, resulting in primary ciliary dyskinesia and male infertility. Using whole-exome sequencing, we identified a pathological homozygous variant of the dynein regulatory complex subunit 3 (DRC3) gene, which expresses leucine-rich repeat-containing protein 48, a component of the N-DRC, in a patient with asthenozoospermia. The variant ENST00000313838.12: c.644dup (p. Glu216GlyfsTer36) causes premature translational arrest of DRC3, resulting in a dysfunctional DRC3 protein. The patient's semen count, color, and pH were normal according to the reference values of the World Health Organization guidelines; however, sperm motility and progressive motility were reduced. DRC3 protein was not detected in the patient's sperm and the ultrastructure of the patient's sperm flagella was destroyed. More importantly, the DRC3 variant reduced its interaction with other components of the N-DRC, including dynein regulatory complex subunits 1, 2, 4, 5, 7, and 8. Our data not only revealed the essential biological functions of DRC3 in sperm flagellum movement and structure but also provided a new basis for the clinical genetic diagnosis of male infertility.
Collapse
Affiliation(s)
- Jiao Qin
- Department of Andrology/Sichuan Human Sperm Bank, West China Second University Hospital, Sichuan University, Chengdu, China
- Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, Chengdu, China
| | - Jinyu Wang
- Department of Medical Genetics, West China Second University Hospital of Sichuan University, Chengdu, 610041, China
| | - Jianhai Chen
- Department of Ecology and Evolution, Biological Sciences Division, The University of Chicago, 1101 E 57th Street, Chicago, IL, 60637, USA
| | - Jinyan Xu
- Department of Andrology/Sichuan Human Sperm Bank, West China Second University Hospital, Sichuan University, Chengdu, China
- Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, Chengdu, China
| | - Shanling Liu
- Department of Medical Genetics, West China Second University Hospital of Sichuan University, Chengdu, 610041, China.
| | - Dong Deng
- Department of Obstetrics, Key Laboratory of Birth Defects and Related Disease of Women and Children of MOE, State Key Laboratory of Biotherapy, West China Second Hospital, Sichuan University, Chengdu, 610041, China.
| | - Fuping Li
- Department of Andrology/Sichuan Human Sperm Bank, West China Second University Hospital, Sichuan University, Chengdu, China.
- Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, Chengdu, China.
| |
Collapse
|
4
|
Benner L, Muron S, Gomez JG, Oliver B. OVO Positively Regulates Essential Maternal Pathways by Binding Near the Transcriptional Start Sites in the Drosophila Female Germline. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.01.565184. [PMID: 38076814 PMCID: PMC10705541 DOI: 10.1101/2023.11.01.565184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/25/2023]
Abstract
Differentiation of female germline stem cells into a mature oocyte includes the expression of RNAs and proteins that drive early embryonic development in Drosophila. We have little insight into what activates the expression of these maternal factors. One candidate is the zinc-finger protein OVO. OVO is required for female germline viability and has been shown to positively regulate its own expression, as well as a downstream target, ovarian tumor, by binding to the transcriptional start site (TSS). To find additional OVO targets in the female germline and further elucidate OVO's role in oocyte development, we performed ChIP-seq to determine genome-wide OVO occupancy, as well as RNA-seq comparing hypomorphic and wild type rescue ovo alleles. OVO preferentially binds in close proximity to target TSSs genome-wide, is associated with open chromatin, transcriptionally active histone marks, and OVO-dependent expression. Motif enrichment analysis on OVO ChIP peaks identified a 5'-TAACNGT-3' OVO DNA binding motif spatially enriched near TSSs. However, the OVO DNA binding motif does not exhibit precise motif spacing relative to the TSS characteristic of RNA Polymerase II complex binding core promoter elements. Integrated genomics analysis showed that 525 genes that are bound and increase in expression downstream of OVO are known to be essential maternally expressed genes. These include genes involved in anterior/posterior/germ plasm specification (bcd, exu, swa, osk, nos, aub, pgc, gcl), egg activation (png, plu, gnu, wisp, C(3)g, mtrm), translational regulation (cup, orb, bru1, me31B), and vitelline membrane formation (fs(1)N, fs(1)M3, clos). This suggests that OVO is a master transcriptional regulator of oocyte development and is responsible for the expression of structural components of the egg as well as maternally provided RNAs that are required for early embryonic development.
Collapse
Affiliation(s)
- Leif Benner
- Section of Developmental Genomics, Laboratory of Biochemistry and Genetics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Savannah Muron
- Section of Developmental Genomics, Laboratory of Biochemistry and Genetics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Jillian G Gomez
- Section of Developmental Genomics, Laboratory of Biochemistry and Genetics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Brian Oliver
- Section of Developmental Genomics, Laboratory of Biochemistry and Genetics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
5
|
Hu Z, Chen J, Olatoye MO, Zhang H, Lin Z. Transcriptome-wide expression landscape and starch synthesis pathway co-expression network in sorghum. THE PLANT GENOME 2024:e20448. [PMID: 38602082 DOI: 10.1002/tpg2.20448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
The gene expression landscape across different tissues and developmental stages reflects their biological functions and evolutionary patterns. Integrative and comprehensive analyses of all transcriptomic data in an organism are instrumental to obtaining a comprehensive picture of gene expression landscape. Such studies are still very limited in sorghum, which limits the discovery of the genetic basis underlying complex agricultural traits in sorghum. We characterized the genome-wide expression landscape for sorghum using 873 RNA-sequencing (RNA-seq) datasets representing 19 tissues. Our integrative analysis of these RNA-seq data provides the most comprehensive transcriptomic atlas for sorghum, which will be valuable for the sorghum research community for functional characterizations of sorghum genes. Based on the transcriptome atlas, we identified 595 housekeeping genes (HKGs) and 2080 tissue-specific expression genes (TEGs) for the 19 tissues. We identified different gene features between HKGs and TEGs, and we found that HKGs have experienced stronger selective constraints than TEGs. Furthermore, we built a transcriptome-wide co-expression network (TW-CEN) comprising 35 modules with each module enriched in specific Gene Ontology terms. High-connectivity genes in TW-CEN tend to express at high levels while undergoing intensive selective pressure. We also built global and seed-preferential co-expression networks of starch synthesis pathways, which indicated that photosynthesis and microtubule-based movement play important roles in starch synthesis. The global transcriptome atlas of sorghum generated by this study provides an important functional genomics resource for trait discovery and insight into starch synthesis regulation in sorghum.
Collapse
Affiliation(s)
- Zhenbin Hu
- Department of Biology, Saint Louis University, Saint Louis, Missouri, USA
| | - Junhao Chen
- Department of Biology, Saint Louis University, Saint Louis, Missouri, USA
| | - Marcus O Olatoye
- USDA-ARS, Forage Seed and Cereal Research Unit, Prosser, Washington, USA
| | - Hengyou Zhang
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design and Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China
| | - Zhenguo Lin
- Department of Biology, Saint Louis University, Saint Louis, Missouri, USA
| |
Collapse
|
6
|
Zou J, Zhang H, Wu Z, Hu W, Zhang T, Xie H, Huang Y, Zhou H. TIGD1 Is an Independent Prognostic Factor that Promotes the Progression of Colon Cancer. Cancer Biother Radiopharm 2024; 39:223-235. [PMID: 36508261 DOI: 10.1089/cbr.2022.0052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Background: Trigger transposable element-derived 1 (TIGD1) is a human-specific gene, but no studies have been conducted to determine its mechanism of action. Our aim is to ascertain the function and mode of action of TIGD1 in the development of colon cancer. Materials and Methods: The authors used bioinformatics to analyze the relationship between TIGD1 and the clinical characteristics of colon cancer, as well as its prognosis. A series of cell assays were conducted to assess the function of TIGD1 in the proliferation and migration of colon cancer, and flow cytometry was used to explore its effects on apoptosis and the cell cycle. Results: The authors discovered that the expression of TIGD1 was remarkably elevated in colon cancer. Clinical correlation analysis demonstrated that TIGD1 expression was elevated in the tissues of advanced-stage patients, and it was remarkably elevated in individuals with both lymph node and distant metastasis. Further, the authors found that individuals showing elevated TIGD1 expression levels had a shortened survival time. Univariate and multivariate Cox regression analyses revealed that TIGD1 was an independent prognostic factor. Overexpression of the TIGD1 gene remarkedly enhances the proliferation and metastasis of colon cancer cells and suppresses apoptosis. In addition, the overexpression of TIGD1 can enhance the transition of tumor cells from the G1 toward the S phase. Western blot results suggested that TIGD1 may promote the malignant activity of colon cancer cells via the Wnt/β-catenin signaling pathway, Bcl-2, N-cadherin, BAX, E-cadherin, CDK6, and CyclinD1. Conclusions: TIGD1 may be an independent prognostic factor in the advancement of colon cancer, and therefore function as a therapeutic target.
Collapse
Affiliation(s)
- Junwei Zou
- Department of Gastrointestinal Surgery, The Second Affiliated Hospital of Wannan Medical College, Wuhu, China
| | - Hesong Zhang
- Department of Hepatobiliary Surgery, The Second People's Hospital of Wuhu, Wuhu, China
| | - Zhaoying Wu
- Department of Gastrointestinal Surgery, The Second Affiliated Hospital of Wannan Medical College, Wuhu, China
| | - Weichao Hu
- Department of Gastroenterology, The Second Affiliated Hospital of Wannan Medical College, Wuhu, China
| | - Tingting Zhang
- Department of Gastroenterology, The Second Affiliated Hospital of Wannan Medical College, Wuhu, China
| | - Hao Xie
- Department of Gastrointestinal Surgery, The Second Affiliated Hospital of Wannan Medical College, Wuhu, China
| | - Yong Huang
- Department of Gastrointestinal Surgery, The Second Affiliated Hospital of Wannan Medical College, Wuhu, China
| | - Hailang Zhou
- Department of Gastroenterology, Lianshui County People's Hospital, Huai'an, China
| |
Collapse
|
7
|
Ruiz-Serra V, Valentini S, Madroñero S, Valencia A, Porta-Pardo E. 3Dmapper: a command line tool for BioBank-scale mapping of variants to protein structures. Bioinformatics 2024; 40:btae171. [PMID: 38565273 PMCID: PMC11018535 DOI: 10.1093/bioinformatics/btae171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 02/09/2024] [Accepted: 03/30/2024] [Indexed: 04/04/2024] Open
Abstract
MOTIVATION The interpretation of genomic data is crucial to understand the molecular mechanisms of biological processes. Protein structures play a vital role in facilitating this interpretation by providing functional context to genetic coding variants. However, mapping genes to proteins is a tedious and error-prone task due to inconsistencies in data formats. Over the past two decades, numerous tools and databases have been developed to automatically map annotated positions and variants to protein structures. However, most of these tools are web-based and not well-suited for large-scale genomic data analysis. RESULTS To address this issue, we introduce 3Dmapper, a stand-alone command-line tool developed in Python and R. It systematically maps annotated protein positions and variants to protein structures, providing a solution that is both efficient and reliable. AVAILABILITY AND IMPLEMENTATION https://github.com/vicruiser/3Dmapper.
Collapse
Affiliation(s)
- Victoria Ruiz-Serra
- Barcelona Supercomputing Center (BSC)
- Josep Carreras Leukaemia Research Institute (IJC), Badalona 08916, Spain
| | - Samuel Valentini
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Trento 38123, Italy
| | - Sergi Madroñero
- Josep Carreras Leukaemia Research Institute (IJC), Badalona 08916, Spain
| | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC)
- Institució Catalana de Recerca Avançada (ICREA)
| | - Eduard Porta-Pardo
- Barcelona Supercomputing Center (BSC)
- Josep Carreras Leukaemia Research Institute (IJC), Badalona 08916, Spain
| |
Collapse
|
8
|
Suwakulsiri W, Xu R, Rai A, Chen M, Shafiq A, Greening DW, Simpson RJ. Transcriptomic analysis and fusion gene identifications of midbody remnants released from colorectal cancer cells reveals they are molecularly distinct from exosomes and microparticles. Proteomics 2024:e2300058. [PMID: 38470197 DOI: 10.1002/pmic.202300058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 02/25/2024] [Accepted: 02/27/2024] [Indexed: 03/13/2024]
Abstract
Previously, we reported that human primary (SW480) and metastatic (SW620) colorectal (CRC) cells release three classes of membrane-encapsulated extracellular vesicles (EVs); midbody remnants (MBRs), exosomes (Exos), and microparticles (MPs). We reported that MBRs were molecularly distinct at the protein level. To gain further biochemical insights into MBRs, Exos, and MPs and their emerging role in CRC, we performed, and report here, for the first time, a comprehensive transcriptome and long noncoding RNA sequencing analysis and fusion gene identification of these three EV classes using the next-generation RNA sequencing technique. Differential transcript expression analysis revealed that MBRs have a distinct transcriptomic profile compared to Exos and MPs with a high enrichment of mitochondrial transcripts lncRNA/pseudogene transcripts that are predicted to bind to ribonucleoprotein complexes, spliceosome, and RNA/stress granule proteins. A salient finding from this study is a high enrichment of several fusion genes in MBRs compared to Exos, MPs, and cell lysates from their parental cells such as MSH2 (gene encoded DNA mismatch repair protein MSH2). This suggests potential EV-liquid biopsy targets for cancer detection. Importantly, the expression of cancer progression-related transcripts found in EV classes derived from SW480 (EGFR) and SW620 (MET and MACCA1) cell lines reflects their parental cell types. Our study is the report of RNA and fusion gene compositions within MBRs (including Exos and MPs) that could have an impact on EV functionality in cancer progression and detection using EV-based RNA/ fusion gene candidates for cancer biomarkers.
Collapse
Affiliation(s)
- Wittaya Suwakulsiri
- Department of Biochemistry and Chemistry, La Trobe Institute for Molecular Science (LIMS), School of Agriculture, Biomedicine and Environment, La Trobe University, Melbourne, Victoria, Australia
- School of Biomedical Engineering, Faculty of Engineering, The University of Sydney, Darlington, New South Wales, Australia
| | - Rong Xu
- Nanobiotechnology Laboratory, Australia Centre for Blood Diseases, Centre Clinical School, Monash University, Melbourne, Victoria, Australia
| | - Alin Rai
- Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Baker Department of Cardiovascular Research, Translation and Implementation, La Trobe University, Melbourne, Victoria, Australia
- Baker Department of Cardiometabolic Health, University of Melbourne, Melbourne, Victoria, Australia
| | - Maoshan Chen
- Laboratory of Radiation Biology, Department of Blood Transfusion, Laboratory Medicine Centre, The Second Affiliated Hospital, Army Medical University, Chongqing, China
| | - Adnan Shafiq
- Department of Cell & Developmental Biology, School of Medicine, Vanderbilt University, Nashville, Tennessee, USA
| | - David W Greening
- Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Baker Department of Cardiovascular Research, Translation and Implementation, La Trobe University, Melbourne, Victoria, Australia
- Baker Department of Cardiometabolic Health, University of Melbourne, Melbourne, Victoria, Australia
| | - Richard J Simpson
- Department of Biochemistry and Chemistry, La Trobe Institute for Molecular Science (LIMS), School of Agriculture, Biomedicine and Environment, La Trobe University, Melbourne, Victoria, Australia
| |
Collapse
|
9
|
Zhao L, Li Z, Huang B, Mi D, Xu D, Sun Y. Integrating evolutionarily conserved mechanism of response to radiation for exploring novel Caenorhabditis elegans radiation-responsive genes for estimation of radiation dose associated with spaceflight. CHEMOSPHERE 2024; 351:141148. [PMID: 38211791 DOI: 10.1016/j.chemosphere.2024.141148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 12/09/2023] [Accepted: 01/06/2024] [Indexed: 01/13/2024]
Abstract
During space exploration, space radiation is widely recognized as an inescapable perilous stressor, owing to its capacity to induce genomic DNA damage and escalate the likelihood of detrimental health outcomes. Rapid and reliable estimation of space radiation dose holds paramount significance in accurately assessing the health risks associated with spaceflight. However, the identification of space radiation-responsive genes, with their potential to serve as early indicators for diagnosing radiation dose associated with spaceflight, continues to pose a significant challenge. In this study, based on the evolutionarily conserved mechanism of radiation response, an in silico analysis method of homologous comparison was performed to identify the Caenorhabditis elegans orthologues of human radiation-responsive genes with possible roles in the major processes of response to radiation, and thereby to explore the potential C. elegans radiation-responsive genes for evaluating the levels of space radiation exposure. The results showed that there were 60 known C. elegans radiation-responsive genes and 211 C. elegans orthologues of human radiation-responsive genes implicated in the major processes of response to radiation. Through an investigation of all available transcriptomic datasets obtained from space-flown C. elegans, it was observed that the expression levels of the majority of these putative C. elegans radiation-responsive genes identified in this study were notably changed across various spaceflight conditions. Furthermore, this study indicated that within the identified genes, 19 known C. elegans radiation-responsive genes and 40 newly identified C. elegans orthologues of human radiation-responsive genes exhibited a remarkable positive correlation with the duration of spaceflight. Moreover, a noteworthy presence of substantial multi-collinearity among the majority of these identified genes was observed. This observation lends support to the possibility of treating each identified gene as an independent indicator of radiation dose in space. Ultimately, a subset of 15 potential radiation-responsive genes was identified, presenting the most promising indicators for estimation of radiation dose associated with spaceflight in C. elegans.
Collapse
Affiliation(s)
- Lei Zhao
- Institute of Environmental Systems Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian, 116026, Liaoning, China.
| | - Zejun Li
- Institute of Environmental Systems Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian, 116026, Liaoning, China
| | - Baohang Huang
- Institute of Environmental Systems Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian, 116026, Liaoning, China
| | - Dong Mi
- College of Science, Dalian Maritime University, Dalian, 116026, Liaoning, China
| | - Dan Xu
- Institute of Environmental Systems Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian, 116026, Liaoning, China
| | - Yeqing Sun
- Institute of Environmental Systems Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian, 116026, Liaoning, China.
| |
Collapse
|
10
|
Vahab N, Bonu T, Kuhlmann L, Ramialison M, Tyagi S. Uncovering co-regulatory modules and gene regulatory networks in the heart through machine learning-based analysis of large-scale epigenomic data. Comput Biol Med 2024; 171:108068. [PMID: 38354497 DOI: 10.1016/j.compbiomed.2024.108068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Revised: 12/30/2023] [Accepted: 01/27/2024] [Indexed: 02/16/2024]
Abstract
The availability of large-scale epigenomic data from various cell types and conditions has yielded valuable insights for evaluating and learning features predicting the co-binding of transcription factors (TF). However, prior attempts to develop models predicting motif co-occurrence lacked scalability for globally analyzing any motif combination or making cross-species predictions. Moreover, mapping co-regulatory modules (CRM) to gene regulatory networks (GRN) is crucial for understanding underlying function. Currently, no comprehensive pipeline exists for large-scale, rapid, and accurate CRM and GRN identification. In this study, we analyzed and evaluated different TF binding characteristics facilitating biologically significant co-binding to identify all potential clusters of co-binding TFs. We curated the UniBind database, containing ChIP-Seq data from over 1983 samples and 232 TFs, and implemented two machine learning models to predict CRMs and the potential regulatory networks they operate on. Two machine learning models, Convolution Neural Networks (CNN) and Random Forest Classifier(RFC), used to predict co-binding between TFs, were compared using precision-recall Receiver Operating Characteristic (ROC) curves. CNN outperformed RFC (AUC 0.94 vs. 0.88) and achieved higher F1 scores (0.938 vs. 0.872). The CRMs generated by the clustering algorithm were validated against ChipAtlas and MCOT, revealing additional motifs forming CRMs. We predicted 200k CRMs for 50k+ human genes, validated against recent CRM prediction methods with 100% overlap. Further, we narrowed our focus to study heart-related regulatory motifs, filtering the generated CRMs to report 1784 Cardiac CRMs containing at least four cardiac TFs. Identified cardiac CRMs revealed potential novel regulators like ARID3A and RXRB for SCAD, including known TFs like PPARG for F11R. Our findings highlight the importance of the NKX family of transcription factors in cardiac development and provide potential targets for further investigation in cardiac disease.
Collapse
Affiliation(s)
- Naima Vahab
- School of Computational Technologies, RMIT University, Melbourne VIC 3000, Australia; Department of Infectious Diseases, Alfred Hospital, Prahran VIC 3008, Australia
| | - Tarun Bonu
- Faculty of Information Technology, Monash University, Clayton VIC 3800, Australia
| | - Levin Kuhlmann
- Faculty of Information Technology, Monash University, Clayton VIC 3800, Australia
| | | | - Sonika Tyagi
- School of Computational Technologies, RMIT University, Melbourne VIC 3000, Australia; Department of Infectious Diseases, Alfred Hospital, Prahran VIC 3008, Australia.
| |
Collapse
|
11
|
Abdolvand M, Chermahini ZM, Bahaloo S, Emami MH, Fahim A, Rahimi H, Amjadi E, Maghool F, Rohani F, Dadkhah M, Farhadian N, Vatandoust N, Abdolvand S, Darehsari MR, Chehelgerdi M, Beni FA, Khodadoostan M, Hemati S, Salehi M. New long noncoding RNA biomarkers and ceRNA networks on miR-616-3p in colorectal cancer: Bioinformatics-based study. JOURNAL OF RESEARCH IN MEDICAL SCIENCES : THE OFFICIAL JOURNAL OF ISFAHAN UNIVERSITY OF MEDICAL SCIENCES 2024; 29:10. [PMID: 38524750 PMCID: PMC10956565 DOI: 10.4103/jrms.jrms_786_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 10/18/2023] [Accepted: 11/01/2023] [Indexed: 03/26/2024]
Abstract
Background Cancer development is aided by the role of long noncoding RNAs (lncRNAs) that act as competing endogenous RNAs (ceRNAs) absorbing microRNAs (miRNAs). We aimed to discover a novel regulatory axis in colorectal cancer (CRC) and potential biomarkers based on miR-616-3p. Materials and Methods The gene expression omnibus database was mined for differentially expressed lncRNAs (DELs) and mRNAs. LncRNAs and mRNAs were predicted using the RegRNA and TargetScan databases. A combination of the ciBioPortal and Ensemble databases was used to locate the mRNAs. Cytoscape 3.7.1-built CeRNA networks. A quantitative real-time polymerase chain reaction (qRT-PCR) was utilized to confirm the expression levels of these RNA molecules. Statistical analyses were implemented by GraphPad Prism 9. Results qRT-PCR showed (Linc01282, lnc-MYADM-1:1, and Zinc Finger Protein 347 [ZNF347]) were overexpressed whereas, (salt-inducible kinases 1 [SIK1], and miR-616-3p) were down regulated. Conclusion These results identify unique, unreported lncRNAs as CRC prognostic biomarkers, as well as prospective mRNAs as new treatment targets and predictive biomarkers for CRC. In addition, our study uncovered unexplored ceRNA networks that should be studied further in CRC.
Collapse
Affiliation(s)
- Mohammad Abdolvand
- Cellular, Molecular and Genetics Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
- Medical Genetics Research Center of Genome, Isfahan University of Medical Sciences, Isfahan, Iran
- Department of Genetics and Molecular Biology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Zahra Mohammadi Chermahini
- Department of Genetics and Molecular Biology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Sahar Bahaloo
- Department of Biology, Faculty of Sciences, Yazd University, Yazd, Iran
| | - Mohammad Hassan Emami
- Poursina Hakim Digestive Diseases Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Alireza Fahim
- Poursina Hakim Digestive Diseases Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Hojjatolah Rahimi
- Poursina Hakim Digestive Diseases Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Elham Amjadi
- Poursina Hakim Digestive Diseases Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Fatemeh Maghool
- Poursina Hakim Digestive Diseases Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Fattah Rohani
- Faculty of Veterinary Medicine of Shahrekord, Shahrekord, Iran
| | - Mina Dadkhah
- Department of Genetics and Molecular Biology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Nooshin Farhadian
- Department of Genetics and Molecular Biology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Nasimeh Vatandoust
- Department of Genetics and Molecular Biology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Shirin Abdolvand
- Department of Genetics, Faculty of Sciences, Islamic Azad University, Shahrekord, Iran
| | | | - Mohammad Chehelgerdi
- Department of Genetics, Faculty of Sciences, Islamic Azad University, Shahrekord, Iran
| | - Faeze Ahmadi Beni
- Cellular, Molecular and Genetics Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
- Medical Genetics Research Center of Genome, Isfahan University of Medical Sciences, Isfahan, Iran
- Department of Genetics and Molecular Biology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Mahsa Khodadoostan
- Department of Gastroenterology and Hepatology, AlZahra Hospital, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Simin Hemati
- Department of Radiooncology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Mansoor Salehi
- Cellular, Molecular and Genetics Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
- Medical Genetics Research Center of Genome, Isfahan University of Medical Sciences, Isfahan, Iran
- Department of Genetics and Molecular Biology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| |
Collapse
|
12
|
Coussement L, Van Criekinge W, De Meyer T. Quantitative transcriptomic and epigenomic data analysis: a primer. BIOINFORMATICS ADVANCES 2024; 4:vbae019. [PMID: 38586118 PMCID: PMC10997052 DOI: 10.1093/bioadv/vbae019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 02/01/2024] [Accepted: 02/09/2024] [Indexed: 04/09/2024]
Abstract
The advent of microarray and second generation sequencing technology has revolutionized the field of molecular biology, allowing researchers to quantitatively assess transcriptomic and epigenomic features in a comprehensive and cost-efficient manner. Moreover, technical advancements have pushed the resolution of these sequencing techniques to the single cell level. As a result, the bottleneck of molecular biology research has shifted from the bench to the subsequent omics data analysis. Even though most methodologies share the same general strategy, state-of-the-art literature typically focuses on data type specific approaches and already assumes expert knowledge. Here, however, we aim at providing conceptual insight in the principles of genome-wide quantitative transcriptomic and epigenomic (including open chromatin assay) data analysis by describing a generic workflow. By starting from a general framework and its assumptions, the need for alternative or additional data-analytical solutions when working with specific data types becomes clear, and are hence introduced. Thus, we aim to enable readers with basic omics expertise to deepen their conceptual and statistical understanding of general strategies and pitfalls in omics data analysis and to facilitate subsequent progression to more specialized literature.
Collapse
Affiliation(s)
- Louis Coussement
- Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, 9000, Belgium
| | - Wim Van Criekinge
- Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, 9000, Belgium
| | - Tim De Meyer
- Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, 9000, Belgium
| |
Collapse
|
13
|
Shahzaib M, Khan UM, Azhar MT, Atif RM, Khan SH, Zaman QU, Rana IA. Phylogenomic curation of Ovate Family Proteins (OFPs) in the U's Triangle of Brassica L. indicates stress-induced growth modulation. PLoS One 2024; 19:e0297473. [PMID: 38277374 PMCID: PMC10817133 DOI: 10.1371/journal.pone.0297473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 01/06/2024] [Indexed: 01/28/2024] Open
Abstract
The Ovate Family Proteins (OFPs) gene family houses a class of proteins that are involved in regulating plant growth and development. To date, there is no report of the simultaneous functional characterization of this gene family in all members of U's Triangle of Brassica. Here, we retrieved a combined total of 256 OFP protein sequences and analyzed their chromosomal localization, gene structure, conserved protein motif domains, and the pattern of cis-acting regulatory elements. The abundance of light-responsive elements like G-box, MRE, and GT1 motif suggests that OFPs are sensitive to the stimuli of light. The protein-protein interaction network analysis revealed that OFP05 and its orthologous genes were involved in regulating the process of transcriptional repression through their interaction with homeodomain transcription factors like KNAT and BLH. The presence of domains like DNA binding 2 and its superfamily speculated the involvement of OFPs in regulating gene expression. The biotic and abiotic stress, and the tissue-specific expression analysis of the RNA-seq datasets revealed that some of the genes such as BjuOFP30, and BnaOFP27, BolOFP11, and BolOFP10 were highly upregulated in seed coat at the mature stage and roots under various chemical stress conditions respectively which suggests their crucial role in plant growth and development processes. Experimental validation of prominent BnaOFPs such as BnaOFP27 confirmed their involvement in regulating gene expression under salinity, heavy metal, drought, heat, and cold stress. The GO and KEGG pathway enrichment analysis also sheds light on the involvement of OFPs in regulating plant growth and development. These findings have the potential to serve as a forerunner for future studies in terms of functionally diverse analysis of the OFP gene family in Brassica and other plant species.
Collapse
Affiliation(s)
- Muhammad Shahzaib
- Centre of Agricultural Biochemistry and Biotechnology, University of Agriculture, Faisalabad, Faisalabad, Punjab, Pakistan
- Centre for Advanced Studies in Agriculture and Food Security, University of Agriculture, Faisalabad, Faisalabad, Punjab, Pakistan
| | - Uzair Muhammad Khan
- Centre for Advanced Studies in Agriculture and Food Security, University of Agriculture, Faisalabad, Faisalabad, Punjab, Pakistan
| | - Muhammad Tehseen Azhar
- Department of Plant Breeding and Genetics, University of Agriculture, Faisalabad, Faisalabad, Punjab, Pakistan
| | - Rana Muhammad Atif
- Centre for Advanced Studies in Agriculture and Food Security, University of Agriculture, Faisalabad, Faisalabad, Punjab, Pakistan
- Department of Plant Breeding and Genetics, University of Agriculture, Faisalabad, Faisalabad, Punjab, Pakistan
| | - Sultan Habibullah Khan
- Centre of Agricultural Biochemistry and Biotechnology, University of Agriculture, Faisalabad, Faisalabad, Punjab, Pakistan
- Centre for Advanced Studies in Agriculture and Food Security, University of Agriculture, Faisalabad, Faisalabad, Punjab, Pakistan
| | - Qamar U. Zaman
- Hainan Yazhou Bay Seed Laboratory, Sanya Nanfan Research Institute of Hainan University, Sanya, China
- College of Tropical Crops, Hainan University, Haikou, China
| | - Iqrar Ahmad Rana
- Centre of Agricultural Biochemistry and Biotechnology, University of Agriculture, Faisalabad, Faisalabad, Punjab, Pakistan
- Centre for Advanced Studies in Agriculture and Food Security, University of Agriculture, Faisalabad, Faisalabad, Punjab, Pakistan
| |
Collapse
|
14
|
Quesnel-Vallières M, Jewell S, Lynch KW, Thomas-Tikhonenko A, Barash Y. MAJIQlopedia: an encyclopedia of RNA splicing variations in human tissues and cancer. Nucleic Acids Res 2024; 52:D213-D221. [PMID: 37953365 PMCID: PMC10767883 DOI: 10.1093/nar/gkad1043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 10/11/2023] [Accepted: 11/02/2023] [Indexed: 11/14/2023] Open
Abstract
Quantification of RNA splicing variations based on RNA-Sequencing can reveal tissue- and disease-specific splicing patterns. To study such splicing variations, we introduce MAJIQlopedia, an encyclopedia of splicing variations that encompasses 86 human tissues and 41 cancer datasets. MAJIQlopedia reports annotated and unannotated splicing events for a total of 486 175 alternative splice junctions in normal tissues and 338 317 alternative splice junctions in cancer. This database, available at https://majiq.biociphers.org/majiqlopedia/, includes a user-friendly interface that provides graphical representations of junction usage quantification for each junction across all tissue or cancer types. To demonstrate case usage of MAJIQlopedia, we review splicing variations in genes WT1, MAPT and BIN1, which all have known tissue or cancer-specific splicing variations. We also use MAJIQlopedia to highlight novel splicing variations in FDX1 and MEGF9 in normal tissues, and we uncover a novel exon inclusion event in RPS6KA6 that only occurs in two cancer types. Users can download the database, request the addition of data to the webtool, or install a MAJIQlopedia server to integrate proprietary data. MAJIQlopedia can serve as a reference database for researchers seeking to understand what splicing variations exist in genes of interest, and those looking to understand tissue- or cancer-specific splice isoform usage.
Collapse
Affiliation(s)
- Mathieu Quesnel-Vallières
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - San Jewell
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Kristen W Lynch
- Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Andrei Thomas-Tikhonenko
- Division of Cancer Pathobiology, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Division of Oncology, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Pathology & Laboratory Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Yoseph Barash
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
15
|
Zhang K, Liang J, Fu Y, Chu J, Fu L, Wang Y, Li W, Zhou Y, Li J, Yin X, Wang H, Liu X, Mou C, Wang C, Wang H, Dong X, Yan D, Yu M, Zhao S, Li X, Ma Y. AGIDB: a versatile database for genotype imputation and variant decoding across species. Nucleic Acids Res 2024; 52:D835-D849. [PMID: 37889051 PMCID: PMC10767904 DOI: 10.1093/nar/gkad913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 10/05/2023] [Accepted: 10/11/2023] [Indexed: 10/28/2023] Open
Abstract
The high cost of large-scale, high-coverage whole-genome sequencing has limited its application in genomics and genetics research. The common approach has been to impute whole-genome sequence variants obtained from a few individuals for a larger population of interest individually genotyped using SNP chip. An alternative involves low-coverage whole-genome sequencing (lcWGS) of all individuals in the larger population, followed by imputation to sequence resolution. To overcome limitations of processing lcWGS data and meeting specific genotype imputation requirements, we developed AGIDB (https://agidb.pro), a website comprising tools and database with an unprecedented sample size and comprehensive variant decoding for animals. AGIDB integrates whole-genome sequencing and chip data from 17 360 and 174 945 individuals, respectively, across 89 species to identify over one billion variants, totaling a massive 688.57 TB of processed data. AGIDB focuses on integrating multiple genotype imputation scenarios. It also provides user-friendly searching and data analysis modules that enable comprehensive annotation of genetic variants for specific populations. To meet a wide range of research requirements, AGIDB offers downloadable reference panels for each species in addition to its extensive dataset, variant decoding and utility tools. We hope that AGIDB will become a key foundational resource in genetics and breeding, providing robust support to researchers.
Collapse
Affiliation(s)
- Kaili Zhang
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
| | - Jiete Liang
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
| | - Yuhua Fu
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
| | - Jinyu Chu
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
| | - Liangliang Fu
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
- Hubei Hongshan Laboratory, Wuhan 430070, China
- The Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan 430070, China
| | - Yongfei Wang
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
| | - Wangjiao Li
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
| | - You Zhou
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
| | - Jinhua Li
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
| | - Xiaoxiao Yin
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
| | - Haiyan Wang
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Xiaolei Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
- Hubei Hongshan Laboratory, Wuhan 430070, China
| | - Chunyan Mou
- College of Animal Science and Technology, Southwest University, Chongqing 402460, China
| | - Chonglong Wang
- Key Laboratory of Pig Molecular Quantitative Genetics of Anhui Academy of Agricultural Sciences, Anhui Provincial Key Laboratory of Livestock and Poultry Product Safety Engineering, Institute of Animal Husbandry and Veterinary Medicine, Anhui Academy of Agricultural Sciences, Hefei 230031, China
| | - Heng Wang
- College of Animal Science and Technology, Shandong Agricultural University, Taian 271018, China
| | - Xinxing Dong
- Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Dawei Yan
- Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Mei Yu
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
- Hubei Hongshan Laboratory, Wuhan 430070, China
| | - Shuhong Zhao
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
- Hubei Hongshan Laboratory, Wuhan 430070, China
- Lingnan Modern Agricultural Science and Technology Guangdong Laboratory, Guangzhou 510642, China
| | - Xinyun Li
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
- Hubei Hongshan Laboratory, Wuhan 430070, China
| | - Yunlong Ma
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
- Lingnan Modern Agricultural Science and Technology Guangdong Laboratory, Guangzhou 510642, China
| |
Collapse
|
16
|
Kwon S, Safer J, Nguyen DT, Hoksza D, May P, Arbesfeld JA, Rubin AF, Campbell AJ, Burgin A, Iqbal S. Genomics 2 Proteins portal: A resource and discovery tool for linking genetic screening outputs to protein sequences and structures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.02.573913. [PMID: 38260256 PMCID: PMC10802383 DOI: 10.1101/2024.01.02.573913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Recent advances in AI-based methods have revolutionized the field of structural biology. Concomitantly, high-throughput sequencing and functional genomics technologies have enabled the detection and generation of variants at an unprecedented scale. However, efficient tools and resources are needed to link these two disparate data types - to "map" variants onto protein structures, to better understand how the variation causes disease and thereby design therapeutics. Here we present the Genomics 2 Proteins Portal (G2P; g2p.broadinstitute.org/): a human proteome-wide resource that maps 19,996,443 genetic variants onto 42,413 protein sequences and 77,923 structures, with a comprehensive set of structural and functional features. Additionally, the G2P portal generalizes the capability of linking genomics to proteins beyond databases by allowing users to interactively upload protein residue-wise annotations (variants, scores, etc.) as well as the protein structure to establish the connection. The portal serves as an easy-to-use discovery tool for researchers and scientists to hypothesize the structure-function relationship between natural or synthetic variations and their molecular phenotype.
Collapse
|
17
|
Guardiola O, Iavarone F, Nicoletti C, Ventre M, Rodríguez C, Pisapia L, Andolfi G, Saccone V, Patriarca EJ, Puri PL, Minchiotti G. CRIPTO-based micro-heterogeneity of mouse muscle satellite cells enables adaptive response to regenerative microenvironment. Dev Cell 2023; 58:2896-2913.e6. [PMID: 38056454 PMCID: PMC10855569 DOI: 10.1016/j.devcel.2023.11.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 07/01/2023] [Accepted: 11/10/2023] [Indexed: 12/08/2023]
Abstract
Skeletal muscle repair relies on heterogeneous populations of satellite cells (SCs). The mechanisms that regulate SC homeostasis and state transition during activation are currently unknown. Here, we investigated the emerging role of non-genetic micro-heterogeneity, i.e., intrinsic cell-to-cell variability of a population, in this process. We demonstrate that micro-heterogeneity of the membrane protein CRIPTO in mouse-activated SCs (ASCs) identifies metastable cell states that allow a rapid response of the population to environmental changes. Mechanistically, CRIPTO micro-heterogeneity is generated and maintained through a process of intracellular trafficking coupled with active shedding of CRIPTO from the plasma membrane. Irreversible perturbation of CRIPTO micro-heterogeneity affects the balance of proliferation, self-renewal, and myogenic commitment in ASCs, resulting in increased self-renewal in vivo. Our findings demonstrate that CRIPTO micro-heterogeneity regulates the adaptative response of ASCs to microenvironmental changes, providing insights into the role of intrinsic heterogeneity in preserving stem cell population diversity during tissue repair.
Collapse
Affiliation(s)
- Ombretta Guardiola
- Stem Cell Fate Laboratory, Institute of Genetics and Biophysics "A. Buzzati-Traverso", CNR, Naples 80131, Italy; Institute of Genetics and Biophysics "A. Buzzati-Traverso", CNR, Naples 80131, Italy.
| | - Francescopaolo Iavarone
- Stem Cell Fate Laboratory, Institute of Genetics and Biophysics "A. Buzzati-Traverso", CNR, Naples 80131, Italy; Institute of Genetics and Biophysics "A. Buzzati-Traverso", CNR, Naples 80131, Italy
| | - Chiara Nicoletti
- Development, Aging and Regeneration Program, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA 92037, USA
| | - Maurizio Ventre
- Department of Chemical, Materials and Industrial Production Engineering, University of Naples "Federico II", Naples 80125, Italy; Center for Advanced Biomaterials for Healthcare@CRIB, Istituto Italiano di Tecnologia, Naples 80125, Italy
| | - Cristina Rodríguez
- Stem Cell Fate Laboratory, Institute of Genetics and Biophysics "A. Buzzati-Traverso", CNR, Naples 80131, Italy; Institute of Genetics and Biophysics "A. Buzzati-Traverso", CNR, Naples 80131, Italy
| | - Laura Pisapia
- Institute of Genetics and Biophysics "A. Buzzati-Traverso", CNR, Naples 80131, Italy
| | - Gennaro Andolfi
- Stem Cell Fate Laboratory, Institute of Genetics and Biophysics "A. Buzzati-Traverso", CNR, Naples 80131, Italy; Institute of Genetics and Biophysics "A. Buzzati-Traverso", CNR, Naples 80131, Italy
| | - Valentina Saccone
- IRCCS Fondazione Santa Lucia, Rome 00143, Italy; Department of Life Sciences and Public Health, Università Cattolica del Sacro Cuore, Rome 00168, Italy
| | - Eduardo J Patriarca
- Stem Cell Fate Laboratory, Institute of Genetics and Biophysics "A. Buzzati-Traverso", CNR, Naples 80131, Italy; Institute of Genetics and Biophysics "A. Buzzati-Traverso", CNR, Naples 80131, Italy
| | - Pier Lorenzo Puri
- Development, Aging and Regeneration Program, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA 92037, USA
| | - Gabriella Minchiotti
- Stem Cell Fate Laboratory, Institute of Genetics and Biophysics "A. Buzzati-Traverso", CNR, Naples 80131, Italy; Institute of Genetics and Biophysics "A. Buzzati-Traverso", CNR, Naples 80131, Italy.
| |
Collapse
|
18
|
Chao H, Zhang S, Hu Y, Ni Q, Xin S, Zhao L, Ivanisenko VA, Orlov YL, Chen M. Integrating omics databases for enhanced crop breeding. J Integr Bioinform 2023; 20:jib-2023-0012. [PMID: 37486120 PMCID: PMC10777369 DOI: 10.1515/jib-2023-0012] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 06/12/2023] [Indexed: 07/25/2023] Open
Abstract
Crop plant breeding involves selecting and developing new plant varieties with desirable traits such as increased yield, improved disease resistance, and enhanced nutritional value. With the development of high-throughput technologies, such as genomics, transcriptomics, and metabolomics, crop breeding has entered a new era. However, to effectively use these technologies, integration of multi-omics data from different databases is required. Integration of omics data provides a comprehensive understanding of the biological processes underlying plant traits and their interactions. This review highlights the importance of integrating omics databases in crop plant breeding, discusses available omics data and databases, describes integration challenges, and highlights recent developments and potential benefits. Taken together, the integration of omics databases is a critical step towards enhancing crop plant breeding and improving global food security.
Collapse
Affiliation(s)
- Haoyu Chao
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou310058, China
| | - Shilong Zhang
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou310058, China
| | - Yueming Hu
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou310058, China
| | - Qingyang Ni
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou310058, China
| | - Saige Xin
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou310058, China
| | - Liang Zhao
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou310058, China
| | - Vladimir A. Ivanisenko
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk630090, Russia
| | - Yuriy L. Orlov
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk630090, Russia
- Agrarian and Technological Institute, Peoples’ Friendship University of Russia, Moscow117198, Russia
- The Digital Health Institute, I.M. Sechenov First Moscow State Medical University of the Russian Ministry of Health (Sechenov University), Moscow119991, Russia
| | - Ming Chen
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou310058, China
| |
Collapse
|
19
|
Chakraborty A, Bisht MS, Saxena R, Mahajan S, Pulikkan J, Sharma VK. Genome sequencing and de novo and reference-based genome assemblies of Bos indicus breeds. Genes Genomics 2023; 45:1399-1408. [PMID: 37231295 DOI: 10.1007/s13258-023-01401-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 05/12/2023] [Indexed: 05/27/2023]
Abstract
BACKGROUND Indian cattle breeds (Bos indicus) are known for their remarkable adaptability to hot and humid climates, higher nutritious quality of milk, better disease tolerance, and greater ability to perform in poor feed compared to taurine cattle (Bos taurus). Distinct phenotypic differences are observed among the B. indicus breeds; however, the whole genome sequences were unavailable for these indigenous breeds. OBJECTIVE We aimed to perform whole genome sequencing to construct the draft genome assemblies of four B. indicus breeds; Ongole, Kasargod Dwarf, Kasargod Kapila, and Vechur (the smallest cattle of the world). METHODS We sequenced the whole genomes using Illumina short-read technology, and constructed de novo and reference-based genome assemblies of these native B. indicus breeds for the first time. RESULTS The draft de novo genome assemblies of B. indicus breeds ranged from 1.98 to 3.42 Gbp. We also constructed the mitochondrial genome assemblies (~ 16.3 Kbp), and yet unavailable 18S rRNA marker gene sequences of these B. indicus breeds. The genome assemblies helped to identify the bovine genes related to distinct phenotypic characteristics and other biological processes for this species compared to B. taurus, which are plausibly responsible for providing better adaptive traits. We also identified the genes that showed sequence variation in dwarf and non-dwarf breeds of B. indicus compared to B. taurus. CONCLUSIONS The genome assemblies of these Indian cattle breeds, the 18S rRNA marker genes, and identification of the distinct genes in B. indicus breeds compared to B. taurus will help in future studies on these cattle species.
Collapse
Affiliation(s)
- Abhisek Chakraborty
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, India
| | - Manohar S Bisht
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, India
| | - Rituja Saxena
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, India
| | - Shruti Mahajan
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, India
| | - Joby Pulikkan
- Department of Genomic Science, Central University of Kerala, Kasaragod, India
| | - Vineet K Sharma
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, India.
| |
Collapse
|
20
|
Kidwai S, Barbiero P, Meijerman I, Tonda A, Perez‐Pardo P, Lio ´ P, van der Maitland‐Zee AH, Oberski DL, Kraneveld AD, Lopez‐Rincon A. A robust mRNA signature obtained via recursive ensemble feature selection predicts the responsiveness of omalizumab in moderate-to-severe asthma. Clin Transl Allergy 2023; 13:e12306. [PMID: 38006387 PMCID: PMC10655633 DOI: 10.1002/clt2.12306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Revised: 09/01/2023] [Accepted: 10/11/2023] [Indexed: 11/27/2023] Open
Abstract
BACKGROUND Not being well controlled by therapy with inhaled corticosteroids and long-acting β2 agonist bronchodilators is a major concern for severe-asthma patients. The current treatment option for these patients is the use of biologicals such as anti-IgE treatment, omalizumab, as an add-on therapy. Despite the accepted use of omalizumab, patients do not always benefit from it. Therefore, there is a need to identify reliable biomarkers as predictors of omalizumab response. METHODS Two novel computational algorithms, machine-learning based Recursive Ensemble Feature Selection (REFS) and rule-based algorithm Logic Explainable Networks (LEN), were used on open accessible mRNA expression data from moderate-to-severe asthma patients to identify genes as predictors of omalizumab response. RESULTS With REFS, the number of features was reduced from 28,402 genes to 5 genes while obtaining a cross-validated accuracy of 0.975. The 5 responsiveness predictive genes encode the following proteins: Coiled-coil domain- containing protein 113 (CCDC113), Solute Carrier Family 26 Member 8 (SLC26A), Protein Phosphatase 1 Regulatory Subunit 3D (PPP1R3D), C-Type lectin Domain Family 4 member C (CLEC4C) and LOC100131780 (not annotated). The LEN algorithm found 4 identical genes with REFS: CCDC113, SLC26A8 PPP1R3D and LOC100131780. Literature research showed that the 4 identified responsiveness predicting genes are associated with mucosal immunity, cell metabolism, and airway remodeling. CONCLUSION AND CLINICAL RELEVANCE Both computational methods show 4 identical genes as predictors of omalizumab response in moderate-to-severe asthma patients. The obtained high accuracy indicates that our approach has potential in clinical settings. Future studies in relevant cohort data should validate our computational approach.
Collapse
Affiliation(s)
- Sarah Kidwai
- Division of PharmacologyUtrecht Institute for Pharmaceutical ScienceFaculty of ScienceUtrecht UniversityUtrechtThe Netherlands
| | - Pietro Barbiero
- Department of Computer Science and TechnologyUniversity of CambridgeCambridgeUK
| | - Irma Meijerman
- Division of PharmacologyUtrecht Institute for Pharmaceutical ScienceFaculty of ScienceUtrecht UniversityUtrechtThe Netherlands
| | | | - Paula Perez‐Pardo
- Division of PharmacologyUtrecht Institute for Pharmaceutical ScienceFaculty of ScienceUtrecht UniversityUtrechtThe Netherlands
| | - Pietro Lio ´
- Department of Computer Science and TechnologyUniversity of CambridgeCambridgeUK
| | | | - Daniel L. Oberski
- Department of Data ScienceUniversity Medical Center UtrechtUtrechtThe Netherlands
| | - Aletta D. Kraneveld
- Division of PharmacologyUtrecht Institute for Pharmaceutical ScienceFaculty of ScienceUtrecht UniversityUtrechtThe Netherlands
| | - Alejandro Lopez‐Rincon
- Division of PharmacologyUtrecht Institute for Pharmaceutical ScienceFaculty of ScienceUtrecht UniversityUtrechtThe Netherlands
- Department of Data ScienceUniversity Medical Center UtrechtUtrechtThe Netherlands
| |
Collapse
|
21
|
Yang P, Hubert SM, Futreal PA, Song X, Zhang J, Lee JJ, Wistuba I, Yuan Y, Zhang J, Li Z. A novel Bayesian model for assessing intratumor heterogeneity of tumor infiltrating leukocytes with multi-region gene expression sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.24.563820. [PMID: 37961165 PMCID: PMC10634795 DOI: 10.1101/2023.10.24.563820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Intratumor heterogeneity (ITH) of tumor-infiltrated leukocytes (TILs) is an important phenomenon of cancer biology with potentially profound clinical impacts. Multi-region gene expression sequencing data provide a promising opportunity that allows for explorations of TILs and their intratumor heterogeneity for each subject. Although several existing methods are available to infer the proportions of TILs, considerable methodological gaps exist for evaluating intratumor heterogeneity of TILs with multi-region gene expression data. Here, we develop ICeITH, immune cell estimation reveals intratumor heterogeneity, a Bayesian hierarchical model that borrows cell type profiles as prior knowledge to decompose mixed bulk data while accounting for the within-subject correlations among tumor samples. ICeITH quantifies intratumor heterogeneity by the variability of targeted cellular compositions. Through extensive simulation studies, we demonstrate that ICeITH is more accurate in measuring relative cellular abundance and evaluating intratumor heterogeneity compared with existing methods. We also assess the ability of ICeITH to stratify patients by their intratumor heterogeneity score and associate the estimations with the survival outcomes. Finally, we apply ICeITH to two multi-region gene expression datasets from lung cancer studies to classify patients into different risk groups according to the ITH estimations of targeted TILs that shape either pro- or anti-tumor processes. In conclusion, ICeITH is a useful tool to evaluate intratumor heterogeneity of TILs from multi-region gene expression data.
Collapse
Affiliation(s)
- Peng Yang
- Department of Statistics, Rice University, Houston, Texas 77005, U.S.A
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center Houston, Texas 77030, U.S.A
| | - Shawna M. Hubert
- Department of Thoracic Head Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - P. Andrew Futreal
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Xingzhi Song
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jianhua Zhang
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - J. Jack Lee
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center Houston, Texas 77030, U.S.A
| | - Ignacio Wistuba
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Ying Yuan
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center Houston, Texas 77030, U.S.A
| | - Jianjun Zhang
- Department of Thoracic Head Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Ziyi Li
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center Houston, Texas 77030, U.S.A
| |
Collapse
|
22
|
Musyaffa FA, Rapp K, Gohlke H. LISTER: Semiautomatic Metadata Extraction from Annotated Experiment Documentation in eLabFTW. J Chem Inf Model 2023; 63:6224-6238. [PMID: 37773594 DOI: 10.1021/acs.jcim.3c00744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/01/2023]
Abstract
The availability of scientific methods, code, and data is key for reproducing an experiment. Research data should be made available following the FAIR principle (findable, accessible, interoperable, and reusable). For that, the annotation of research data with metadata is central. However, existing research data management workflows often require that metadata be created by the corresponding researchers, which takes effort and time. Here, we developed LISTER as a methodological and algorithmic solution to create and extract metadata from annotated, template-based experimental documentation using minimum effort. We focused on tailoring the integration between existing platforms by using eLabFTW as the electronic lab notebook and adopting the ISA (investigation, study, assay) model as the abstract data model framework. LISTER consists of four components: annotation language to support metadata extraction; customized eLabFTW entries using specific hierarchies, templates, and tags to structure reusable scientific documentation; a "container" concept in eLabFTW, making metadata of a particular container content extractable along with its underlying, related experiments via a single click; a Python-based app to enable easy-to-use, semiautomated metadata extraction from eLabFTW entries. LISTER outputs metadata in machine-readable .json and human-readable .xlsx formats, and Material and Methods (MM) descriptions in .docx format that could be used in a thesis or manuscript. The metadata can be used as a basis to create or extend ontologies, which, when applied to the published research data, will significantly enhance its value. DSpace is used as a data cataloging platform for hosting the extracted metadata and research data. We applied LISTER to computational biophysical chemistry, protein biochemistry, and molecular biology, and our concept should be extendable to other life science areas.
Collapse
Affiliation(s)
- Fathoni A Musyaffa
- Institute for Pharmaceutical and Medicinal Chemistry, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
| | - Kirsten Rapp
- Institute for Pharmaceutical and Medicinal Chemistry, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
| | - Holger Gohlke
- Institute for Pharmaceutical and Medicinal Chemistry, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
- Institute of Bio- and Geosciences (IBG-4: Bioinformatics), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
| |
Collapse
|
23
|
Ahmed F, Yang YJ, Samantasinghar A, Kim YW, Ko JB, Choi KH. Network-based drug repurposing for HPV-associated cervical cancer. Comput Struct Biotechnol J 2023; 21:5186-5200. [PMID: 37920815 PMCID: PMC10618120 DOI: 10.1016/j.csbj.2023.10.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 10/17/2023] [Accepted: 10/17/2023] [Indexed: 11/04/2023] Open
Abstract
In women, cervical cancer (CC) is the fourth most common cancer around the world with average cases of 604,000 and 342,000 deaths per year. Approximately 50% of high-grade CC are attributed to human papillomavirus (HPV) types 16 and 18. Chances of CC in HPV-positive patients are 6 times more than HPV-negative patients which demands timely and effective treatment. Repurposing of drugs is considered a viable approach to drug discovery which makes use of existing drugs, thus potentially reducing the time and costs associated with de-novo drug discovery. In this study, we present an integrative drug repurposing framework based on a systems biology-enabled network medicine platform. First, we built an HPV-induced CC protein interaction network named HPV2C following the CC signatures defined by the omics dataset, obtained from GEO database. Second, the drug target interaction (DTI) data obtained from DrugBank, and related databases was used to model the DTI network followed by drug target network proximity analysis of HPV-host associated key targets and DTIs in the human protein interactome. This analysis identified 142 potential anti-HPV repurposable drugs to target HPV induced CC pathways. Third, as per the literature survey 51 of the predicted drugs are already used for CC and 33 of the remaining drugs have anti-viral activity. Gene set enrichment analysis of potential drugs in drug-gene signatures and in HPV-induced CC-specific transcriptomic data in human cell lines additionally validated the predictions. Finally, 13 drug combinations were found using a network based on overlapping exposure. To summarize, the study provides effective network-based technique to quickly identify suitable repurposable drugs and drug combinations that target HPV-associated CC.
Collapse
Affiliation(s)
- Faheem Ahmed
- Department of Mechatronics Engineering, Jeju National University, South Korea
| | - Young Jin Yang
- Korea Institute of Industrial Technology, 102 Jejudaehak-ro, Jeju-si 63243, South Korea
| | | | - Young Woo Kim
- Korea Institute of Industrial Technology, 102 Jejudaehak-ro, Jeju-si 63243, South Korea
| | - Jeong Beom Ko
- Korea Institute of Industrial Technology, 102 Jejudaehak-ro, Jeju-si 63243, South Korea
| | - Kyung Hyun Choi
- Department of Mechatronics Engineering, Jeju National University, South Korea
| |
Collapse
|
24
|
Sajeev A, BharathwajChetty B, Vishwa R, Alqahtani MS, Abbas M, Sethi G, Kunnumakkara AB. Crosstalk between Non-Coding RNAs and Wnt/β-Catenin Signaling in Head and Neck Cancer: Identification of Novel Biomarkers and Therapeutic Agents. Noncoding RNA 2023; 9:63. [PMID: 37888209 PMCID: PMC10610319 DOI: 10.3390/ncrna9050063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 09/25/2023] [Accepted: 10/08/2023] [Indexed: 10/28/2023] Open
Abstract
Head and neck cancers (HNC) encompass a broad spectrum of neoplastic disorders characterized by significant morbidity and mortality. While contemporary therapeutic interventions offer promise, challenges persist due to tumor recurrence and metastasis. Central to HNC pathogenesis is the aberration in numerous signaling cascades. Prominently, the Wnt signaling pathway has been critically implicated in the etiology of HNC, as supported by a plethora of research. Equally important, variations in the expression of non-coding RNAs (ncRNAs) have been identified to modulate key cancer phenotypes such as cellular proliferation, epithelial-mesenchymal transition, metastatic potential, recurrence, and treatment resistance. This review aims to provide an exhaustive insight into the multifaceted influence of ncRNAs on HNC, with specific emphasis on their interactions with the Wnt/β-catenin (WBC) signaling axis. We further delineate the effect of ncRNAs in either exacerbating or attenuating HNC progression via interference with WBC signaling. An overview of the mechanisms underlying the interplay between ncRNAs and WBC signaling is also presented. In addition, we described the potential of various ncRNAs in enhancing the efficacy of chemotherapeutic and radiotherapeutic modalities. In summary, this assessment posits the potential of ncRNAs as therapeutic agents targeting the WBC signaling pathway in HNC management.
Collapse
Affiliation(s)
- Anjana Sajeev
- Cancer Biology Laboratory, Department of Biosciences and Bioengineering, Indian Institute of Technology (IIT) Guwahati, Guwahati 781039, Assam, India; (A.S.); (B.B.); (R.V.)
| | - Bandari BharathwajChetty
- Cancer Biology Laboratory, Department of Biosciences and Bioengineering, Indian Institute of Technology (IIT) Guwahati, Guwahati 781039, Assam, India; (A.S.); (B.B.); (R.V.)
| | - Ravichandran Vishwa
- Cancer Biology Laboratory, Department of Biosciences and Bioengineering, Indian Institute of Technology (IIT) Guwahati, Guwahati 781039, Assam, India; (A.S.); (B.B.); (R.V.)
| | - Mohammed S. Alqahtani
- Radiological Sciences Department, College of Applied Medical Sciences, King Khalid University, Abha 61421, Saudi Arabia;
- BioImaging Unit, Space Research Centre, Michael Atiyah Building, University of Leicester, Leicester LE1 7RH, UK
| | - Mohamed Abbas
- Electrical Engineering Department, College of Engineering, King Khalid University, Abha 61421, Saudi Arabia;
| | - Gautam Sethi
- Department of Pharmacology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 117600, Singapore
| | - Ajaikumar B. Kunnumakkara
- Cancer Biology Laboratory, Department of Biosciences and Bioengineering, Indian Institute of Technology (IIT) Guwahati, Guwahati 781039, Assam, India; (A.S.); (B.B.); (R.V.)
| |
Collapse
|
25
|
Cole R, Holroyd N, Tracey A, Berriman M, Viney M. The parasitic nematode Strongyloides ratti exists predominantly as populations of long-lived asexual lineages. Nat Commun 2023; 14:6427. [PMID: 37833369 PMCID: PMC10575991 DOI: 10.1038/s41467-023-42250-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 10/05/2023] [Indexed: 10/15/2023] Open
Abstract
Nematodes are important parasites of people and animals, and in natural ecosystems they are a major ecological force. Strongyloides ratti is a common parasitic nematode of wild rats and we have investigated its population genetics using single-worm, whole-genome sequencing. We find that S. ratti populations in the UK consist of mixtures of mainly asexual lineages that are widely dispersed across a host population. These parasite lineages are likely very old and may have originated in Asia from where rats originated. Genes that underly the parasitic phase of the parasite's life cycle are hyperdiverse compared with the rest of the genome, and this may allow the parasites to maximise their fitness in a diverse host population. These patterns of parasitic nematode population genetics have not been found before and may also apply to Strongyloides spp. that infect people, which will affect how we should approach their control.
Collapse
Affiliation(s)
- Rebecca Cole
- School of Biological Sciences, University of Bristol, Bristol, BS8 1TQ, UK
| | - Nancy Holroyd
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Alan Tracey
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Matt Berriman
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
- School of Infection & Immunity, University of Glasgow, 120 University Place, Glasgow, G12 8TA, UK
| | - Mark Viney
- School of Biological Sciences, University of Bristol, Bristol, BS8 1TQ, UK.
- Department of Evolution, Ecology and Behaviour, University of Liverpool, Liverpool, L69 7ZB, UK.
| |
Collapse
|
26
|
Khani F, Hooper WF, Wang X, Chu TR, Shah M, Winterkorn L, Sigouros M, Conteduca V, Pisapia D, Wobker S, Walker S, Graff JN, Robinson B, Mosquera JM, Sboner A, Elemento O, Robine N, Beltran H. Evolution of structural rearrangements in prostate cancer intracranial metastases. NPJ Precis Oncol 2023; 7:91. [PMID: 37704749 PMCID: PMC10499931 DOI: 10.1038/s41698-023-00435-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 08/08/2023] [Indexed: 09/15/2023] Open
Abstract
Intracranial metastases in prostate cancer are uncommon but clinically aggressive. A detailed molecular characterization of prostate cancer intracranial metastases would improve our understanding of their pathogenesis and the search for new treatment strategies. We evaluated the clinical and molecular characteristics of 36 patients with metastatic prostate cancer to either the dura or brain parenchyma. We performed whole genome sequencing (WGS) of 10 intracranial prostate cancer metastases, as well as WGS of primary prostate tumors from men who later developed metastatic disease (n = 6) and nonbrain prostate cancer metastases (n = 36). This first whole genome sequencing study of prostate intracranial metastases led to several new insights. First, there was a higher diversity of complex structural alterations in prostate cancer intracranial metastases compared to primary tumor tissues. Chromothripsis and chromoplexy events seemed to dominate, yet there were few enrichments of specific categories of structural variants compared with non-brain metastases. Second, aberrations involving the AR gene, including AR enhancer gain were observed in 7/10 (70%) of intracranial metastases, as well as recurrent loss of function aberrations involving TP53 in 8/10 (80%), RB1 in 2/10 (20%), BRCA2 in 2/10 (20%), and activation of the PI3K/AKT/PTEN pathway in 8/10 (80%). These alterations were frequently present in tumor tissues from other sites of disease obtained concurrently or sequentially from the same individuals. Third, clonality analysis points to genomic factors and evolutionary bottlenecks that contribute to metastatic spread in patients with prostate cancer. These results describe the aggressive molecular features underlying intracranial metastasis that may inform future diagnostic and treatment approaches.
Collapse
Affiliation(s)
- Francesca Khani
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
| | | | - Xiaofei Wang
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
| | | | | | | | - Michael Sigouros
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Vincenza Conteduca
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
- Department of Medical and Surgical Sciences, Unit of Medical Oncology and Biomolecular Therapy, University of Foggia, Policlinico Riuniti, Foggia, Italy
| | - David Pisapia
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Sara Wobker
- Department of Pathology and Laboratory Medicine, UNC Chapel Hill, Chapel Hill, NC, USA
| | - Sydney Walker
- Department of Medical Oncology, Oregon Health Sciences University, Portland, OR, USA
| | - Julie N Graff
- Department of Medical Oncology, Oregon Health Sciences University, Portland, OR, USA
| | - Brian Robinson
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Juan Miguel Mosquera
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Andrea Sboner
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Olivier Elemento
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | | | - Himisha Beltran
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA.
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA.
| |
Collapse
|
27
|
Scammell BH, Tchio C, Song Y, Nishiyama T, Louie TL, Dashti HS, Nakatochi M, Zee PC, Daghlas I, Momozawa Y, Cai J, Ollila HM, Redline S, Wakai K, Sofer T, Suzuki S, Lane JM, Saxena R. Multi-ancestry genome-wide analysis identifies shared genetic effects and common genetic variants for self-reported sleep duration. Hum Mol Genet 2023; 32:2797-2807. [PMID: 37384397 PMCID: PMC10656946 DOI: 10.1093/hmg/ddad101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 06/01/2023] [Accepted: 06/02/2023] [Indexed: 07/01/2023] Open
Abstract
Both short (≤6 h per night) and long sleep duration (≥9 h per night) are associated with increased risk of chronic diseases. Despite evidence linking habitual sleep duration and risk of disease, the genetic determinants of sleep duration in the general population are poorly understood, especially outside of European (EUR) populations. Here, we report that a polygenic score of 78 European ancestry sleep duration single-nucleotide polymorphisms (SNPs) is associated with sleep duration in an African (n = 7288; P = 0.003), an East Asian (n = 13 618; P = 6 × 10-4) and a South Asian (n = 7485; P = 0.025) genetic ancestry cohort, but not in a Hispanic/Latino cohort (n = 8726; P = 0.71). Furthermore, in a pan-ancestry (N = 483 235) meta-analysis of genome-wide association studies (GWAS) for habitual sleep duration, 73 loci are associated with genome-wide statistical significance. Follow-up of five loci (near HACD2, COG5, PRR12, SH3RF1 and KCNQ5) identified expression-quantitative trait loci for PRR12 and COG5 in brain tissues and pleiotropic associations with cardiovascular and neuropsychiatric traits. Overall, our results suggest that the genetic basis of sleep duration is at least partially shared across diverse ancestry groups.
Collapse
Affiliation(s)
- B H Scammell
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02215, USA
- Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02141, USA
| | - C Tchio
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02215, USA
- Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02141, USA
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Cardiovascular Research Institute, Morehouse School of Medicine, Atlanta, GA 30310, USA
| | - Y Song
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02215, USA
- Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02141, USA
| | - T Nishiyama
- Department of Public Health, Nagoya City University Graduate School of Medicine, Nagoya 467-8701, Japan
| | - T L Louie
- Department of Biostatistics, University of Washington, Seattle, WA 98105, USA
| | - H S Dashti
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02215, USA
- Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02141, USA
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - M Nakatochi
- Public Health Informatics Unit, Department of Integrated Health Sciences, Nagoya University Graduate School of Medicine, Nagoya 467-8701, Japan
| | - P C Zee
- Center for Circadian and Sleep Medicine, Department of Neurology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - I Daghlas
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02215, USA
- Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02141, USA
| | - Y Momozawa
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Japan
| | - J Cai
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - H M Ollila
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02215, USA
- Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02141, USA
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Institute for Molecular Medicine, HiLIFE, University of Helsinki, Helsinki 00014, Finland
| | - S Redline
- Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, MA 02115, USA
| | - K Wakai
- Department of Preventive Medicine, Nagoya University Graduate School of Medicine, Nagoya 467-8701, Japan
| | - T Sofer
- Department of Biostatistics, University of Washington, Seattle, WA 98105, USA
- Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, MA 02115, USA
| | - S Suzuki
- Department of Public Health, Nagoya City University Graduate School of Medicine, Nagoya 467-8701, Japan
| | - J M Lane
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02215, USA
- Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02141, USA
- Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, MA 02115, USA
| | - R Saxena
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02215, USA
- Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02141, USA
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| |
Collapse
|
28
|
Kabir M, Stuart HM, Lopes FM, Fotiou E, Keavney B, Doig AJ, Woolf AS, Hentges KE. Predicting congenital renal tract malformation genes using machine learning. Sci Rep 2023; 13:13204. [PMID: 37580336 PMCID: PMC10425350 DOI: 10.1038/s41598-023-38110-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 07/03/2023] [Indexed: 08/16/2023] Open
Abstract
Congenital renal tract malformations (RTMs) are the major cause of severe kidney failure in children. Studies to date have identified defined genetic causes for only a minority of human RTMs. While some RTMs may be caused by poorly defined environmental perturbations affecting organogenesis, it is likely that numerous causative genetic variants have yet to be identified. Unfortunately, the speed of discovering further genetic causes for RTMs is limited by challenges in prioritising candidate genes harbouring sequence variants. Here, we exploited the computer-based artificial intelligence methodology of supervised machine learning to identify genes with a high probability of being involved in renal development. These genes, when mutated, are promising candidates for causing RTMs. With this methodology, the machine learning classifier determines which attributes are common to renal development genes and identifies genes possessing these attributes. Here we report the validation of an RTM gene classifier and provide predictions of the RTM association status for all protein-coding genes in the mouse genome. Overall, our predictions, whilst not definitive, can inform the prioritisation of genes when evaluating patient sequence data for genetic diagnosis. This knowledge of renal developmental genes will accelerate the processes of reaching a genetic diagnosis for patients born with RTMs.
Collapse
Affiliation(s)
- Mitra Kabir
- CentreDivision of Evolution, Infection and Genomics, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, The University of Manchester, Oxford Road, Manchester, M13 9PT, UK
| | - Helen M Stuart
- CentreDivision of Evolution, Infection and Genomics, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, The University of Manchester, Oxford Road, Manchester, M13 9PT, UK
- Manchester Centre for Genomic Medicine, St. Mary's Hospital, Health Innovation Manchester, Manchester University Foundation NHS Trust, Manchester, M13 9WL, UK
| | - Filipa M Lopes
- Division of Cell Matrix Biology and Regenerative Medicine, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, M13 9PL, UK
| | - Elisavet Fotiou
- Division of Cardiovascular Sciences, School of Medical Sciences, Faculty of Biology, Medicine, and Health, The University of Manchester, Manchester, M13 9PL, UK
- C.B.B Lifeline Biotech Ltd, 5 Propontidos Street, Strovolos, 2033, Nicosia, Cyprus
| | - Bernard Keavney
- Division of Cardiovascular Sciences, School of Medical Sciences, Faculty of Biology, Medicine, and Health, The University of Manchester, Manchester, M13 9PL, UK
- Manchester Heart Institute, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, M13 9WL, UK
| | - Andrew J Doig
- Division of Neuroscience, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Stopford Building, Manchester, M13 9BL, UK
| | - Adrian S Woolf
- Division of Cell Matrix Biology and Regenerative Medicine, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, M13 9PL, UK
- Department of Nephrology, Royal Manchester Children's Hospital, Manchester Academic Health Science Centre, Manchester, M13 9WL, UK
| | - Kathryn E Hentges
- CentreDivision of Evolution, Infection and Genomics, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, The University of Manchester, Oxford Road, Manchester, M13 9PT, UK.
| |
Collapse
|
29
|
Chakraborty A, Mondal S, Mahajan S, Sharma VK. High-quality genome assemblies provide clues on the evolutionary advantage of blue peafowl over green peafowl. Heliyon 2023; 9:e18571. [PMID: 37576271 PMCID: PMC10412995 DOI: 10.1016/j.heliyon.2023.e18571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 07/14/2023] [Accepted: 07/20/2023] [Indexed: 08/15/2023] Open
Abstract
An intriguing example of differential adaptability is the case of two Asian peafowl species, Pavo cristatus (blue peafowl) and Pavo muticus (green peafowl), where the former has a "Least Concern" conservation status and the latter is an "Endangered" species. To understand the genetic basis of this differential adaptability of the two peafowl species, a comparative analysis of these species is much needed to gain the genomic and evolutionary insights. Thus, we constructed a high-quality genome assembly of blue peafowl with an N50 value of 84.81 Mb (pseudochromosome-level assembly), and a high-confidence coding gene set to perform the genomic and evolutionary analyses of blue and green peafowls with 49 other avian species. The analyses revealed adaptive evolution of genes related to neuronal development, immunity, and skeletal muscle development in these peafowl species. Major genes related to axon guidance such as NEO1 and UNC5, semaphorin (SEMA), and ephrin receptor showed adaptive evolution in peafowl species. However, blue peafowl showed the presence of 42% more coding genes compared to the green peafowl along with a higher number of species-specific gene clusters, segmental duplicated genes and expanded gene families, and comparatively higher evolution in neuronal and developmental pathways. Blue peafowl also showed longer branch length compared to green peafowl in the species phylogenetic tree. These genomic insights obtained from the high-quality genome assembly of P. cristatus constructed in this study provide new clues on the superior adaptability of the blue peafowl over green peafowl despite having a recent species divergence time.
Collapse
Affiliation(s)
- Abhisek Chakraborty
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, 462066, Madhya Pradesh, India
| | - Samuel Mondal
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, 462066, Madhya Pradesh, India
| | - Shruti Mahajan
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, 462066, Madhya Pradesh, India
| | - Vineet K. Sharma
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, 462066, Madhya Pradesh, India
| |
Collapse
|
30
|
Abduljaleel Z, Melebari S, Athar M, Dehlawi S, Udhaya Kumar S, Aziz SA, Dannoun AI, Malik SM, Thasleem J, George Priya Doss C. SARS-CoV-2 vaccine breakthrough infections (VBI) by Omicron variant (B.1.1.529) and consequences in structural and functional impact. Cell Signal 2023:110798. [PMID: 37423342 DOI: 10.1016/j.cellsig.2023.110798] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 06/18/2023] [Accepted: 07/04/2023] [Indexed: 07/11/2023]
Abstract
This study investigated the efficacy of existing vaccines against hospitalization and infection due to the Omicron variant of COVID-19, particularly for those who received two doses of Moderna or Pfizer vaccines and one dose of Johnson & Johnson vaccine or who were vaccinated more than five months before. A total of 36 variants in Omicron's spike protein, targeted by all three vaccinations, have made antibodies less effective at neutralizing the virus. The genotyping of the SARS-CoV-2 viral sequence revealed clinically significant variants such as E484K in three genetic mutations (T95I, D614G, and del142-144). A woman showed two of these mutations, indicating a potential risk of infection after successful immunization, as recently reported by Hacisuleyman (2021). We examine the effects of mutations on domains (NID, RBM, and SD2) found at the interfaces of the spike domains Omicron B.1.1529, Delta/B.1.1529, Alpha/B.1.1.7, VUM B.1.526, B.1.575.2, and B.1.1214 (formerly VOI Iota). We tested the affinity of Omicron for ACE2 and found that the wild- and mutant-spike proteins were using atomistic molecular dynamics simulations. According to the binding free energies calculated during mutagenesis, the ACE2 bound Omicron spikes more strongly than the wild strain SARS-CoV-2. T95I, D614G, and E484K are three substitutions that significantly contribute to RBD, corresponding to ACE2 binding energies and a doubling of the electrostatic potential of Omicron spike proteins. The Omicron appears to bind to ACE2 with greater affinity, increasing its infectivity and transmissibility. The spike virus was designed to strengthen antibody immune evasion through binding while boosting receptor binding by enhancing IgG and IgM antibodies that stimulate human β-cell, as opposed to the wild strain, which has more vital stimulation of both antibodies.
Collapse
Affiliation(s)
- Zainularifeen Abduljaleel
- Science and Technology Unit, Umm Al-Qura University, P.O. Box 715, Makkah 21955, Saudi Arabia; Department of Medical Genetics, Faculty of Medicine, Umm Al-Qura University, P.O. Box 715, Makkah 21955, Saudi Arabia.
| | - Sami Melebari
- Department of Molecular Biology, The Regional Laboratory, Ministry of Health (MOH), Makkah, Saudi Arabia
| | - Mohammed Athar
- Science and Technology Unit, Umm Al-Qura University, P.O. Box 715, Makkah 21955, Saudi Arabia; Department of Medical Genetics, Faculty of Medicine, Umm Al-Qura University, P.O. Box 715, Makkah 21955, Saudi Arabia
| | - Saied Dehlawi
- Department of Molecular Biology, The Regional Laboratory, Ministry of Health (MOH), Makkah, Saudi Arabia
| | - S Udhaya Kumar
- Laboratory of Integrative Genomics, Department of Integrative Biology, School of Bio Sciences and Technology, Vellore Institute of Technology (VIT), Vellore 632014, Tamil Nadu, India
| | - Syed A Aziz
- Department of Pathology and Lab Medicine, University of Ottawa, 451 Smyth Road, Ottawa, ON K1H 8M5, Canada
| | - Anas Ibrahim Dannoun
- Department of Medical Genetics, Faculty of Medicine, Umm Al-Qura University, P.O. Box 715, Makkah 21955, Saudi Arabia
| | - Shaheer M Malik
- Department of Chemistry, Faculty of Applied Sciences, Umm Al-Qura University, Makkah, Saudi Arabia
| | - Jasheela Thasleem
- Jamal Mohamed College, Bharathidasan University, 7, Race Course Road, Kaja Nagar, Tiruchirappalli, Tamil Nadu 620020, India
| | - C George Priya Doss
- Laboratory of Integrative Genomics, Department of Integrative Biology, School of Bio Sciences and Technology, Vellore Institute of Technology (VIT), Vellore 632014, Tamil Nadu, India
| |
Collapse
|
31
|
Fernández Álvarez J, Navas González FJ, León Jurado JM, González Ariza A, Martínez Martínez MA, Pastrana CI, Pizarro Inostroza MG, Delgado Bermejo JV. Discriminant canonical tool for inferring the effect of αS1, αS2, β, and κ casein haplotypes and haplogroups on zoometric/linear appraisal breeding values in Murciano-Granadina goats. Front Vet Sci 2023; 10:1138528. [PMID: 37483293 PMCID: PMC10360128 DOI: 10.3389/fvets.2023.1138528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 06/19/2023] [Indexed: 07/25/2023] Open
Abstract
Genomic tools have shown promising results in maximizing breeding outcomes, but their impact has not yet been explored. This study aimed to outline the effect of the individual haplotypes of each component of the casein complex (αS1, β, αS2, and κ-casein) on zoometric/linear appraisal breeding values. A discriminant canonical analysis was performed to study the relationship between the predicted breeding value for 17 zoometric/linear appraisal traits and the aforementioned casein gene haplotypic sequences. The analysis considered a total of 41,323 zoometric/linear appraisal records from 22,727 primiparous does, 17,111 multiparous does, and 1,485 bucks registered in the Murciano-Grandina goat breed herdbook. Results suggest that, although a lack of significant differences (p > 0.05) was reported across the predictive breeding values of zoometric/linear appraisal traits for αS1, αS2, and κ casein, significant differences were found for β casein (p < 0.05). The presence of β casein haplotypic sequences GAGACCCC, GGAACCCC, GGAACCTC, GGAATCTC, GGGACCCC, GGGATCTC, and GGGGCCCC, linked to differential combinations of increased quantities of higher quality milk in terms of its composition, may also be connected to increased zoometric/linear appraisal predicted breeding values. Selection must be performed carefully, given the fact that the consideration of apparently desirable animals that present the haplotypic sequence GGGATCCC in the β casein gene, due to their positive predicted breeding values for certain zoometric/linear appraisal traits such as rear insertion height, bone quality, anterior insertion, udder depth, rear legs side view, and rear legs rear view, may lead to an indirect selection against the other zoometric/linear appraisal traits and in turn lead to an inefficient selection toward an optimal dairy morphological type in Murciano-Granadina goats. Contrastingly, the consideration of animals presenting the GGAACCCC haplotypic sequence involves also considering animals that increase the genetic potential for all zoometric/linear appraisal traits, thus making them recommendable as breeding animals. The relevance of this study relies on the fact that the information derived from these analyses will enhance the selection of breeding individuals, in which a desirable dairy type is indirectly sought, through the haplotypic sequences in the β casein locus, which is not currently routinely considered in the Murciano-Granadina goat breeding program.
Collapse
Affiliation(s)
| | | | - José M. León Jurado
- Agropecuary Provincial Centre, Diputación Provincial de Córdoba, Córdoba, Spain
| | - Antonio González Ariza
- Department of Genetics, University of Córdoba, Córdoba, Spain
- Agropecuary Provincial Centre, Diputación Provincial de Córdoba, Córdoba, Spain
| | | | | | - María G. Pizarro Inostroza
- Department of Genetics, University of Córdoba, Córdoba, Spain
- Animal Breeding Consulting, S.L., Córdoba Science and Technology Park Rabanales, Córdoba, Spain
| | | |
Collapse
|
32
|
Luzuriaga-Neira AR, Ritchie AM, Payne BL, Carrillo-Parramon O, Liberles DA, Alvarez-Ponce D. Highly Abundant Proteins Are Highly Thermostable. Genome Biol Evol 2023; 15:evad112. [PMID: 37399326 DOI: 10.1093/gbe/evad112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/08/2023] [Indexed: 07/05/2023] Open
Abstract
Highly abundant proteins tend to evolve slowly (a trend called E-R anticorrelation), and a number of hypotheses have been proposed to explain this phenomenon. The misfolding avoidance hypothesis attributes the E-R anticorrelation to the abundance-dependent toxic effects of protein misfolding. To avoid these toxic effects, protein sequences (particularly those of highly expressed proteins) would be under selection to fold properly. One prediction of the misfolding avoidance hypothesis is that highly abundant proteins should exhibit high thermostability (i.e., a highly negative free energy of folding, ΔG). Thus far, only a handful of analyses have tested for a relationship between protein abundance and thermostability, producing contradictory results. These analyses have been limited by 1) the scarcity of ΔG data, 2) the fact that these data have been obtained by different laboratories and under different experimental conditions, 3) the problems associated with using proteins' melting energy (Tm) as a proxy for ΔG, and 4) the difficulty of controlling for potentially confounding variables. Here, we use computational methods to compare the free energy of folding of pairs of human-mouse orthologous proteins with different expression levels. Even though the effect size is limited, the most highly expressed ortholog is often the one with a more negative ΔG of folding, indicating that highly expressed proteins are often more thermostable.
Collapse
Affiliation(s)
| | - Andrew M Ritchie
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, Pennsylvania, USA
| | | | | | - David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, Pennsylvania, USA
| | | |
Collapse
|
33
|
Zabardast A, Tamer EG, Son YA, Yılmaz A. An automated framework for evaluation of deep learning models for splice site predictions. Sci Rep 2023; 13:10221. [PMID: 37353532 PMCID: PMC10290104 DOI: 10.1038/s41598-023-34795-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 05/08/2023] [Indexed: 06/25/2023] Open
Abstract
A novel framework for the automated evaluation of various deep learning-based splice site detectors is presented. The framework eliminates time-consuming development and experimenting activities for different codebases, architectures, and configurations to obtain the best models for a given RNA splice site dataset. RNA splicing is a cellular process in which pre-mRNAs are processed into mature mRNAs and used to produce multiple mRNA transcripts from a single gene sequence. Since the advancement of sequencing technologies, many splice site variants have been identified and associated with the diseases. So, RNA splice site prediction is essential for gene finding, genome annotation, disease-causing variants, and identification of potential biomarkers. Recently, deep learning models performed highly accurately for classifying genomic signals. Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM) and its bidirectional version (BLSTM), Gated Recurrent Unit (GRU), and its bidirectional version (BGRU) are promising models. During genomic data analysis, CNN's locality feature helps where each nucleotide correlates with other bases in its vicinity. In contrast, BLSTM can be trained bidirectionally, allowing sequential data to be processed from forward and reverse directions. Therefore, it can process 1-D encoded genomic data effectively. Even though both methods have been used in the literature, a performance comparison was missing. To compare selected models under similar conditions, we have created a blueprint for a series of networks with five different levels. As a case study, we compared CNN and BLSTM models' learning capabilities as building blocks for RNA splice site prediction in two different datasets. Overall, CNN performed better with [Formula: see text] accuracy ([Formula: see text] improvement), [Formula: see text] F1 score ([Formula: see text] improvement), and [Formula: see text] AUC-PR ([Formula: see text] improvement) in human splice site prediction. Likewise, an outperforming performance with [Formula: see text] accuracy ([Formula: see text] improvement), [Formula: see text] F1 score ([Formula: see text] improvement), and [Formula: see text] AUC-PR ([Formula: see text] improvement) is achieved in C. elegans splice site prediction. Overall, our results showed that CNN learns faster than BLSTM and BGRU. Moreover, CNN performs better at extracting sequence patterns than BLSTM and BGRU. To our knowledge, no other framework is developed explicitly for evaluating splice detection models to decide the best possible model in an automated manner. So, the proposed framework and the blueprint would help selecting different deep learning models, such as CNN vs. BLSTM and BGRU, for splice site analysis or similar classification tasks and in different problems.
Collapse
Affiliation(s)
- Amin Zabardast
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Elif Güney Tamer
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Yeşim Aydın Son
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Arif Yılmaz
- Institute of Data Science, Maastricht University, Maastricht, The Netherlands.
| |
Collapse
|
34
|
Kumar S, Agrawal A, Vindal V. BCLncRDB: a comprehensive database of LncRNAs associated with breast cancer. Funct Integr Genomics 2023; 23:178. [PMID: 37227514 DOI: 10.1007/s10142-023-01112-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/11/2023] [Accepted: 05/17/2023] [Indexed: 05/26/2023]
Abstract
Breast cancer, the most common cancer in women, is characterized by high morbidity and mortality worldwide. Recent evidence has shown that long non-coding RNAs (lncRNAs) play a crucial role in the development and progression of breast cancer. However, despite increasing data and evidence indicating the implication of lncRNAs in breast cancer, no web resource or database exists primarily for lncRNAs associated with only breast cancer. Therefore, we developed a manually curated, comprehensive database, "BCLncRDB," for lncRNAs associated with breast cancer. For this, we collected, processed, and analyzed available data on breast cancer-associated lncRNAs from different sources, including previously published research articles, the Gene Expression Omnibus (GEO) Database of the National Centre for Biotechnology Information (NCBI), The Cancer Genome Atlas (TCGA), and the Ensembl database; subsequently, these data were hosted at BCLncRDB for public access. Currently, the database contains 5324 unique breast cancer-lncRNA associations and has the following features: (i) a user-friendly, easy-to-use web interface for searching and browsing about lncRNAs of the user's interest, (ii) differentially expressed and methylated lncRNAs, (iii) stage- and subtype-specific lncRNAs, and (iv) drugs, subcellular localization, sequence, and chromosome information of these lncRNAs. Thus, the BCLncRDB provides a one-stop dedicated platform for exploring breast cancer-related lncRNAs to advance and support the ongoing research on this disease. The BCLncRDB is publicly available for use at http://sls.uohyd.ac.in/new/bclncrdb_v1 .
Collapse
Affiliation(s)
- Swapnil Kumar
- Department of Biotechnology & Bioinformatics, School of Life Sciences, South Campus, University of Hyderabad, Prof. C. R. Rao Road, Gachibowli, Hyderabad, 500046, India
| | - Avantika Agrawal
- Department of Biotechnology & Bioinformatics, School of Life Sciences, South Campus, University of Hyderabad, Prof. C. R. Rao Road, Gachibowli, Hyderabad, 500046, India
| | - Vaibhav Vindal
- Department of Biotechnology & Bioinformatics, School of Life Sciences, South Campus, University of Hyderabad, Prof. C. R. Rao Road, Gachibowli, Hyderabad, 500046, India.
| |
Collapse
|
35
|
Wu Y, Jin M, Fernandez M, Hart KL, Liao A, Ge X, Fernandes SM, McDonald T, Chen Z, Röth D, Ghoda LY, Marcucci G, Kalkum M, Pillai RK, Danilov AV, Li JJ, Chen J, Brown JR, Rosen ST, Siddiqi T, Wang L. METTL3-Mediated m6A Modification Controls Splicing Factor Abundance and Contributes to Aggressive CLL. Blood Cancer Discov 2023; 4:228-245. [PMID: 37067905 PMCID: PMC10150290 DOI: 10.1158/2643-3230.bcd-22-0156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 01/30/2023] [Accepted: 03/10/2023] [Indexed: 04/18/2023] Open
Abstract
RNA splicing dysregulation underlies the onset and progression of cancers. In chronic lymphocytic leukemia (CLL), spliceosome mutations leading to aberrant splicing occur in ∼20% of patients. However, the mechanism for splicing defects in spliceosome-unmutated CLL cases remains elusive. Through an integrative transcriptomic and proteomic analysis, we discover that proteins involved in RNA splicing are posttranscriptionally upregulated in CLL cells, resulting in splicing dysregulation. The abundance of splicing complexes is an independent risk factor for poor prognosis. Moreover, increased splicing factor expression is highly correlated with the abundance of METTL3, an RNA methyltransferase that deposits N6-methyladenosine (m6A) on mRNA. METTL3 is essential for cell growth in vitro and in vivo and controls splicing factor protein expression in a methyltransferase-dependent manner through m6A modification-mediated ribosome recycling and decoding. Our results uncover METTL3-mediated m6A modification as a novel regulatory axis in driving splicing dysregulation and contributing to aggressive CLL. SIGNIFICANCE METTL3 controls widespread splicing factor abundance via translational control of m6A-modified mRNA, contributes to RNA splicing dysregulation and disease progression in CLL, and serves as a potential therapeutic target in aggressive CLL. See related commentary by Janin and Esteller, p. 176. This article is highlighted in the In This Issue feature, p. 171.
Collapse
Affiliation(s)
- Yiming Wu
- Department of Systems Biology, Beckman Research Institute, City of Hope National Comprehensive Cancer Center, Monrovia, California
| | - Meiling Jin
- Department of Systems Biology, Beckman Research Institute, City of Hope National Comprehensive Cancer Center, Monrovia, California
| | - Mike Fernandez
- Department of Systems Biology, Beckman Research Institute, City of Hope National Comprehensive Cancer Center, Monrovia, California
| | - Kevyn L. Hart
- Department of Systems Biology, Beckman Research Institute, City of Hope National Comprehensive Cancer Center, Monrovia, California
| | - Aijun Liao
- Department of Systems Biology, Beckman Research Institute, City of Hope National Comprehensive Cancer Center, Monrovia, California
| | - Xinzhou Ge
- Department of Statistics, University of California, Los Angeles, California
- Department of Computational Medicine, University of California, Los Angeles, California
| | - Stacey M. Fernandes
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts
| | - Tinisha McDonald
- The Hematopoietic Tissue Biorepository, City of Hope National Comprehensive Cancer Center, Duarte, California
- Department of Hematological Malignancies Translational Sciences, Beckman Research Institute, City of Hope Comprehensive Cancer Center, Duarte, California
| | - Zhenhua Chen
- Department of Systems Biology, Beckman Research Institute, City of Hope National Comprehensive Cancer Center, Monrovia, California
| | - Daniel Röth
- Department of Molecular Imaging and Therapy, Diabetes and Metabolism Research Institute, Beckman Research Institute, City of Hope, Duarte, California
| | - Lucy Y. Ghoda
- The Hematopoietic Tissue Biorepository, City of Hope National Comprehensive Cancer Center, Duarte, California
- Department of Hematological Malignancies Translational Sciences, Beckman Research Institute, City of Hope Comprehensive Cancer Center, Duarte, California
| | - Guido Marcucci
- The Hematopoietic Tissue Biorepository, City of Hope National Comprehensive Cancer Center, Duarte, California
- Department of Hematological Malignancies Translational Sciences, Beckman Research Institute, City of Hope Comprehensive Cancer Center, Duarte, California
- Department of Hematology & Hematopoietic Cell Transplantation, City of Hope Comprehensive Cancer Center, Duarte, California
| | - Markus Kalkum
- Department of Molecular Imaging and Therapy, Diabetes and Metabolism Research Institute, Beckman Research Institute, City of Hope, Duarte, California
| | - Raju K. Pillai
- Department of Pathology, City of Hope National Comprehensive Cancer Center, Duarte, California
| | - Alexey V. Danilov
- Department of Hematology & Hematopoietic Cell Transplantation, City of Hope Comprehensive Cancer Center, Duarte, California
- Toni Stephenson Lymphoma Center, Beckman Research Institute, City of Hope Comprehensive Cancer Center, Duarte, California
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, California
- Department of Computational Medicine, University of California, Los Angeles, California
| | - Jianjun Chen
- Department of Systems Biology, Beckman Research Institute, City of Hope National Comprehensive Cancer Center, Monrovia, California
| | - Jennifer R. Brown
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts
| | - Steven T. Rosen
- Department of Hematology & Hematopoietic Cell Transplantation, City of Hope Comprehensive Cancer Center, Duarte, California
- Toni Stephenson Lymphoma Center, Beckman Research Institute, City of Hope Comprehensive Cancer Center, Duarte, California
| | - Tanya Siddiqi
- Department of Hematology & Hematopoietic Cell Transplantation, City of Hope Comprehensive Cancer Center, Duarte, California
- Toni Stephenson Lymphoma Center, Beckman Research Institute, City of Hope Comprehensive Cancer Center, Duarte, California
| | - Lili Wang
- Department of Systems Biology, Beckman Research Institute, City of Hope National Comprehensive Cancer Center, Monrovia, California
- Toni Stephenson Lymphoma Center, Beckman Research Institute, City of Hope Comprehensive Cancer Center, Duarte, California
| |
Collapse
|
36
|
Zhao L, Zhang G, Tang A, Huang B, Mi D. Microgravity alters the expressions of DNA repair genes and their regulatory miRNAs in space-flown Caenorhabditis elegans. LIFE SCIENCES IN SPACE RESEARCH 2023; 37:25-38. [PMID: 37087176 DOI: 10.1016/j.lssr.2023.02.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 11/14/2022] [Accepted: 02/06/2023] [Indexed: 05/03/2023]
Abstract
During spaceflight, multiple unique hazardous factors, particularly microgravity and space radiation, can induce different types of DNA damage, which pose a constant threat to genomic integrity and stability of living organisms. Although organisms have evolved different kinds of conserved DNA repair pathways to eliminate this DNA damage on Earth, the impact of space microgravity on the expressions of these DNA repair genes and their regulatory miRNAs has not been fully explored. In this study, we integrated all existing datasets, including both transcriptional and miRNA microarrays in wild-type (WT) Caenorhabditis elegans that were exposed to the treatments of spaceflight (SF), spaceflight control with a 1g centrifugal device (SC), and ground control (GC) in three space experiments with the periods of 4, 8 and 16.5 days. The results of principal component analysis showed the gene expression patterns for five major DNA repair pathways (i.e., non-homologous end joining (NHEJ), homologous recombination (HR), mismatch repair (MMR), nucleotide excision repair (NER), and base excision repair (BER)) were well separated and clustered between SF/GC and SC/GC treatments after three spaceflights. In the 16.5-days space experiment, we also selected the datasets of dys-1 mutant and ced-1 mutant of C. elegans, which respectively presented microgravity-insensitivity and radiosensitivity. Compared to the WT C. elegans flown in the 16.5-days spaceflight, the separation distances between SF and SC samples were significantly reduced in the dys-1 mutant, while greatly enhanced in the ced-1 mutant for five DNA repair pathways. By comparing the results of differential expression analysis in SF/GC versus SC/GC samples, we found the DNA repair genes annotated in the pathways of BER and NER were prominently down-regulated under microgravity during both the 4- and 8-days spaceflights. While, under microgravity, the genes annotated in MMR were dominatingly up-regulated during the 4-days spaceflight, and those annotated in HR were mainly up-regulated during the 8-days spaceflight. And, most of the DNA repair genes annotated in the pathways of BER, NER, MMR, and HR were up-regulated under microgravity during the 16.5-days spaceflight. Using miRNA-mRNA integrated analysis, we determined the regulatory networks of differentially expressed DNA repair genes and their regulatory miRNAs in WT C. elegans after three spaceflights. Compared to GC conditions, the differentially expressed miRNAs were analyzed under SF and SC treatments of three spaceflights, and some altered miRNAs that responded to SF and SC could regulate the expressions of corresponding DNA repair genes annotated in different DNA repair pathways. In summary, these findings indicate that microgravity can significantly alter the expression patterns of DNA repair genes and their regulatory miRNAs in space-flown C. elegans. The alterations of the expressions of DNA repair genes and the dominating DNA repair pathways under microgravity are possibly related to the spaceflight period. In addition, the key miRNAs are identified as the post-transcriptional regulators to regulate the expressions of various DNA repair genes under microgravity. These altered miRNAs that responded to microgravity can be implicated in regulating diverse DNA repair processes in space-flown C. elegans.
Collapse
Affiliation(s)
- Lei Zhao
- Institute of Environmental Systems Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian 116026, Liaoning, China.
| | - Ge Zhang
- Institute of Environmental Systems Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian 116026, Liaoning, China
| | - Aiping Tang
- College of Science, Dalian Maritime University, Dalian 116026, Liaoning, China
| | - Baohang Huang
- Institute of Environmental Systems Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian 116026, Liaoning, China
| | - Dong Mi
- College of Science, Dalian Maritime University, Dalian 116026, Liaoning, China
| |
Collapse
|
37
|
Moreno A, Taffet A, Tjahjono E, Anderson QL, Kirienko NV. Examining Sporadic Cancer Mutations Uncovers a Set of Genes Involved in Mitochondrial Maintenance. Genes (Basel) 2023; 14:1009. [PMID: 37239369 PMCID: PMC10218105 DOI: 10.3390/genes14051009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 04/25/2023] [Accepted: 04/27/2023] [Indexed: 05/28/2023] Open
Abstract
Mitochondria are key organelles for cellular health and metabolism and the activation of programmed cell death processes. Although pathways for regulating and re-establishing mitochondrial homeostasis have been identified over the past twenty years, the consequences of disrupting genes that regulate other cellular processes, such as division and proliferation, on affecting mitochondrial function remain unclear. In this study, we leveraged insights about increased sensitivity to mitochondrial damage in certain cancers, or genes that are frequently mutated in multiple cancer types, to compile a list of candidates for study. RNAi was used to disrupt orthologous genes in the model organism Caenorhabditis elegans, and a series of assays were used to evaluate these genes' importance for mitochondrial health. Iterative screening of ~1000 genes yielded a set of 139 genes predicted to play roles in mitochondrial maintenance or function. Bioinformatic analyses indicated that these genes are statistically interrelated. Functional validation of a sample of genes from this set indicated that disruption of each gene caused at least one phenotype consistent with mitochondrial dysfunction, including increased fragmentation of the mitochondrial network, abnormal steady-state levels of NADH or ROS, or altered oxygen consumption. Interestingly, RNAi-mediated knockdown of these genes often also exacerbated α-synuclein aggregation in a C. elegans model of Parkinson's disease. Additionally, human orthologs of the gene set showed enrichment for roles in human disorders. This gene set provides a foundation for identifying new mechanisms that support mitochondrial and cellular homeostasis.
Collapse
Affiliation(s)
| | | | | | | | - Natalia V. Kirienko
- Department of BioSciences, Rice University, 6100 Main St, MS140, Houston, TX 77005, USA; (A.M.); (A.T.); (E.T.); (Q.L.A.)
| |
Collapse
|
38
|
Alsaafin A, Safarpoor A, Sikaroudi M, Hipp JD, Tizhoosh HR. Learning to predict RNA sequence expressions from whole slide images with applications for search and classification. Commun Biol 2023; 6:304. [PMID: 36949169 PMCID: PMC10033650 DOI: 10.1038/s42003-023-04583-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 02/13/2023] [Indexed: 03/24/2023] Open
Abstract
Deep learning methods are widely applied in digital pathology to address clinical challenges such as prognosis and diagnosis. As one of the most recent applications, deep models have also been used to extract molecular features from whole slide images. Although molecular tests carry rich information, they are often expensive, time-consuming, and require additional tissue to sample. In this paper, we propose tRNAsformer, an attention-based topology that can learn both to predict the bulk RNA-seq from an image and represent the whole slide image of a glass slide simultaneously. The tRNAsformer uses multiple instance learning to solve a weakly supervised problem while the pixel-level annotation is not available for an image. We conducted several experiments and achieved better performance and faster convergence in comparison to the state-of-the-art algorithms. The proposed tRNAsformer can assist as a computational pathology tool to facilitate a new generation of search and classification methods by combining the tissue morphology and the molecular fingerprint of the biopsy samples.
Collapse
Affiliation(s)
- Areej Alsaafin
- Rhazes Lab, Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN, USA
- Kimia Lab, University of Waterloo, Waterloo, ON, Canada
| | | | | | - Jason D Hipp
- Division of Computational Pathology and AI, Mayo Clinic, Rochester, MN, USA
| | - H R Tizhoosh
- Rhazes Lab, Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN, USA.
- Kimia Lab, University of Waterloo, Waterloo, ON, Canada.
| |
Collapse
|
39
|
Figaschewski M, Sürün B, Tiede T, Kohlbacher O. The personalized cancer network explorer (PeCaX) as a visual analytics tool to support molecular tumor boards. BMC Bioinformatics 2023; 24:88. [PMID: 36890446 PMCID: PMC9993744 DOI: 10.1186/s12859-023-05194-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 02/17/2023] [Indexed: 03/10/2023] Open
Abstract
BACKGROUND Personalized oncology represents a shift in cancer treatment from conventional methods to target specific therapies where the decisions are made based on the patient specific tumor profile. Selection of the optimal therapy relies on a complex interdisciplinary analysis and interpretation of these variants by experts in molecular tumor boards. With up to hundreds of somatic variants identified in a tumor, this process requires visual analytics tools to guide and accelerate the annotation process. RESULTS The Personal Cancer Network Explorer (PeCaX) is a visual analytics tool supporting the efficient annotation, navigation, and interpretation of somatic genomic variants through functional annotation, drug target annotation, and visual interpretation within the context of biological networks. Starting with somatic variants in a VCF file, PeCaX enables users to explore these variants through a web-based graphical user interface. The most protruding feature of PeCaX is the combination of clinical variant annotation and gene-drug networks with an interactive visualization. This reduces the time and effort the user needs to invest to get to a treatment suggestion and helps to generate new hypotheses. PeCaX is being provided as a platform-independent containerized software package for local or institution-wide deployment. PeCaX is available for download at https://github.com/KohlbacherLab/PeCaX-docker .
Collapse
Affiliation(s)
- Mirjam Figaschewski
- Department of Computer Science, University of Tübingen, Tübingen, Germany. .,Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany.
| | - Bilge Sürün
- Department of Computer Science, University of Tübingen, Tübingen, Germany.,Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany
| | - Thorsten Tiede
- Department of Computer Science, University of Tübingen, Tübingen, Germany.,Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany
| | - Oliver Kohlbacher
- Department of Computer Science, University of Tübingen, Tübingen, Germany.,Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany.,Institute for Translational Bioinformatics, University Hospital Tübingen, Tübingen, Germany
| |
Collapse
|
40
|
Assessing the Genomics Structure of Dorper and White Dorper Variants, and Dorper Populations in South Africa and Hungary. BIOLOGY 2023; 12:biology12030386. [PMID: 36979078 PMCID: PMC10045292 DOI: 10.3390/biology12030386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 02/20/2023] [Accepted: 02/22/2023] [Indexed: 03/05/2023]
Abstract
Dorper sheep was developed for meat production in arid and semi-arid regions under extensive production systems in South Africa. Two variants with distinct head and neck colors were bred during their development process. White Dorper have a white coat while Dorper have a black head and neck. Both variants have grown in popularity around the world. Therefore, understanding the genomic architecture between South African Dorpers and Dorper populations adapted to other climatic regions, as well as genomic differences between Dorper and White Dorper variants is vital for their molecular management. Using the ovine 50K SNP chip, this study compared the genetic architecture of Dorper variants between populations from South Africa and Hungary. The Dorper populations in both countries had high genetic diversity levels, although Dorper in Hungary showed high levels of inbreeding. White Dorpers from both countries were genetically closely related, while Dorpers were distantly related according to principal component analysis and neighbor-joining tree. Additionally, whereas all groups displayed unique selection signatures for local adaptation, Dorpers from Hungary had a similar linkage disequilibrium decay. Environmental differences and color may have influenced the genetic differentiation between the Dorpers. For their molecular management and prospective genomic selection, it is crucial to understand the Dorper sheep’s genomic architecture, and the results of this study can be interpreted as a step in this direction.
Collapse
|
41
|
Quan X, Cai W, Xi C, Wang C, Yan L. AIMedGraph: a comprehensive multi-relational knowledge graph for precision medicine. Database (Oxford) 2023; 2023:7059703. [PMID: 36856726 PMCID: PMC9976745 DOI: 10.1093/database/baad006] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 02/01/2023] [Accepted: 02/10/2023] [Indexed: 03/02/2023]
Abstract
The development of high-throughput molecular testing techniques has enabled the large-scale exploration of the underlying molecular causes of diseases and the development of targeted treatment for specific genetic alterations. However, knowledge to interpret the impact of genetic variants on disease or treatment is distributed in different databases, scientific literature studies and clinical guidelines. AIMedGraph was designed to comprehensively collect and interrogate standardized information about genes, genetic alterations and their therapeutic and diagnostic relevance and build a multi-relational, evidence-based knowledge graph. Graph database Neo4j was used to represent precision medicine knowledge as nodes and edges in AIMedGraph. Entities in the current release include 30 340 diseases/phenotypes, 26 140 genes, 187 541 genetic variants, 2821 drugs, 15 125 clinical trials and 797 911 supporting literature studies. Edges in this release cover 621 731 drug interactions, 9279 drug susceptibility impacts, 6330 pharmacogenomics effects, 30 339 variant pathogenicity and 1485 drug adverse reactions. The knowledge graph technique enables hidden knowledge inference and provides insight into potential disease or drug molecular mechanisms. Database URL: http://aimedgraph.tongshugene.net:8201.
Collapse
Affiliation(s)
- Xueping Quan
- Correspondence may also be addressed to Xueping Quan. Tel: +8621-58886662;
| | - Weijing Cai
- Department of Innovative Technology, Shanghai Tongshu Biotechnology Research Institute, No26 and 28, 377 Lane of Shanlian Road, Baoshan District, Shanghai 200444, China
| | - Chenghang Xi
- Department of Artificial Intelligence, Shanghai Tongshu Biotechnology Research Institute, No26 and 28, 377 Lane of Shanlian Road, Baoshan District, Shanghai 200444, China
| | - Chunxiao Wang
- Department of Innovative Technology, Shanghai Tongshu Biotechnology Research Institute, No26 and 28, 377 Lane of Shanlian Road, Baoshan District, Shanghai 200444, China
| | | |
Collapse
|
42
|
Nithya C, Kiran M, Nagarajaram HA. Dissection of hubs and bottlenecks in a protein-protein interaction network. Comput Biol Chem 2023; 102:107802. [PMID: 36603332 DOI: 10.1016/j.compbiolchem.2022.107802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 11/20/2022] [Accepted: 12/08/2022] [Indexed: 12/23/2022]
Abstract
Analysis of degree centrality in conjunction with betweenness centrality of proteins in a human protein-protein interaction network revealed three categories of centrally important proteins: a) proteins with high degree and betweenness (hub-bottlenecks denoted as MX), b) proteins with high betweenness and low degree (non-hub-bottlenecks/pure bottlenecks denoted as PB) and c) proteins with high degree and low betweenness (hub-non-bottlenecks/pure hubs denoted as PH). When subjected to a detailed statistical analysis of their molecular-level properties, the proteins belonging to each of these categories were found to be associated with distinct canonical molecular properties, i.e., "molecular markers". The MX proteins are a) conformationally versatile, mainly comprising of essential proteins, b) the targets for interactions by the proteins of viral and bacterial pathogens, c) evolutionally constrained, involved in multiple pathways, enriched with disease genes and d) involved in the functions such as protein stabilization, phosphorylation, and mRNA slicing processes. PB proteins are a) enriched with extracellular and cancer-related proteins, b) enriched with the approved drug targets and c) involved in cell-cell signaling processes. Finally, PH are a) structurally versatile, b) enriched with essential proteins primarily involved in housekeeping processes (transcription and replication). The fact that the proteins belonging to these three categories form three distinct sets in terms of their molecular properties reveals the existence of trichotomy among hubs and bottlenecks, and this knowledge is of paramount importance while prioritizing protein targets for further studies such as drug design and disease association studies based on their network centrality values.
Collapse
Affiliation(s)
- Chandramohan Nithya
- Department of Biotechnology and Bioinformatics, School of Life Sciences, University of Hyderabad, Hyderabad, Telangana 500046, India
| | - Manjari Kiran
- Department of Systems and Computational Biology, School of Life Sciences, University of Hyderabad, Hyderabad, Telangana 500046, India
| | | |
Collapse
|
43
|
Kumar A, Schrader AW, Boroojeny AE, Asadian M, Lee J, Song YJ, Zhao SD, Han HS, Sinha S. Intracellular Spatial Transcriptomic Analysis Toolkit (InSTAnT). RESEARCH SQUARE 2023:rs.3.rs-2481749. [PMID: 36747718 PMCID: PMC9901031 DOI: 10.21203/rs.3.rs-2481749/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Imaging-based spatial transcriptomics technologies such as MERFISH offer snapshots of cellular processes in unprecedented detail, but new analytic tools are needed to realize their full potential. We present InSTAnT, a computational toolkit for extracting molecular relationships from spatial transcriptomics data at the intra-cellular resolution. InSTAnT detects gene pairs and modules with interesting patterns of mutual co-localization within and across cells, using specialized statistical tests and graph mining. We showcase the toolkit on datasets profiling a human cancer cell line and hypothalamic preoptic region of mouse brain. We performed rigorous statistical assessment of discovered co-localization patterns, found supporting evidence from databases and RNA interactions, and identified subcellular domains associated with RNA-colocalization. We identified several novel cell type-specific gene co-localizations in the brain. Intra-cellular spatial patterns discovered by InSTAnT mirror diverse molecular relationships, including RNA interactions and shared sub-cellular localization or function, providing a rich compendium of testable hypotheses regarding molecular functions.
Collapse
Affiliation(s)
- Anurendra Kumar
- College of Computing, Georgia Institute of Technology, Atlanta, GA, 30332, USA
| | - Alex W. Schrader
- Department of Chemistry, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA
| | | | - Marisa Asadian
- Department of Chemistry, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA
| | - Juyeon Lee
- Department of Chemistry, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA
| | - You Jin Song
- Department of Cell and Developmental Biology, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA
| | - Sihai Dave Zhao
- Department of Statistics, University of Illinois Urbana-Champaign, Urbana, IL, 61820, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA
| | - Hee-Sun Han
- Department of Chemistry, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA
| | - Saurabh Sinha
- The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA, 30332, USA
- H. Milton Stewart School of Industrial & Systems Engineering, Georgia Institute of Technology, Atlanta, GA, 30318, USA
| |
Collapse
|
44
|
Li Z, Gao E, Zhou J, Han W, Xu X, Gao X. Applications of deep learning in understanding gene regulation. CELL REPORTS METHODS 2023; 3:100384. [PMID: 36814848 PMCID: PMC9939384 DOI: 10.1016/j.crmeth.2022.100384] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Gene regulation is a central topic in cell biology. Advances in omics technologies and the accumulation of omics data have provided better opportunities for gene regulation studies than ever before. For this reason deep learning, as a data-driven predictive modeling approach, has been successfully applied to this field during the past decade. In this article, we aim to give a brief yet comprehensive overview of representative deep-learning methods for gene regulation. Specifically, we discuss and compare the design principles and datasets used by each method, creating a reference for researchers who wish to replicate or improve existing methods. We also discuss the common problems of existing approaches and prospectively introduce the emerging deep-learning paradigms that will potentially alleviate them. We hope that this article will provide a rich and up-to-date resource and shed light on future research directions in this area.
Collapse
Affiliation(s)
- Zhongxiao Li
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Elva Gao
- The KAUST School, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Juexiao Zhou
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Wenkai Han
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xiaopeng Xu
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| |
Collapse
|
45
|
Zan CF, Wei WF, Li JA, Shi MP, Cong L, Gu MY, Chen YH, Wang SY, Li ZH. Circulating exosomal lncRNA contributes to the pathogenesis of spinal cord injury in rats. Neural Regen Res 2023; 18:889-894. [DOI: 10.4103/1673-5374.353504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
|
46
|
Kuhn L, Vincent T, Hammann P, Zuber H. Exploring Protein Interactome Data with IPinquiry: Statistical Analysis and Data Visualization by Spectral Counts. Methods Mol Biol 2023; 2426:243-265. [PMID: 36308692 DOI: 10.1007/978-1-0716-1967-4_11] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Immunoprecipitation mass spectrometry (IP-MS) is a popular method for the identification of protein-protein interactions. This approach is particularly powerful when information is collected without a priori knowledge and has been successively used as a first key step for the elucidation of many complex protein networks. IP-MS consists in the affinity purification of a protein of interest and of its interacting proteins followed by protein identification and quantification by mass spectrometry analysis. We developed an R package, named IPinquiry, dedicated to IP-MS analysis and based on the spectral count quantification method. The main purpose of this package is to provide a simple R pipeline with a limited number of processing steps to facilitate data exploration for biologists. This package allows to perform differential analysis of protein accumulation between two groups of IP experiments, to retrieve protein annotations, to export results, and to create different types of graphics. Here we describe the step-by-step procedure for an interactome analysis using IPinquiry from data loading to result export and plot production.
Collapse
Affiliation(s)
- Lauriane Kuhn
- Plateforme protéomique Strasbourg Esplanade du CNRS, Université de Strasbourg, Strasbourg, France
| | - Timothée Vincent
- Institut de biologie moléculaire des plantes, CNRS, Université de Strasbourg, Strasbourg, France
| | - Philippe Hammann
- Plateforme protéomique Strasbourg Esplanade du CNRS, Université de Strasbourg, Strasbourg, France
| | - Hélène Zuber
- Institut de biologie moléculaire des plantes, CNRS, Université de Strasbourg, Strasbourg, France.
| |
Collapse
|
47
|
Silva JM, Qi W, Pinho AJ, Pratas D. AlcoR: alignment-free simulation, mapping, and visualization of low-complexity regions in biological data. Gigascience 2022; 12:giad101. [PMID: 38091509 PMCID: PMC10716826 DOI: 10.1093/gigascience/giad101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 09/29/2023] [Accepted: 11/07/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Low-complexity data analysis is the area that addresses the search and quantification of regions in sequences of elements that contain low-complexity or repetitive elements. For example, these can be tandem repeats, inverted repeats, homopolymer tails, GC-biased regions, similar genes, and hairpins, among many others. Identifying these regions is crucial because of their association with regulatory and structural characteristics. Moreover, their identification provides positional and quantity information where standard assembly methodologies face significant difficulties because of substantial higher depth coverage (mountains), ambiguous read mapping, or where sequencing or reconstruction defects may occur. However, the capability to distinguish low-complexity regions (LCRs) in genomic and proteomic sequences is a challenge that depends on the model's ability to find them automatically. Low-complexity patterns can be implicit through specific or combined sources, such as algorithmic or probabilistic, and recurring to different spatial distances-namely, local, medium, or distant associations. FINDINGS This article addresses the challenge of automatically modeling and distinguishing LCRs, providing a new method and tool (AlcoR) for efficient and accurate segmentation and visualization of these regions in genomic and proteomic sequences. The method enables the use of models with different memories, providing the ability to distinguish local from distant low-complexity patterns. The method is reference and alignment free, providing additional methodologies for testing, including a highly flexible simulation method for generating biological sequences (DNA or protein) with different complexity levels, sequence masking, and a visualization tool for automatic computation of the LCR maps into an ideogram style. We provide illustrative demonstrations using synthetic, nearly synthetic, and natural sequences showing the high efficiency and accuracy of AlcoR. As large-scale results, we use AlcoR to unprecedentedly provide a whole-chromosome low-complexity map of a recent complete human genome and the haplotype-resolved chromosome pairs of a heterozygous diploid African cassava cultivar. CONCLUSIONS The AlcoR method provides the ability of fast sequence characterization through data complexity analysis, ideally for scenarios entangling the presence of new or unknown sequences. AlcoR is implemented in C language using multithreading to increase the computational speed, is flexible for multiple applications, and does not contain external dependencies. The tool accepts any sequence in FASTA format. The source code is freely provided at https://github.com/cobilab/alcor.
Collapse
Affiliation(s)
- Jorge M Silva
- IEETA, Institute of Electronics and Informatics Engineering of Aveiro, and LASI, Intelligent Systems Associate Laboratory, University of Aveiro, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal
- Department of Electronics Telecommunications and Informatics, University of Aveiro, Campus Universitario de Santiago, 3810-193, Aveiro, Portugal
| | - Weihong Qi
- Functional Genomics Center Zurich, ETH Zurich and University of Zurich, Winterthurerstrasse, 190, 8057, Zurich, Switzerland
- SIB, Swiss Institute of Bioinformatics, 1202, Geneva, Switzerland
| | - Armando J Pinho
- IEETA, Institute of Electronics and Informatics Engineering of Aveiro, and LASI, Intelligent Systems Associate Laboratory, University of Aveiro, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal
- Department of Electronics Telecommunications and Informatics, University of Aveiro, Campus Universitario de Santiago, 3810-193, Aveiro, Portugal
| | - Diogo Pratas
- IEETA, Institute of Electronics and Informatics Engineering of Aveiro, and LASI, Intelligent Systems Associate Laboratory, University of Aveiro, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal
- Department of Electronics Telecommunications and Informatics, University of Aveiro, Campus Universitario de Santiago, 3810-193, Aveiro, Portugal
- Department of Virology, University of Helsinki, Haartmaninkatu, 3, 00014 Helsinki, Finland
| |
Collapse
|
48
|
Robinson JT, Thorvaldsdottir H, Turner D, Mesirov JP. igv.js: an embeddable JavaScript implementation of the Integrative Genomics Viewer (IGV). Bioinformatics 2022; 39:6958554. [PMID: 36562559 PMCID: PMC9825295 DOI: 10.1093/bioinformatics/btac830] [Citation(s) in RCA: 79] [Impact Index Per Article: 39.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 11/29/2022] [Accepted: 12/22/2022] [Indexed: 12/24/2022] Open
Abstract
SUMMARY igv.js is an embeddable JavaScript implementation of the Integrative Genomics Viewer (IGV). It can be easily dropped into any web page with a single line of code and has no external dependencies. The viewer runs completely in the web browser, with no backend server and no data pre-processing required. AVAILABILITY AND IMPLEMENTATION The igv.js JavaScript component can be installed from NPM at https://www.npmjs.com/package/igv. The source code is available at https://github.com/igvteam/igv.js under the MIT open-source license. IGV-Web, the end-user application built around igv.js, is available at https://igv.org/app. The source code is available at https://github.com/igvteam/igv-webapp under the MIT open-source license. SUPPLEMENTARY INFORMATION Supplementary information is available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Douglass Turner
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Jill P Mesirov
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA,Moores Cancer Center, University of California San Diego La Jolla, CA 92037, USA
| |
Collapse
|
49
|
Esposito M, Gualandi N, Spirito G, Ansaloni F, Gustincich S, Sanges R. Transposons Acting as Competitive Endogenous RNAs: In-Silico Evidence from Datasets Characterised by L1 Overexpression. Biomedicines 2022; 10:biomedicines10123279. [PMID: 36552034 PMCID: PMC9776036 DOI: 10.3390/biomedicines10123279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 12/07/2022] [Accepted: 12/11/2022] [Indexed: 12/23/2022] Open
Abstract
LINE L1 are transposable elements that can replicate within the genome by passing through RNA intermediates. The vast majority of these element copies in the human genome are inactive and just between 100 and 150 copies are still able to mobilize. During evolution, they could have been positively selected for beneficial cellular functions. Nonetheless, L1 deregulation can be detrimental to the cell, causing diseases such as cancer. The activity of miRNAs represents a fundamental mechanism for controlling transcript levels in somatic cells. These are a class of small non-coding RNAs that cause degradation or translational inhibition of their target transcripts. Beyond this, competitive endogenous RNAs (ceRNAs), mostly made by circular and non-coding RNAs, have been seen to compete for the binding of the same set of miRNAs targeting protein coding genes. In this study, we have investigated whether autonomously transcribed L1s may act as ceRNAs by analyzing public dataset in-silico. We observed that genes sharing miRNA target sites with L1 have a tendency to be upregulated when L1 are overexpressed, suggesting the possibility that L1 might act as ceRNAs. This finding will help in the interpretation of transcriptomic responses in contexts characterized by the specific activation of transposons.
Collapse
Affiliation(s)
- Mauro Esposito
- Computational Genomics Laboratory, Area of Neuroscience, Scuola Internazionale Superiore di Studi Avanzati (SISSA), 34136 Trieste, Italy
| | - Nicolò Gualandi
- Computational Genomics Laboratory, Area of Neuroscience, Scuola Internazionale Superiore di Studi Avanzati (SISSA), 34136 Trieste, Italy
| | - Giovanni Spirito
- Computational Genomics Laboratory, Area of Neuroscience, Scuola Internazionale Superiore di Studi Avanzati (SISSA), 34136 Trieste, Italy
- CMP3vda, via Lavoratori Vittime del Col Du Mont 28, 11100 Aosta, Italy
| | - Federico Ansaloni
- Computational Genomics Laboratory, Area of Neuroscience, Scuola Internazionale Superiore di Studi Avanzati (SISSA), 34136 Trieste, Italy
- Central RNA Laboratory, Istituto Italiano di Tecnologia, 16132 Genova, Italy
| | - Stefano Gustincich
- CMP3vda, via Lavoratori Vittime del Col Du Mont 28, 11100 Aosta, Italy
- Central RNA Laboratory, Istituto Italiano di Tecnologia, 16132 Genova, Italy
| | - Remo Sanges
- Computational Genomics Laboratory, Area of Neuroscience, Scuola Internazionale Superiore di Studi Avanzati (SISSA), 34136 Trieste, Italy
- Central RNA Laboratory, Istituto Italiano di Tecnologia, 16132 Genova, Italy
- Correspondence:
| |
Collapse
|
50
|
Liu Q, Peng X, Shen M, Qian Q, Xing J, Li C, Gregory R. Ribo-uORF: a comprehensive data resource of upstream open reading frames (uORFs) based on ribosome profiling. Nucleic Acids Res 2022; 51:D248-D261. [PMID: 36440758 PMCID: PMC9825487 DOI: 10.1093/nar/gkac1094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 10/27/2022] [Accepted: 11/22/2022] [Indexed: 11/29/2022] Open
Abstract
Upstream open reading frames (uORFs) are typically defined as translation sites located within the 5' untranslated region upstream of the main protein coding sequence (CDS) of messenger RNAs (mRNAs). Although uORFs are prevalent in eukaryotic mRNAs and modulate the translation of downstream CDSs, a comprehensive resource for uORFs is currently lacking. We developed Ribo-uORF (http://rnainformatics.org.cn/RiboUORF) to serve as a comprehensive functional resource for uORF analysis based on ribosome profiling (Ribo-seq) data. Ribo-uORF currently supports six species: human, mouse, rat, zebrafish, fruit fly, and worm. Ribo-uORF includes 501 554 actively translated uORFs and 107 914 upstream translation initiation sites (uTIS), which were identified from 1495 Ribo-seq and 77 quantitative translation initiation sequencing (QTI-seq) datasets, respectively. We also developed mRNAbrowse to visualize items such as uORFs, cis-regulatory elements, genetic variations, eQTLs, GWAS-based associations, RNA modifications, and RNA editing. Ribo-uORF provides a very intuitive web interface for conveniently browsing, searching, and visualizing uORF data. Finally, uORFscan and UTR5var were developed in Ribo-uORF to precisely identify uORFs and analyze the influence of genetic mutations on uORFs using user-uploaded datasets. Ribo-uORF should greatly facilitate studies of uORFs and their roles in mRNA translation and posttranscriptional control of gene expression.
Collapse
Affiliation(s)
- Qi Liu
- To whom correspondence should be addressed. Tel: +86 020 87596559;
| | | | - Mengyuan Shen
- Rice Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China,Guangdong Key Laboratory of New Technology in Rice Breeding, Guangzhou 510640, China,Guangdong Rice Engineering Laboratory, Guangzhou 510640, China,Key Laboratory of Genetics and Breeding of High Quality Rice in Southern China (Co-construction by Ministry and Province), Guangzhou 510640, China
| | - Qian Qian
- Rice Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China,Guangdong Key Laboratory of New Technology in Rice Breeding, Guangzhou 510640, China,Guangdong Rice Engineering Laboratory, Guangzhou 510640, China,Key Laboratory of Genetics and Breeding of High Quality Rice in Southern China (Co-construction by Ministry and Province), Guangzhou 510640, China
| | - Junlian Xing
- Rice Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China,Guangdong Key Laboratory of New Technology in Rice Breeding, Guangzhou 510640, China,Guangdong Rice Engineering Laboratory, Guangzhou 510640, China,Key Laboratory of Genetics and Breeding of High Quality Rice in Southern China (Co-construction by Ministry and Province), Guangzhou 510640, China
| | - Chen Li
- Rice Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China,Guangdong Key Laboratory of New Technology in Rice Breeding, Guangzhou 510640, China,Guangdong Rice Engineering Laboratory, Guangzhou 510640, China,Key Laboratory of Genetics and Breeding of High Quality Rice in Southern China (Co-construction by Ministry and Province), Guangzhou 510640, China
| | - Richard I Gregory
- Correspondence may also be addressed to Richard I. Gregory. Tel: +1 617 919 2273;
| |
Collapse
|