1
|
Korhonen PK, Wang T, Young ND, Byrne JJ, Campos TL, Chang BC, Taki AC, Gasser RB. Analysis of Haemonchus embryos at single cell resolution identifies two eukaryotic elongation factors as intervention target candidates. Comput Struct Biotechnol J 2024; 23:1026-1035. [PMID: 38435301 PMCID: PMC10907403 DOI: 10.1016/j.csbj.2024.01.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Revised: 01/14/2024] [Accepted: 01/15/2024] [Indexed: 03/05/2024] Open
Abstract
Advances in single cell technologies are allowing investigations of a wide range of biological processes and pathways in animals, such as the multicellular model organism Caenorhabditis elegans - a free-living nematode. However, there has been limited application of such technology to related parasitic nematodes which cause major diseases of humans and animals worldwide. With no vaccines against the vast majority of parasitic nematodes and treatment failures due to drug resistance or inefficacy, new intervention targets are urgently needed, preferably informed by a deep understanding of these nematodes' cellular and molecular biology - which is presently lacking for most worms. Here, we created the first single cell atlas for an early developmental stage of Haemonchus contortus - a highly pathogenic, C. elegans-related parasitic nematode. We obtained and curated RNA sequence (snRNA-seq) data from single nuclei from embryonating eggs of H. contortus (150,000 droplets), and selected high-quality transcriptomic data for > 14,000 single nuclei for analysis, and identified 19 distinct clusters of cells. Guided by comparative analyses with C. elegans, we were able to reproducibly assign seven cell clusters to body wall muscle, hypodermis, neuronal, intestinal or seam cells, and identified eight genes that were transcribed in all cell clusters/types, three of which were inferred to be essential in H. contortus. Two of these genes (i.e. Hc-eef-1A and Hc-eef1G), coding for eukaryotic elongation factors (called Hc-eEF1A and Hc-eEF1G), were also demonstrated to be transcribed and expressed in all key developmental stages of H. contortus. Together with these findings, sequence- and structure-based comparative analyses indicated the potential of Hc-eEF1A and/or Hc-eEF1G as intervention targets within the protein biosynthesis machinery of H. contortus. Future work will focus on single cell studies of all key developmental stages and tissues of H. contortus, and on evaluating the suitability of the two elongation factor proteins as drug targets in H. contortus and related nematodes, with a view to finding new nematocidal drug candidates.
Collapse
Affiliation(s)
- Pasi K. Korhonen
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Tao Wang
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Neil D. Young
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Joseph J. Byrne
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Tulio L. Campos
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Bill C.H. Chang
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Aya C. Taki
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Robin B. Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria 3010, Australia
| |
Collapse
|
2
|
Shanley HT, Taki AC, Nguyen N, Wang T, Byrne JJ, Ang CS, Leeming MG, Williamson N, Chang BCH, Jabbar A, Sleebs BE, Gasser RB. Comparative structure activity and target exploration of 1,2-diphenylethynes in Haemonchus contortus and Caenorhabditis elegans. Int J Parasitol Drugs Drug Resist 2024; 25:100534. [PMID: 38554597 PMCID: PMC10992699 DOI: 10.1016/j.ijpddr.2024.100534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 03/14/2024] [Accepted: 03/17/2024] [Indexed: 04/01/2024]
Abstract
Infections and diseases caused by parasitic nematodes have a major adverse impact on the health and productivity of animals and humans worldwide. The control of these parasites often relies heavily on the treatment with commercially available chemical compounds (anthelmintics). However, the excessive or uncontrolled use of these compounds in livestock animals has led to major challenges linked to drug resistance in nematodes. Therefore, there is a need to develop new anthelmintics with novel mechanism(s) of action. Recently, we identified a small molecule, designated UMW-9729, with nematocidal activity against the free-living model organism Caenorhabditis elegans. Here, we evaluated UMW-9729's potential as an anthelmintic in a structure-activity relationship (SAR) study in C. elegans and the highly pathogenic, blood-feeding Haemonchus contortus (barber's pole worm), and explored the compound-target relationship using thermal proteome profiling (TPP). First, we synthesised and tested 25 analogues of UMW-9729 for their nematocidal activity in both H. contortus (larvae and adults) and C. elegans (young adults), establishing a preliminary nematocidal pharmacophore for both species. We identified several compounds with marked activity against either H. contortus or C. elegans which had greater efficacy than UMW-9729, and found a significant divergence in compound bioactivity between these two nematode species. We also identified a UMW-9729 analogue, designated 25, that moderately inhibited the motility of adult female H. contortus in vitro. Subsequently, we inferred three H. contortus proteins (HCON_00134350, HCON_00021470 and HCON_00099760) and five C. elegans proteins (F30A10.9, F15B9.8, B0361.6, DNC-4 and UNC-11) that interacted directly with UMW-9729; however, no conserved protein target was shared between the two nematode species. Future work aims to extend the SAR investigation in these and other parasitic nematode species, and validate individual proteins identified here as possible targets of UMW-9729. Overall, the present study evaluates this anthelmintic candidate and highlights some challenges associated with early anthelmintic investigation.
Collapse
Affiliation(s)
- Harrison T Shanley
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria, 3010, Australia; Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, 3052, Australia
| | - Aya C Taki
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Nghi Nguyen
- Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, 3052, Australia
| | - Tao Wang
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Joseph J Byrne
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Ching-Seng Ang
- Melbourne Mass Spectrometry and Proteomics Facility, The Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Michael G Leeming
- Melbourne Mass Spectrometry and Proteomics Facility, The Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Nicholas Williamson
- Melbourne Mass Spectrometry and Proteomics Facility, The Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Bill C H Chang
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Abdul Jabbar
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Brad E Sleebs
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria, 3010, Australia; Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, 3052, Australia.
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Science, The University of Melbourne, Parkville, Victoria, 3010, Australia.
| |
Collapse
|
3
|
Hu W, Li M, Xiao H, Guan L. Essential genes identification model based on sequence feature map and graph convolutional neural network. BMC Genomics 2024; 25:47. [PMID: 38200437 PMCID: PMC10777564 DOI: 10.1186/s12864-024-09958-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 01/01/2024] [Indexed: 01/12/2024] Open
Abstract
BACKGROUND Essential genes encode functions that play a vital role in the life activities of organisms, encompassing growth, development, immune system functioning, and cell structure maintenance. Conventional experimental techniques for identifying essential genes are resource-intensive and time-consuming, and the accuracy of current machine learning models needs further enhancement. Therefore, it is crucial to develop a robust computational model to accurately predict essential genes. RESULTS In this study, we introduce GCNN-SFM, a computational model for identifying essential genes in organisms, based on graph convolutional neural networks (GCNN). GCNN-SFM integrates a graph convolutional layer, a convolutional layer, and a fully connected layer to model and extract features from gene sequences of essential genes. Initially, the gene sequence is transformed into a feature map using coding techniques. Subsequently, a multi-layer GCN is employed to perform graph convolution operations, effectively capturing both local and global features of the gene sequence. Further feature extraction is performed, followed by integrating convolution and fully-connected layers to generate prediction results for essential genes. The gradient descent algorithm is utilized to iteratively update the cross-entropy loss function, thereby enhancing the accuracy of the prediction results. Meanwhile, model parameters are tuned to determine the optimal parameter combination that yields the best prediction performance during training. CONCLUSIONS Experimental evaluation demonstrates that GCNN-SFM surpasses various advanced essential gene prediction models and achieves an average accuracy of 94.53%. This study presents a novel and effective approach for identifying essential genes, which has significant implications for biology and genomics research.
Collapse
Affiliation(s)
- Wenxing Hu
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China
| | - Mengshan Li
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China.
| | - Haiyang Xiao
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China
| | - Lixin Guan
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China
| |
Collapse
|
4
|
Ma J, Song J, Young ND, Chang BCH, Korhonen PK, Campos TL, Liu H, Gasser RB. 'Bingo'-a large language model- and graph neural network-based workflow for the prediction of essential genes from protein data. Brief Bioinform 2023; 25:bbad472. [PMID: 38152979 PMCID: PMC10753293 DOI: 10.1093/bib/bbad472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 10/22/2023] [Accepted: 11/28/2023] [Indexed: 12/29/2023] Open
Abstract
The identification and characterization of essential genes are central to our understanding of the core biological functions in eukaryotic organisms, and has important implications for the treatment of diseases caused by, for example, cancers and pathogens. Given the major constraints in testing the functions of genes of many organisms in the laboratory, due to the absence of in vitro cultures and/or gene perturbation assays for most metazoan species, there has been a need to develop in silico tools for the accurate prediction or inference of essential genes to underpin systems biological investigations. Major advances in machine learning approaches provide unprecedented opportunities to overcome these limitations and accelerate the discovery of essential genes on a genome-wide scale. Here, we developed and evaluated a large language model- and graph neural network (LLM-GNN)-based approach, called 'Bingo', to predict essential protein-coding genes in the metazoan model organisms Caenorhabditis elegans and Drosophila melanogaster as well as in Mus musculus and Homo sapiens (a HepG2 cell line) by integrating LLM and GNNs with adversarial training. Bingo predicts essential genes under two 'zero-shot' scenarios with transfer learning, showing promise to compensate for a lack of high-quality genomic and proteomic data for non-model organisms. In addition, the attention mechanisms and GNNExplainer were employed to manifest the functional sites and structural domain with most contribution to essentiality. In conclusion, Bingo provides the prospect of being able to accurately infer the essential genes of little- or under-studied organisms of interest, and provides a biological explanation for gene essentiality.
Collapse
Affiliation(s)
- Jiani Ma
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Jiangning Song
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
| | - Neil D Young
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Bill C H Chang
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Pasi K Korhonen
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Tulio L Campos
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
- Bioinformatics Core Facility, Instituto Aggeu Magalhaes, Fundaçao Oswaldo Cruz (IAM-Fiocruz), Recife, Pernambuco, Brazil
| | - Hui Liu
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| |
Collapse
|
5
|
Sen S, Woodhouse MR, Portwood JL, Andorf CM. Maize Feature Store: A centralized resource to manage and analyze curated maize multi-omics features for machine learning applications. Database (Oxford) 2023; 2023:baad078. [PMID: 37935586 PMCID: PMC10634621 DOI: 10.1093/database/baad078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 09/16/2023] [Accepted: 10/19/2023] [Indexed: 11/09/2023]
Abstract
The big-data analysis of complex data associated with maize genomes accelerates genetic research and improves agronomic traits. As a result, efforts have increased to integrate diverse datasets and extract meaning from these measurements. Machine learning models are a powerful tool for gaining knowledge from large and complex datasets. However, these models must be trained on high-quality features to succeed. Currently, there are no solutions to host maize multi-omics datasets with end-to-end solutions for evaluating and linking features to target gene annotations. Our work presents the Maize Feature Store (MFS), a versatile application that combines features built on complex data to facilitate exploration, modeling and analysis. Feature stores allow researchers to rapidly deploy machine learning applications by managing and providing access to frequently used features. We populated the MFS for the maize reference genome with over 14 000 gene-based features based on published genomic, transcriptomic, epigenomic, variomic and proteomics datasets. Using the MFS, we created an accurate pan-genome classification model with an AUC-ROC score of 0.87. The MFS is publicly available through the maize genetics and genomics database. Database URL https://mfs.maizegdb.org/.
Collapse
Affiliation(s)
- Shatabdi Sen
- Department of Plant Pathology & Microbiology, Iowa State University, 1344 Advanced Teaching & Research Bldg, 2213 Pammel Dr, Ames, IA 50011, USA
| | - Margaret R Woodhouse
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, 819 Wallace Road, Ames, IA 50011, USA
| | - John L Portwood
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, 819 Wallace Road, Ames, IA 50011, USA
| | - Carson M Andorf
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, 819 Wallace Road, Ames, IA 50011, USA
- Department of Computer Science, Iowa State University, Atanasoff Hall, 2434 Osborn Dr, Ames, IA 50011, USA
| |
Collapse
|
6
|
Al-Anzi BF, Khajah M, Fakhraldeen SA. Predicting and explaining the impact of genetic disruptions and interactions on organismal viability. Bioinformatics 2022; 38:4088-4099. [PMID: 35861390 PMCID: PMC9438956 DOI: 10.1093/bioinformatics/btac519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 06/30/2022] [Accepted: 07/20/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Existing computational models can predict single- and double-mutant fitness but they do have limitations. First, they are often tested via evaluation metrics that are inappropriate for imbalanced datasets. Second, all of them only predict a binary outcome (viable or not, and negatively interacting or not). Third, most are uninterpretable black box machine learning models. RESULTS Budding yeast datasets were used to develop high-performance Multinomial Regression (MN) models capable of predicting the impact of single, double and triple genetic disruptions on viability. These models are interpretable and give realistic non-binary predictions and can predict negative genetic interactions (GIs) in triple-gene knockouts. They are based on a limited set of gene features and their predictions are influenced by the probability of target gene participating in molecular complexes or pathways. Furthermore, the MN models have utility in other organisms such as fission yeast, fruit flies and humans, with the single gene fitness MN model being able to distinguish essential genes necessary for cell-autonomous viability from those required for multicellular survival. Finally, our models exceed the performance of previous models, without sacrificing interpretability. AVAILABILITY AND IMPLEMENTATION All code and processed datasets used to generate results and figures in this manuscript are available at our Github repository at https://github.com/KISRDevelopment/cell_viability_paper. The repository also contains a link to the GI prediction website that lets users search for GIs using the MN models. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Saja A Fakhraldeen
- Ecosystem-based Management of Marine Resources Program, Kuwait Institute for Scientific Research, Safat, 13109, Kuwait
| |
Collapse
|
7
|
The impact of species-wide gene expression variation on Caenorhabditis elegans complex traits. Nat Commun 2022; 13:3462. [PMID: 35710766 PMCID: PMC9203580 DOI: 10.1038/s41467-022-31208-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 06/08/2022] [Indexed: 12/15/2022] Open
Abstract
Phenotypic variation in organism-level traits has been studied in Caenorhabditis elegans wild strains, but the impacts of differences in gene expression and the underlying regulatory mechanisms are largely unknown. Here, we use natural variation in gene expression to connect genetic variants to differences in organismal-level traits, including drug and toxicant responses. We perform transcriptomic analyses on 207 genetically distinct C. elegans wild strains to study natural regulatory variation of gene expression. Using this massive dataset, we perform genome-wide association mappings to investigate the genetic basis underlying gene expression variation and reveal complex genetic architectures. We find a large collection of hotspots enriched for expression quantitative trait loci across the genome. We further use mediation analysis to understand how gene expression variation could underlie organism-level phenotypic variation for a variety of complex traits. These results reveal the natural diversity in gene expression and possible regulatory mechanisms in this keystone model organism, highlighting the promise of using gene expression variation to understand how phenotypic diversity is generated.
Collapse
|
8
|
Panditrao G, Bhowmick R, Meena C, Sarkar RR. Emerging landscape of molecular interaction networks: Opportunities, challenges and prospects. J Biosci 2022. [PMID: 36210749 PMCID: PMC9018971 DOI: 10.1007/s12038-022-00253-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Network biology finds application in interpreting molecular interaction networks and providing insightful inferences using graph theoretical analysis of biological systems. The integration of computational bio-modelling approaches with different hybrid network-based techniques provides additional information about the behaviour of complex systems. With increasing advances in high-throughput technologies in biological research, attempts have been made to incorporate this information into network structures, which has led to a continuous update of network biology approaches over time. The newly minted centrality measures accommodate the details of omics data and regulatory network structure information. The unification of graph network properties with classical mathematical and computational modelling approaches and technologically advanced approaches like machine-learning- and artificial intelligence-based algorithms leverages the potential application of these techniques. These computational advances prove beneficial and serve various applications such as essential gene prediction, identification of drug–disease interaction and gene prioritization. Hence, in this review, we have provided a comprehensive overview of the emerging landscape of molecular interaction networks using graph theoretical approaches. With the aim to provide information on the wide range of applications of network biology approaches in understanding the interaction and regulation of genes, proteins, enzymes and metabolites at different molecular levels, we have reviewed the methods that utilize network topological properties, emerging hybrid network-based approaches and applications that integrate machine learning techniques to analyse molecular interaction networks. Further, we have discussed the applications of these approaches in biomedical research with a note on future prospects.
Collapse
Affiliation(s)
- Gauri Panditrao
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Rupa Bhowmick
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| | - Chandrakala Meena
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Ram Rup Sarkar
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| |
Collapse
|
9
|
Herath HMPD, Taki AC, Rostami A, Jabbar A, Keiser J, Geary TG, Gasser RB. Whole-organism phenotypic screening methods used in early-phase anthelmintic drug discovery. Biotechnol Adv 2022; 57:107937. [PMID: 35271946 DOI: 10.1016/j.biotechadv.2022.107937] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 02/24/2022] [Accepted: 03/03/2022] [Indexed: 01/17/2023]
Abstract
Diseases caused by parasitic helminths (worms) represent a major global health burden in both humans and animals. As vaccines against helminths have yet to achieve a prominent role in worm control, anthelmintics are the primary tool to limit production losses and disease due to helminth infections in both human and veterinary medicine. However, the excessive and often uncontrolled use of these drugs has led to widespread anthelmintic resistance in these worms - particularly of animals - to almost all commercially available anthelmintics, severely compromising control. Thus, there is a major demand for the discovery and development of new classes of anthelmintics. A key component of the discovery process is screening libraries of compounds for anthelmintic activity. Given the need for, and major interest by the pharmaceutical industry in, novel anthelmintics, we considered it both timely and appropriate to re-examine screening methods used for anthelmintic discovery. Thus, we reviewed current literature (1977-2021) on whole-worm phenotypic screening assays developed and used in academic laboratories, with a particular focus on those employed to discover nematocides. This review reveals that at least 50 distinct phenotypic assays with low-, medium- or high-throughput capacity were developed over this period, with more recently developed methods being quantitative, semi-automated and higher throughput. The main features assessed or measured in these assays include worm motility, growth/development, morphological changes, viability/lethality, pharyngeal pumping, egg hatching, larval migration, CO2- or ATP-production and/or enzyme activity. Recent progress in assay development has led to the routine application of practical, cost-effective, medium- to high-throughput whole-worm screening assays in academic or public-private partnership (PPP) contexts, and major potential for novel high-content, high-throughput platforms in the near future. Complementing this progress are major advances in the molecular data sciences, computational biology and informatics, which are likely to further enable and accelerate anthelmintic drug discovery and development.
Collapse
Affiliation(s)
- H M P Dilrukshi Herath
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria, Australia
| | - Aya C Taki
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria, Australia
| | - Ali Rostami
- Infectious Diseases and Tropical Medicine Research Center, Health Research Institute, Babol University of Medical Sciences, Babol, Iran
| | - Abdul Jabbar
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria, Australia
| | - Jennifer Keiser
- Department of Medical Parasitology and Infection Biology, Swiss Tropical and Public Health Institute, CH-4051 Basel, Switzerland
| | - Timothy G Geary
- Institute of Parasitology, McGill University, Sainte Anne-de-Bellevue, Quebec H9X3V9, Canada; School of Biological Sciences, Queen's University-Belfast, Belfast, Ireland
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria, Australia.
| |
Collapse
|
10
|
Abstract
DNA is central to the propagation and evolution of most living organisms due to the essential process of its self-replication. Yet it also encodes factors that permit epigenetic (not included in DNA sequence) flow of information from parents to their offspring and beyond. The known mechanisms of epigenetic inheritance include chemical modifications of DNA and chromatin, as well as regulatory RNAs. All these factors can modulate gene expression programs in the ensuing generations. The nematode Caenorhabditis elegans is recognized as a pioneer organism in transgenerational epigenetic inheritance research. Recent advances in C. elegans epigenetics include the discoveries of control mechanisms that limit the duration of RNA-based epigenetic inheritance, periodic DNA motifs that counteract epigenetic silencing establishment, new mechanistic insights into epigenetic inheritance carried by sperm, and the tantalizing examples of inheritance of sensory experiences. This review aims to highlight new findings in epigenetics research in C. elegans with the main focus on transgenerational epigenetic phenomena dependent on small RNAs.
Collapse
Affiliation(s)
- Alla Grishok
- Department of Biochemistry, BU Genome Science Institute, Boston University School of Medicine, 72 E. Concord St. K422, Boston, MA 02118, USA
| |
Collapse
|
11
|
Campos TL, Korhonen PK, Hofmann A, Gasser RB, Young ND. Harnessing model organism genomics to underpin the machine learning-based prediction of essential genes in eukaryotes - Biotechnological implications. Biotechnol Adv 2021; 54:107822. [PMID: 34461202 DOI: 10.1016/j.biotechadv.2021.107822] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Revised: 08/17/2021] [Accepted: 08/24/2021] [Indexed: 12/17/2022]
Abstract
The availability of high-quality genomes and advances in functional genomics have enabled large-scale studies of essential genes in model eukaryotes, including the 'elegant worm' (Caenorhabditis elegans; Nematoda) and the 'vinegar fly' (Drosophila melanogaster; Arthropoda). However, this is not the case for other, much less-studied organisms, such as socioeconomically important parasites, for which functional genomic platforms usually do not exist. Thus, there is a need to develop innovative techniques or approaches for the prediction, identification and investigation of essential genes. A key approach that could enable the prediction of such genes is machine learning (ML). Here, we undertake an historical review of experimental and computational approaches employed for the characterisation of essential genes in eukaryotes, with a particular focus on model ecdysozoans (C. elegans and D. melanogaster), and discuss the possible applicability of ML-approaches to organisms such as socioeconomically important parasites. We highlight some recent results showing that high-performance ML, combined with feature engineering, allows a reliable prediction of essential genes from extensive, publicly available 'omic data sets, with major potential to prioritise such genes (with statistical confidence) for subsequent functional genomic validation. These findings could 'open the door' to fundamental and applied research areas. Evidence of some commonality in the essential gene-complement between these two organisms indicates that an ML-engineering approach could find broader applicability to ecdysozoans such as parasitic nematodes or arthropods, provided that suitably large and informative data sets become/are available for proper feature engineering, and for the robust training and validation of algorithms. This area warrants detailed exploration to, for example, facilitate the identification and characterisation of essential molecules as novel targets for drugs and vaccines against parasitic diseases. This focus is particularly important, given the substantial impact that such diseases have worldwide, and the current challenges associated with their prevention and control and with drug resistance in parasite populations.
Collapse
Affiliation(s)
- Tulio L Campos
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia; Bioinformatics Core Facility, Instituto Aggeu Magalhães, Fundação Oswaldo Cruz (IAM-Fiocruz), Recife, Pernambuco, Brazil
| | - Pasi K Korhonen
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Andreas Hofmann
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia.
| | - Neil D Young
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia.
| |
Collapse
|
12
|
Campos TL, Korhonen PK, Young ND. Cross-Predicting Essential Genes between Two Model Eukaryotic Species Using Machine Learning. Int J Mol Sci 2021; 22:5056. [PMID: 34064595 PMCID: PMC8150380 DOI: 10.3390/ijms22105056] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Revised: 05/07/2021] [Accepted: 05/08/2021] [Indexed: 12/24/2022] Open
Abstract
Experimental studies of Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular and cellular processes in metazoans at large. Since the publication of their genomes, functional genomic investigations have identified genes that are essential or non-essential for survival in each species. Recently, a range of features linked to gene essentiality have been inferred using a machine learning (ML)-based approach, allowing essentiality predictions within a species. Nevertheless, predictions between species are still elusive. Here, we undertake a comprehensive study using ML to discover and validate features of essential genes common to both C. elegans and D. melanogaster. We demonstrate that the cross-species prediction of gene essentiality is possible using a subset of features linked to nucleotide/protein sequences, protein orthology and subcellular localisation, single-cell RNA-seq, and histone methylation markers. Complementary analyses showed that essential genes are enriched for transcription and translation functions and are preferentially located away from heterochromatin regions of C. elegans and D. melanogaster chromosomes. The present work should enable the cross-prediction of essential genes between model and non-model metazoans.
Collapse
Affiliation(s)
- Tulio L. Campos
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC 3010, Australia; (T.L.C.); (P.K.K.)
- Bioinformatics Core Facility, Instituto Aggeu Magalhães, Fundação Oswaldo Cruz (IAM-Fiocruz), Recife 50740-465, PE, Brazil
| | - Pasi K. Korhonen
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC 3010, Australia; (T.L.C.); (P.K.K.)
| | - Neil D. Young
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC 3010, Australia; (T.L.C.); (P.K.K.)
| |
Collapse
|
13
|
Young ND, Stroehlein AJ, Kinkar L, Wang T, Sohn WM, Chang BCH, Kaur P, Weisz D, Dudchenko O, Aiden EL, Korhonen PK, Gasser RB. High-quality reference genome for Clonorchis sinensis. Genomics 2021; 113:1605-1615. [PMID: 33677057 DOI: 10.1016/j.ygeno.2021.03.001] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 01/18/2021] [Accepted: 03/01/2021] [Indexed: 12/13/2022]
Abstract
The Chinese liver fluke, Clonorchis sinensis, causes the disease clonorchiasis, affecting ~35 million people in regions of China, Vietnam, Korea and the Russian Far East. Chronic clonorchiasis causes cholangitis and can induce a malignant cancer, called cholangiocarcinoma, in the biliary system. Control in endemic regions is challenging, and often relies largely on chemotherapy with one anthelmintic, called praziquantel. Routine treatment carries a significant risk of inducing resistance to this anthelmintic in the fluke, such that the discovery of new interventions is considered important. It is hoped that the use of molecular technologies will assist this endeavour by enabling the identification of drug or vaccine targets involved in crucial biological processes and/or pathways in the parasite. Although draft genomes of C. sinensis have been published, their assemblies are fragmented. In the present study, we tackle this genome fragmentation issue by utilising, in an integrated way, advanced (second- and third-generation) DNA sequencing and informatic approaches to build a high-quality reference genome for C. sinensis, with chromosome-level contiguity and curated gene models. This substantially-enhanced genome provides a resource that could accelerate fundamental and applied molecular investigations of C. sinensis, clonorchiasis and/or cholangiocarcinoma, and assist in the discovery of new interventions against what is a highly significant, but neglected disease-complex.
Collapse
Affiliation(s)
- Neil D Young
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, Victoria 3010, Australia.
| | - Andreas J Stroehlein
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Liina Kinkar
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Tao Wang
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Woon-Mok Sohn
- Department of Parasitology and Institute of Health Sciences, School of Medicine, Gyeongsang National University, Jinju, Republic of Korea
| | - Bill C H Chang
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Parwinder Kaur
- UWA School of Agriculture and Environment, Faculty of Science, University of Western Australia, Perth, Western Australia 6009, Australia
| | - David Weisz
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Olga Dudchenko
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, USA
| | - Erez Lieberman Aiden
- UWA School of Agriculture and Environment, Faculty of Science, University of Western Australia, Perth, Western Australia 6009, Australia; The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, USA; Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech, Pudong 201210, China
| | - Pasi K Korhonen
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, Victoria 3010, Australia
| |
Collapse
|