1
|
Yang X, Yu X, Ming Y, Liu H, Zhu W, Yan B, Huang H, Ding L, Qian X, Wang Y, Wu K, Niu M, Yan Q, Huang X, Wang C, Wang Y, He Z. The vertical distribution and metabolic versatility of complete ammonia oxidizing communities in mangrove sediments. ENVIRONMENTAL RESEARCH 2025; 277:121602. [PMID: 40222470 DOI: 10.1016/j.envres.2025.121602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2025] [Revised: 04/10/2025] [Accepted: 04/11/2025] [Indexed: 04/15/2025]
Abstract
Recently discovered complete ammonia-oxidizing (comammox) microorganisms can completely oxidize ammonia to nitrate and play an important role in the nitrogen (N) cycle across various ecosystems. However, little is known about the vertical distribution and metabolic versatility of comammox communities in mangrove ecosystems. Here we profiled comammox communities from deep sediments (up to 5 m) in a mangrove wetland by combining metagenome sequencing and physicochemical properties analysis. Our results showed that the relative abundance of comammox bacteria (23.2 %) was higher than ammonia-oxidizing bacteria (AOB, 12.0 %), but lower than ammonia-oxidizing archaea (AOA, 64.8 %). The abundance of comammox communities significantly (p < 0.01) decreased with the sediment depth, and dissolved organic carbon and total sulfur appeared to be major environmental factors influencing the nitrifying microbial community structure. We also recovered a high-quality metagenome-assembled genome (MAG) of comammox bacteria (Nitrospira sp. bin2030) affiliated with comammox clade A. Nitrospira sp. bin2030 possessed diverse metabolic processes, not only the key genes for ammonia oxidation and urea utilization in the N cycle, but also key genes involved in carbon and energy metabolisms, sulfur metabolism, and environmental adaptation (e.g., oxidative stress, salinity, temperature, heavy metal tolerance). The findings advance our understanding of vertical distribution and metabolic versatility of comammox communities in mangrove sediments, having important implications for quantifying their contribution to nitrification processes in mangrove ecosystems.
Collapse
Affiliation(s)
- Xinlei Yang
- College of Marine Sciences, South China Agricultural University, Guangzhou, 510642, China; Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China
| | - Xiaoli Yu
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China
| | - Yuzhen Ming
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China
| | - Huanping Liu
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China
| | - Wengen Zhu
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China
| | - Bozhi Yan
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China
| | - Huaxia Huang
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China
| | - Lang Ding
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China
| | - Xin Qian
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China
| | - Yukun Wang
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China
| | - Kun Wu
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China
| | - Mingyang Niu
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China
| | - Qingyun Yan
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China
| | - Xiaohong Huang
- College of Marine Sciences, South China Agricultural University, Guangzhou, 510642, China
| | - Cheng Wang
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China
| | - Yuejun Wang
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China.
| | - Zhili He
- Marine Synthetic Ecology Research Center, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Marine Science, School of Earth Science and Engineering, Sun Yan-sen University, Zhuhai, 519082, China.
| |
Collapse
|
2
|
Asadi S, Soorni A, Mehrabi R, Talebi M. Exploring effector candidates in Rhynchosporium commune: insights into their expression dynamics during barley infection. Sci Rep 2025; 15:17667. [PMID: 40399472 PMCID: PMC12095539 DOI: 10.1038/s41598-025-02572-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2025] [Accepted: 05/14/2025] [Indexed: 05/23/2025] Open
Abstract
Rhynchosporium commune is a fungal pathogen responsible for causing scald disease in barley, leading to significant yield losses and reduced grain quality in susceptible cultivars. Effector proteins secreted by R. commune play crucial roles in manipulating host defenses and facilitating infection. Hence, this study aimed to identify and characterize effector candidates (ECs) in R. commune using a comprehensive bioinformatics approach combined with experimental validation. Initially, a dataset of 12,211 genes from the R. commune strain UK7 genome was analyzed to identify potential ECs, resulting in the selection of 48 candidate proteins. These candidates were further validated using RNA-Seq analysis, which confirmed significant expression of 27 ECs during infection. Our analysis re-identified key effectors, including CZT06923 and CZT13833, with 100% identity to NIP3 and NIP2, respectively, in R. commune. Novel ECs, such as CZT07600, CZT13755, and CZT13375, were identified with lower identity to NIP2, suggesting potential variants. Additionally, structural analysis revealed that CZT07873 EC indicates significant structural similarity to known fungal effector. qRT-PCR validation confirmed the differential expression of CZS93219 and CZT13755, with peak expression at 9 and 12 dpi, respectively. This comprehensive approach enhances our understanding of R. commune's pathogenic mechanisms and provides insights into potential targets for developing disease management strategies in barley cultivation.
Collapse
Affiliation(s)
- Samin Asadi
- Department of Biotechnology, College of Agriculture, Isfahan University of Technology, Isfahan, Iran
| | - Aboozar Soorni
- Department of Biotechnology, College of Agriculture, Isfahan University of Technology, Isfahan, Iran.
| | - Rahim Mehrabi
- Department of Biotechnology, College of Agriculture, Isfahan University of Technology, Isfahan, Iran.
- Keygene N.V., 6700 AE, Wageningen, The Netherlands.
| | - Majid Talebi
- Department of Biotechnology, College of Agriculture, Isfahan University of Technology, Isfahan, Iran
| |
Collapse
|
3
|
Joshi S, Mohapatra S, Kumar D, Joshi A, Iyer M, Sowdhamini R. GenDiS3 database: census on the prevalence of protein domain superfamilies of known structure in the entire sequence database. Database (Oxford) 2025; 2025:baaf035. [PMID: 40343712 PMCID: PMC12063530 DOI: 10.1093/database/baaf035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2024] [Revised: 01/08/2025] [Accepted: 04/09/2025] [Indexed: 05/11/2025]
Abstract
Despite the vast amount of sequence data available, a significant disparity exists between the number of protein sequences identified and the relatively few structures that have been resolved. This disparity highlights the challenge in structural biology to bridge the gap between sequence information and 3D structural data, and the necessity for robust databases capable of linking distant homologs to known structures. Studies have indicated that there are a limited number of structural folds, despite the vast diversity of proteins. Hence, computational tools can enhance our ability to classify protein sequences, much before their structures are determined or their functions are characterized, thereby bridging the gap between sequence and structural data. GenDiS (Genomic Distribution of Superfamilies) is a repository with information on the genomic distribution of protein domain superfamilies, involving a one-time computational exercise to search for trusted homologs of protein domains of known structures against the vast sequence database. We have updated this database employing advanced bioinformatics tools, including DELTA-BLAST (domain enhanced lookup time accelerated BLAST) for initial detection of hits and HMMSCAN for validation, significantly improving the accuracy of domain identification. Using these tools, over 151 million sequence homologs for 2060 superfamilies [SCOPe (Structural Classification of Proteins extended)] were identified and 116 million out of them were validated as true positives. Through a case study on glycolysis-related enzymes, variations in domain architectures of these enzymes are explored, revealing evolutionary changes and functional diversity among these essential proteins. We present another case, LOG gene, where one can tune in and find significant mutations across the evolutionary lineage. The GenDiS database, GenDiS3, and the associated tools made available at https://caps.ncbs.res.in/gendis3/ offer a powerful resource for researchers in functional annotation and evolutionary studies. Database URL: https://caps.ncbs.res.in/gendis3/.
Collapse
Affiliation(s)
- Sarthak Joshi
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore 560065, India
| | - Shailendu Mohapatra
- Computational Biology, Insitute of Bioinformatics and Applied Biotechnology, Bangalore 560100, India
| | - Dhwani Kumar
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore 560065, India
| | - Adwait Joshi
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore 560065, India
| | - Meenakshi Iyer
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore 560065, India
| | - Ramanathan Sowdhamini
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore 560065, India
- Computational Biology, Insitute of Bioinformatics and Applied Biotechnology, Bangalore 560100, India
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India
| |
Collapse
|
4
|
Pieramico G, Makkinayeri S, Guidotti R, Basti A, Voso D, Lucarelli D, D’Andrea A, L’Abbate T, Romani GL, Pizzella V, Marzetti L. Robustness of brain state identification in synthetic phase-coupled neurodynamics using Hidden Markov Models. Front Syst Neurosci 2025; 19:1548437. [PMID: 40342833 PMCID: PMC12058723 DOI: 10.3389/fnsys.2025.1548437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2024] [Accepted: 04/01/2025] [Indexed: 05/11/2025] Open
Abstract
Hidden Markov Models (HMMs) have emerged as a powerful tool for analyzing time series of neural activity. Gaussian HMMs and their time-resolved extension, Time-Delay Embedded HMMs (TDE-HMMs), have been instrumental in detecting discrete brain states in the form of temporal sequences of large-scale brain networks. To assess the performance of Gaussian HMMs and TDE-HMMs in this context, we conducted simulations that generated synthetic data representing multiple phase-coupled interactions between different cortical regions to mimic real neural data. Our study demonstrates that TDE-HMM performs better than Gaussian HMM in accurately detecting brain states from synthetic phase-coupled interaction data. Finally, for TDE-HMMs, we manipulated key parameters such as phase coupling variability, state duration, and influence of volume conduction effect to evaluate the models' performance under varying conditions.
Collapse
Affiliation(s)
- Giulia Pieramico
- Department of Engineering and Geology, University of Chieti-Pescara, Pescara, Abruzzo, Italy
- Institute for Advanced Biomedical Technologies, University of Chieti-Pescara, Chieti, Abruzzo, Italy
| | - Saeed Makkinayeri
- Institute for Advanced Biomedical Technologies, University of Chieti-Pescara, Chieti, Abruzzo, Italy
- Department of Neuroscience, Imaging and Clinical Sciences, University of Chieti-Pescara, Chieti, Abruzzo, Italy
| | - Roberto Guidotti
- Institute for Advanced Biomedical Technologies, University of Chieti-Pescara, Chieti, Abruzzo, Italy
- Department of Neuroscience, Imaging and Clinical Sciences, University of Chieti-Pescara, Chieti, Abruzzo, Italy
| | - Alessio Basti
- Department of Engineering and Geology, University of Chieti-Pescara, Pescara, Abruzzo, Italy
- Institute for Advanced Biomedical Technologies, University of Chieti-Pescara, Chieti, Abruzzo, Italy
| | - Domenico Voso
- Department of Neuroscience, Imaging and Clinical Sciences, University of Chieti-Pescara, Chieti, Abruzzo, Italy
| | - Delia Lucarelli
- Department of Neuroscience, Imaging and Clinical Sciences, University of Chieti-Pescara, Chieti, Abruzzo, Italy
| | - Antea D’Andrea
- Institute for Advanced Biomedical Technologies, University of Chieti-Pescara, Chieti, Abruzzo, Italy
- Department of Neuroscience, Imaging and Clinical Sciences, University of Chieti-Pescara, Chieti, Abruzzo, Italy
| | - Teresa L’Abbate
- Department of Engineering and Geology, University of Chieti-Pescara, Pescara, Abruzzo, Italy
- Institute for Advanced Biomedical Technologies, University of Chieti-Pescara, Chieti, Abruzzo, Italy
| | - Gian Luca Romani
- Institute for Advanced Biomedical Technologies, University of Chieti-Pescara, Chieti, Abruzzo, Italy
| | - Vittorio Pizzella
- Institute for Advanced Biomedical Technologies, University of Chieti-Pescara, Chieti, Abruzzo, Italy
- Department of Neuroscience, Imaging and Clinical Sciences, University of Chieti-Pescara, Chieti, Abruzzo, Italy
| | - Laura Marzetti
- Department of Engineering and Geology, University of Chieti-Pescara, Pescara, Abruzzo, Italy
- Institute for Advanced Biomedical Technologies, University of Chieti-Pescara, Chieti, Abruzzo, Italy
| |
Collapse
|
5
|
Mitchell RAC. Identification of universal grass genes and estimates of their monocot-/commelinid-/grass-specificity. BIOINFORMATICS ADVANCES 2025; 5:vbaf079. [PMID: 40417655 PMCID: PMC12098945 DOI: 10.1093/bioadv/vbaf079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2025] [Revised: 03/20/2025] [Accepted: 04/04/2025] [Indexed: 05/27/2025]
Abstract
Motivation Where experiments identify sets of grass genes of unknown function, e.g. underlying a QTL or co-expressed in a transcriptome, it is useful to know which of these genes are common to all grasses (universal) and whether they likely have monocot-/commelinid-/grass-specific function. Results A pipeline used data on 16 grass full genomes from Ensembl Plants to generate 13 312 highly conserved, universal groups of grass protein-coding genes. Validation steps showed that 98.8% of these groups also had gene matches in recently sequenced genomes from two major grass clades not used in the pipeline. Comparison with many non-grass genomes identified 4609 of these groups as likely of monocot-/commelinid-/grass-specific function. Both grouping of genes and specificity were defined using hidden Markov model (HMM) profiles of the groups. The HMM-based approach performed better than simple percentage identity in discriminating between test sets of known specific and non-specific genes. The results give novel insight into the nature of monocot-/commelinid-/grass-specific genes. Researchers can use the universal_grass_peps database to gain evidence for their experimentally identified grass genes being involved in monocot-/commelinid-/grass-specific traits. Availability and implementation The universal_grass_peps database is available for download at https://data.rothamsted.ac.uk/dataset/universal_grass_peps.
Collapse
Affiliation(s)
- Rowan A C Mitchell
- Sustainable Soils and Crops, Rothamsted Research, Harpenden, Hertfordshire AL5 2JQ, United Kingdom
| |
Collapse
|
6
|
Kalal V, Jha BK. Cancer detection with various classification models: A comprehensive feature analysis using HMM to extract a nucleotide pattern. Comput Biol Chem 2024; 113:108215. [PMID: 39378821 DOI: 10.1016/j.compbiolchem.2024.108215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Revised: 09/04/2024] [Accepted: 09/15/2024] [Indexed: 10/10/2024]
Abstract
This work presents a novel feature extraction method for identifying complex patterns in genomic sequences by employing the Hidden Markov Model (HMM). In this study, we use HMM to identify gene nucleotide patterns that are specific to malignant and non-malignant cells. Crucial genetic components DNA and RNA are involved in many biological processes that impact both healthy and malignant cells. Early patient identification is essential to successful cancer diagnosis and therapy. Varying nucleotide patterns indicate different cellular responses, which are important to understanding the molecular causes of cancer and associated disorders. We present a detailed study of nucleotide patterns in whole raw nucleotide sequences with variations in both protein sequence (CDS) and non-protein sequence (NCDS) in both malignant and non-malignant cells. Nucleotide prediction has been achieved while computational expenses are reduced by using the proposed HMM for feature extraction and selection. The classification models implemented in this work for cancer detection are Gradient-Boosted Decision Trees (GBDT), Random Forests (RF), Decision Trees (DT), and Support Vector Machines (SVM) with kernels. The suggested classification model's accuracy and 10-fold cross-validation have been validated via comprehensive case studies. The results reveal that DT and ensemble learning techniques significantly differentiate between malignant and non-malignant DNA sequences. SVM with suitable kernels improves cancer detection accuracy significantly. Combining feature reduction approaches with nucleotide pattern classifiers based on Hidden Markov models improves performance and ensures reliable cancer detection.
Collapse
Affiliation(s)
- Vijay Kalal
- Department of Mathematics, School of Technology, Pandit Deendayal Energy University, Raysan, Gandhinagar, Gujarat 382007, India.
| | - Brajesh Kumar Jha
- Department of Mathematics, School of Technology, Pandit Deendayal Energy University, Raysan, Gandhinagar, Gujarat 382007, India.
| |
Collapse
|
7
|
Pinto Y, Bhatt AS. Sequencing-based analysis of microbiomes. Nat Rev Genet 2024; 25:829-845. [PMID: 38918544 DOI: 10.1038/s41576-024-00746-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/15/2024] [Indexed: 06/27/2024]
Abstract
Microbiomes occupy a range of niches and, in addition to having diverse compositions, they have varied functional roles that have an impact on agriculture, environmental sciences, and human health and disease. The study of microbiomes has been facilitated by recent technological and analytical advances, such as cheaper and higher-throughput DNA and RNA sequencing, improved long-read sequencing and innovative computational analysis methods. These advances are providing a deeper understanding of microbiomes at the genomic, transcriptional and translational level, generating insights into their function and composition at resolutions beyond the species level.
Collapse
Affiliation(s)
- Yishay Pinto
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Medicine, Divisions of Hematology and Blood & Marrow Transplantation, Stanford University, Stanford, CA, USA
| | - Ami S Bhatt
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Medicine, Divisions of Hematology and Blood & Marrow Transplantation, Stanford University, Stanford, CA, USA.
| |
Collapse
|
8
|
Bouchet C, Umair S, Stasiuk S, Grant W, Green P, Knight J. Target screening using RNA interference in the sheep abomasal nematode parasite Haemonchus contortus. Mol Biochem Parasitol 2024; 260:111648. [PMID: 39004228 DOI: 10.1016/j.molbiopara.2024.111648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 06/25/2024] [Accepted: 07/11/2024] [Indexed: 07/16/2024]
Abstract
RNA interference (RNAi) on parasitic nematodes has been described as a valuable tool for screening putative targets that could be used as novel drug and/or vaccine candidates. This study aimed to set up a pipeline to identify potential targets using RNAi for vaccine/anti-parasite therapy development against Haemonchus contortus, a blood-feeding abomasal nematode parasite. The available H. contortus sequence data was mined for targets, which were tested for essentiality using RNAi electroporation assays. A total of 56 genes were identified and tested for knockdown using electroporation of first-stage larvae (L1) H. contortus with the target double-stranded RNA. Electroporation of L1 proved to be effective overall; 17 targets had a strong phenotype and significant reduction in alive H. contortus, and another 24 had a moderate phenotype with a significant reduction in larvae development. A total of 28 targets showed a significant reduction in the development of H. contortus larvae to the infective stage (L3) following the RNAi assay. Down-regulation of target transcript levels was evaluated in some targets by semi-quantitative PCR. Four out of five genes tested showed complete knockdown of mRNA levels via semi-quantitative PCR, whereas the knockdown was partial for one. In conclusion, the results indicate that the RNAi pathway is confirmed in H. contortus and that several target genes have the potential to be investigated further as possible vaccine candidates.
Collapse
Affiliation(s)
| | - Saleh Umair
- AgResearch Ltd, Private Bag 11-008, Palmerston North, New Zealand
| | - Susan Stasiuk
- AgResearch Ltd, Private Bag 11-008, Palmerston North, New Zealand; Department of Parasitology, University of Calgary Alberta, Canada
| | - Warwick Grant
- AgResearch Ltd, Private Bag 11-008, Palmerston North, New Zealand; Department of Physiology Anatomy and Microbiology, School of Life Sciences, La Trobe University, Bundoora 3083, Australia
| | - Peter Green
- AgResearch Ltd, Private Bag 11-008, Palmerston North, New Zealand
| | | |
Collapse
|
9
|
Telek A, Molnár Z, Takács K, Varga B, Grolmusz V, Tasnádi G, Vértessy BG. Discovery and biocatalytic characterization of opine dehydrogenases by metagenome mining. Appl Microbiol Biotechnol 2024; 108:101. [PMID: 38229296 PMCID: PMC10787698 DOI: 10.1007/s00253-023-12871-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 11/29/2023] [Accepted: 12/06/2023] [Indexed: 01/18/2024]
Abstract
Enzymatic processes play an increasing role in synthetic organic chemistry which requires the access to a broad and diverse set of enzymes. Metagenome mining is a valuable and efficient way to discover novel enzymes with unique properties for biotechnological applications. Here, we report the discovery and biocatalytic characterization of six novel metagenomic opine dehydrogenases from a hot spring environment (mODHs) (EC 1.5.1.X). These enzymes catalyze the asymmetric reductive amination between an amino acid and a keto acid resulting in opines which have defined biochemical roles and represent promising building blocks for pharmaceutical applications. The newly identified enzymes exhibit unique substrate specificity and higher thermostability compared to known examples. The feature that they preferably utilize negatively charged polar amino acids is so far unprecedented for opine dehydrogenases. We have identified two spatially correlated positions in their active sites that govern this substrate specificity and demonstrated a switch of substrate preference by site-directed mutagenesis. While they still suffer from a relatively narrow substrate scope, their enhanced thermostability and the orthogonality of their substrate preference make them a valuable addition to the toolbox of enzymes for reductive aminations. Importantly, enzymatic reductive aminations with highly polar amines are very rare in the literature. Thus, the preparative-scale enzymatic production, purification, and characterization of three highly functionalized chiral secondary amines lend a special significance to our work in filling this gap. KEY POINTS: • Six new opine dehydrogenases have been discovered from a hot spring metagenome • The newly identified enzymes display a unique substrate scope • Substrate specificity is governed by two correlated active-site residues.
Collapse
Grants
- K119493 National Research, Development and Innovation Office
- K135231 National Research, Development and Innovation Office
- VEKOP-2.3.2-16-2017-00013 National Research, Development and Innovation Office
- NKP-2018-1.2.1-NKP-2018-00005 National Research, Development and Innovation Office
- TKP2021-EGA-02 National Research, Development and Innovation Office
- ÚNKP-22-4-II-BME-158 National Research, Development and Innovation Office
- RRF-2.3.1-21-2022-000 15 National Research, Development and Innovation Office
- C1580174 Nemzeti Kutatási, Fejlesztési és Innovaciós Alap
- ELTE TKP 2021-NKTA-62 Nemzeti Kutatási, Fejlesztési és Innovaciós Alap
- 2022-1.2.2-TÉT-IPARI-UZ-2022-00003 Nemzeti Kutatási, Fejlesztési és Innovaciós Alap
- Budapest University of Technology and Economics
Collapse
Affiliation(s)
- András Telek
- Department of Applied Biotechnology, Budapest University of Technology and Economics, Budapest, Hungary
- Servier Research Institute of Medicinal Chemistry, Budapest, Hungary
| | - Zsófia Molnár
- Institute of Molecular Life Sciences, Research Centre for Natural Sciences, HUN-REN, Budapest, Hungary
- Department of Organic Chemistry and Technology, Budapest University of Technology and Economics, Budapest, Hungary
| | - Kristóf Takács
- PIT Bioinformatics Group, Institute of Mathematics, Eötvös University, Budapest, Hungary
| | - Bálint Varga
- PIT Bioinformatics Group, Institute of Mathematics, Eötvös University, Budapest, Hungary
| | - Vince Grolmusz
- PIT Bioinformatics Group, Institute of Mathematics, Eötvös University, Budapest, Hungary
| | - Gábor Tasnádi
- Servier Research Institute of Medicinal Chemistry, Budapest, Hungary.
| | - Beáta G Vértessy
- Department of Applied Biotechnology, Budapest University of Technology and Economics, Budapest, Hungary.
- Institute of Molecular Life Sciences, Research Centre for Natural Sciences, HUN-REN, Budapest, Hungary.
| |
Collapse
|
10
|
Lee CP, Le XH, Gawryluk RMR, Casaretto JA, Rothstein SJ, Millar AH. EARLY NODULIN93 acts via cytochrome c oxidase to alter respiratory ATP production and root growth in plants. THE PLANT CELL 2024; 36:4716-4731. [PMID: 39179507 PMCID: PMC11530774 DOI: 10.1093/plcell/koae242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 07/24/2024] [Accepted: 08/20/2024] [Indexed: 08/26/2024]
Abstract
EARLY NODULIN 93 (ENOD93) has been genetically associated with biological nitrogen fixation in legumes and nitrogen use efficiency in cereals, but its precise function is unknown. We show that hidden Markov models define ENOD93 as a homolog of the N-terminal domain of RESPIRATORY SUPERCOMPLEX FACTOR 2 (RCF2). RCF2 regulates cytochrome oxidase (CIV), influencing the generation of a mitochondrial proton motive force in yeast (Saccharomyces cerevisiae). Knockout of ENOD93 in Arabidopsis (Arabidopsis thaliana) causes a short root phenotype and early flowering. ENOD93 is associated with a protein complex the size of CIV in mitochondria, but neither CIV abundance nor its activity changed in ruptured organelles of enod93. However, a progressive loss of ADP-dependent respiration rate was observed in intact enod93 mitochondria, which could be recovered in complemented lines. Mitochondrial membrane potential was higher in enod93 in a CIV-dependent manner, but ATP synthesis and ADP depletion rates progressively decreased. The respiration rate of whole enod93 seedlings was elevated, and root ADP content was nearly double that in wild type without a change in ATP content. We propose that ENOD93 and HYPOXIA-INDUCED GENE DOMAIN 2 (HIGD2) are the functional equivalent of yeast RCF2 but have remained undiscovered in many eukaryotic lineages because they are encoded by 2 distinct genes.
Collapse
Affiliation(s)
- Chun Pong Lee
- School of Molecular Sciences, University of Western Australia, Crawley, WA 6009, Australia
| | - Xuyen H Le
- School of Molecular Sciences, University of Western Australia, Crawley, WA 6009, Australia
| | - Ryan M R Gawryluk
- Department of Biology, University of Victoria, Victoria, BC V8W 2Y2, Canada
| | - José A Casaretto
- Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Steven J Rothstein
- Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - A Harvey Millar
- School of Molecular Sciences, University of Western Australia, Crawley, WA 6009, Australia
| |
Collapse
|
11
|
Wang B, Mount S. Latent Dirichlet allocation mixture models for nucleotide sequence analysis. NAR Genom Bioinform 2024; 6:lqae099. [PMID: 39131816 PMCID: PMC11310860 DOI: 10.1093/nargab/lqae099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 06/13/2024] [Accepted: 07/23/2024] [Indexed: 08/13/2024] Open
Abstract
Strings of nucleotides carrying biological information are typically described as sequence motifs represented by weight matrices or consensus sequences. However, many signals in DNA or RNA are recognized by multiple factors in temporal sequence, consist of distinct alternative motifs, or are best described by base composition. Here we apply the latent Dirichlet allocation (LDA) mixture model to nucleotide sequences. Using positions in an alignment of human or Drosophila splice sites as samples, we show that LDA readily identifies motifs, including such elusive cases as the intron branch site. Using whole sequences with positional k-mers as features, LDA can identify sequence subtypes enriched in long vs. short introns. LDA with bulk k-mers can reliably distinguish reading frame and species of origin in coding sequences from humans and Drosophila. We find that LDA is a useful model for describing heterogeneous signals, for assigning individual sequences to subtypes, and for identifying and characterizing sequences that do not fit recognized subtypes. Because LDA topic models are interpretable, they also aid the discovery of new motifs, even those present in a small fraction of samples. In summary, LDA can identify and characterize signals in nucleotide sequences, including candidate regulatory factors involved in biological processes.
Collapse
Affiliation(s)
- Bixuan Wang
- Dept. of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - Stephen M Mount
- Dept. of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
12
|
Raiyn J, Rayan A, Abu-Lafi S, Rayan A. From Sequence to Solution: Intelligent Learning Engine Optimization in Drug Discovery and Protein Analysis. BIOTECH 2024; 13:33. [PMID: 39311335 PMCID: PMC11417716 DOI: 10.3390/biotech13030033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 08/22/2024] [Accepted: 08/29/2024] [Indexed: 09/26/2024] Open
Abstract
This study introduces the intelligent learning engine (ILE) optimization technology, a novel approach designed to revolutionize screening processes in bioinformatics, cheminformatics, and a range of other scientific fields. By focusing on the efficient and precise identification of candidates with desirable characteristics, the ILE technology marks a significant leap forward in addressing the complexities of candidate selection in drug discovery, protein classification, and beyond. The study's primary objective is to address the challenges associated with optimizing screening processes to efficiently select candidates across various fields, including drug discovery and protein classification. The methodology employed involves a detailed algorithmic process that includes dataset preparation, encoding of protein sequences, sensor nucleation, and optimization, culminating in the empirical evaluation of molecular activity indexing, homology-based modeling, and classification of proteins such as G-protein-coupled receptors. This process showcases the method's success in multiple sequence alignment, protein identification, and classification. Key results demonstrate the ILE's superior accuracy in protein classification and virtual high-throughput screening, with a notable breakthrough in drug development for assessing drug-induced long QT syndrome risks through hERG potassium channel interaction analysis. The technology showcased exceptional results in the formulation and evaluation of novel cancer drug candidates, highlighting its potential for significant advancements in pharmaceutical innovations. The findings underline the ILE optimization technology as a transformative tool in screening processes due to its proven effectiveness and broad applicability across various domains. This breakthrough contributes substantially to the fields of systems optimization and holds promise for diverse applications, enhancing the process of selecting candidate molecules with target properties and advancing drug discovery, protein classification, and modeling.
Collapse
Affiliation(s)
- Jamal Raiyn
- Computer Science Department, Faculty of Science, Al-Qasemi Academic College, Baka EL-Garbiah 30100, Israel;
| | - Adam Rayan
- NGS Ac-Tech—Next Generation Scholars Ltd., Kabul 2496300, Israel;
| | - Saleh Abu-Lafi
- Faculty of Pharmacy, Al-Quds University, Abu-Dies 144, Palestine;
| | - Anwar Rayan
- NGS Ac-Tech—Next Generation Scholars Ltd., Kabul 2496300, Israel;
- Science and Technology Department, Faculty of Science, Al-Qasemi Academic College, Baka EL-Garbiah 30100, Israel
| |
Collapse
|
13
|
Ho LYL, Pan L, Meng F, Ho KTM, Liu F, Wu MT, Lei HI, Bhachu G, Wang X, Dahlsten O, Sun Y, Lee PH, Tan GYA. Quantum modeling simulates nutrient effect of bioplastic polyhydroxyalkanoate (PHA) production in Pseudomonas putida. Sci Rep 2024; 14:18255. [PMID: 39107357 PMCID: PMC11303679 DOI: 10.1038/s41598-024-68727-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 07/26/2024] [Indexed: 08/10/2024] Open
Abstract
Polyhydroxyalkanoates (PHAs) could be used to make sustainable, biodegradable plastics. However, the precise and accurate mechanistic modeling of PHA biosynthesis, especially medium-chain-length PHA (mcl-PHA), for yield improvement remains a challenge to biology. PHA biosynthesis is typically triggered by nitrogen limitation and tends to peak at an optimal carbon-to-nitrogen (C/N) ratio. Specifically, simulation of the underlying dynamic regulation mechanisms for PHA bioprocess is a bottleneck owing to surfeit model complexity and current modeling philosophies for uncertainty. To address this issue, we proposed a quantum-like decision-making model to encode gene expression and regulation events as hidden layers by the general transformation of a density matrix, which uses the interference of probability amplitudes to provide an empirical-level description for PHA biosynthesis. We implemented our framework modeling the biosynthesis of mcl-PHA in Pseudomonas putida with respect to external C/N ratios, showing its optimization production at maximum PHA production of 13.81% cell dry mass (CDM) at the C/N ratio of 40:1. The results also suggest the degree of P. putida's preference in channeling carbon towards PHA production as part of the bacterium's adaptative behavior to nutrient stress using quantum formalism. Generic parameters (kD, kN and theta θ) obtained based on such quantum formulation, representing P. putida's PHA biosynthesis with respect to external C/N ratios, was discussed. This work offers a new perspective on the use of quantum theory for PHA production, demonstrating its application potential for other bioprocesses.
Collapse
Affiliation(s)
- Lawrence Yuk Lung Ho
- Department of Architecture and Civil Engineering, City University of Hong Kong, Hong Kong SAR, China
| | - Li Pan
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong SAR, China
| | - Fei Meng
- Department of Physics, City University of Hong Kong, Hong Kong SAR, China
| | - Kin Tung Michael Ho
- Department of Civil and Environmental Engineering, Imperial College London, London, UK
| | - Feiyang Liu
- Department of Physics, City University of Hong Kong, Hong Kong SAR, China
| | - Ming-Tsung Wu
- Department of Civil and Environmental Engineering, Imperial College London, London, UK
| | - Hei I Lei
- Department of Civil and Environmental Engineering, Imperial College London, London, UK
| | - Govind Bhachu
- Department of Civil and Environmental Engineering, Imperial College London, London, UK
| | - Xin Wang
- Department of Physics, City University of Hong Kong, Hong Kong SAR, China
| | - Oscar Dahlsten
- Department of Physics, City University of Hong Kong, Hong Kong SAR, China
| | - Yanni Sun
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong SAR, China
| | - Po-Heng Lee
- Department of Civil and Environmental Engineering, Imperial College London, London, UK.
| | - Giin Yu Amy Tan
- Department of Architecture and Civil Engineering, City University of Hong Kong, Hong Kong SAR, China.
| |
Collapse
|
14
|
Bhushan V, Nita-Lazar A. Recent Advancements in Subcellular Proteomics: Growing Impact of Organellar Protein Niches on the Understanding of Cell Biology. J Proteome Res 2024; 23:2700-2722. [PMID: 38451675 PMCID: PMC11296931 DOI: 10.1021/acs.jproteome.3c00839] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
The mammalian cell is a complex entity, with membrane-bound and membrane-less organelles playing vital roles in regulating cellular homeostasis. Organellar protein niches drive discrete biological processes and cell functions, thus maintaining cell equilibrium. Cellular processes such as signaling, growth, proliferation, motility, and programmed cell death require dynamic protein movements between cell compartments. Aberrant protein localization is associated with a wide range of diseases. Therefore, analyzing the subcellular proteome of the cell can provide a comprehensive overview of cellular biology. With recent advancements in mass spectrometry, imaging technology, computational tools, and deep machine learning algorithms, studies pertaining to subcellular protein localization and their dynamic distributions are gaining momentum. These studies reveal changing interaction networks because of "moonlighting proteins" and serve as a discovery tool for disease network mechanisms. Consequently, this review aims to provide a comprehensive repository for recent advancements in subcellular proteomics subcontexting methods, challenges, and future perspectives for method developers. In summary, subcellular proteomics is crucial to the understanding of the fundamental cellular mechanisms and the associated diseases.
Collapse
Affiliation(s)
- Vanya Bhushan
- Functional Cellular Networks Section, Laboratory of Immune System Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Aleksandra Nita-Lazar
- Functional Cellular Networks Section, Laboratory of Immune System Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, United States
| |
Collapse
|
15
|
Trgovec-Greif L, Hellinger HJ, Mainguy J, Pfundner A, Frishman D, Kiening M, Webster NS, Laffy PW, Feichtinger M, Rattei T. VOGDB-Database of Virus Orthologous Groups. Viruses 2024; 16:1191. [PMID: 39205165 PMCID: PMC11360334 DOI: 10.3390/v16081191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 07/21/2024] [Accepted: 07/23/2024] [Indexed: 09/04/2024] Open
Abstract
Computational models of homologous protein groups are essential in sequence bioinformatics. Due to the diversity and rapid evolution of viruses, the grouping of protein sequences from virus genomes is particularly challenging. The low sequence similarities of homologous genes in viruses require specific approaches for sequence- and structure-based clustering. Furthermore, the annotation of virus genomes in public databases is not as consistent and up to date as for many cellular genomes. To tackle these problems, we have developed VOGDB, which is a database of virus orthologous groups. VOGDB is a multi-layer database that progressively groups viral genes into groups connected by increasingly remote similarity. The first layer is based on pair-wise sequence similarities, the second layer is based on the sequence profile alignments, and the third layer uses predicted protein structures to find the most remote similarity. VOGDB groups allow for more sensitive homology searches of novel genes and increase the chance of predicting annotations or inferring phylogeny. VOGD B uses all virus genomes from RefSeq and partially reannotates them. VOGDB is updated with every RefSeq release. The unique feature of VOGDB is the inclusion of both prokaryotic and eukaryotic viruses in the same clustering process, which makes it possible to explore old evolutionary relationships of the two groups. VOGDB is freely available at vogdb.org under the CC BY 4.0 license.
Collapse
Affiliation(s)
- Lovro Trgovec-Greif
- Centre for Microbiology and Environmental Systems Science, University of Vienna, 1030 Vienna, Austria
- Doctoral School of Microbiology and Environmental Systems Science, University of Vienna, 1030 Vienna, Austria
| | - Hans-Jörg Hellinger
- Doctoral School of Microbiology and Environmental Systems Science, University of Vienna, 1030 Vienna, Austria
- Armaments and Defence Technology Agency, Austria
| | | | - Alexander Pfundner
- Centre for Microbiology and Environmental Systems Science, University of Vienna, 1030 Vienna, Austria
- Doctoral School of Microbiology and Environmental Systems Science, University of Vienna, 1030 Vienna, Austria
| | - Dmitrij Frishman
- Department of Bioinformatics, School of Life Sciences, Technical University Munich, 85350 Freising, Germany
| | - Michael Kiening
- Department of Bioinformatics, School of Life Sciences, Technical University Munich, 85350 Freising, Germany
| | - Nicole Suzanne Webster
- Australian Institute of Marine Science, PMB no3 Townsville MC, Townsville 4810, Australia
- Institute for Marine and Antarctic Studies, University of Tasmania, Hobart 7000, Australia
- Australian Centre for Ecogenomics, University of Queensland, Brisbane 4072, Australia
| | - Patrick William Laffy
- Australian Institute of Marine Science, PMB no3 Townsville MC, Townsville 4810, Australia
| | - Michael Feichtinger
- Centre for Microbiology and Environmental Systems Science, University of Vienna, 1030 Vienna, Austria
| | - Thomas Rattei
- Centre for Microbiology and Environmental Systems Science, University of Vienna, 1030 Vienna, Austria
| |
Collapse
|
16
|
García Mesa JJ, Zhu Z, Cartwright RA. COATi: Statistical Pairwise Alignment of Protein-Coding Sequences. Mol Biol Evol 2024; 41:msae117. [PMID: 38869090 PMCID: PMC11255384 DOI: 10.1093/molbev/msae117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 04/26/2024] [Accepted: 05/28/2024] [Indexed: 06/14/2024] Open
Abstract
Sequence alignment is an essential method in bioinformatics and the basis of many analyses, including phylogenetic inference, ancestral sequence reconstruction, and gene annotation. Sequencing artifacts and errors made during genome assembly, such as abiological frameshifts and incorrect early stop codons, can impact downstream analyses leading to erroneous conclusions in comparative and functional genomic studies. More significantly, while indels can occur both within and between codons in natural sequences, most amino-acid- and codon-based aligners assume that indels only occur between codons. This mismatch between biology and alignment algorithms produces suboptimal alignments and errors in downstream analyses. To address these issues, we present COATi, a statistical, codon-aware pairwise aligner that supports complex insertion-deletion models and can handle artifacts present in genomic data. COATi allows users to reduce the amount of discarded data while generating more accurate sequence alignments. COATi can infer indels both within and between codons, leading to improved sequence alignments. We applied COATi to a dataset containing orthologous protein-coding sequences from humans and gorillas and conclude that 41% of indels occurred between codons, agreeing with previous work in other species. We also applied COATi to semiempirical benchmark alignments and find that it outperforms several popular alignment programs on several measures of alignment quality and accuracy.
Collapse
Affiliation(s)
- Juan José García Mesa
- The Biodesign Institute, Arizona State University, Tempe, AZ, USA
- Ira A. Fulton Schools of Engineering, Arizona State University, Tempe, AZ, USA
| | - Ziqi Zhu
- The Biodesign Institute, Arizona State University, Tempe, AZ, USA
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Reed A Cartwright
- The Biodesign Institute, Arizona State University, Tempe, AZ, USA
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
17
|
Li B, Ming D. GATSol, an enhanced predictor of protein solubility through the synergy of 3D structure graph and large language modeling. BMC Bioinformatics 2024; 25:204. [PMID: 38824535 PMCID: PMC11549816 DOI: 10.1186/s12859-024-05820-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 05/29/2024] [Indexed: 06/03/2024] Open
Abstract
BACKGROUND Protein solubility is a critically important physicochemical property closely related to protein expression. For example, it is one of the main factors to be considered in the design and production of antibody drugs and a prerequisite for realizing various protein functions. Although several solubility prediction models have emerged in recent years, many of these models are limited to capturing information embedded in one-dimensional amino acid sequences, resulting in unsatisfactory predictive performance. RESULTS In this study, we introduce a novel Graph Attention network-based protein Solubility model, GATSol, which represents the 3D structure of proteins as a protein graph. In addition to the node features of amino acids extracted by the state-of-the-art protein large language model, GATSol utilizes amino acid distance maps generated using the latest AlphaFold technology. Rigorous testing on independent eSOL and the Saccharomyces cerevisiae test datasets has shown that GATSol outperforms most recently introduced models, especially with respect to the coefficient of determination R2, which reaches 0.517 and 0.424, respectively. It outperforms the current state-of-the-art GraphSol by 18.4% on the S. cerevisiae_test set. CONCLUSIONS GATSol captures 3D dimensional features of proteins by building protein graphs, which significantly improves the accuracy of protein solubility prediction. Recent advances in protein structure modeling allow our method to incorporate spatial structure features extracted from predicted structures into the model by relying only on the input of protein sequences, which simplifies the entire graph neural network prediction process, making it more user-friendly and efficient. As a result, GATSol may help prioritize highly soluble proteins, ultimately reducing the cost and effort of experimental work. The source code and data of the GATSol model are freely available at https://github.com/binbinbinv/GATSol .
Collapse
Affiliation(s)
- Bin Li
- College of Biotechnology and Pharmaceutical Engineering, Nanjing Tech University, 30 South Puzhu Road, Jiangbei New District, Nanjing, 211816, Jiangsu, People's Republic of China
| | - Dengming Ming
- College of Biotechnology and Pharmaceutical Engineering, Nanjing Tech University, 30 South Puzhu Road, Jiangbei New District, Nanjing, 211816, Jiangsu, People's Republic of China.
| |
Collapse
|
18
|
Yu R, Huang Z, Lam TYC, Sun Y. Utilizing profile hidden Markov model databases for discovering viruses from metagenomic data: a comprehensive review. Brief Bioinform 2024; 25:bbae292. [PMID: 39003531 PMCID: PMC11246558 DOI: 10.1093/bib/bbae292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 05/19/2024] [Accepted: 06/04/2024] [Indexed: 07/15/2024] Open
Abstract
Profile hidden Markov models (pHMMs) are able to achieve high sensitivity in remote homology search, making them popular choices for detecting novel or highly diverged viruses in metagenomic data. However, many existing pHMM databases have different design focuses, making it difficult for users to decide the proper one to use. In this review, we provide a thorough evaluation and comparison for multiple commonly used profile HMM databases for viral sequence discovery in metagenomic data. We characterized the databases by comparing their sizes, their taxonomic coverage, and the properties of their models using quantitative metrics. Subsequently, we assessed their performance in virus identification across multiple application scenarios, utilizing both simulated and real metagenomic data. We aim to offer researchers a thorough and critical assessment of the strengths and limitations of different databases. Furthermore, based on the experimental results obtained from the simulated and real metagenomic data, we provided practical suggestions for users to optimize their use of pHMM databases, thus enhancing the quality and reliability of their findings in the field of viral metagenomics.
Collapse
Affiliation(s)
- Runzhou Yu
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, China
| | - Ziyi Huang
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, China
| | - Theo Y C Lam
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, China
| | - Yanni Sun
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, China
| |
Collapse
|
19
|
Torcello-Requena A, Murphy ARJ, Lidbury IDEA, Pitt FD, Stark R, Millard AD, Puxty RJ, Chen Y, Scanlan DJ. A distinct, high-affinity, alkaline phosphatase facilitates occupation of P-depleted environments by marine picocyanobacteria. Proc Natl Acad Sci U S A 2024; 121:e2312892121. [PMID: 38713622 PMCID: PMC11098088 DOI: 10.1073/pnas.2312892121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 04/06/2024] [Indexed: 05/09/2024] Open
Abstract
Marine picocyanobacteria of the genera Prochlorococcus and Synechococcus, the two most abundant phototrophs on Earth, thrive in oligotrophic oceanic regions. While it is well known that specific lineages are exquisitely adapted to prevailing in situ light and temperature regimes, much less is known of the molecular machinery required to facilitate occupancy of these low-nutrient environments. Here, we describe a hitherto unknown alkaline phosphatase, Psip1, that has a substantially higher affinity for phosphomonoesters than other well-known phosphatases like PhoA, PhoX, or PhoD and is restricted to clade III Synechococcus and a subset of high light I-adapted Prochlorococcus strains, suggesting niche specificity. We demonstrate that Psip1 has undergone convergent evolution with PhoX, requiring both iron and calcium for activity and likely possessing identical key residues around the active site, despite generally very low sequence homology. Interrogation of metagenomes and transcriptomes from TARA oceans and an Atlantic Meridional transect shows that psip1 is abundant and highly expressed in picocyanobacterial populations from the Mediterranean Sea and north Atlantic gyre, regions well recognized to be phosphorus (P)-deplete. Together, this identifies psip1 as an important oligotrophy-specific gene for P recycling in these organisms. Furthermore, psip1 is not restricted to picocyanobacteria and is abundant and highly transcribed in some α-proteobacteria and eukaryotic algae, suggesting that such a high-affinity phosphatase is important across the microbial taxonomic world to occupy low-P environments.
Collapse
Affiliation(s)
| | - Andrew R. J. Murphy
- School of Life Sciences, University of Warwick, CoventryCV4 7AL, United Kingdom
| | - Ian D. E. A. Lidbury
- Molecular Microbiology: Biochemistry to Disease, School of Biosciences, University of Sheffield, SheffieldS10 2TN, United Kingdom
| | - Frances D. Pitt
- School of Life Sciences, University of Warwick, CoventryCV4 7AL, United Kingdom
| | - Richard Stark
- School of Life Sciences, University of Warwick, CoventryCV4 7AL, United Kingdom
| | - Andrew D. Millard
- Centre for Phage Research, Department of Genetics and Genome Biology, University of Leicester, LeicesterLE1 7RH, United Kingdom
| | - Richard J. Puxty
- School of Life Sciences, University of Warwick, CoventryCV4 7AL, United Kingdom
| | - Yin Chen
- School of Biosciences, University of Birmingham, BirminghamB15 2TT, United Kingdom
| | - David J. Scanlan
- School of Life Sciences, University of Warwick, CoventryCV4 7AL, United Kingdom
| |
Collapse
|
20
|
García Sánchez N, Ugarte Carro E, Prieto-Santamaría L, Rodríguez-González A. Protein sequence analysis in the context of drug repurposing. BMC Med Inform Decis Mak 2024; 24:122. [PMID: 38741115 DOI: 10.1186/s12911-024-02531-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 05/08/2024] [Indexed: 05/16/2024] Open
Abstract
MOTIVATION Drug repurposing speeds up the development of new treatments, being less costly, risky, and time consuming than de novo drug discovery. There are numerous biological elements that contribute to the development of diseases and, as a result, to the repurposing of drugs. METHODS In this article, we analysed the potential role of protein sequences in drug repurposing scenarios. For this purpose, we embedded the protein sequences by performing four state of the art methods and validated their capacity to encapsulate essential biological information through visualization. Then, we compared the differences in sequence distance between protein-drug target pairs of drug repurposing and non - drug repurposing data. Thus, we were able to uncover patterns that define protein sequences in repurposing cases. RESULTS We found statistically significant sequence distance differences between protein pairs in the repurposing data and the rest of protein pairs in non-repurposing data. In this manner, we verified the potential of using numerical representations of sequences to generate repurposing hypotheses in the future.
Collapse
Affiliation(s)
- Natalia García Sánchez
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, 28223, Spain
| | - Esther Ugarte Carro
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, 28223, Spain
| | - Lucía Prieto-Santamaría
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, 28223, Spain
- ETS de Ingenieros Informáticos, Universidad Politécnica de Madrid, Boadilla del Monte, Madrid, 28660, Spain
| | - Alejandro Rodríguez-González
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, 28223, Spain.
- ETS de Ingenieros Informáticos, Universidad Politécnica de Madrid, Boadilla del Monte, Madrid, 28660, Spain.
| |
Collapse
|
21
|
Devi R, Goyal P, Verma B, Hussain S, Chowdhary F, Arora P, Gupta S. A transcriptome-wide identification of ATP-binding cassette (ABC) transporters revealed participation of ABCB subfamily in abiotic stress management of Glycyrrhiza glabra L. BMC Genomics 2024; 25:315. [PMID: 38532362 DOI: 10.1186/s12864-024-10227-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 03/15/2024] [Indexed: 03/28/2024] Open
Abstract
Transcriptome-wide survey divulged a total of 181 ABC transporters in G. glabra which were phylogenetically classified into six subfamilies. Protein-Protein interactions revealed nine putative GgABCBs (-B6, -B14, -B15, -B25, -B26, -B31, -B40, -B42 &-B44) corresponding to five AtABCs orthologs (-B1, -B4, -B11, -B19, &-B21). Significant transcript accumulation of ABCB6 (31.8 folds), -B14 (147.5 folds), -B15 (17 folds), -B25 (19.7 folds), -B26 (18.31 folds), -B31 (61.89 folds), -B40 (1273 folds) and -B42 (51 folds) was observed under the influence of auxin. Auxin transport-specific inhibitor, N-1-naphthylphthalamic acid, showed its effectiveness only at higher (10 µM) concentration where it down regulated the expression of ABCBs, PINs (PIN FORMED) and TWD1 (TWISTED DWARF 1) genes in shoot tissues, while their expression was seen to enhance in the root tissues. Further, qRT-PCR analysis under various growth conditions (in-vitro, field and growth chamber), and subjected to abiotic stresses revealed differential expression implicating role of ABCBs in stress management. Seven of the nine genes were shown to be involved in the stress physiology of the plant. GgABCB6, 15, 25 and ABCB31 were induced in multiple stresses, while GgABCB26, 40 & 42 were exclusively triggered under drought stress. No study pertaining to the ABC transporters from G. glabra is available till date. The present investigation will give an insight to auxin transportation which has been found to be associated with plant growth architecture; the knowledge will help to understand the association between auxin transportation and plant responses under the influence of various conditions.
Collapse
Affiliation(s)
- Ritu Devi
- Plant Biotechnology Division, Jammu, India
- CSIR-Indian Institute of Integrative Medicine, Canal Road, Jammu, 180001, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| | - Pooja Goyal
- Plant Biotechnology Division, Jammu, India
- CSIR-Indian Institute of Integrative Medicine, Canal Road, Jammu, 180001, India
- Registered from Guru Nanak Dev University, Amritsar, India
| | - Bhawna Verma
- Plant Biotechnology Division, Jammu, India
- CSIR-Indian Institute of Integrative Medicine, Canal Road, Jammu, 180001, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| | - Shahnawaz Hussain
- Plant Biotechnology Division, Jammu, India
- CSIR-Indian Institute of Integrative Medicine, Canal Road, Jammu, 180001, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| | - Fariha Chowdhary
- Plant Biotechnology Division, Jammu, India
- CSIR-Indian Institute of Integrative Medicine, Canal Road, Jammu, 180001, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| | - Palak Arora
- Plant Biotechnology Division, Jammu, India
- CSIR-Indian Institute of Integrative Medicine, Canal Road, Jammu, 180001, India
| | - Suphla Gupta
- Plant Biotechnology Division, Jammu, India.
- CSIR-Indian Institute of Integrative Medicine, Canal Road, Jammu, 180001, India.
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India.
| |
Collapse
|
22
|
Benedetti F, Mongodin EF, Badger JH, Munawwar A, Cellini A, Yuan W, Silvestri G, Kraus CN, Marini S, Rathinam CV, Salemi M, Tettelin H, Gallo RC, Zella D. Bacterial DnaK reduces the activity of anti-cancer drugs cisplatin and 5FU. J Transl Med 2024; 22:269. [PMID: 38475767 PMCID: PMC10935962 DOI: 10.1186/s12967-024-05078-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 03/07/2024] [Indexed: 03/14/2024] Open
Abstract
BACKGROUND Chemotherapy is a primary treatment for cancer, but its efficacy is often limited by cancer-associated bacteria (CAB) that impair tumor suppressor functions. Our previous research found that Mycoplasma fermentans DnaK, a chaperone protein, impairs p53 activities, which are essential for most anti-cancer chemotherapeutic responses. METHODS To investigate the role of DnaK in chemotherapy, we treated cancer cell lines with M. fermentans DnaK and then with commonly used p53-dependent anti-cancer drugs (cisplatin and 5FU). We evaluated the cells' survival in the presence or absence of a DnaK-binding peptide (ARV-1502). We also validated our findings using primary tumor cells from a novel DnaK knock-in mouse model. To provide a broader context for the clinical significance of these findings, we investigated human primary cancer sequencing datasets from The Cancer Genome Atlas (TCGA). We identified F. nucleatum as a CAB carrying DnaK with an amino acid composition highly similar to M. fermentans DnaK. Therefore, we investigated the effect of F. nucleatum DnaK on the anti-cancer activity of cisplatin and 5FU. RESULTS Our results show that both M. fermentans and F. nucleatum DnaKs reduce the effectiveness of cisplatin and 5FU. However, the use of ARV-1502 effectively restored the drugs' anti-cancer efficacy. CONCLUSIONS Our findings offer a practical framework for designing and implementing novel personalized anti-cancer strategies by targeting specific bacterial DnaKs in patients with poor response to chemotherapy, underscoring the potential for microbiome-based personalized cancer therapies.
Collapse
Affiliation(s)
- Francesca Benedetti
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Biochemistry and Molecular Biology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Emmanuel F Mongodin
- Department of Microbiology and Immunology, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Jonathan H Badger
- Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, DHHS, Bethesda, MD, USA
| | - Arshi Munawwar
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Ashley Cellini
- Pathology Biorepository Shared Service, University of Maryland Greenebaum Comprehensive Cancer Center, Baltimore, MD, 21201, USA
| | - Weirong Yuan
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Giovannino Silvestri
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | | | - Simone Marini
- Emerging Pathogens Institute, University of Florida, Gainesville, FL, USA
- Department of Epidemiology, University of Florida, Gainesville, FL, USA
| | - Chozha V Rathinam
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Marco Salemi
- Emerging Pathogens Institute, University of Florida, Gainesville, FL, USA
- Department of Pathology, University of Florida, Gainesville, FL, USA
| | - Hervé Tettelin
- Department of Microbiology and Immunology, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Robert C Gallo
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, MD, USA.
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA.
| | - Davide Zella
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, MD, USA.
- Department of Biochemistry and Molecular Biology, University of Maryland School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
23
|
Goshisht MK. Machine Learning and Deep Learning in Synthetic Biology: Key Architectures, Applications, and Challenges. ACS OMEGA 2024; 9:9921-9945. [PMID: 38463314 PMCID: PMC10918679 DOI: 10.1021/acsomega.3c05913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 01/19/2024] [Accepted: 01/30/2024] [Indexed: 03/12/2024]
Abstract
Machine learning (ML), particularly deep learning (DL), has made rapid and substantial progress in synthetic biology in recent years. Biotechnological applications of biosystems, including pathways, enzymes, and whole cells, are being probed frequently with time. The intricacy and interconnectedness of biosystems make it challenging to design them with the desired properties. ML and DL have a synergy with synthetic biology. Synthetic biology can be employed to produce large data sets for training models (for instance, by utilizing DNA synthesis), and ML/DL models can be employed to inform design (for example, by generating new parts or advising unrivaled experiments to perform). This potential has recently been brought to light by research at the intersection of engineering biology and ML/DL through achievements like the design of novel biological components, best experimental design, automated analysis of microscopy data, protein structure prediction, and biomolecular implementations of ANNs (Artificial Neural Networks). I have divided this review into three sections. In the first section, I describe predictive potential and basics of ML along with myriad applications in synthetic biology, especially in engineering cells, activity of proteins, and metabolic pathways. In the second section, I describe fundamental DL architectures and their applications in synthetic biology. Finally, I describe different challenges causing hurdles in the progress of ML/DL and synthetic biology along with their solutions.
Collapse
Affiliation(s)
- Manoj Kumar Goshisht
- Department of Chemistry, Natural and
Applied Sciences, University of Wisconsin—Green
Bay, Green
Bay, Wisconsin 54311-7001, United States
| |
Collapse
|
24
|
Kou X, Zhao Z, Xu X, Li C, Wu J, Zhang S. Identification and expression analysis of ATP-binding cassette (ABC) transporters revealed its role in regulating stress response in pear (Pyrus bretchneideri). BMC Genomics 2024; 25:169. [PMID: 38347517 PMCID: PMC10863237 DOI: 10.1186/s12864-024-10063-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Accepted: 01/29/2024] [Indexed: 02/15/2024] Open
Abstract
BACKGROUND ATP-binding cassette (ABC) transporter proteins constitute a plant gene superfamily crucial for growth, development, and responses to environmental stresses. Despite their identification in various plants like maize, rice, and Arabidopsis, little is known about the information on ABC transporters in pear. To investigate the functions of ABC transporters in pear development and abiotic stress response, we conducted an extensive analysis of ABC gene family in the pear genome. RESULTS In this study, 177 ABC transporter genes were successfully identified in the pear genome, classified into seven subfamilies: 8 ABCAs, 40 ABCBs, 24 ABCCs, 8 ABCDs, 9 ABCEs, 8 ABCFs, and 80 ABCGs. Ten motifs were common among all ABC transporter proteins, while distinct motif structures were observed for each subfamily. Distribution analysis revealed 85 PbrABC transporter genes across 17 chromosomes, driven primarily by WGD and dispersed duplication. Cis-regulatory element analysis of PbrABC promoters indicated associations with phytohormones and stress responses. Tissue-specific expression profiles demonstrated varied expression levels across tissues, suggesting diverse functions in development. Furthermore, several PbrABC genes responded to abiotic stresses, with 82 genes sensitive to salt stress, including 40 upregulated and 23 downregulated genes. Additionally, 91 genes were responsive to drought stress, with 22 upregulated and 36 downregulated genes. These findings highlight the pivotal role of PbrABC genes in abiotic stress responses. CONCLUSION This study provides evolutionary insights into PbrABC transporter genes, establishing a foundation for future research on their functions in pear. The identified motifs, distribution patterns, and stress-responsive expressions contribute to understanding the regulatory mechanisms of ABC transporters in pear. The observed tissue-specific expression profiles suggest diverse roles in developmental processes. Notably, the significant responses to salt and drought stress emphasize the importance of PbrABC genes in mediating adaptive responses. Overall, our study advances the understanding of PbrABC transporter genes in pear, opening avenues for further investigations in plant molecular biology and stress physiology.
Collapse
Affiliation(s)
- Xiaobing Kou
- School of Life Sciences, Nantong University, Nantong, 226019, Jiangsu, People's Republic of China.
| | - Zhen Zhao
- Centre of Pear Engineering Technology Research, State Key Laboratory of Crop Genetics and Germplasm Enhancement, College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, China
| | - Xinqi Xu
- School of Life Sciences, Nantong University, Nantong, 226019, Jiangsu, People's Republic of China
| | - Chang Li
- School of Life Sciences, Nantong University, Nantong, 226019, Jiangsu, People's Republic of China
| | - Juyou Wu
- Centre of Pear Engineering Technology Research, State Key Laboratory of Crop Genetics and Germplasm Enhancement, College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, China
| | - Shaoling Zhang
- Centre of Pear Engineering Technology Research, State Key Laboratory of Crop Genetics and Germplasm Enhancement, College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, China.
| |
Collapse
|
25
|
Ramazi S, Tabatabaei SAH, Khalili E, Nia AG, Motarjem K. Analysis and review of techniques and tools based on machine learning and deep learning for prediction of lysine malonylation sites in protein sequences. Database (Oxford) 2024; 2024:baad094. [PMID: 38245002 PMCID: PMC10799748 DOI: 10.1093/database/baad094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 11/30/2023] [Accepted: 12/20/2023] [Indexed: 01/22/2024]
Abstract
The post-translational modifications occur as crucial molecular regulatory mechanisms utilized to regulate diverse cellular processes. Malonylation of proteins, a reversible post-translational modification of lysine/k residues, is linked to a variety of biological functions, such as cellular regulation and pathogenesis. This modification plays a crucial role in metabolic pathways, mitochondrial functions, fatty acid oxidation and other life processes. However, accurately identifying malonylation sites is crucial to understand the molecular mechanism of malonylation, and the experimental identification can be a challenging and costly task. Recently, approaches based on machine learning (ML) have been suggested to address this issue. It has been demonstrated that these procedures improve accuracy while lowering costs and time constraints. However, these approaches also have specific shortcomings, including inappropriate feature extraction out of protein sequences, high-dimensional features and inefficient underlying classifiers. As a result, there is an urgent need for effective predictors and calculation methods. In this study, we provide a comprehensive analysis and review of existing prediction models, tools and benchmark datasets for predicting malonylation sites in protein sequences followed by a comparison study. The review consists of the specifications of benchmark datasets, explanation of features and encoding methods, descriptions of the predictions approaches and their embedding ML or deep learning models and the description and comparison of the existing tools in this domain. To evaluate and compare the prediction capability of the tools, a new bunch of data has been extracted based on the most updated database and the tools have been assessed based on the extracted data. Finally, a hybrid architecture consisting of several classifiers including classical ML models and a deep learning model has been proposed to ensemble the prediction results. This approach demonstrates the better performance in comparison with all prediction tools included in this study (the source codes of the models presented in this manuscript are available in https://github.com/Malonylation). Database URL: https://github.com/A-Golshan/Malonylation.
Collapse
Affiliation(s)
| | - Seyed Amir Hossein Tabatabaei
- Department of Computer Science, Faculty of Mathematical Sciences, University of Guilan, Namjoo St. Postal, Rasht 41938-33697, Iran
- Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Jalal AleAhmad, Tehran 14117-13116, Iran
| | - Elham Khalili
- Department of Plant Sciences, Faculty of Science, Tarbiat Modares University, Jalal AleAhmad, Tehran 14117-13116, Iran
| | - Amirhossein Golshan Nia
- Department of Mathematics and Computer Science, Amirkabir University of Technology, No. 350, Hafez Ave, Tehran 15916-34311, Iran
| | - Kiomars Motarjem
- Department of Statistics, Faculty of Mathematical Sciences, Tarbiat Modares University, Jalal AleAhmad, Tehran 14117-13116, Iran
| |
Collapse
|
26
|
Augustijn HE, Roseboom AM, Medema MH, van Wezel GP. Harnessing regulatory networks in Actinobacteria for natural product discovery. J Ind Microbiol Biotechnol 2024; 51:kuae011. [PMID: 38569653 PMCID: PMC10996143 DOI: 10.1093/jimb/kuae011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 04/02/2024] [Indexed: 04/05/2024]
Abstract
Microbes typically live in complex habitats where they need to rapidly adapt to continuously changing growth conditions. To do so, they produce an astonishing array of natural products with diverse structures and functions. Actinobacteria stand out for their prolific production of bioactive molecules, including antibiotics, anticancer agents, antifungals, and immunosuppressants. Attention has been directed especially towards the identification of the compounds they produce and the mining of the large diversity of biosynthetic gene clusters (BGCs) in their genomes. However, the current return on investment in random screening for bioactive compounds is low, while it is hard to predict which of the millions of BGCs should be prioritized. Moreover, many of the BGCs for yet undiscovered natural products are silent or cryptic under laboratory growth conditions. To identify ways to prioritize and activate these BGCs, knowledge regarding the way their expression is controlled is crucial. Intricate regulatory networks control global gene expression in Actinobacteria, governed by a staggering number of up to 1000 transcription factors per strain. This review highlights recent advances in experimental and computational methods for characterizing and predicting transcription factor binding sites and their applications to guide natural product discovery. We propose that regulation-guided genome mining approaches will open new avenues toward eliciting the expression of BGCs, as well as prioritizing subsets of BGCs for expression using synthetic biology approaches. ONE-SENTENCE SUMMARY This review provides insights into advances in experimental and computational methods aimed at predicting transcription factor binding sites and their applications to guide natural product discovery.
Collapse
Affiliation(s)
- Hannah E Augustijn
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
- Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - Anna M Roseboom
- Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - Marnix H Medema
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
- Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - Gilles P van Wezel
- Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
- Netherlands Institute for Ecology (NIOO-KNAW), Wageningen, The Netherlands
| |
Collapse
|
27
|
Singh S, Sahani H. Current Advancement and Future Prospects: Biomedical Nanoengineering. Curr Radiopharm 2024; 17:120-137. [PMID: 38058099 DOI: 10.2174/0118744710274376231123063135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 10/19/2023] [Accepted: 10/27/2023] [Indexed: 12/08/2023]
Abstract
Recent advancements in biomedicine have seen a significant reliance on nanoengineering, as traditional methods often fall short in harnessing the unique attributes of biomaterials. Nanoengineering has emerged as a valuable approach to enhance and enrich the performance and functionalities of biomaterials, driving research and development in the field. This review emphasizes the most prevalent biomaterials used in biomedicine, including polymers, nanocomposites, and metallic materials, and explores the pivotal role of nanoengineering in developing biomedical treatments and processes. Particularly, the review highlights research focused on gaining an in-depth understanding of material properties and effectively enhancing material performance through molecular dynamics simulations, all from a nanoengineering perspective.
Collapse
Affiliation(s)
- Sonia Singh
- Institute of Pharmaceutical Research, GLA University, 17 km Stone, NH-2, Mathura-Delhi Road Mathura, Chaumuhan, Uttar Pradesh, 281406, India
| | - Hrishika Sahani
- Lifecell International Pvt. Ltd., NSP Office, Pearls Business Park, 8th Floor Office No-804, Netaji Subhash Palace Delhi, 110034, India
| |
Collapse
|
28
|
Reyes-Umana V, Ewens SD, Meier DAO, Coates JD. Integration of molecular and computational approaches paints a holistic portrait of obscure metabolisms. mBio 2023; 14:e0043123. [PMID: 37855625 PMCID: PMC10746228 DOI: 10.1128/mbio.00431-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2023] Open
Abstract
Microorganisms are essential drivers of earth's geochemical cycles. However, the significance of elemental redox cycling mediated by microorganisms is often underestimated beyond the most well-studied nutrient cycles. Phosphite, (per)chlorate, and iodate are each considered esoteric substrates metabolized by microorganisms. However, recent investigations have indicated that these metabolisms are widespread and ubiquitous, affirming a need to continue studying the underlying microbiology to understand their biogeochemical effects and their interface with each other and our biosphere. This review focuses on combining canonical techniques of culturing microorganisms with modern omic approaches to further our understanding of obscure metabolic pathways and elucidate their importance in global biogeochemical cycles. Using these approaches, marker genes of interest have already been identified for phosphite, (per)chlorate, and iodate using traditional microbial physiology and genetics. Subsequently, their presence was queried to reveal the distribution of metabolic pathways in the environment using publicly available databases. In conjunction with each other, computational and experimental techniques provide a more comprehensive understanding of the location of these microorganisms, their underlying biochemistry and genetics, and how they tie into our planet's geochemical cycles.
Collapse
Affiliation(s)
- Victor Reyes-Umana
- Department of Plant and Microbial Biology, University of California, Berkeley, California, USA
| | - Sophia D. Ewens
- Department of Plant and Microbial Biology, University of California, Berkeley, California, USA
| | - David A. O. Meier
- Department of Plant and Microbial Biology, University of California, Berkeley, California, USA
| | - John D. Coates
- Department of Plant and Microbial Biology, University of California, Berkeley, California, USA
| |
Collapse
|
29
|
Dean EA, Kimmel GJ, Frank MJ, Bukhari A, Hossain NM, Jain MD, Dahiya S, Miklos DB, Altrock PM, Locke FL. Circulating tumor DNA adds specificity to PET after axicabtagene ciloleucel in large B-cell lymphoma. Blood Adv 2023; 7:4608-4618. [PMID: 37126659 PMCID: PMC10448428 DOI: 10.1182/bloodadvances.2022009426] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 03/07/2023] [Accepted: 03/30/2023] [Indexed: 05/03/2023] Open
Abstract
We examined the meaning of metabolically active lesions on 1-month restaging nuclear imaging of patients with relapsed/refractory large B-cell lymphoma receiving axicabtagene ciloleucel (axi-cel) by assessing the relationship between total metabolic tumor volume (MTV) on positron emission tomography (PET) scans and circulating tumor DNA (ctDNA) in the plasma. In this prospective multicenter sample collection study, MTV was retrospectively calculated via commercial software at baseline, 1, and 3 months after chimeric antigen receptor (CAR) T-cell therapy; ctDNA was available before and after axi-cel administration. Spearman correlation coefficient (rs) was used to study the relationship between the variables, and a mathematical model was constructed to describe tumor dynamics 1 month after CAR T-cell therapy. The median time between baseline scan and axi-cel infusion was 33 days (range, 1-137 days) for all 57 patients. For 41 of the patients with imaging within 33 days of axi-cel or imaging before that time but no bridging therapy, the correlation at baseline became stronger (rs, 0.61; P < .0001) compared with all patients (rs, 0.38; P = .004). Excluding patients in complete remission with no measurable residual disease, ctDNA and MTV at 1 month did not correlate (rs, 0.28; P = .11) but correlated at 3 months (rs, 0.79; P = .0007). Modeling of tumor dynamics, which incorporated ctDNA and inflammation as part of MTV, recapitulated the outcomes of patients with positive radiologic 1-month scans. Our results suggested that nonprogressing hypermetabolic lesions on 1-month PET represent ongoing treatment responses, and their composition may be elucidated by concurrently examining the ctDNA.
Collapse
Affiliation(s)
- Erin A. Dean
- Department of Blood and Marrow Transplant and Cellular Immunotherapy, H. Lee Moffitt Cancer and Research Institute, Tampa, FL
- Division of Hematology and Oncology, Department of Medicine, University of Florida, Gainesville, FL
| | - Gregory J. Kimmel
- Department of Integrated Mathematical Oncology, Moffitt Research Institute, Tampa, FL
| | - Matthew J. Frank
- Division of Blood and Stem Cell Transplantation, Department of Medicine, Stanford University, Stanford, CA
| | - Ali Bukhari
- Greenebaum Comprehensive Cancer Center, University of Maryland School of Medicine, Baltimore, MD
- Division of Hematology and Oncology, Department of Internal Medicine, Wright-Patterson Medical Center, Wright-Patterson Air Force Base, OH
| | - Nasheed M. Hossain
- Cell Therapy and Transplant Program, Division of Hematology/Oncology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Michael D. Jain
- Department of Blood and Marrow Transplant and Cellular Immunotherapy, H. Lee Moffitt Cancer and Research Institute, Tampa, FL
| | - Saurabh Dahiya
- Division of Blood and Stem Cell Transplantation, Department of Medicine, Stanford University, Stanford, CA
- Greenebaum Comprehensive Cancer Center, University of Maryland School of Medicine, Baltimore, MD
| | - David B. Miklos
- Division of Blood and Stem Cell Transplantation, Department of Medicine, Stanford University, Stanford, CA
| | - Philipp M. Altrock
- Department of Evolutionary Theory, Max Planck Institute for Evolutionary Biology, Ploen, Germany
| | - Frederick L. Locke
- Department of Blood and Marrow Transplant and Cellular Immunotherapy, H. Lee Moffitt Cancer and Research Institute, Tampa, FL
| |
Collapse
|
30
|
Laufer VA, Glover TW, Wilson TE. Applications of advanced technologies for detecting genomic structural variation. MUTATION RESEARCH. REVIEWS IN MUTATION RESEARCH 2023; 792:108475. [PMID: 37931775 PMCID: PMC10792551 DOI: 10.1016/j.mrrev.2023.108475] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 09/07/2023] [Accepted: 11/02/2023] [Indexed: 11/08/2023]
Abstract
Chromosomal structural variation (SV) encompasses a heterogenous class of genetic variants that exerts strong influences on human health and disease. Despite their importance, many structural variants (SVs) have remained poorly characterized at even a basic level, a discrepancy predicated upon the technical limitations of prior genomic assays. However, recent advances in genomic technology can identify and localize SVs accurately, opening new questions regarding SV risk factors and their impacts in humans. Here, we first define and classify human SVs and their generative mechanisms, highlighting characteristics leveraged by various SV assays. We next examine the first-ever gapless assembly of the human genome and the technical process of assembling it, which required third-generation sequencing technologies to resolve structurally complex loci. The new portions of that "telomere-to-telomere" and subsequent pangenome assemblies highlight aspects of SV biology likely to develop in the near-term. We consider the strengths and limitations of the most promising new SV technologies and when they or longstanding approaches are best suited to meeting salient goals in the study of human SV in population-scale genomics research, clinical, and public health contexts. It is a watershed time in our understanding of human SV when new approaches are expected to fundamentally change genomic applications.
Collapse
Affiliation(s)
- Vincent A Laufer
- Department of Pathology, University of Michigan Medical School, Ann Arbor, MI 48109, USA.
| | - Thomas W Glover
- Department of Pathology, University of Michigan Medical School, Ann Arbor, MI 48109, USA; Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA.
| | - Thomas E Wilson
- Department of Pathology, University of Michigan Medical School, Ann Arbor, MI 48109, USA; Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA.
| |
Collapse
|
31
|
Saad KR, Kumar G, Puthusseri B, Srinivasa SM, Giridhar P, Shetty NP. Genome-wide identification of MATE, functional analysis and molecular dynamics of DcMATE21 involved in anthocyanin accumulation in Daucus carota. PHYTOCHEMISTRY 2023; 210:113676. [PMID: 37059287 DOI: 10.1016/j.phytochem.2023.113676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 04/09/2023] [Accepted: 04/11/2023] [Indexed: 06/19/2023]
Abstract
Anthocyanins are a subclass of flavonoids that are synthesized in the endoplasmic reticulum and then transported to the vacuole in plants. Multidrug and toxic compound extrusion transporters (MATE) is a family of membrane transporters that transport ions and secondary metabolites, such as anthocyanins, in plants. Although various studies on MATE transporters have been carried out on different plant species, this is the first comprehensive report to mine the Daucus carota genome to identify the MATE gene family. Our study identified 45 DcMATEs through genome-wide analysis and detected five segmental and six tandem duplications from the genome. The chromosome distribution, phylogenetic analysis, and cis-regulatory elements revealed the structural diversity and numerous functions associated with the DcMATEs. In addition, we analyzed RNA-seq data obtained from the European Nucleotide Archive to screen for the expression of DcMATEs involved in anthocyanin biosynthesis. Among the identified DcMATEs, DcMATE21 correlated with anthocyanin content in the different D. carota varieties. In addition, the expression of DcMATE21 and anthocyanin biosynthesis genes was correlated under abscisic acid, methyl jasmonate, sodium nitroprusside, salicylic acid, and phenylalanine treatments, which were substantiated by anthocyanin accumulation in the in vitro cultures. Further molecular membrane dynamics of DcMATE21 with anthocyanin (cyanidin-3-glucoside) identified the binding pocket, showing extensive H-bond interactions with 10 crucial amino acids present in the transmembrane helix of 7, 8, and 10 of DcMATE21. The current investigation, using RNA-seq, in vitro cultures, and molecular dynamics studies revealed the involvement of DcMATE21 in anthocyanin accumulation in vitro cultures of D. carota.
Collapse
Affiliation(s)
- Kirti R Saad
- Plant Cell Biotechnology Department, CSIR-Central Food Technological Research Institute, Mysore, 570 020, Karnataka, India.
| | - Gyanendra Kumar
- Plant Cell Biotechnology Department, CSIR-Central Food Technological Research Institute, Mysore, 570 020, Karnataka, India.
| | - Bijesh Puthusseri
- Plant Cell Biotechnology Department, CSIR-Central Food Technological Research Institute, Mysore, 570 020, Karnataka, India.
| | - Sudhanva M Srinivasa
- Faculty of Natural Sciences, Adichunchanagiri University, BG Nagara, 571448, Karnataka, India.
| | - Parvatam Giridhar
- Plant Cell Biotechnology Department, CSIR-Central Food Technological Research Institute, Mysore, 570 020, Karnataka, India.
| | - Nandini P Shetty
- Plant Cell Biotechnology Department, CSIR-Central Food Technological Research Institute, Mysore, 570 020, Karnataka, India.
| |
Collapse
|
32
|
Formaglio P, Wosniack ME, Tromer RM, Polli JG, Matos YB, Zhong H, Raposo EP, da Luz MGE, Amino R. Plasmodium sporozoite search strategy to locate hotspots of blood vessel invasion. Nat Commun 2023; 14:2965. [PMID: 37221182 DOI: 10.1038/s41467-023-38706-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2022] [Accepted: 05/10/2023] [Indexed: 05/25/2023] Open
Abstract
Plasmodium sporozoites actively migrate in the dermis and enter blood vessels to infect the liver. Despite their importance for malaria infection, little is known about these cutaneous processes. We combine intravital imaging in a rodent malaria model and statistical methods to unveil the parasite strategy to reach the bloodstream. We determine that sporozoites display a high-motility mode with a superdiffusive Lévy-like pattern known to optimize the location of scarce targets. When encountering blood vessels, sporozoites frequently switch to a subdiffusive low-motility behavior associated with probing for intravasation hotspots, marked by the presence of pericytes. Hence, sporozoites present anomalous diffusive motility, alternating between superdiffusive tissue exploration and subdiffusive local vessel exploitation, thus optimizing the sequential tasks of seeking blood vessels and pericyte-associated sites of privileged intravasation.
Collapse
Affiliation(s)
- Pauline Formaglio
- Institut Pasteur, Université Paris Cité, Malaria Infection and Immunity Unit, 75015, Paris, France
| | | | - Raphael M Tromer
- Departamento de Física Teórica e Experimental, Universidade Federal do Rio Grande do Norte, 59078- 970, Natal-RN, Brazil
| | - Jaderson G Polli
- Departamento de Física, Universidade Federal do Paraná, 81531-980, Curitiba-PR, Brazil
| | - Yuri B Matos
- Departamento de Física, Universidade Federal do Paraná, 81531-980, Curitiba-PR, Brazil
| | - Hang Zhong
- Institut Pasteur, Université Paris Cité, Malaria Infection and Immunity Unit, 75015, Paris, France
| | - Ernesto P Raposo
- Laboratório de Física Teórica e Computacional, Departamento de Física, Universidade Federal de Pernambuco, 50670-901, Recife-PE, Brazil
| | - Marcos G E da Luz
- Departamento de Física, Universidade Federal do Paraná, 81531-980, Curitiba-PR, Brazil.
| | - Rogerio Amino
- Institut Pasteur, Université Paris Cité, Malaria Infection and Immunity Unit, 75015, Paris, France.
| |
Collapse
|
33
|
Long F, Wu H, Li H, Zuo W, Ao Q. Genome-Wide Analysis of MYB Transcription Factors and Screening of MYBs Involved in the Red Color Formation in Rhododendron delavayi. Int J Mol Sci 2023; 24:ijms24054641. [PMID: 36902072 PMCID: PMC10037418 DOI: 10.3390/ijms24054641] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 02/23/2023] [Accepted: 02/24/2023] [Indexed: 03/06/2023] Open
Abstract
Flower color is one of the crucial traits of ornamental plants. Rhododendron delavayi Franch. is a famous ornamental plant species distributed in the mountain areas of Southwest China. This plant has red inflorescence and young branchlets. However, the molecular basis of the color formation of R. delavayi is unclear. In this study, 184 MYB genes were identified based on the released genome of R. delavayi. These genes included 78 1R-MYB, 101 R2R3-MYB, 4 3R-MYB, and 1 4R-MYB. The MYBs were divided into 35 subgroups using phylogenetic analysis of the MYBs of Arabidopsis thaliana. The members of the same subgroup in R. delavayi had similar conserved domains and motifs, gene structures, and promoter cis-acting elements, which indicate their relatively conserved function. In addition, transcriptome based on unique molecular identifier strategy and color difference of the spotted petals, unspotted petals, spotted throat, unspotted throat, and branchlet cortex were detected. Results showed significant differences in the expression levels of R2R3-MYB genes. Weighted co-expression network analysis between transcriptome and chromatic aberration values of five types of red samples showed that the MYBs were the most important TFs involved in the color formation, of which seven were R2R3-MYB, and three were 1R-MYB. Two R2R3-MYB (DUH019226.1 and DUH019400.1) had the highest connectivity in the whole regulation network, and they were identified as hub genes for red color formation. These two MYB hub genes provide references for the study of transcriptional regulation of the red color formation of R. delavayi.
Collapse
Affiliation(s)
- Fenfang Long
- College of Agriculture, Guizhou University, Guiyang 550025, China
| | - Hairong Wu
- College of Agriculture, Guizhou University, Guiyang 550025, China
| | - Huie Li
- College of Agriculture, Guizhou University, Guiyang 550025, China
| | - Weiwei Zuo
- College of Agriculture, Guizhou University, Guiyang 550025, China
| | - Qian Ao
- College of Agriculture, Guizhou University, Guiyang 550025, China
| |
Collapse
|
34
|
Busi SB, de Nies L, Pramateftaki P, Bourquin M, Kohler TJ, Ezzat L, Fodelianakis S, Michoud G, Peter H, Styllas M, Tolosano M, De Staercke V, Schön M, Galata V, Wilmes P, Battin T. Glacier-Fed Stream Biofilms Harbor Diverse Resistomes and Biosynthetic Gene Clusters. Microbiol Spectr 2023; 11:e0406922. [PMID: 36688698 PMCID: PMC9927545 DOI: 10.1128/spectrum.04069-22] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Accepted: 12/22/2022] [Indexed: 01/24/2023] Open
Abstract
Antimicrobial resistance (AMR) is a universal phenomenon the origins of which lay in natural ecological interactions such as competition within niches, within and between micro- to higher-order organisms. To study these phenomena, it is crucial to examine the origins of AMR in pristine environments, i.e., limited anthropogenic influences. In this context, epilithic biofilms residing in glacier-fed streams (GFSs) are an excellent model system to study diverse, intra- and inter-domain, ecological crosstalk. We assessed the resistomes of epilithic biofilms from GFSs across the Southern Alps (New Zealand) and the Caucasus (Russia) and observed that both bacteria and eukaryotes encoded twenty-nine distinct AMR categories. Of these, beta-lactam, aminoglycoside, and multidrug resistance were both abundant and taxonomically distributed in most of the bacterial and eukaryotic phyla. AMR-encoding phyla included Bacteroidota and Proteobacteria among the bacteria, alongside Ochrophyta (algae) among the eukaryotes. Additionally, biosynthetic gene clusters (BGCs) involved in the production of antibacterial compounds were identified across all phyla in the epilithic biofilms. Furthermore, we found that several bacterial genera (Flavobacterium, Polaromonas, Superphylum Patescibacteria) encode both atimicrobial resistance genes (ARGs) and BGCs within close proximity of each other, demonstrating their capacity to simultaneously influence and compete within the microbial community. Our findings help unravel how naturally occurring BGCs and AMR contribute to the epilithic biofilms mode of life in GFSs. Additionally, we report that eukaryotes may serve as AMR reservoirs owing to their potential for encoding ARGs. Importantly, these observations may be generalizable and potentially extended to other environments that may be more or less impacted by human activity. IMPORTANCE Antimicrobial resistance is an omnipresent phenomenon in the anthropogenically influenced ecosystems. However, its role in shaping microbial community dynamics in pristine environments is relatively unknown. Using metagenomics, we report the presence of antimicrobial resistance genes and their associated pathways in epilithic biofilms within glacier-fed streams. Importantly, we observe biosynthetic gene clusters associated with antimicrobial resistance in both pro- and eukaryotes in these biofilms. Understanding the role of resistance in the context of this pristine environment and complex biodiversity may shed light on previously uncharacterized mechanisms of cross-domain interactions.
Collapse
Affiliation(s)
- Susheel Bhanu Busi
- Systems Ecology Group, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Laura de Nies
- Systems Ecology Group, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Paraskevi Pramateftaki
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Massimo Bourquin
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Tyler J. Kohler
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Leïla Ezzat
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Stilianos Fodelianakis
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Grégoire Michoud
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Hannes Peter
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Michail Styllas
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Matteo Tolosano
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Vincent De Staercke
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Martina Schön
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Valentina Galata
- Systems Ecology Group, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Paul Wilmes
- Systems Ecology Group, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Tom Battin
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| |
Collapse
|
35
|
Oliveira LS, Reyes A, Dutilh BE, Gruber A. Rational Design of Profile HMMs for Sensitive and Specific Sequence Detection with Case Studies Applied to Viruses, Bacteriophages, and Casposons. Viruses 2023; 15:519. [PMID: 36851733 PMCID: PMC9966878 DOI: 10.3390/v15020519] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 02/01/2023] [Accepted: 02/09/2023] [Indexed: 02/15/2023] Open
Abstract
Profile hidden Markov models (HMMs) are a powerful way of modeling biological sequence diversity and constitute a very sensitive approach to detecting divergent sequences. Here, we report the development of protocols for the rational design of profile HMMs. These methods were implemented on TABAJARA, a program that can be used to either detect all biological sequences of a group or discriminate specific groups of sequences. By calculating position-specific information scores along a multiple sequence alignment, TABAJARA automatically identifies the most informative sequence motifs and uses them to construct profile HMMs. As a proof-of-principle, we applied TABAJARA to generate profile HMMs for the detection and classification of two viral groups presenting different evolutionary rates: bacteriophages of the Microviridae family and viruses of the Flavivirus genus. We obtained conserved models for the generic detection of any Microviridae or Flavivirus sequence, and profile HMMs that can specifically discriminate Microviridae subfamilies or Flavivirus species. In another application, we constructed Cas1 endonuclease-derived profile HMMs that can discriminate CRISPRs and casposons, two evolutionarily related transposable elements. We believe that the protocols described here, and implemented on TABAJARA, constitute a generic toolbox for generating profile HMMs for the highly sensitive and specific detection of sequence classes.
Collapse
Affiliation(s)
- Liliane S. Oliveira
- Department of Parasitology, Instituto de Ciências Biomédicas, Universidade de São Paulo, São Paulo 05508-000, SP, Brazil
| | - Alejandro Reyes
- Max Planck Tandem Group in Computational Biology, Department of Biological Sciences, Universidad de los Andes, Bogotá 111711, Colombia
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, Saint Louis, MO 63108, USA
| | - Bas E. Dutilh
- Institute of Biodiversity, Faculty of Biological Sciences, Cluster of Excellence Balance of the Microverse, Friedrich-Schiller-University Jena, 07743 Jena, Germany
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
- European Virus Bioinformatics Center, Leutragraben 1, 07743 Jena, Germany
| | - Arthur Gruber
- Department of Parasitology, Instituto de Ciências Biomédicas, Universidade de São Paulo, São Paulo 05508-000, SP, Brazil
- European Virus Bioinformatics Center, Leutragraben 1, 07743 Jena, Germany
| |
Collapse
|
36
|
Zhao M, Eadeh FR, Nguyen TN, Gupta P, Admoni H, Gonzalez C, Woolley AW. Teaching agents to understand teamwork: Evaluating and predicting collective intelligence as a latent variable via Hidden Markov Models. COMPUTERS IN HUMAN BEHAVIOR 2023. [DOI: 10.1016/j.chb.2022.107524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
37
|
Wang Z, Tang X, Liu J, Ying Z. Subtask analysis of process data through a predictive model. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2023; 76:211-235. [PMID: 36317951 DOI: 10.1111/bmsp.12290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 08/26/2022] [Indexed: 06/16/2023]
Abstract
Response process data collected from human-computer interactive items contain detailed information about respondents' behavioural patterns and cognitive processes. Such data are valuable sources for analysing respondents' problem-solving strategies. However, the irregular data format and the complex structure make standard statistical tools difficult to apply. This article develops a computationally efficient method for exploratory analysis of such process data. The new approach segments a lengthy individual process into a sequence of short subprocesses to achieve complexity reduction, easy clustering and meaningful interpretation. Each subprocess is considered a subtask. The segmentation is based on sequential action predictability using a parsimonious predictive model combined with the Shannon entropy. Simulation studies are conducted to assess the performance of the new method. We use a case study of PIAAC 2012 to demonstrate how exploratory analysis for process data can be carried out with the new approach.
Collapse
Affiliation(s)
- Zhi Wang
- Department of Statistics, Columbia University, New York City, NY, USA
| | - Xueying Tang
- Department of Mathematics, University of Arizona, Tucson, Arizona, USA
| | - Jingchen Liu
- Department of Statistics, Columbia University, New York City, NY, USA
| | - Zhiliang Ying
- Department of Statistics, Columbia University, New York City, NY, USA
| |
Collapse
|
38
|
Lee FS, Anderson AG, Olafson BD. Benchmarking TriadAb using targets from the second antibody modeling assessment. Protein Eng Des Sel 2023; 36:gzad013. [PMID: 37864287 DOI: 10.1093/protein/gzad013] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 08/10/2023] [Indexed: 10/22/2023] Open
Abstract
Computational modeling and design of antibodies has become an integral part of today's research and development in antibody therapeutics. Here we describe the Triad Antibody Homology Modeling (TriadAb) package, a functionality of the Triad protein design platform that predicts the structure of any heavy and light chain sequences of an antibody Fv domain using template-based modeling. To gauge the performance of TriadAb, we benchmarked against the results of the Second Antibody Modeling Assessment (AMA-II). On average, TriadAb produced main-chain carbonyl root-mean-square deviations between models and experimentally determined structures at 1.10 Å, 1.45 Å, 1.41 Å, 3.04 Å, 1.47 Å, 1.27 Å, 1.63 Å in the framework and the six complementarity-determining regions (H1, H2, H3, L1, L2, L3), respectively. The inaugural results are comparable to those reported in AMA-II, corroborating with our internal bench-based experiences that models generated using TriadAb are sufficiently accurate and useful for antibody engineering using the sequence design capabilities provided by Triad.
Collapse
|
39
|
Adibi P, Kalani S, Zahabi SJ, Asadi H, Bakhtiar M, Heidarpour MR, Roohafza H, Shahoon H, Amouzadeh M. Emotion recognition support system: Where physicians and psychiatrists meet linguists and data engineers. World J Psychiatry 2023; 13:1-14. [PMID: 36687372 PMCID: PMC9850871 DOI: 10.5498/wjp.v13.i1.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Revised: 09/18/2022] [Accepted: 12/21/2022] [Indexed: 01/13/2023] Open
Abstract
An important factor in the course of daily medical diagnosis and treatment is understanding patients' emotional states by the caregiver physicians. However, patients usually avoid speaking out their emotions when expressing their somatic symptoms and complaints to their non-psychiatrist doctor. On the other hand, clinicians usually lack the required expertise (or time) and have a deficit in mining various verbal and non-verbal emotional signals of the patients. As a result, in many cases, there is an emotion recognition barrier between the clinician and the patients making all patients seem the same except for their different somatic symptoms. In particular, we aim to identify and combine three major disciplines (psychology, linguistics, and data science) approaches for detecting emotions from verbal communication and propose an integrated solution for emotion recognition support. Such a platform may give emotional guides and indices to the clinician based on verbal communication at the consultation time.
Collapse
Affiliation(s)
- Peyman Adibi
- Isfahan Gastroenterology and Hepatology Research Center, Isfahan University of Medical Sciences, Isfahan 8174673461, Iran
| | - Simindokht Kalani
- Department of Psychology, University of Isfahan, Isfahan 8174673441, Iran
| | - Sayed Jalal Zahabi
- Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan 8415683111, Iran
| | - Homa Asadi
- Department of Linguistics, University of Isfahan, Isfahan 8174673441, Iran
| | - Mohsen Bakhtiar
- Department of Linguistics, Ferdowsi University of Mashhad, Mashhad 9177948974, Iran
| | - Mohammad Reza Heidarpour
- Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan 8415683111, Iran
| | - Hamidreza Roohafza
- Department of Psychocardiology, Cardiac Rehabilitation Research Center, Cardiovascular Research Institute (WHO-Collaborating Center), Isfahan University of Medical Sciences, Isfahan 8187698191, Iran
| | - Hassan Shahoon
- Isfahan Gastroenterology and Hepatology Research Center, Isfahan University of Medical Sciences, Isfahan 8174673461, Iran
| | - Mohammad Amouzadeh
- Department of Linguistics, University of Isfahan, Isfahan 8174673441, Iran
- School of International Studies, Sun Yat-sen University, Zhuhai 519082, Guangdong Province, China
| |
Collapse
|
40
|
Abstract
This chapter outlines the myriad applications of machine learning (ML) in synthetic biology, specifically in engineering cell and protein activity, and metabolic pathways. Though by no means comprehensive, the chapter highlights several prominent computational tools applied in the field and their potential use cases. The examples detailed reinforce how ML algorithms can enhance synthetic biology research by providing data-driven insights into the behavior of living systems, even without detailed knowledge of their underlying mechanisms. By doing so, ML promises to increase the efficiency of research projects by modeling hypotheses in silico that can then be tested through experiments. While challenges related to training dataset generation and computational costs remain, ongoing improvements in ML tools are paving the way for smarter and more streamlined synthetic biology workflows that can be readily employed to address grand challenges across manufacturing, medicine, engineering, agriculture, and beyond.
Collapse
Affiliation(s)
- Brendan Fu-Long Sieow
- NUS Synthetic Biology for Clinical and Technological Innovation (SynCTI), National University of Singapore, Singapore, Singapore
- Synthetic Biology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- NUS Graduate School for Integrative Sciences and Engineering Programme, National University of Singapore, Singapore, Singapore
| | - Ryan De Sotto
- NUS Synthetic Biology for Clinical and Technological Innovation (SynCTI), National University of Singapore, Singapore, Singapore
- Synthetic Biology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Zhi Ren Darren Seet
- NUS Synthetic Biology for Clinical and Technological Innovation (SynCTI), National University of Singapore, Singapore, Singapore
- Synthetic Biology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - In Young Hwang
- NUS Synthetic Biology for Clinical and Technological Innovation (SynCTI), National University of Singapore, Singapore, Singapore
- Synthetic Biology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Matthew Wook Chang
- NUS Synthetic Biology for Clinical and Technological Innovation (SynCTI), National University of Singapore, Singapore, Singapore.
- Synthetic Biology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
| |
Collapse
|
41
|
Baldrighi GN, Nova A, Bernardinelli L, Fazia T. A Pipeline for Phasing and Genotype Imputation on Mixed Human Data (Parents-Offspring Trios and Unrelated Subjects) by Reviewing Current Methods and Software. LIFE (BASEL, SWITZERLAND) 2022; 12:life12122030. [PMID: 36556394 PMCID: PMC9781110 DOI: 10.3390/life12122030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 12/01/2022] [Accepted: 12/02/2022] [Indexed: 12/09/2022]
Abstract
Genotype imputation has become an essential prerequisite when performing association analysis. It is a computational technique that allows us to infer genetic markers that have not been directly genotyped, thereby increasing statistical power in subsequent association studies, which consequently has a crucial impact on the identification of causal variants. Many features need to be considered when choosing the proper algorithm for imputation, including the target sample on which it is performed, i.e., related individuals, unrelated individuals, or both. Problems could arise when dealing with a target sample made up of mixed data, composed of both related and unrelated individuals, especially since the scientific literature on this topic is not sufficiently clear. To shed light on this issue, we examined existing algorithms and software for performing phasing and imputation on mixed human data from SNP arrays, specifically when related subjects belong to trios. By discussing the advantages and limitations of the current algorithms, we identified LD-based methods as being the most suitable for reconstruction of haplotypes in this specific context, and we proposed a feasible pipeline that can be used for imputing genotypes in both phased and unphased human data.
Collapse
|
42
|
Zhai W, Duan Y, Zhang X, Xu G, Li H, Shi J, Xu Z, Zhang X. Sequence and thermodynamic characteristics of terminators revealed by FlowSeq and the discrimination of terminators strength. Synth Syst Biotechnol 2022; 7:1046-1055. [PMID: 35845313 PMCID: PMC9257418 DOI: 10.1016/j.synbio.2022.06.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 06/11/2022] [Accepted: 06/11/2022] [Indexed: 11/24/2022] Open
Abstract
The intrinsic terminator in prokaryotic forms secondary RNA structure and terminates the transcription. However, leaking transcription is common due to varied terminator strength. Besides of the representative hairpin and U-tract structure, detailed sequence and thermodynamic features of terminators were not completely clear, and the effect of terminator on the upstream gene expression was unclearly. Thus, it is still challenging to use terminator to control expression with higher precision. Here, in E. Coli, we firstly determined the effect of the 3′-end sequences including spacer sequences and terminator sequences on the expression of upstream and downstream genes. Secondly, terminator mutation library was constructed, and the thermodynamic and sequence features differing in the termination efficiency were analyzed using the FlowSeq technique. The result showed that under the regulation of terminators, a negative correlation was presented between the expression of upstream and downstream genes (r=−0.60), and the terminators with lower free energy corelated with higher upstream gene expression. Meanwhile, the terminator with longer stem length, more compact loop and perfect U-tract structure was benefit to the transcription termination. Finally, a terminator strength classification model was established, and the verification experiment based on 20 synthetic terminators indicated that the model can distinguish strong and weak terminators to certain extent. The results help to elucidate the role of terminators in gene expression, and the key factors identified are crucial for rational design of terminators, and the model provided a method for terminator strength prediction.
Collapse
Affiliation(s)
- Weiji Zhai
- Biotechnology of Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi, China
| | - Yanting Duan
- Biotechnology of Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi, China
| | - Xiaomei Zhang
- School of Life Science and Health Engineering, Jiangnan University, Wuxi, 214122, China
- Jiangsu Provincial Engineering Research Center for Bioactive Product Processing, Jiangnan University, 1800 Lihu Avenue, Wuxi, 214122, China
| | - Guoqiang Xu
- Biotechnology of Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi, China
| | - Hui Li
- School of Life Science and Health Engineering, Jiangnan University, Wuxi, 214122, China
- Jiangsu Provincial Engineering Research Center for Bioactive Product Processing, Jiangnan University, 1800 Lihu Avenue, Wuxi, 214122, China
| | - Jinsong Shi
- School of Life Science and Health Engineering, Jiangnan University, Wuxi, 214122, China
- Jiangsu Provincial Engineering Research Center for Bioactive Product Processing, Jiangnan University, 1800 Lihu Avenue, Wuxi, 214122, China
| | - Zhenghong Xu
- Biotechnology of Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi, China
| | - Xiaojuan Zhang
- Biotechnology of Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi, China
- Corresponding author. Biotechnology of Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China.
| |
Collapse
|
43
|
Luo R, Pan W, Liu W, Tian Y, Zeng Y, Li Y, Li Z, Cui L. The barley DIR gene family: An expanded gene family that is involved in stress responses. Front Genet 2022; 13:1042772. [PMID: 36406120 PMCID: PMC9667096 DOI: 10.3389/fgene.2022.1042772] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 10/24/2022] [Indexed: 09/09/2023] Open
Abstract
Gene family expansion plays a central role in adaptive divergence and, ultimately, speciation is influenced by phenotypic diversity in different environments. Barley (Hordeum vulgare) is the fourth most important cereal crop in the world and is used for brewing purposes, animal feed, and human food. Systematic characterization of expanded gene families is instrumental in the research of the evolutionary history of barley and understanding of the molecular function of their gene products. A total of 31,750 conserved orthologous groups (OGs) were identified using eight genomes/subgenomes, of which 1,113 and 6,739 were rapidly expanded and contracted OGs in barley, respectively. Five expanded OGs containing 20 barley dirigent genes (HvDIRs) were identified. HvDIRs from the same OG were phylogenetically clustered with similar gene structure and domain organization. In particular, 7 and 5 HvDIRs from OG0000960 and OG0001516, respectively, contributed greatly to the expansion of the DIR-c subfamily. Tandem duplication was the driving force for the expansion of the barley DIR gene family. Nucleotide diversity and haplotype network analysis revealed that the expanded HvDIRs experienced severe bottleneck events during barley domestication, and can thus be considered as potential domestication-related candidate genes. The expression profile and co-expression network analysis revealed the critical roles of the expanded HvDIRs in various biological processes, especially in stress responses. HvDIR18, HvDIR19, and HvDIR63 could serve as excellent candidates for further functional genomics studies to improve the production of barley products. Our study revealed that the HvDIR family was significantly expanded in barley and might be involved in different developmental processes and stress responses. Thus, besides providing a framework for future functional genomics and metabolomics studies, this study also identified HvDIRs as candidates for use in improving barley crop resistance to biotic and abiotic stresses.
Collapse
Affiliation(s)
- Ruihan Luo
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| | - Wenqiu Pan
- State Key Laboratory of Crop Stress Biology in Arid Areas and College of Agronomy, Northwest A&F University, Yangling, Shaanxi, China
| | - Wenqiang Liu
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| | - Yuan Tian
- Xintai Urban and Rural Development Group Co., Ltd., Taian, Shandong, China
| | - Yan Zeng
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| | - Yihan Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| | - Zhimin Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| | - Licao Cui
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| |
Collapse
|
44
|
Liu Z, Fan H, Ma Z. Comparison of SWEET gene family between maize and foxtail millet through genomic, transcriptomic, and proteomic analyses. THE PLANT GENOME 2022; 15:e20226. [PMID: 35713030 DOI: 10.1002/tpg2.20226] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 04/14/2021] [Indexed: 06/15/2023]
Abstract
Foxtail millet [Setaria italica (L.) P. Beauv.] does not show high yield and biomass compared with maize (Zea mays L.) although it is a C4 crop with the potential for high productivity. Because SWEET genes, which are important for sugar transport in plants, play critical roles in biomass production and seed filling in crops, genome-wide, transcriptomic, and proteomic comparison on SWEET gene family between these two species would provide some clues for unlocking this issue. In our study, 24 SWEET genes were identified in foxtail millet and maize. Sequence-based bioinformatics combined with gene expression analyses identified several candidate functional orthologs in these two species. A comparative analysis on expression characteristics of SWEET genes and proteins between maize and foxtail millet indicate that not only some critical major SWEET proteins show significant upregulation in maize compared with their orthologs in foxtail millet, but also there are more quantities of maize SWEET genes showing high expressions than that of foxtail millet genes, suggesting that compared with foxtail millet, maize possesses higher capacity of sugar transport, the crucial determinant for crop yield and biomass. These results provide a basis on revealing why foxtail millet exhibits low yield and biomass although it is a C4 crop with the potential for high productivity.
Collapse
Affiliation(s)
- Zheng Liu
- State Key Laboratory of North China Crop Improvement and Regulation, College of Agronomy, Hebei Agricultural Univ., Baoding, Hebei, 071001, People's Republic of China
| | - Hui Fan
- State Key Laboratory of North China Crop Improvement and Regulation, College of Agronomy, Hebei Agricultural Univ., Baoding, Hebei, 071001, People's Republic of China
| | - Zhiying Ma
- State Key Laboratory of North China Crop Improvement and Regulation, College of Agronomy, Hebei Agricultural Univ., Baoding, Hebei, 071001, People's Republic of China
| |
Collapse
|
45
|
Duality Between the Local Score of One Sequence and Constrained Hidden Markov Model. Methodol Comput Appl Probab 2022. [DOI: 10.1007/s11009-021-09856-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
46
|
Dirmeier S, Beerenwinkel N. Structured hierarchical models for probabilistic inference from perturbation screening data. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Simon Dirmeier
- Department of Biosystems Science and Engineering, ETH Zurich
| | | |
Collapse
|
47
|
Wang S, Kim M, Jiang X, Harmanci AO. Evaluation of vicinity-based hidden Markov models for genotype imputation. BMC Bioinformatics 2022; 23:356. [PMID: 36038834 PMCID: PMC9422108 DOI: 10.1186/s12859-022-04896-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 08/08/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The decreasing cost of DNA sequencing has led to a great increase in our knowledge about genetic variation. While population-scale projects bring important insight into genotype-phenotype relationships, the cost of performing whole-genome sequencing on large samples is still prohibitive. In-silico genotype imputation coupled with genotyping-by-arrays is a cost-effective and accurate alternative for genotyping of common and uncommon variants. Imputation methods compare the genotypes of the typed variants with the large population-specific reference panels and estimate the genotypes of untyped variants by making use of the linkage disequilibrium patterns. Most accurate imputation methods are based on the Li-Stephens hidden Markov model, HMM, that treats the sequence of each chromosome as a mosaic of the haplotypes from the reference panel. RESULTS Here we assess the accuracy of vicinity-based HMMs, where each untyped variant is imputed using the typed variants in a small window around itself (as small as 1 centimorgan). Locality-based imputation is used recently by machine learning-based genotype imputation approaches. We assess how the parameters of the vicinity-based HMMs impact the imputation accuracy in a comprehensive set of benchmarks and show that vicinity-based HMMs can accurately impute common and uncommon variants. CONCLUSIONS Our results indicate that locality-based imputation models can be effectively used for genotype imputation. The parameter settings that we identified can be used in future methods and vicinity-based HMMs can be used for re-structuring and parallelizing new imputation methods. The source code for the vicinity-based HMM implementations is publicly available at https://github.com/harmancilab/LoHaMMer .
Collapse
Affiliation(s)
- Su Wang
- Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Miran Kim
- Department of Mathematics, Hanyang University, Seoul, 04763, Republic of Korea
| | - Xiaoqian Jiang
- Center for Secure Artificial Intelligence For hEalthcare (SAFE), School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Arif Ozgun Harmanci
- Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA.
| |
Collapse
|
48
|
Real-Time Assembly Support System with Hidden Markov Model and Hybrid Extensions. MATHEMATICS 2022. [DOI: 10.3390/math10152725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This paper presents a context-aware adaptive assembly assistance system meant to support factory workers by embedding predictive capabilities. The research is focused on the predictor which suggests the next assembly step. Hidden Markov models are analyzed for this purpose. Several prediction methods have been previously evaluated and the prediction by partial matching, which was the most efficient, is considered in this work as a component of a hybrid model together with an optimally configured hidden Markov model. The experimental results show that the hidden Markov model is a viable choice to predict the next assembly step, whereas the hybrid predictor is even better, outperforming in some cases all the other models. Nevertheless, an assembly assistance system meant to support factory workers needs to embed multiple models to exhibit valuable predictive capabilities.
Collapse
|
49
|
Rosic N. Genome Mining as an Alternative Way for Screening the Marine Organisms for Their Potential to Produce UV-Absorbing Mycosporine-like Amino Acid. Mar Drugs 2022; 20:478. [PMID: 35892946 PMCID: PMC9394291 DOI: 10.3390/md20080478] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/21/2022] [Accepted: 07/21/2022] [Indexed: 12/10/2022] Open
Abstract
Mycosporine-like amino acids (MAAs) are small molecules with robust ultraviolet (UV)-absorbing capacities and a huge potential to be used as an environmentally friendly natural sunscreen. MAAs, temperature, and light-stable compounds demonstrate powerful photoprotective capacities and the ability to capture light in the UV-A and UV-B ranges without the production of damaging free radicals. The biotechnological uses of these secondary metabolites have been often limited by the small quantities restored from natural resources, variation in MAA expression profiles, and limited success in heterologous expression systems. Overcoming these obstacles requires a better understanding of MAA biosynthesis and its regulatory processes. MAAs are produced to a certain extent via a four-enzyme pathway, including genes encoding enzymes dehydroquinate synthase, enzyme O-methyltransferase, adenosine triphosphate grasp, and a nonribosomal peptide synthetase. However, there are substantial genetic discrepancies in the MAA genetic pathway in different species, suggesting further complexity of this pathway that is yet to be fully explored. In recent years, the application of genome-mining approaches allowed the identification of biosynthetic gene clusters (BGCs) that resulted in the discovery of many new compounds from unconventional sources. This review explores the use of novel genomics tools for linking BGCs and secondary metabolites based on the available omics data, including MAAs, and evaluates the potential of using novel genome-mining tools to reveal a cryptic potential for new bioproduct screening approaches and unrevealing new MAA producers.
Collapse
Affiliation(s)
- Nedeljka Rosic
- Faculty of Health, Southern Cross University, Gold Coast, QLD 4225, Australia;
- Marine Ecology Research Centre, Southern Cross University, Lismore, NSW 2480, Australia
| |
Collapse
|
50
|
Zhang W, Meng Q, Wang J, Guo F. HDIContact: a novel predictor of residue-residue contacts on hetero-dimer interfaces via sequential information and transfer learning strategy. Brief Bioinform 2022; 23:6599074. [PMID: 35653713 DOI: 10.1093/bib/bbac169] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 03/07/2022] [Accepted: 04/16/2022] [Indexed: 11/12/2022] Open
Abstract
Proteins maintain the functional order of cell in life by interacting with other proteins. Determination of protein complex structural information gives biological insights for the research of diseases and drugs. Recently, a breakthrough has been made in protein monomer structure prediction. However, due to the limited number of the known protein structure and homologous sequences of complexes, the prediction of residue-residue contacts on hetero-dimer interfaces is still a challenge. In this study, we have developed a deep learning framework for inferring inter-protein residue contacts from sequential information, called HDIContact. We utilized transfer learning strategy to produce Multiple Sequence Alignment (MSA) two-dimensional (2D) embedding based on patterns of concatenated MSA, which could reduce the influence of noise on MSA caused by mismatched sequences or less homology. For MSA 2D embedding, HDIContact took advantage of Bi-directional Long Short-Term Memory (BiLSTM) with two-channel to capture 2D context of residue pairs. Our comprehensive assessment on the Escherichia coli (E. coli) test dataset showed that HDIContact outperformed other state-of-the-art methods, with top precision of 65.96%, the Area Under the Receiver Operating Characteristic curve (AUROC) of 83.08% and the Area Under the Precision Recall curve (AUPR) of 25.02%. In addition, we analyzed the potential of HDIContact for human-virus protein-protein complexes, by achieving top five precision of 80% on O75475-P04584 related to Human Immunodeficiency Virus. All experiments indicated that our method was a valuable technical tool for predicting inter-protein residue contacts, which would be helpful for understanding protein-protein interaction mechanisms.
Collapse
Affiliation(s)
- Wei Zhang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Qiaozhen Meng
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|