1
|
Sahibzada KI, Shahid S, Akhter M, Faisal M, Abd El Rahman RA, Imran M, Lv Y, Wei D, Hu Y. Advancing Enzyme-Based Detoxification Prediction with ToxZyme: An Ensemble Machine Learning Approach. Toxins (Basel) 2025; 17:171. [PMID: 40278669 PMCID: PMC12031443 DOI: 10.3390/toxins17040171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2025] [Revised: 03/20/2025] [Accepted: 03/28/2025] [Indexed: 04/26/2025] Open
Abstract
The aaccurate prediction of enzymes with environment detoxification functions is crucial, not only to achieve a better understanding of bioremediation strategies, but also to alleviate environmental pollution. In the present study, a novel machine learning model was introduced which classifies enzymes by their toxin degradation ability. In this model, two different sets of data were used which include enzymes that can catalyze the toxin degradation as a positive dataset and non-toxin-degrading enzymes as a negative dataset. Further, a comparison of multiple classifiers was performed to find the best model and a Random Forest (RF) classifier was selected due to its strong performance. To enhance the accuracy, we combined RF with a Deep Neural Network (DNN), forming an ensemble model which effectively integrated both techniques. This combination achieved 95% precision, surpassing individual models. Our ensemble model not only ensures high prediction accuracy but also reliably differentiates toxin-degrading enzymes from non-degrading ones. This study highlights the power of combining classical machine learning with deep learning to advance prediction. Our model represents a significant step in enzyme classification and serves as a valuable resource for environmental biotechnology, food nutrition, and health applications.
Collapse
Affiliation(s)
- Kashif Iqbal Sahibzada
- College of Biological Engineering, Henan University of Technology, Zhengzhou 450001, China; (K.I.S.); (M.I.); (Y.L.)
- Department of Health Professional Technologies, Faculty of Allied Health Sciences, The University of Lahore, Lahore 54570, Pakistan
| | - Shumaila Shahid
- School of Biochemistry and Biotechnology, University of the Punjab, Lahore 54570, Pakistan;
| | - Mohsina Akhter
- School of Biological Sciences, University of the Punjab, Lahore 54570, Pakistan;
| | - Muhammad Faisal
- Chemical Engineering, School for Engineering of Matter, Transport and Energy (SEMTE), Arizona State University, Tempe, AZ 85281, USA;
- University Institute of Biochemistry and Biotechnology, PMAS-Arid Agriculture University Rawalpindi, Rawalpindi 46000, Pakistan
| | - Reham A. Abd El Rahman
- Department of Clinical Laboratory Science, College of Applied Medical Sciences, University of Hafer Al Batin UHB, Hafer Al Batin 39524, Saudi Arabia;
| | - Muhammad Imran
- College of Biological Engineering, Henan University of Technology, Zhengzhou 450001, China; (K.I.S.); (M.I.); (Y.L.)
| | - Yangyong Lv
- College of Biological Engineering, Henan University of Technology, Zhengzhou 450001, China; (K.I.S.); (M.I.); (Y.L.)
| | - Dongqing Wei
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, China
- Qihe Laboratory, Qishui Guang East, Qibin District, Hebi 458030, China
- Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang 473006, China
| | - Yuansen Hu
- College of Biological Engineering, Henan University of Technology, Zhengzhou 450001, China; (K.I.S.); (M.I.); (Y.L.)
| |
Collapse
|
2
|
Sahibzada KI, Shahid S, Akhter M, Abid R, Azhar M, Hu Y, Wei DQ. HIV OctaScanner: A Machine Learning Approach to Unveil Proteolytic Cleavage Dynamics in HIV-1 Protease Substrates. J Chem Inf Model 2025; 65:640-648. [PMID: 39807569 DOI: 10.1021/acs.jcim.4c01808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2025]
Abstract
The rise of resistance to antiretroviral drugs due to mutations in human immunodeficiency virus-1 (HIV-1) protease is a major obstacle to effective treatment. These mutations alter the drug-binding pocket of the protease and reduce the drug efficacy by disrupting interactions with inhibitors. Traditional methods, such as biochemical assays and structural biology, are crucial for studying enzyme function but are time-consuming and labor-intensive. To address these challenges, we developed HIV OctaScanner, a machine learning algorithm that predicts the proteolytic cleavage activity of octameric substrates at the HIV-1 protease cleavage sites. The algorithm uses a Random Forest (RF) classifier and achieves a prediction accuracy of 89% in the identification of cleavable octamers. This innovative approach facilitates the rapid screening of potential substrates for HIV-1 protease, providing critical insights into enzyme function and guiding the development of more effective therapeutic strategies. By improving the accuracy of substrate identification, HIV OctaScanner has the potential to support the development of next generation antiretroviral treatments.
Collapse
Affiliation(s)
- Kashif Iqbal Sahibzada
- College of Biological Engineering, Henan University of Technology, Zhengzhou 450001, China
- Department of Health Professional Technologies, Faculty of Allied Health Sciences, The University of Lahore, Lahore 54570, Pakistan
| | - Shumaila Shahid
- School of Biochemistry and Biotechnology, University of the Punjab, Lahore 54590, Pakistan
| | - Mohsina Akhter
- School of Biological Sciences, University of the Punjab, Lahore 54590, Pakistan
| | - Rizwan Abid
- School of Biochemistry and Biotechnology, University of the Punjab, Lahore 54590, Pakistan
| | - Muteeba Azhar
- School of Biochemistry and Biotechnology, University of the Punjab, Lahore 54590, Pakistan
| | - Yuansen Hu
- College of Biological Engineering, Henan University of Technology, Zhengzhou 450001, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
- Qihe Laboratory, Qishui Guang East, Qibin District, Hebi, Henan 458030, China
- Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang, Henan 473006, P.R. China
| |
Collapse
|
3
|
Mastrantonio V, Libro P, Di Martino J, Matera M, Bellini R, Castrignanò T, Urbanelli S, Porretta D. Integrated de novo transcriptome of Culex pipiens mosquito larvae as a resource for genetic control strategies. Sci Data 2024; 11:471. [PMID: 38724521 PMCID: PMC11082219 DOI: 10.1038/s41597-024-03285-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 04/19/2024] [Indexed: 05/12/2024] Open
Abstract
We present a de novo transcriptome of the mosquito vector Culex pipiens, assembled by sequences of susceptible and insecticide resistant larvae. The high quality of the assembly was confirmed by TransRate and BUSCO. A mapping percentage until 94.8% was obtained by aligning contigs to Nr, SwissProt, and TrEMBL, with 27,281 sequences that simultaneously mapped on the three databases. A total of 14,966 ORFs were also functionally annotated by using the eggNOG database. Among them, we identified ORF sequences of the main gene families involved in insecticide resistance. Therefore, this resource stands as a valuable reference for further studies of differential gene expression as well as to identify genes of interest for genetic-based control tools.
Collapse
Affiliation(s)
| | - Pietro Libro
- Department of Ecological and Biological Sciences, Tuscia University, Largo dell'Università snc, 01100, Viterbo, Italy
| | - Jessica Di Martino
- Department of Ecological and Biological Sciences, Tuscia University, Largo dell'Università snc, 01100, Viterbo, Italy
| | - Michele Matera
- Envu, 2022 ES Deutschland GmbH, Germany, Monheim, Germany
- Department of Vector Biology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA, United Kingdom
| | - Romeo Bellini
- Centro Agricoltura Ambiente "G. Nicoli", Via Sant'Agata 835, 40014, Crevalcore, Italy
| | - Tiziana Castrignanò
- Department of Ecological and Biological Sciences, Tuscia University, Largo dell'Università snc, 01100, Viterbo, Italy.
| | - Sandra Urbanelli
- Department of Environmental Biology, Sapienza University of Rome, 00185, Rome, Italy
| | - Daniele Porretta
- Department of Environmental Biology, Sapienza University of Rome, 00185, Rome, Italy
| |
Collapse
|
4
|
Ahsan MU, Barbier F, Hayward A, Powell R, Hofman H, Parfitt SC, Wilkie J, Beveridge CA, Mitter N. Molecular Cues for Phenological Events in the Flowering Cycle in Avocado. PLANTS (BASEL, SWITZERLAND) 2023; 12:2304. [PMID: 37375929 DOI: 10.3390/plants12122304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 06/09/2023] [Accepted: 06/09/2023] [Indexed: 06/29/2023]
Abstract
Reproductively mature horticultural trees undergo an annual flowering cycle that repeats each year of their reproductive life. This annual flowering cycle is critical for horticultural tree productivity. However, the molecular events underlying the regulation of flowering in tropical tree crops such as avocado are not fully understood or documented. In this study, we investigated the potential molecular cues regulating the yearly flowering cycle in avocado for two consecutive crop cycles. Homologues of flowering-related genes were identified and assessed for their expression profiles in various tissues throughout the year. Avocado homologues of known floral genes FT, AP1, LFY, FUL, SPL9, CO and SEP2/AGL4 were upregulated at the typical time of floral induction for avocado trees growing in Queensland, Australia. We suggest these are potential candidate markers for floral initiation in these crops. In addition, DAM and DRM1, which are associated with endodormancy, were downregulated at the time of floral bud break. In this study, a positive correlation between CO activation and FT in avocado leaves to regulate flowering was not seen. Furthermore, the SOC1-SPL4 model described in annual plants appears to be conserved in avocado. Lastly, no correlation of juvenility-related miRNAs miR156, miR172 with any phenological event was observed.
Collapse
Affiliation(s)
- Muhammad Umair Ahsan
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Francois Barbier
- School of Biological Sciences, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Alice Hayward
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Rosanna Powell
- School of Biological Sciences, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Helen Hofman
- Department of Agriculture and Fisheries, Queensland Government, Bundaberg, QLD 4670, Australia
| | - Siegrid Carola Parfitt
- Department of Agriculture and Fisheries, Queensland Government, Bundaberg, QLD 4670, Australia
| | - John Wilkie
- Department of Agriculture and Fisheries, Queensland Government, Bundaberg, QLD 4670, Australia
| | | | - Neena Mitter
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD 4072, Australia
| |
Collapse
|
5
|
Solares E, Morales-Cruz A, Balderas RF, Focht E, Ashworth VETM, Wyant S, Minio A, Cantu D, Arpaia ML, Gaut BS. Insights into the domestication of avocado and potential genetic contributors to heterodichogamy. G3 (BETHESDA, MD.) 2023; 13:jkac323. [PMID: 36477810 PMCID: PMC9911064 DOI: 10.1093/g3journal/jkac323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Revised: 11/11/2022] [Accepted: 11/15/2022] [Indexed: 12/13/2022]
Abstract
The domestication history of the avocado (Persea americana) remains unclear. We created a reference genome from the Gwen varietal, which is closely related to the economically dominant Hass varietal. Our genome assembly had an N50 of 3.37 megabases, a BUSCO score of 91%, and was scaffolded with a genetic map, producing 12 pseudo-chromosomes with 49,450 genes. We used the Gwen genome as a reference to investigate population genomics, based on a sample of 34 resequenced accessions that represented the 3 botanical groups of P. americana. Our analyses were consistent with 3 separate domestication events; we estimated that the Mexican group diverged from the Lowland (formerly known as "West Indian") and Guatemalan groups >1 million years ago. We also identified putative targets of selective sweeps in domestication events; within the Guatemalan group, putative candidate genes were enriched for fruit development and ripening. We also investigated divergence between heterodichogamous flowering types, providing preliminary evidence for potential candidate genes involved in pollination and floral development.
Collapse
Affiliation(s)
- Edwin Solares
- Deptartment of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA 92697-2525, USA
| | - Abraham Morales-Cruz
- Deptartment of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA 92697-2525, USA
| | - Rosa Figueroa Balderas
- Department of Viticulture and Enology, University of California, Davis, Davis, CA 95616, USA
| | - Eric Focht
- Department of Botany and Plant Sciences, University of California, Riverside, Riverside, CA 92521, USA
| | - Vanessa E T M Ashworth
- Department of Botany and Plant Sciences, University of California, Riverside, Riverside, CA 92521, USA
| | - Skylar Wyant
- Deptartment of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA 92697-2525, USA
| | - Andrea Minio
- Department of Viticulture and Enology, University of California, Davis, Davis, CA 95616, USA
| | - Dario Cantu
- Department of Viticulture and Enology, University of California, Davis, Davis, CA 95616, USA
| | - Mary Lu Arpaia
- Department of Botany and Plant Sciences, University of California, Riverside, Riverside, CA 92521, USA
| | - Brandon S Gaut
- Deptartment of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA 92697-2525, USA
| |
Collapse
|
6
|
Song M, Wang H, Fan Z, Huang H, Ma H. Advances in sequencing and key character analysis of mango ( Mangifera indica L.). HORTICULTURE RESEARCH 2023; 10:uhac259. [PMID: 37601702 PMCID: PMC10433700 DOI: 10.1093/hr/uhac259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 11/19/2022] [Indexed: 08/22/2023]
Abstract
Mango (Mangifera indica L.) is an important fruit crop in tropical and subtropical countries associated with many agronomic and horticultural problems, such as susceptibility to pathogens, including powdery mildew and anthracnose, poor yield and quality, and short shelf life. Conventional breeding techniques exhibit significant limitations in improving mango quality due to the characteristics of long ripening, self-incompatibility, and high genetic heterozygosity. In recent years, much emphasis has been placed on identification of key genes controlling a certain trait through genomic association analysis and directly breeding new varieties through transgene or genotype selection of offspring. This paper reviews the latest research progress on the genome and transcriptome sequencing of mango fruit. The rapid development of genome sequencing and bioinformatics provides effective strategies for identifying, labeling, cloning, and manipulating many genes related to economically important traits. Preliminary verification of the functions of mango genes has been conducted, including genes related to flowering regulation, fruit development, and polyphenol biosynthesis. Importantly, modern biotechnology can refine existing mango varieties to meet the market demand with high economic benefits.
Collapse
Affiliation(s)
- Miaoyu Song
- College of Horticulture, China Agricultural University, Beijing 100193, China
| | - Haomiao Wang
- College of Horticulture, China Agricultural University, Beijing 100193, China
| | - Zhiyi Fan
- College of Horticulture, China Agricultural University, Beijing 100193, China
| | - Hantang Huang
- College of Horticulture, China Agricultural University, Beijing 100193, China
| | - Huiqin Ma
- College of Horticulture, China Agricultural University, Beijing 100193, China
- State Key Laboratory of Agrobiotechnology, China Agricultural University, Beijing 100083, China
| |
Collapse
|
7
|
He J, Lyu R, Luo Y, Xiao J, Xie L, Wen J, Li W, Pei L, Cheng J. A phylotranscriptome study using silica gel-dried leaf tissues produces an updated robust phylogeny of Ranunculaceae. Mol Phylogenet Evol 2022; 174:107545. [PMID: 35690374 DOI: 10.1016/j.ympev.2022.107545] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 06/01/2022] [Accepted: 06/02/2022] [Indexed: 11/16/2022]
Abstract
The utility of transcriptome data in plant phylogenetics has gained popularity in recent years. However, because RNA degrades much more easily than DNA, the logistics of obtaining fresh tissues has become a major limiting factor for widely applying this method. Here, we used Ranunculaceae to test whether silica-dried plant tissues could be used for RNA extraction and subsequent phylogenomic studies. We sequenced 27 transcriptomes, 21 from silica gel-dried (SD-samples) and six from liquid nitrogen-preserved (LN-samples) leaf tissues, and downloaded 27 additional transcriptomes from GenBank. Our results showed that although the LN-samples produced slightly better reads than the SD-samples, there were no significant differences in RNA quality and quantity, assembled contig lengths and numbers, and BUSCO comparisons between two treatments. Using these data, we conducted phylogenomic analyses, including concatenated- and coalescent-based phylogenetic reconstruction, molecular dating, coalescent simulation, phylogenetic network estimation, and whole genome duplication (WGD) inference. The resulting phylogeny was consistent with previous studies with higher resolution and statistical support. The 11 core Ranunculaceae tribes grouped into two chromosome type clades (T- and R-types), with high support. Discordance among gene trees is likely due to hybridization and introgression, ancient genetic polymorphism and incomplete lineage sorting. Our results strongly support one ancient hybridization event within the R-type clade and three WGD events in Ranunculales. Evolution of the three Ranunculaceae chromosome types is likely not directly related to WGD events. By clearly resolving the Ranunculaceae phylogeny, we demonstrated that SD-samples can be used for RNA-seq and phylotranscriptomic studies of angiosperms.
Collapse
Affiliation(s)
- Jian He
- School of Ecology and Nature Conservation, Beijing Forestry University, Beijing 100083, PR China
| | - Rudan Lyu
- School of Ecology and Nature Conservation, Beijing Forestry University, Beijing 100083, PR China
| | - Yike Luo
- School of Ecology and Nature Conservation, Beijing Forestry University, Beijing 100083, PR China
| | - Jiamin Xiao
- School of Ecology and Nature Conservation, Beijing Forestry University, Beijing 100083, PR China
| | - Lei Xie
- School of Ecology and Nature Conservation, Beijing Forestry University, Beijing 100083, PR China.
| | - Jun Wen
- Department of Botany, National Museum of Natural History, MRC 166, Smithsonian Institution, Washington, DC 20013-7012, USA.
| | - Wenhe Li
- School of Ecology and Nature Conservation, Beijing Forestry University, Beijing 100083, PR China
| | - Linying Pei
- Beijing Engineering Technology Research Center for Garden Plants, Beijing Forestry University Forest Science Co. Ltd., Beijing 100083, PR China
| | - Jin Cheng
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, PR China
| |
Collapse
|
8
|
Suresh BV, Choudhary P, Aggarwal PR, Rana S, Singh RK, Ravikesavan R, Prasad M, Muthamilarasan M. De novo transcriptome analysis identifies key genes involved in dehydration stress response in kodo millet (Paspalum scrobiculatum L.). Genomics 2022; 114:110347. [PMID: 35337948 DOI: 10.1016/j.ygeno.2022.110347] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 02/08/2022] [Accepted: 03/18/2022] [Indexed: 01/14/2023]
Abstract
Kodo millet (Paspalum scrobiculatum L.) is a small millet species known for its excellent nutritional and climate-resilient traits. To understand the genes and pathways underlying dehydration stress tolerance of kodo millet, the transcriptome of cultivar 'CO3' subjected to dehydration stress (0 h, 3 h, and 6 h) was sequenced. The study generated 239.1 million clean reads that identified 9201, 9814, and 2346 differentially expressed genes (DEGs) in 0 h vs. 3 h, 0 h vs. 6 h, and 3 h vs. 6 h libraries, respectively. The DEGs were found to be associated with vital molecular pathways, including hormone metabolism and signaling, antioxidant scavenging, photosynthesis, and cellular metabolism, and were validated using qRT-PCR. Also, a higher abundance of uncharacterized genes expressed during stress warrants further studies to characterize this class of genes to understand their role in dehydration stress response. Altogether, the study provides insights into the transcriptomic response of kodo millet during dehydration stress.
Collapse
Affiliation(s)
- Bonthala Venkata Suresh
- Quantitative Genetics and Genomics of Plants, Heinrich Heine University, Düsseldorf 40225, Germany.
| | - Pooja Choudhary
- Department of Plant Sciences, School of Life Sciences, University of Hyderabad, Hyderabad 500046, Telangana, India
| | - Pooja Rani Aggarwal
- Department of Plant Sciences, School of Life Sciences, University of Hyderabad, Hyderabad 500046, Telangana, India
| | - Sumi Rana
- Department of Plant Sciences, School of Life Sciences, University of Hyderabad, Hyderabad 500046, Telangana, India.
| | | | - Rajasekaran Ravikesavan
- Department of Millets, Centre for Plant Breeding and Genetics, Tamil Nadu Agricultural University, Coimbatore, Tamil Nadu, India.
| | - Manoj Prasad
- Department of Plant Sciences, School of Life Sciences, University of Hyderabad, Hyderabad 500046, Telangana, India; National Institute of Plant Genome Research, New Delhi 110067, India.
| | - Mehanathan Muthamilarasan
- Department of Plant Sciences, School of Life Sciences, University of Hyderabad, Hyderabad 500046, Telangana, India.
| |
Collapse
|
9
|
Zhao W, Chou J, Li J, Xu Y, Li Y, Hao Y. Impacts of Extreme Climate Events on Future Rice Yields in Global Major Rice-Producing Regions. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:4437. [PMID: 35457305 PMCID: PMC9031651 DOI: 10.3390/ijerph19084437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 03/30/2022] [Accepted: 04/04/2022] [Indexed: 12/02/2022]
Abstract
Under the dual impacts of climate change and COVID-19, there are great risks to the world's food security. Rice is one of the three major food crops of the world. Assessing the impact of climate change on future rice production is very important for ensuring global food security. This article divides the world's main rice-producing regions into four regions and uses a multivariate nonlinear model based on historical economic and climatic data to explore the impacts of historical extreme climatic events and economic factors on rice yield. Based on these historical models, future climatic data, and economic data under different shared socioeconomic pathways (SSPs), the yields of four major rice-producing regions of the world under different climate change scenarios (SSP126, SSP245, and SSP585) are predicted. The research results reveal that under different climate change scenarios, extreme high-temperature events (Tx90p) and extreme precipitation events (Rx5day, R99pTOT) in the four major rice-producing regions have an upward trend in the future. Extreme low-temperature events (Tn10p) have a downward trend. In the rice-producing regions of Southeast Asia and South America, extreme precipitation events will increase significantly in the future. The prediction results of this model indicate that the rice output of these four major rice-producing regions will show an upward trend in the future. Although extreme precipitation events will have a negative impact on rice production, future increases in rice planting areas, economic development, and population growth will all contribute to an increase in rice production. The increase in food demand caused by population growth also brings uncertainty to global food security. This research is helpful for further understanding climate change trends and risks to global rice-production areas in the future and provides an important reference for global rice-production planning and risk management.
Collapse
Affiliation(s)
- Weixing Zhao
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China; (W.Z.); (J.L.); (Y.X.); (Y.L.); (Y.H.)
| | - Jieming Chou
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China; (W.Z.); (J.L.); (Y.X.); (Y.L.); (Y.H.)
- Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai 510275, China
| | - Jiangnan Li
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China; (W.Z.); (J.L.); (Y.X.); (Y.L.); (Y.H.)
| | - Yuan Xu
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China; (W.Z.); (J.L.); (Y.X.); (Y.L.); (Y.H.)
| | - Yuanmeng Li
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China; (W.Z.); (J.L.); (Y.X.); (Y.L.); (Y.H.)
| | - Yidan Hao
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China; (W.Z.); (J.L.); (Y.X.); (Y.L.); (Y.H.)
| |
Collapse
|
10
|
Mostafa S, Wang Y, Zeng W, Jin B. Floral Scents and Fruit Aromas: Functions, Compositions, Biosynthesis, and Regulation. FRONTIERS IN PLANT SCIENCE 2022; 13:860157. [PMID: 35360336 PMCID: PMC8961363 DOI: 10.3389/fpls.2022.860157] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Accepted: 02/09/2022] [Indexed: 05/27/2023]
Abstract
Floral scents and fruit aromas are crucial volatile organic compounds (VOCs) in plants. They are used in defense mechanisms, along with mechanisms to attract pollinators and seed dispersers. In addition, they are economically important for the quality of crops, as well as quality in the perfume, cosmetics, food, drink, and pharmaceutical industries. Floral scents and fruit aromas share many volatile organic compounds in flowers and fruits. Volatile compounds are classified as terpenoids, phenylpropanoids/benzenoids, fatty acid derivatives, and amino acid derivatives. Many genes and transcription factors regulating the synthesis of volatiles have been discovered. In this review, we summarize recent progress in volatile function, composition, biosynthetic pathway, and metabolism regulation. We also discuss unresolved issues and research perspectives, providing insight into improvements and applications of plant VOCs.
Collapse
Affiliation(s)
- Salma Mostafa
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, China
- Department of Floriculture, Faculty of Agriculture, Alexandria University, Alexandria, Egypt
| | - Yun Wang
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, China
| | - Wen Zeng
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, China
| | - Biao Jin
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, China
| |
Collapse
|
11
|
Raghavan V, Kraft L, Mesny F, Rigerte L. A simple guide to de novo transcriptome assembly and annotation. Brief Bioinform 2022; 23:6514404. [PMID: 35076693 PMCID: PMC8921630 DOI: 10.1093/bib/bbab563] [Citation(s) in RCA: 53] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 12/03/2021] [Accepted: 12/09/2021] [Indexed: 12/13/2022] Open
Abstract
A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools.
Collapse
Affiliation(s)
- Venket Raghavan
- Corresponding authors: Venket Raghavan, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail: ; Louis Kraft, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail:
| | - Louis Kraft
- Corresponding authors: Venket Raghavan, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail: ; Louis Kraft, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail:
| | | | | |
Collapse
|
12
|
Constructing a de novo transcriptome and a reference proteome for the bivalve Scrobicularia plana: Comparative analysis of different assembly strategies and proteomic analysis. Genomics 2021; 113:1543-1553. [PMID: 33774165 DOI: 10.1016/j.ygeno.2021.03.025] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 03/17/2021] [Accepted: 03/21/2021] [Indexed: 11/20/2022]
Abstract
Scrobicularia plana is a coastal and estuarine bivalve widely used in ecotoxicological studies. However, the underlying molecular mechanisms for S. plana pollutant responses are hardly known due to the lack of molecular databases. Thus, in this study we present a holistic approach to assess a robust reference transcriptome and proteome of this clam. A mixture of control and metal-exposed individuals was used for mRNA isolation. Four sets of high quality filtered preprocessed reads were generated (two quality scores and two sequenced lengths) and assembled with Mira, Ray and Trinity algorithms. The sixty-four generated assemblies were refined, filtered and evaluated for their proteomic quality. Eight assemblies presented top Detonate scores but one was selected due to its compactness and biological representation, which was generated: (i) from the highest quality dataset (Q20L100), (ii) using Trinity algorithm with all k-mers (AtKa), (iii) removing redundancy by CD-HIT (RR80), and (iv) filtering out poor contigs (F), that was subsequently named Q20L100AtKaRR80F. S. plana proteomic analysis revealed 10,017 peptide groups that corresponded to 2066 proteins with a wide coverage of molecular functions and biological processes, confirming the strength of the database generated.
Collapse
|
13
|
Alvarez RV, Mariño-Ramírez L, Landsman D. Transcriptome annotation in the cloud: complexity, best practices, and cost. Gigascience 2021; 10:giaa163. [PMID: 33511996 PMCID: PMC7845158 DOI: 10.1093/gigascience/giaa163] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 11/13/2020] [Accepted: 12/23/2020] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND The NIH Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability (STRIDES) initiative provides NIH-funded researchers cost-effective access to commercial cloud providers, such as Amazon Web Services (AWS) and Google Cloud Platform (GCP). These cloud providers represent an alternative for the execution of large computational biology experiments like transcriptome annotation, which is a complex analytical process that requires the interrogation of multiple biological databases with several advanced computational tools. The core components of annotation pipelines published since 2012 are BLAST sequence alignments using annotated databases of both nucleotide or protein sequences almost exclusively with networked on-premises compute systems. FINDINGS We compare multiple BLAST sequence alignments using AWS and GCP. We prepared several Jupyter Notebooks with all the code required to submit computing jobs to the batch system on each cloud provider. We consider the consequence of the number of query transcripts in input files and the effect on cost and processing time. We tested compute instances with 16, 32, and 64 vCPUs on each cloud provider. Four classes of timing results were collected: the total run time, the time for transferring the BLAST databases to the instance local solid-state disk drive, the time to execute the CWL script, and the time for the creation, set-up, and release of an instance. This study aims to establish an estimate of the cost and compute time needed for the execution of multiple BLAST runs in a cloud environment. CONCLUSIONS We demonstrate that public cloud providers are a practical alternative for the execution of advanced computational biology experiments at low cost. Using our cloud recipes, the BLAST alignments required to annotate a transcriptome with ∼500,000 transcripts can be processed in <2 hours with a compute cost of ∼$200-$250. In our opinion, for BLAST-based workflows, the choice of cloud platform is not dependent on the workflow but, rather, on the specific details and requirements of the cloud provider. These choices include the accessibility for institutional use, the technical knowledge required for effective use of the platform services, and the availability of open source frameworks such as APIs to deploy the workflow.
Collapse
Affiliation(s)
- Roberto Vera Alvarez
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, NIH, 9000 Rockville Pike, Bethesda, MD 20890, USA
| | - Leonardo Mariño-Ramírez
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, NIH, 9000 Rockville Pike, Bethesda, MD 20890, USA
| | - David Landsman
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, NIH, 9000 Rockville Pike, Bethesda, MD 20890, USA
| |
Collapse
|
14
|
Chromosome-Scale Assembly and Annotation of the Macadamia Genome ( Macadamia integrifolia HAES 741). G3-GENES GENOMES GENETICS 2020; 10:3497-3504. [PMID: 32747341 PMCID: PMC7534425 DOI: 10.1534/g3.120.401326] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Macadamia integrifolia is a representative of the large basal eudicot family Proteaceae and the main progenitor species of the Australian native nut crop macadamia. Since its commercialisation in Hawaii fewer than 100 years ago, global production has expanded rapidly. However, genomic resources are limited in comparison to other horticultural crops. The first draft assembly of M. integrifolia had good coverage of the functional gene space but its high fragmentation has restricted its use in comparative genomics and association studies. Here we have generated an improved assembly of cultivar HAES 741 (4,094 scaffolds, 745 Mb, N50 413 kb) using a combination of Illumina paired and PacBio long read sequences. Scaffolds were anchored to 14 pseudo-chromosomes using seven genetic linkage maps. This assembly has improved contiguity and coverage, with >120 Gb of additional sequence. Following annotation, 34,274 protein-coding genes were predicted, representing 90% of the expected gene content. Our results indicate that the macadamia genome is repetitive and heterozygous. The total repeat content was 55% and genome-wide heterozygosity, estimated by read mapping, was 0.98% or an average of one SNP per 102 bp. This is the first chromosome-scale genome assembly for macadamia and the Proteaceae. It is expected to be a valuable resource for breeding, gene discovery, conservation and evolutionary genomics.
Collapse
|