1
|
Hurtado M, Suarez-Álvarez S, Castander-Olarieta A, Montalbán IA, Goicoechea PG, López de Heredia U, Marino D, Moncaleán P. Physiological and molecular response to drought in somatic plants from Pinus radiata embryonal masses induced at high temperatures. PLANT PHYSIOLOGY AND BIOCHEMISTRY : PPB 2025; 224:109886. [PMID: 40262399 DOI: 10.1016/j.plaphy.2025.109886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2025] [Accepted: 04/04/2025] [Indexed: 04/24/2025]
Abstract
Drought and heat are among the major abiotic stresses in forest trees and are directly related with the consequences of climatic change. Many responses to abiotic stresses in plants have been associated with plant memory but mechanisms underlying this phenomenon remain unclear. Somatic embryogenesis, which is considered one of the most important methods for large-scale vegetative propagation of plants, is also used for stress induction and study the mechanisms involved in adaptation to abiotic stress. Specifically, heat stress during initiation stage of somatic embryogenesis has shown to have an impact in differential expression of stress related genes in pines. Modifications caused by a previous stress could eventually influence the stress tolerance of somatic plants years later. In this study we analysed the response to drought in 2-year-old radiata pine somatic plants, derived from embryonal masses initiated at 60 °C, at physiological, transcriptomic and amino acid accumulation level. Our results showed a more pronounce response to drought in plants coming from 60 °C treatment, which presented lower values in several physiological parameters as well as higher proline and tyrosine levels. Additionally, the transcriptomic response to drought was stronger in heat primed plants compared to control plants, suggesting a memory acquired two years before.
Collapse
Affiliation(s)
- Mikel Hurtado
- Department Forestry Sciences, NEIKER-BRTA, Instituto Vasco de Investigación y Desarrollo Agrario, Campus Agroalimentario de Arkaute, Ctra N-104 km 355, Arkaute, Álava, 01192, Spain; Department of Plant Biology and Ecology, Facultad de Ciencia y Tecnología, Universidad del País Vasco-Euskal Herriko Unibertsitatea (UPV/EHU), Barrio Sarriena s/n, Leioa, Bizkaia, 48940, Spain
| | - Sonia Suarez-Álvarez
- Department Plant Production. NEIKER-BRTA, Instituto Vasco de Investigación y Desarrollo Agrario, Campus Agroalimentario de Arkaute, Ctra N-104 km 355, Arkaute, Álava, 01192, Spain
| | - Ander Castander-Olarieta
- Department Forestry Sciences, NEIKER-BRTA, Instituto Vasco de Investigación y Desarrollo Agrario, Campus Agroalimentario de Arkaute, Ctra N-104 km 355, Arkaute, Álava, 01192, Spain
| | - Itziar A Montalbán
- Department Forestry Sciences, NEIKER-BRTA, Instituto Vasco de Investigación y Desarrollo Agrario, Campus Agroalimentario de Arkaute, Ctra N-104 km 355, Arkaute, Álava, 01192, Spain
| | - Pablo G Goicoechea
- Department Forestry Sciences, NEIKER-BRTA, Instituto Vasco de Investigación y Desarrollo Agrario, Campus Agroalimentario de Arkaute, Ctra N-104 km 355, Arkaute, Álava, 01192, Spain
| | - Unai López de Heredia
- GI en Desarrollo de Especies y Comunidades Leñosas (WooSP), Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politécnica de Madrid, Ciudad Universitaria s/n, 28040, Madrid, Spain
| | - Daniel Marino
- Department of Plant Biology and Ecology, Facultad de Ciencia y Tecnología, Universidad del País Vasco-Euskal Herriko Unibertsitatea (UPV/EHU), Barrio Sarriena s/n, Leioa, Bizkaia, 48940, Spain
| | - Paloma Moncaleán
- Department Forestry Sciences, NEIKER-BRTA, Instituto Vasco de Investigación y Desarrollo Agrario, Campus Agroalimentario de Arkaute, Ctra N-104 km 355, Arkaute, Álava, 01192, Spain.
| |
Collapse
|
2
|
Mora‐Márquez F, Nuño JC, Soto Á, López de Heredia U. Missing genotype imputation in non-model species using self-organizing maps. Mol Ecol Resour 2025; 25:e13992. [PMID: 38970328 PMCID: PMC11887599 DOI: 10.1111/1755-0998.13992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 05/30/2024] [Accepted: 06/26/2024] [Indexed: 07/08/2024]
Abstract
Current methodologies of genome-wide single-nucleotide polymorphism (SNP) genotyping produce large amounts of missing data that may affect statistical inference and bias the outcome of experiments. Genotype imputation is routinely used in well-studied species to buffer the impact in downstream analysis, and several algorithms are available to fill in missing genotypes. The lack of reference haplotype panels precludes the use of these methods in genomic studies on non-model organisms. As an alternative, machine learning algorithms are employed to explore the genotype data and to estimate the missing genotypes. Here, we propose an imputation method based on self-organizing maps (SOM), a widely used neural networks formed by spatially distributed neurons that cluster similar inputs into close neurons. The method explores genotype datasets to select SNP loci to build binary vectors from the genotypes, and initializes and trains neural networks for each query missing SNP genotype. The SOM-derived clustering is then used to impute the best genotype. To automate the imputation process, we have implemented gtImputation, an open-source application programmed in Python3 and with a user-friendly GUI to facilitate the whole process. The method performance was validated by comparing its accuracy, precision and sensitivity on several benchmark genotype datasets with other available imputation algorithms. Our approach produced highly accurate and precise genotype imputations even for SNPs with alleles at low frequency and outperformed other algorithms, especially for datasets from mixed populations with unrelated individuals.
Collapse
Affiliation(s)
- Fernando Mora‐Márquez
- GI en Especies Leñosas (WooSp), Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio NaturalUniversidad Politécnica de Madrid, Ciudad UniversitariaMadridSpain
| | - Juan Carlos Nuño
- GI en Especies Leñosas (WooSp), Dpto. Matemática Aplicada, ETSI Montes, Forestal y del Medio NaturalUniversidad Politécnica de Madrid, Ciudad UniversitariaMadridSpain
| | - Álvaro Soto
- GI en Especies Leñosas (WooSp), Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio NaturalUniversidad Politécnica de Madrid, Ciudad UniversitariaMadridSpain
| | - Unai López de Heredia
- GI en Especies Leñosas (WooSp), Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio NaturalUniversidad Politécnica de Madrid, Ciudad UniversitariaMadridSpain
| |
Collapse
|
3
|
Mora-Márquez F, Hurtado M, López de Heredia U. gymnotoa-db: a database and application to optimize functional annotation in gymnosperms. Database (Oxford) 2025; 2025:baaf019. [PMID: 40052362 PMCID: PMC11886576 DOI: 10.1093/database/baaf019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2024] [Revised: 02/11/2025] [Accepted: 02/18/2025] [Indexed: 03/09/2025]
Abstract
Gymnosperms are a clade of non-flowering plants that include about 1000 living species. Due to their complex genomes and lack of genomic resources, functional annotation in genomics and transcriptomics on gymnosperms suffers from limitations. Here we present gymnotoa-db, which is a novel, publicly accessible relational database designed to facilitate functional annotation in gymnosperms. This database stores non-redundant records of gymnosperm proteins, encompassing taxonomic and functional information. The complementary software, gymnotoa-app, enables users to download gymnotoa-db and execute a comprehensive functional annotation pipeline for high-throughput sequencing-derived DNA or cDNA sequences. gymnotoa-app's user-friendly interface and efficient algorithms streamline the functional annotation process, making it an invaluable tool for researchers studying gymnosperms. We compared gymnotoa-app's performance against other annotation tools utilizing disparate reference databases. Our results demonstrate gymnotoa-app's superior ability to accurately annotate gymnosperm transcripts, recovering a greater number of transcripts and unique, non-redundant Gene Ontology terms. gymnotoa-db's distinctive features include comprehensive coverage with a non-redundant dataset of gymnosperm protein sequences, robust functional information that integrates data from multiple ontology systems, including GO, KEGG, EC, and MetaCYC, while keeping the taxonomic context, including Arabidopsis homologs. Database URL: https://blogs.upm.es/gymnotoa-db/2024/09/19/gymnotoa-app/.
Collapse
Affiliation(s)
- Fernando Mora-Márquez
- GI en Desarrollo de Especies y Comunidades Leñosas (WooSP), Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politécnica de Madrid, José Antonio Novais 10, Madrid 28040, Spain
| | - Mikel Hurtado
- GI en Desarrollo de Especies y Comunidades Leñosas (WooSP), Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politécnica de Madrid, José Antonio Novais 10, Madrid 28040, Spain
| | - Unai López de Heredia
- GI en Desarrollo de Especies y Comunidades Leñosas (WooSP), Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politécnica de Madrid, José Antonio Novais 10, Madrid 28040, Spain
| |
Collapse
|
4
|
Ko G, Kim PG, Yoon BH, Kim J, Song W, Byeon I, Yoon J, Lee B, Kim YK. Closha 2.0: a bio-workflow design system for massive genome data analysis on high performance cluster infrastructure. BMC Bioinformatics 2024; 25:353. [PMID: 39533201 PMCID: PMC11558834 DOI: 10.1186/s12859-024-05963-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Accepted: 10/21/2024] [Indexed: 11/16/2024] Open
Abstract
BACKGROUND The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and significant computational challenges. As the cost of next-generation sequencing (NGS) has decreased, the amount of genomic data has surged globally. However, the cost and complexity of the computational resources required continue to be substantial barriers to leveraging big data. A promising solution to these computational challenges is cloud computing, which provides researchers with the necessary CPUs, memory, storage, and software tools. RESULTS Here, we present Closha 2.0, a cloud computing service that offers a user-friendly platform for analyzing massive genomic datasets. Closha 2.0 is designed to provide a cloud-based environment that enables all genomic researchers, including those with limited or no programming experience, to easily analyze their genomic data. The new 2.0 version of Closha has more user-friendly features than the previous 1.0 version. Firstly, the workbench features a script editor that supports Python, R, and shell script programming, enabling users to write scripts and integrate them into their pipelines. This functionality is particularly useful for downstream analysis. Second, Closha 2.0 runs on containers, which execute each tool in an independent environment. This provides a stable environment and prevents dependency issues and version conflicts among tools. Additionally, users can execute each step of a pipeline individually, allowing them to test applications at each stage and adjust parameters to achieve the desired results. We also updated a high-speed data transmission tool called GBox that facilitates the rapid transfer of large datasets. CONCLUSIONS The analysis pipelines on Closha 2.0 are reproducible, with all analysis parameters and inputs being permanently recorded. Closha 2.0 simplifies multi-step analysis with drag-and-drop functionality and provides a user-friendly interface for genomic scientists to obtain accurate results from NGS data. Closha 2.0 is freely available at https://www.kobic.re.kr/closha2 .
Collapse
Affiliation(s)
- Gunhwan Ko
- Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea
| | - Pan-Gyu Kim
- Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea
| | - Byung-Ha Yoon
- Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea
| | - JaeHee Kim
- Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea
| | - Wangho Song
- Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea
| | - IkSu Byeon
- Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea
| | - JongCheol Yoon
- Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea
| | - Byungwook Lee
- Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea.
| | - Young-Kuk Kim
- Department of Bio-AI Convergence, Chungnam National University, Daejeon, 34134, Korea.
| |
Collapse
|
5
|
Martínez-García M, Hernández-Lemus E. Data Integration Challenges for Machine Learning in Precision Medicine. Front Med (Lausanne) 2022; 8:784455. [PMID: 35145977 PMCID: PMC8821900 DOI: 10.3389/fmed.2021.784455] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 12/28/2021] [Indexed: 12/19/2022] Open
Abstract
A main goal of Precision Medicine is that of incorporating and integrating the vast corpora on different databases about the molecular and environmental origins of disease, into analytic frameworks, allowing the development of individualized, context-dependent diagnostics, and therapeutic approaches. In this regard, artificial intelligence and machine learning approaches can be used to build analytical models of complex disease aimed at prediction of personalized health conditions and outcomes. Such models must handle the wide heterogeneity of individuals in both their genetic predisposition and their social and environmental determinants. Computational approaches to medicine need to be able to efficiently manage, visualize and integrate, large datasets combining structure, and unstructured formats. This needs to be done while constrained by different levels of confidentiality, ideally doing so within a unified analytical architecture. Efficient data integration and management is key to the successful application of computational intelligence approaches to medicine. A number of challenges arise in the design of successful designs to medical data analytics under currently demanding conditions of performance in personalized medicine, while also subject to time, computational power, and bioethical constraints. Here, we will review some of these constraints and discuss possible avenues to overcome current challenges.
Collapse
Affiliation(s)
- Mireya Martínez-García
- Clinical Research Division, National Institute of Cardiology ‘Ignacio Chávez’, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autnoma de Mexico, Mexico City, Mexico
| |
Collapse
|
6
|
La Ferlita A, Alaimo S, Di Bella S, Martorana E, Laliotis GI, Bertoni F, Cascione L, Tsichlis PN, Ferro A, Bosotti R, Pulvirenti A. RNAdetector: a free user-friendly stand-alone and cloud-based system for RNA-Seq data analysis. BMC Bioinformatics 2021; 22:298. [PMID: 34082707 PMCID: PMC8173825 DOI: 10.1186/s12859-021-04211-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2020] [Accepted: 05/20/2021] [Indexed: 12/13/2022] Open
Abstract
Background RNA-Seq is a well-established technology extensively used for transcriptome profiling, allowing the analysis of coding and non-coding RNA molecules. However, this technology produces a vast amount of data requiring sophisticated computational approaches for their analysis than other traditional technologies such as Real-Time PCR or microarrays, strongly discouraging non-expert users. For this reason, dozens of pipelines have been deployed for the analysis of RNA-Seq data. Although interesting, these present several limitations and their usage require a technical background, which may be uncommon in small research laboratories. Therefore, the application of these technologies in such contexts is still limited and causes a clear bottleneck in knowledge advancement. Results Motivated by these considerations, we have developed RNAdetector, a new free cross-platform and user-friendly RNA-Seq data analysis software that can be used locally or in cloud environments through an easy-to-use Graphical User Interface allowing the analysis of coding and non-coding RNAs from RNA-Seq datasets of any sequenced biological species. Conclusions RNAdetector is a new software that fills an essential gap between the needs of biomedical and research labs to process RNA-Seq data and their common lack of technical background in performing such analysis, which usually relies on outsourcing such steps to third party bioinformatics facilities or using expensive commercial software. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04211-7.
Collapse
Affiliation(s)
- Alessandro La Ferlita
- Department of Clinical and Experimental Medicine, Bioinformatics Unit, University of Catania, Catania, Italy.,Department of Cancer Biology and Genetics, The Ohio State University, Columbus, OH, USA.,Department of Physics and Astronomy, University of Catania, Catania, Italy
| | - Salvatore Alaimo
- Department of Clinical and Experimental Medicine, Bioinformatics Unit, University of Catania, Catania, Italy
| | | | - Emanuele Martorana
- Regional Referral Centre for Rare Lung Diseases, A. O. U. "Policlinico-Vittorio Emanuele", Department of Clinical and Experimental Medicine, University of Catania, Catania, Italy
| | - Georgios I Laliotis
- Department of Cancer Biology and Genetics, The Ohio State University, Columbus, OH, USA
| | | | | | - Philip N Tsichlis
- Department of Cancer Biology and Genetics, The Ohio State University, Columbus, OH, USA
| | - Alfredo Ferro
- Department of Clinical and Experimental Medicine, Bioinformatics Unit, University of Catania, Catania, Italy
| | | | - Alfredo Pulvirenti
- Department of Clinical and Experimental Medicine, Bioinformatics Unit, University of Catania, Catania, Italy.
| |
Collapse
|