1
|
Chen LP. Classification and prediction for multi-cancer data with ultrahigh-dimensional gene expressions. PLoS One 2022; 17:e0274440. [PMID: 36107929 PMCID: PMC9477337 DOI: 10.1371/journal.pone.0274440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 08/28/2022] [Indexed: 11/29/2022] Open
Abstract
Analysis of gene expression data is an attractive topic in the field of bioinformatics, and a typical application is to classify and predict individuals’ diseases or tumors by treating gene expression values as predictors. A primary challenge of this study comes from ultrahigh-dimensionality, which makes that (i) many predictors in the dataset might be non-informative, (ii) pairwise dependence structures possibly exist among high-dimensional predictors, yielding the network structure. While many supervised learning methods have been developed, it is expected that the prediction performance would be affected if impacts of ultrahigh-dimensionality were not carefully addressed. In this paper, we propose a new statistical learning algorithm to deal with multi-classification subject to ultrahigh-dimensional gene expressions. In the proposed algorithm, we employ the model-free feature screening method to retain informative gene expression values from ultrahigh-dimensional data, and then construct predictive models with network structures of selected gene expression accommodated. Different from existing supervised learning methods that build predictive models based on entire dataset, our approach is able to identify informative predictors and dependence structures for gene expression. Throughout analysis of a real dataset, we find that the proposed algorithm gives precise classification as well as accurate prediction, and outperforms some commonly used supervised learning methods.
Collapse
Affiliation(s)
- Li-Pang Chen
- Department of Statistics, National Chengchi University, Taipei, Taiwan, ROC
- * E-mail:
| |
Collapse
|
2
|
Bajo-Morales J, Prieto-Prieto JC, Herrera LJ, Rojas I, Castillo-Secilla D. COVID-19 Biomarkers Recognition & Classification Using Intelligent Systems. Curr Bioinform 2022. [DOI: 10.2174/1574893617666220328125029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Background:
SARS-CoV-2 has paralyzed mankind due to its high transmissibility and its associated mortality, causing millions of infections and deaths worldwide. The search for gene expression biomarkers from the host transcriptional response to infection may help understand the underlying mechanisms by which the virus causes COVID-19. This research proposes a smart methodology integrating different RNA-Seq datasets from SARS-CoV-2, other respiratory diseases, and healthy patients.
Methods:
The proposed pipeline exploits the functionality of the ‘KnowSeq’ R/Bioc package, integrating different data sources and attaining a significantly larger gene expression dataset, thus endowing the results with higher statistical significance and robustness in comparison with previous studies in the literature. A detailed preprocessing step was carried out to homogenize the samples and build a clinical decision system for SARS-CoV-2. It uses machine learning techniques such as feature selection algorithm and supervised classification system. This clinical decision system uses the most differentially expressed genes among different diseases (including SARS-Cov-2) to develop a four-class classifier.
Results:
The multiclass classifier designed can discern SARS-CoV-2 samples, reaching an accuracy equal to 91.5%, a mean F1-Score equal to 88.5%, and a SARS-CoV-2 AUC equal to 94% by using only 15 genes as predictors. A biological interpretation of the gene signature extracted reveals relations with processes involved in viral responses.
Conclusion:
This work proposes a COVID-19 gene signature composed of 15 genes, selected after applying the feature selection ‘minimum Redundancy Maximum Relevance’ algorithm. The integration among several RNA-Seq datasets was a success, allowing for a considerable large number of samples and therefore providing greater statistical significance to the results than previous studies. Biological interpretation of the selected genes was also provided.
Collapse
Affiliation(s)
- Javier Bajo-Morales
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain
| | - Juan Carlos Prieto-Prieto
- Nuclear Medicine Department, IMIBIC, University Hospital Reina Sofia, Menéndez Pidal Avenue, 14004, Córdoba, Spain
| | - Luis Javier Herrera
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain
| | - Ignacio Rojas
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain
| | - Daniel Castillo-Secilla
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain
| |
Collapse
|
3
|
Castillo-Secilla D, Gálvez JM, Carrillo-Perez F, Verona-Almeida M, Redondo-Sánchez D, Ortuno FM, Herrera LJ, Rojas I. KnowSeq R-Bioc package: The automatic smart gene expression tool for retrieving relevant biological knowledge. Comput Biol Med 2021; 133:104387. [PMID: 33872966 DOI: 10.1016/j.compbiomed.2021.104387] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 04/05/2021] [Accepted: 04/05/2021] [Indexed: 02/07/2023]
Abstract
KnowSeq R/Bioc package is designed as a powerful, scalable and modular software focused on automatizing and assembling renowned bioinformatic tools with new features and functionalities. It comprises a unified environment to perform complex gene expression analyses, covering all the needed processing steps to identify a gene signature for a specific disease to gather understandable knowledge. This process may be initiated from raw files either available at well-known platforms or provided by the users themselves, and in either case coming from different information sources and different Transcriptomic technologies. The pipeline makes use of a set of advanced algorithms, including the adaptation of a novel procedure for the selection of the most representative genes in a given multiclass problem. Similarly, an intelligent system able to classify new patients, providing the user the opportunity to choose one among a number of well-known and widespread classification and feature selection methods in Bioinformatics, is embedded. Furthermore, KnowSeq is engineered to automatically develop a complete and detailed HTML report of the whole process which is also modular and scalable. Biclass breast cancer and multiclass lung cancer study cases were addressed to rigorously assess the usability and efficiency of KnowSeq. The models built by using the Differential Expressed Genes achieved from both experiments reach high classification rates. Furthermore, biological knowledge was extracted in terms of Gene Ontologies, Pathways and related diseases with the aim of helping the expert in the decision-making process. KnowSeq is available at Bioconductor (https://bioconductor.org/packages/KnowSeq), GitHub (https://github.com/CasedUgr/KnowSeq) and Docker (https://hub.docker.com/r/casedugr/knowseq).
Collapse
Affiliation(s)
- Daniel Castillo-Secilla
- Department of Computer Architecture and Technology,University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero 2, 18014, Granada, Spain.
| | - Juan Manuel Gálvez
- Department of Computer Architecture and Technology,University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero 2, 18014, Granada, Spain
| | - Francisco Carrillo-Perez
- Department of Computer Architecture and Technology,University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero 2, 18014, Granada, Spain
| | - Marta Verona-Almeida
- Department of Computer Architecture and Technology,University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero 2, 18014, Granada, Spain
| | - Daniel Redondo-Sánchez
- Instituto de Investigación Biosanitaria de Granada, Non-Communicable Disease and Cancer Epidemiology Group, ibs.GRANADA, Avda. de Madrid, 15. Pabellón de Consultas Externas 2, 2a Planta, CP, 18012, Granada, Spain
| | - Francisco Manuel Ortuno
- Clinical Bioinformatics Area, Fundación Andaluza Progreso y Salud (FPS), Hospital Universitario Virgen del Rocío, Avenida Manuel Siurot s/n, 41013, Sevilla, Spain
| | - Luis Javier Herrera
- Department of Computer Architecture and Technology,University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero 2, 18014, Granada, Spain
| | - Ignacio Rojas
- Department of Computer Architecture and Technology,University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero 2, 18014, Granada, Spain
| |
Collapse
|
4
|
Wong HN, Lewies A, Haigh M, Viljoen JM, Wentzel JF, Haynes RK, du Plessis LH. Anti-Melanoma Activities of Artemisone and Prenylated Amino-Artemisinins in Combination With Known Anticancer Drugs. Front Pharmacol 2020; 11:558894. [PMID: 33117161 PMCID: PMC7552967 DOI: 10.3389/fphar.2020.558894] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Accepted: 09/08/2020] [Indexed: 12/24/2022] Open
Abstract
The most frequently occurring cancers are those of the skin, with melanoma being the leading cause of death due to skin cancer. Breakthroughs in chemotherapy have been achieved in certain cases, though only marginal advances have been made in treatment of metastatic melanoma. Strategies aimed at inducing redox dysregulation by use of reactive oxygen species (ROS) inducers present a promising approach to cancer chemotherapy. Here we use a rational combination of an oxidant drug combined with a redox or pro-oxidant drug to optimize the cytotoxic effect. Thus we demonstrate for the first time enhanced activity of the amino-artemisinin artemisone and novel prenylated piperazine derivatives derived from dihydroartemisinin as the oxidant component, and elesclomol-Cu(II) as the redox component, against human malignant melanoma cells A375 in vitro. The combinations caused a dose dependent decrease in cell numbers and increase in apoptosis. The results indicate that oxidant-redox drug combinations have considerable potential and warrant further investigation.
Collapse
Affiliation(s)
- Ho Ning Wong
- Centre of Excellence for Pharmaceutical Sciences (Pharmacen™), North-West University, Potchefstroom, South Africa
| | - Angélique Lewies
- Centre of Excellence for Pharmaceutical Sciences (Pharmacen™), North-West University, Potchefstroom, South Africa
| | - Michaela Haigh
- Centre of Excellence for Pharmaceutical Sciences (Pharmacen™), North-West University, Potchefstroom, South Africa
| | - Joe M Viljoen
- Centre of Excellence for Pharmaceutical Sciences (Pharmacen™), North-West University, Potchefstroom, South Africa
| | - Johannes F Wentzel
- Centre of Excellence for Pharmaceutical Sciences (Pharmacen™), North-West University, Potchefstroom, South Africa
| | - Richard K Haynes
- Centre of Excellence for Pharmaceutical Sciences (Pharmacen™), North-West University, Potchefstroom, South Africa
| | - Lissinda H du Plessis
- Centre of Excellence for Pharmaceutical Sciences (Pharmacen™), North-West University, Potchefstroom, South Africa
| |
Collapse
|
5
|
Van Eeckhout A, Garcia-Caurel E, Ossikovski R, Lizana A, Rodríguez C, González-Arnay E, Campos J. Depolarization metric spaces for biological tissues classification. JOURNAL OF BIOPHOTONICS 2020; 13:e202000083. [PMID: 32406967 DOI: 10.1002/jbio.202000083] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Revised: 05/01/2020] [Accepted: 05/08/2020] [Indexed: 05/02/2023]
Abstract
Classification of tissues is an important problem in biomedicine. An efficient tissue classification protocol allows, for instance, the guided-recognition of structures through treated images or discriminating between healthy and unhealthy regions (e.g., early detection of cancer). In this framework, we study the potential of some polarimetric metrics, the so-called depolarization spaces, for the classification of biological tissues. The analysis is performed using 120 biological ex vivo samples of three different tissues types. Based on these data collection, we provide for the first time a comparison between these depolarization spaces, as well as with most commonly used depolarization metrics, in terms of biological samples discrimination. The results illustrate the way to determine the set of depolarization metrics which optimizes tissue classification efficiencies. In that sense, the results show the interest of the method which is general, and which can be applied to study multiple types of biological samples, including of course human tissues. The latter can be useful for instance, to improve and to boost applications related to optical biopsy.
Collapse
Affiliation(s)
- Albert Van Eeckhout
- Grup d'Òptica, Physics Department, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Enric Garcia-Caurel
- LPICM, CNRS, École Polytechnique, Université Paris-Saclay, Palaiseau, France
| | - Razvigor Ossikovski
- LPICM, CNRS, École Polytechnique, Université Paris-Saclay, Palaiseau, France
| | - Angel Lizana
- Grup d'Òptica, Physics Department, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Carla Rodríguez
- Grup d'Òptica, Physics Department, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Emilio González-Arnay
- Departamento de Anatomía, Histología y Neurociencia, Universidad Autónoma de Madrid, Madrid, Spain
- Servicio de Anatomía Patológica, Hospital Universitario de Canarias, Santa Cruz de Tenerife, Spain
| | - Juan Campos
- Grup d'Òptica, Physics Department, Universitat Autònoma de Barcelona, Bellaterra, Spain
| |
Collapse
|
6
|
Du Y, Kou P, Marraiki N, Elgorban A. Fucoxanthin modulates the development of 7, 12-dimethyl benz (a) anthracene-induced skin carcinogenesis in swiss albino mice in vivo. Pharmacogn Mag 2020. [DOI: 10.4103/pm.pm_292_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
|
7
|
Galvez JM, Castillo-Secilla D, Herrera LJ, Valenzuela O, Caba O, Prados JC, Ortuno FM, Rojas I. Towards Improving Skin Cancer Diagnosis by Integrating Microarray and RNA-Seq Datasets. IEEE J Biomed Health Inform 2019; 24:2119-2130. [PMID: 31871000 DOI: 10.1109/jbhi.2019.2953978] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Many clinical studies have revealed the high biological similarities existing among different skin pathological states. These similarities create difficulties in the efficient diagnosis of skin cancer, and encourage to study and design new intelligent clinical decision support systems. In this sense, gene expression analysis can help find differentially expressed genes (DEGs) simultaneously discerning multiple skin pathological states in a single test. The integration of multiple heterogeneous transcriptomic datasets requires different pipeline stages to be properly designed: from suitable batch merging and efficient biomarker selection to automated classification assessment. This article presents a novel approach addressing all these technical issues, with the intention of providing new sights about skin cancer diagnosis. Although new future efforts will have to be made in the search for better biomarkers recognizing specific skin pathological states, our study found a panel of 8 highly relevant multiclass DEGs for discerning up to 10 skin pathological states: 2 healthy skin conditions a priori, 2 cataloged precancerous skin diseases and 6 cancerous skin states. Their power of diagnosis over new samples was widely tested by previously well-trained classification models. Robust performance metrics such as overall and mean multiclass F1-score outperformed recognition rates of 94% and 80%, respectively. Clinicians should give special attention to highlighted multiclass DEGs that have high gene expression changes present among them, and understand their biological relationship to different skin pathological states.
Collapse
|
8
|
Morais-Rodrigues F, Silv Erio-Machado R, Kato RB, Rodrigues DLN, Valdez-Baez J, Fonseca V, San EJ, Gomes LGR, Dos Santos RG, Vinicius Canário Viana M, da Cruz Ferraz Dutra J, Teixeira Dornelles Parise M, Parise D, Campos FF, de Souza SJ, Ortega JM, Barh D, Ghosh P, Azevedo VAC, Dos Santos MA. Analysis of the microarray gene expression for breast cancer progression after the application modified logistic regression. Gene 2019; 726:144168. [PMID: 31759986 DOI: 10.1016/j.gene.2019.144168] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Revised: 09/21/2019] [Accepted: 10/11/2019] [Indexed: 01/02/2023]
Abstract
Methods based around statistics and linear algebra have been increasingly used in attempts to address emerging questions in microarray literature. Microarray technology is a long-used tool in the global analysis of gene expression, allowing for the simultaneous investigation of hundreds or thousands of genes in a sample. It is characterized by a low sample size and a large feature number created a non-square matrix, and by the incomplete rank, that can generate countless more solution in classifiers. To avoid the problem of the 'curse of dimensionality' many authors have performed feature selection or reduced the size of data matrix. In this work, we introduce a new logistic regression-based model to classify breast cancer tumor samples based on microarray expression data, including all features of gene expression and without reducing the microarray data matrix. If the user still deems it necessary to perform feature reduction, it can be done after the application of the methodology, still maintaining a good classification. This methodology allowed the correct classification of breast cancer sample data sets from Gene Expression Omnibus (GEO) data series GSE65194, GSE20711, and GSE25055, which contain the microarray data of said breast cancer samples. Classification had a minimum performance of 80% (sensitivity and specificity), and explored all possible data combinations, including breast cancer subtypes. This methodology highlighted genes not yet studied in breast cancer, some of which have been observed in Gene Regulatory Networks (GRNs). In this work we examine the patterns and features of a GRN composed of transcription factors (TFs) in MCF-7 breast cancer cell lines, providing valuable information regarding breast cancer. In particular, some genes whose αi ∗ associated parameter values revealed extreme positive and negative values, and, as such, can be identified as breast cancer prediction genes. We indicate that the PKN2, MKL1, MED23, CUL5 and GLI genes demonstrate a tumor suppressor profile, and that the MTR, ITGA2B, TELO2, MRPL9, MTTL1, WIPI1, KLHL20, PI4KB, FOLR1 and SHC1 genes demonstrate an oncogenic profile. We propose that these may serve as potential breast cancer prediction genes, and should be prioritized for further clinical studies on breast cancer. This new model allows for the assignment of values to the αi ∗ parameters associated with gene expression. It was noted that some αi ∗ parameters are associated with genes previously described as breast cancer biomarkers, as well as other genes not yet studied in relation to this disease.
Collapse
Affiliation(s)
- Francielly Morais-Rodrigues
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil.
| | - Rita Silv Erio-Machado
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Rodrigo Bentes Kato
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Diego Lucas Neres Rodrigues
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Juan Valdez-Baez
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Vagner Fonseca
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil; KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), College of Health Sciences, University of KwaZulu-Natal, Durban 4001, South Africa
| | - Emmanuel James San
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), College of Health Sciences, University of KwaZulu-Natal, Durban 4001, South Africa
| | - Lucas Gabriel Rodrigues Gomes
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Roselane Gonçalves Dos Santos
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Marcus Vinicius Canário Viana
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil; Federal University of Pará, UFPA, Brazil
| | - Joyce da Cruz Ferraz Dutra
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Mariana Teixeira Dornelles Parise
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Doglas Parise
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Frederico F Campos
- Department of Computer Science, Federal University of Minas Gerais, Brazil Av Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | | | - José Miguel Ortega
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Debmalya Barh
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal 721172, India
| | - Preetam Ghosh
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Vasco A C Azevedo
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Marcos A Dos Santos
- Department of Computer Science, Federal University of Minas Gerais, Brazil Av Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| |
Collapse
|
9
|
Liang R, Xie J, Zhang C, Zhang M, Huang H, Huo H, Cao X, Niu B. Identifying Cancer Targets Based on Machine Learning Methods via Chou's 5-steps Rule and General Pseudo Components. Curr Top Med Chem 2019; 19:2301-2317. [PMID: 31622219 DOI: 10.2174/1568026619666191016155543] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2019] [Revised: 07/19/2019] [Accepted: 08/26/2019] [Indexed: 01/09/2023]
Abstract
In recent years, the successful implementation of human genome project has made people realize that genetic, environmental and lifestyle factors should be combined together to study cancer due to the complexity and various forms of the disease. The increasing availability and growth rate of 'big data' derived from various omics, opens a new window for study and therapy of cancer. In this paper, we will introduce the application of machine learning methods in handling cancer big data including the use of artificial neural networks, support vector machines, ensemble learning and naïve Bayes classifiers.
Collapse
Affiliation(s)
- Ruirui Liang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Jiayang Xie
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Chi Zhang
- Foshan Huaxia Eye Hospital, Huaxia Eye Hospital Group, Foshan 528000, China
| | - Mengying Zhang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Hai Huang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Haizhong Huo
- Department of General Surgery, Shanghai Ninth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai 200011, China
| | - Xin Cao
- Zhongshan Hospital, Institute of Clinical Science, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Bing Niu
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| |
Collapse
|
10
|
Leukemia multiclass assessment and classification from Microarray and RNA-seq technologies integration at gene expression level. PLoS One 2019; 14:e0212127. [PMID: 30753220 PMCID: PMC6372182 DOI: 10.1371/journal.pone.0212127] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 01/27/2019] [Indexed: 12/13/2022] Open
Abstract
In more recent years, a significant increase in the number of available biological experiments has taken place due to the widespread use of massive sequencing data. Furthermore, the continuous developments in the machine learning and in the high performance computing areas, are allowing a faster and more efficient analysis and processing of this type of data. However, biological information about a certain disease is normally widespread due to the use of different sequencing technologies and different manufacturers, in different experiments along the years around the world. Thus, nowadays it is of paramount importance to attain a correct integration of biologically-related data in order to achieve genuine benefits from them. For this purpose, this work presents an integration of multiple Microarray and RNA-seq platforms, which has led to the design of a multiclass study by collecting samples from the main four types of leukemia, quantified at gene expression. Subsequently, in order to find a set of differentially expressed genes with the highest discernment capability among different types of leukemia, an innovative parameter referred to as coverage is presented here. This parameter allows assessing the number of different pathologies that a certain gen is able to discern. It has been evaluated together with other widely known parameters under assessment of an ANOVA statistical test which corroborated its filtering power when the identified genes are subjected to a machine learning process at multiclass level. The optimal tuning of gene extraction evaluated parameters by means of this statistical test led to the selection of 42 highly relevant expressed genes. By the use of minimum-Redundancy Maximum-Relevance (mRMR) feature selection algorithm, these genes were reordered and assessed under the operation of four different classification techniques. Outstanding results were achieved by taking exclusively the first ten genes of the ranking into consideration. Finally, specific literature was consulted on this last subset of genes, revealing the occurrence of practically all of them with biological processes related to leukemia. At sight of these results, this study underlines the relevance of considering a new parameter which facilitates the identification of highly valid expressed genes for simultaneously discerning multiple types of leukemia.
Collapse
|
11
|
Grafanaki K, Anastasakis D, Kyriakopoulos G, Skeparnias I, Georgiou S, Stathopoulos C. Translation regulation in skin cancer from a tRNA point of view. Epigenomics 2018; 11:215-245. [PMID: 30565492 DOI: 10.2217/epi-2018-0176] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Protein synthesis is a central and dynamic process, frequently deregulated in cancer through aberrant activation or expression of translation initiation factors and tRNAs. The discovery of tRNA-derived fragments, a new class of abundant and, in some cases stress-induced, small Noncoding RNAs has perplexed the epigenomics landscape and highlights the emerging regulatory role of tRNAs in translation and beyond. Skin is the biggest organ in human body, which maintains homeostasis of its multilayers through regulatory networks that induce translational reprogramming, and modulate tRNA transcription, modification and fragmentation, in response to various stress signals, like UV irradiation. In this review, we summarize recent knowledge on the role of translation regulation and tRNA biology in the alarming prevalence of skin cancer.
Collapse
Affiliation(s)
- Katerina Grafanaki
- Department of Biochemistry, School of Medicine, University of Patras, 26504 Patras, Greece.,Department of Dermatology, School of Medicine, University of Patras, 26504 Patras, Greece
| | - Dimitrios Anastasakis
- National Institute of Musculoskeletal & Arthritis & Skin, NIH, 50 South Drive, Room 1152, Bethesda, MD 20892, USA
| | - George Kyriakopoulos
- Department of Biochemistry, School of Medicine, University of Patras, 26504 Patras, Greece
| | - Ilias Skeparnias
- Department of Biochemistry, School of Medicine, University of Patras, 26504 Patras, Greece
| | - Sophia Georgiou
- Department of Dermatology, School of Medicine, University of Patras, 26504 Patras, Greece
| | | |
Collapse
|