1
|
Kervarrec T, Appenzeller S, Gramlich S, Coyaud E, Bachiri K, Appay R, Macagno N, Tallet A, Bonenfant C, Lecorre Y, Kapfer J, Kettani S, Srinivas N, Lei KC, Lange A, Becker JC, Sarosi EM, Sartelet H, von Deimling A, Touzé A, Guyétant S, Samimi M, Schrama D, Houben R. Analyses of combined Merkel cell carcinomas with neuroblastic components suggests that loss of T antigen expression in Merkel cell carcinoma may result in cell cycle arrest and neuroblastic transdifferentiation. J Pathol 2024; 264:112-124. [PMID: 39049595 DOI: 10.1002/path.6304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 02/28/2024] [Accepted: 05/08/2024] [Indexed: 07/27/2024]
Abstract
Merkel cell carcinoma (MCC) is an aggressive skin cancer frequently caused by genomic integration of the Merkel cell polyomavirus (MCPyV). MCPyV-negative cases often present as combined MCCs, which represent a distinctive subset of tumors characterized by association of an MCC with a second tumor component, mostly squamous cell carcinoma. Up to now, only exceptional cases of combined MCC with neuroblastic differentiation have been reported. Herein we describe two additional combined MCCs with neuroblastic differentiation and provide comprehensive morphologic, immunohistochemical, transcriptomic, genetic and epigenetic characterization of these tumors, which both arose in elderly men and appeared as an isolated inguinal adenopathy. Microscopic examination revealed biphasic tumors combining a poorly differentiated high-grade carcinoma with a poorly differentiated neuroblastic component lacking signs of proliferation. Immunohistochemical investigation revealed keratin 20 and MCPyV T antigen (TA) in the MCC parts, while neuroblastic differentiation was confirmed in the other component in both cases. A clonal relation of the two components can be deduced from 20 and 14 shared acquired point mutations detected by whole exome analysis in both combined tumors, respectively. Spatial transcriptomics demonstrated a lower expression of stem cell marker genes such as SOX2 and MCM2 in the neuroblastic component. Interestingly, although the neuroblastic part lacked TA expression, the same genomic MCPyV integration and the same large T-truncating mutations were observed in both tumor parts. Given that neuronal transdifferentiation upon TA repression has been reported for MCC cell lines, the most likely scenario for the two combined MCC/neuroblastic tumors is that neuroblastic transdifferentiation resulted from loss of TA expression in a subset of MCC cells. Indeed, DNA methylation profiling suggests an MCC-typical cellular origin for the combined MCC/neuroblastomas. © 2024 The Author(s). The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.
Collapse
MESH Headings
- Humans
- Carcinoma, Merkel Cell/pathology
- Carcinoma, Merkel Cell/virology
- Carcinoma, Merkel Cell/genetics
- Carcinoma, Merkel Cell/metabolism
- Male
- Skin Neoplasms/pathology
- Skin Neoplasms/genetics
- Skin Neoplasms/virology
- Skin Neoplasms/metabolism
- Antigens, Viral, Tumor/genetics
- Antigens, Viral, Tumor/metabolism
- Cell Transdifferentiation
- Merkel cell polyomavirus/genetics
- Cell Cycle Checkpoints/genetics
- Biomarkers, Tumor/genetics
- Biomarkers, Tumor/metabolism
- Aged, 80 and over
- Aged
- Neoplasms, Complex and Mixed/pathology
- Neoplasms, Complex and Mixed/genetics
- Neoplasms, Complex and Mixed/metabolism
- Neuroblastoma/pathology
- Neuroblastoma/genetics
- Neuroblastoma/metabolism
Collapse
Affiliation(s)
- Thibault Kervarrec
- Department of Pathology, Université de Tours, Centre Hospitalier Universitaire de Tours, Tours, France
- "Biologie des infections à polyomavirus" team, UMR INRAE ISP 1282, Université de Tours, Tours, France
- CARADERM Network
| | - Silke Appenzeller
- Comprehensive Cancer Center Mainfranken, University Hospital of Würzburg, Würzburg, Germany
| | - Susanne Gramlich
- Institute of Pathology, University of Würzburg, Würzburg, Germany
| | | | - Kamel Bachiri
- PRISM INSERM U1192, Université de Lille, Lille, France
| | - Romain Appay
- Department of Pathology, Université de Marseille, Assistance publique des Hopitaux de Marseille, Marseille, France
| | - Nicolas Macagno
- CARADERM Network
- Department of Pathology, Université de Marseille, Assistance publique des Hopitaux de Marseille, Marseille, France
| | - Anne Tallet
- Platform of Somatic Tumor Molecular Genetics, Centre Hospitalier Universitaire de Tours, Tours, France
| | - Christine Bonenfant
- Platform of Somatic Tumor Molecular Genetics, Centre Hospitalier Universitaire de Tours, Tours, France
| | - Yannick Lecorre
- Dermatology Department, LUNAM Université, CHU Angers, Angers, France
| | | | | | - Nalini Srinivas
- Department of Translational Skin Cancer Research and Dermatology, University Hospital Essen, Essen, Germany
| | - Kuan Cheok Lei
- Department of Translational Skin Cancer Research and Dermatology, University Hospital Essen, Essen, Germany
- German Cancer Consortium (DKTK), Partner Site Essen/Düsseldorf and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Anja Lange
- Bioinformatics & Computational Biophysics, University Duisburg-Essen, Essen, Germany
| | - Jürgen C Becker
- Department of Translational Skin Cancer Research and Dermatology, University Hospital Essen, Essen, Germany
- German Cancer Consortium (DKTK), Partner Site Essen/Düsseldorf and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Eva Maria Sarosi
- Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany
| | - Hervé Sartelet
- Laboratoire de Biopathologie, CHRU de Nancy, Nancy, France
- INSERM U1256, Université de Lorraine, Nancy, France
| | - Andreas von Deimling
- Department of Neuropathology, Institute of Pathology, Ruprecht-Karls-University, Heidelberg, Germany
- Clinical Cooperation Unit Neuropathology, German Cancer Research Center (DKFZ), and German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Antoine Touzé
- "Biologie des infections à polyomavirus" team, UMR INRAE ISP 1282, Université de Tours, Tours, France
| | - Serge Guyétant
- Department of Pathology, Université de Tours, Centre Hospitalier Universitaire de Tours, Tours, France
- "Biologie des infections à polyomavirus" team, UMR INRAE ISP 1282, Université de Tours, Tours, France
| | - Mahtab Samimi
- "Biologie des infections à polyomavirus" team, UMR INRAE ISP 1282, Université de Tours, Tours, France
- CARADERM Network
- Department of Dermatology, Université de Tours, Centre Hospitalier Universitaire de Tours, Tours, France
| | - David Schrama
- Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany
| | - Roland Houben
- Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany
| |
Collapse
|
2
|
Obraitis D, Li D. Blood virome research in myalgic encephalomyelitis/chronic fatigue syndrome: challenges and opportunities. Curr Opin Virol 2024; 68-69:101437. [PMID: 39537445 PMCID: PMC11795702 DOI: 10.1016/j.coviro.2024.101437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 08/22/2024] [Accepted: 10/21/2024] [Indexed: 11/16/2024]
Abstract
Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is a debilitating disease with a complex clinical presentation and an unknown etiology. Various viral infections have been proposed as potential triggers of ME/CFS onset, but no specific pathogen has been identified in all cases of postinfectious ME/CFS. The symptomatology of the postacute sequelae of SARS-CoV-2, or long COVID, mirrors that of ME/CFS, with nearly half of long COVID patients meeting ME/CFS diagnostic criteria. The influx of newly diagnosed patients has reinvigorated interest in ME/CFS pathogenesis research, with an emphasis on viral triggers. This review summarizes the current understanding of ME/CFS research on viral triggers, including blood virome screening studies. To further elucidate the molecular basis of ME/CFS, there is a need to develop innovative bioinformatics tools capable of analyzing complex virome data and integrating multiomics information.
Collapse
Affiliation(s)
- Dominic Obraitis
- Department of Immunology and Molecular Microbiology, Texas Tech University Health Sciences Center, Lubbock, TX 79430, USA; Neuroscience and Behavior Program, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Dawei Li
- Department of Immunology and Molecular Microbiology, Texas Tech University Health Sciences Center, Lubbock, TX 79430, USA.
| |
Collapse
|
3
|
Qiao L, Li C, Lin W, He X, Mi J, Tong Y, Gao J. ViroISDC: a method for calling integration sites of hepatitis B virus based on feature encoding. BMC Bioinformatics 2024; 25:177. [PMID: 38704528 PMCID: PMC11070082 DOI: 10.1186/s12859-024-05763-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 03/26/2024] [Indexed: 05/06/2024] Open
Abstract
BACKGROUND Hepatitis B virus (HBV) integrates into human chromosomes and can lead to genomic instability and hepatocarcinogenesis. Current tools for HBV integration site detection lack accuracy and stability. RESULTS This study proposes a deep learning-based method, named ViroISDC, for detecting integration sites. ViroISDC generates corresponding grammar rules and encodes the characteristics of the language data to predict integration sites accurately. Compared with Lumpy, Pindel, Seeksv, and SurVirus, ViroISDC exhibits better overall performance and is less sensitive to sequencing depth and integration sequence length, displaying good reliability, stability, and generality. Further downstream analysis of integrated sites detected by ViroISDC reveals the integration patterns and features of HBV. It is observed that HBV integration exhibits specific chromosomal preferences and tends to integrate into cancerous tissue. Moreover, HBV integration frequency was higher in males than females, and high-frequency integration sites were more likely to be present on hepatocarcinogenesis- and anti-cancer-related genes, validating the reliability of the ViroISDC. CONCLUSIONS ViroISDC pipeline exhibits superior precision, stability, and reliability across various datasets when compared to similar software. It is invaluable in exploring HBV infection in the human body, holding significant implications for the diagnosis, treatment, and prognosis assessment of HCC.
Collapse
Affiliation(s)
- Lei Qiao
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Chang Li
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Wei Lin
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Xiaoqi He
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Jia Mi
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Yigang Tong
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China.
| | - Jingyang Gao
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China.
| |
Collapse
|
4
|
Kervarrec T, Appenzeller S, Tallet A, Jullie ML, Sohier P, Guillonneau F, Rütten A, Berthon P, Le Corre Y, Hainaut-Wierzbicka E, Blom A, Beneton N, Bens G, Nardin C, Aubin F, Dinulescu M, Visée S, Herfs M, Touzé A, Guyétant S, Samimi M, Houben R, Schrama D. Detection of wildtype Merkel cell polyomavirus genomic sequence and VP1 transcription in a subset of Merkel cell carcinoma. Histopathology 2024; 84:356-368. [PMID: 37830288 DOI: 10.1111/his.15068] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 09/16/2023] [Accepted: 09/24/2023] [Indexed: 10/14/2023]
Abstract
AIMS Merkel cell carcinoma (MCC) is frequently caused by the Merkel cell polyomavirus (MCPyV). Characteristic for these virus-positive (VP) MCC is MCPyV integration into the host genome and truncation of the viral oncogene Large T antigen (LT), with full-length LT expression considered as incompatible with MCC growth. Genetic analysis of a VP-MCC/trichoblastoma combined tumour demonstrated that virus-driven MCC can arise from an epithelial cell. Here we describe two further cases of VP-MCC combined with an adnexal tumour, i.e. one trichoblastoma and one poroma. METHODS AND RESULTS Whole-genome sequencing of MCC/trichoblastoma again provided evidence of a trichoblastoma-derived MCC. Although an MCC-typical LT-truncating mutation was detected, we could not determine an integration site and we additionally detected a wildtype sequence encoding full-length LT. Similarly, Sanger sequencing of the combined MCC/poroma revealed coding sequences for both truncated and full-length LT. Moreover, in situ RNA hybridization demonstrated expression of a late region mRNA encoding the viral capsid protein VP1 in both combined as well as in a few cases of pure MCC. CONCLUSION The data presented here suggest the presence of wildtype MCPyV genomes and VP1 transcription in a subset of MCC.
Collapse
Affiliation(s)
- Thibault Kervarrec
- Department of Pathology, Université de Tours, Centre Hospitalier Universitaire de Tours, Tours, France
- "Biologie des Infections à Polyomavirus" Team, UMR INRAE ISP 1282, Université de Tours, Tours, France
| | - Silke Appenzeller
- Comprehensive Cancer Center Mainfranken, University Hospital of Würzburg, Würzburg, Germany
| | - Anne Tallet
- Platform of Somatic Tumor Molecular Genetics, Université de Tours, Centre Hospitalier Universitaire de Tours, Tours, France
| | - Marie-Laure Jullie
- Department of Pathology, Hôpital Haut-Lévêque, CHU de Bordeaux, CARADERM Network, Pessac, France
| | - Pierre Sohier
- Faculté de Médecine, Université Paris Cité, Paris, France
- Department of Pathology, Hôpital Cochin, AP-HP.Centre-Université Paris Cité, Paris, France
| | - Francois Guillonneau
- 3P5 Proteomics, Hôpital Cochin, AP-HP, Centre-Université Paris Cité, Paris, France
| | | | - Patricia Berthon
- "Biologie des Infections à Polyomavirus" Team, UMR INRAE ISP 1282, Université de Tours, Tours, France
| | - Yannick Le Corre
- Dermatology Department, LUNAM Université, CHU Angers, Angers, France
| | | | - Astrid Blom
- Department of General and Oncologic Dermatology, CARADERM Network Ambroise-Paré hospital, APHP & Research Unit EA 4340, University of Versailles-Saint-Quentin-en-Yvelines, Paris-Saclay University, Boulogne-Billancourt, France
| | | | - Guido Bens
- Dermatology Department, CHR d'Orléans, Orléans, France
- Dermatology Department, CH de Blois, Blois, France
| | - Charline Nardin
- Dermatology Department, Inserm 1098, Université de Franche Comté, CHU Besançon, Besançon, France
| | - Francois Aubin
- Dermatology Department, Inserm 1098, Université de Franche Comté, CHU Besançon, Besançon, France
| | - Monica Dinulescu
- Dermatology Department, CHR Rennes, Rennes, France
- Institut Dermatologique du Grand Ouest (IDGO), Rennes, France
| | - Sebastien Visée
- Department of Pathology, Centre Hospitalier d'Angoulème, Angoulème, France
| | - Michael Herfs
- Laboratory of Experimental Pathology, GIGA-Cancer, University of Liège, Liège, Belgium
| | - Antoine Touzé
- "Biologie des Infections à Polyomavirus" Team, UMR INRAE ISP 1282, Université de Tours, Tours, France
| | - Serge Guyétant
- Department of Pathology, Université de Tours, Centre Hospitalier Universitaire de Tours, Tours, France
- "Biologie des Infections à Polyomavirus" Team, UMR INRAE ISP 1282, Université de Tours, Tours, France
| | - Mahtab Samimi
- "Biologie des Infections à Polyomavirus" Team, UMR INRAE ISP 1282, Université de Tours, Tours, France
- Departement of Dermatology, Université de Tours, Centre Hospitalier Universitaire de Tours, Tours, France
| | - Roland Houben
- Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany
| | - David Schrama
- Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany
| |
Collapse
|
5
|
Lee CC, Ye R, Tubbs JD, Baum L, Zhong Y, Leung SYJ, Chan SC, Wu KYK, Cheng PKJ, Chow LP, Leung PWL, Sham PC. Third-generation genome sequencing implicates medium-sized structural variants in chronic schizophrenia. Front Neurosci 2023; 16:1058359. [PMID: 36711134 PMCID: PMC9874699 DOI: 10.3389/fnins.2022.1058359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 12/14/2022] [Indexed: 01/13/2023] Open
Abstract
Background Schizophrenia (SCZ) is a heterogeneous psychiatric disorder, with significant contribution from genetic factors particularly for chronic cases with negative symptoms and cognitive deficits. To date, Genome Wide Association Studies (GWAS) and exome sequencing have associated SCZ with a number of single nucleotide polymorphisms (SNPs) and copy number variants (CNVs), but there is still missing heritability. Medium-sized structural variants (SVs) are difficult to detect using SNP arrays or second generation sequencing, and may account for part of the missing heritability of SCZ. Aims and objectives To identify SVs associated with severe chronic SCZ across the whole genome. Study design 10 multiplex families with probands suffering from chronic SCZ with negative symptoms and cognitive deficits were recruited, with all their affected members demonstrating uni-lineal inheritance. Control subjects comprised one affected member from the affected lineage, and unaffected members from each paternal and maternal lineage. Methods Third generation sequencing was applied to peripheral blood samples from 10 probands and 5 unaffected controls. Bioinformatic tools were used to identify SVs from the long sequencing reads, with confirmation of findings in probands by short-read Illumina sequencing, Sanger sequencing and visual manual validation with Integrated Genome Browser. Results In the 10 probands, we identified and validated 88 SVs (mostly in introns and medium-sized), within 79 genes, which were absent in the 5 unaffected control subjects. These 79 genes were enriched in 20 biological pathways which were related to brain development, neuronal migration, neurogenesis, neuronal/synaptic function, learning/memory, and hearing. These identified SVs also showed evidence for enrichment of genes that are highly expressed in the adolescent striatum. Conclusion A substantial part of the missing heritability in SCZ may be explained by medium-sized SVs detectable only by third generation sequencing. We have identified a number of such SVs potentially conferring risk for SCZ, which implicate multiple brain-related genes and pathways. In addition to previously-identified pathways involved in SCZ such as neurodevelopment and neuronal/synaptic functioning, we also found novel evidence for enrichment in hearing-related pathways and genes expressed in the adolescent striatum.
Collapse
Affiliation(s)
- Chi Chiu Lee
- Department of Psychiatry, Kwai Chung Hospital, Hong Kong, Hong Kong SAR, China,*Correspondence: Chi Chiu Lee,
| | - Rui Ye
- Department of Psychiatry, The University of Hong Kong, Hong Kong, Hong Kong SAR, China
| | - Justin D. Tubbs
- Department of Psychiatry, The University of Hong Kong, Hong Kong, Hong Kong SAR, China
| | - Larry Baum
- Department of Psychiatry, The University of Hong Kong, Hong Kong, Hong Kong SAR, China,State Key Laboratory of Brain and Cognitive Sciences, The University of Hong Kong, Hong Kong, Hong Kong SAR, China
| | - Yuanxin Zhong
- Department of Psychiatry, The University of Hong Kong, Hong Kong, Hong Kong SAR, China
| | - Shuk Yan Joey Leung
- Department of Psychiatry, Kwai Chung Hospital, Hong Kong, Hong Kong SAR, China
| | - Sheung Chun Chan
- Department of Psychiatry, Tai Po Hospital, Hong Kong, Hong Kong SAR, China
| | - Kit Ying Kitty Wu
- Kowloon West Cluster, Hospital Authority, Hong Kong, Hong Kong SAR, China
| | - Po Kwan Jamie Cheng
- Department of Clinical Psychology, Yan Chai Hospital, Hong Kong, Hong Kong SAR, China
| | - Lai Ping Chow
- Department of Psychiatry, Kwai Chung Hospital, Hong Kong, Hong Kong SAR, China
| | - Patrick W. L. Leung
- Department of Psychology, The Chinese University of Hong Kong, Hong Kong, Hong Kong SAR, China
| | - Pak Chung Sham
- Department of Psychiatry, The University of Hong Kong, Hong Kong, Hong Kong SAR, China,State Key Laboratory of Brain and Cognitive Sciences, The University of Hong Kong, Hong Kong, Hong Kong SAR, China,Centre for PanorOmic Sciences, The University of Hong Kong, Hong Kong, Hong Kong SAR, China,Pak Chung Sham,
| |
Collapse
|
6
|
Feng H, Xiang Y, Wang X, Xue W, Yue Z. MTAGCN: predicting miRNA-target associations in Camellia sinensis var. assamica through graph convolution neural network. BMC Bioinformatics 2022; 23:271. [PMID: 35820798 PMCID: PMC9275082 DOI: 10.1186/s12859-022-04819-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 07/01/2022] [Indexed: 11/10/2022] Open
Abstract
Background MircoRNAs (miRNAs) play a central role in diverse biological processes of Camellia sinensis var.assamica (CSA) through their associations with target mRNAs, including CSA growth, development and stress response. However, although the experiment methods of CSA miRNA-target identifications are costly and time-consuming, few computational methods have been developed to tackle the CSA miRNA-target association prediction problem. Results In this paper, we constructed a heterogeneous network for CSA miRNA and targets by integrating rich biological information, including a miRNA similarity network, a target similarity network, and a miRNA-target association network. We then proposed a deep learning framework of graph convolution networks with layer attention mechanism, named MTAGCN. In particular, MTAGCN uses the attention mechanism to combine embeddings of multiple graph convolution layers, employing the integrated embedding to score the unobserved CSA miRNA-target associations. Discussion Comprehensive experiment results on two tasks (balanced task and unbalanced task) demonstrated that our proposed model achieved better performance than the classic machine learning and existing graph convolution network-based methods. The analysis of these results could offer valuable information for understanding complex CSA miRNA-target association mechanisms and would make a contribution to precision plant breeding. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04819-3.
Collapse
Affiliation(s)
- Haisong Feng
- School of Information and Computer, Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, Anhui, China
| | - Ying Xiang
- School of Information and Computer, Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, Anhui, China
| | - Xiaosong Wang
- School of Information and Computer, Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, Anhui, China
| | - Wei Xue
- School of Information and Computer, Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, Anhui, China
| | - Zhenyu Yue
- School of Information and Computer, Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, Anhui, China.
| |
Collapse
|
7
|
Chau JFT, Yu MHC, Chui MMC, Yeung CCW, Kwok AWC, Zhuang X, Lee R, Fung JLF, Lee M, Mak CCY, Ng NYT, Chung CCY, Chan MCY, Tsang MHY, Chan JCK, Chan KYK, Kan ASY, Chung PHY, Yang W, Lee SL, Chan GCF, Tam PKH, Lau YL, Yeung KS, Chung BHY, Tang CSM. Comprehensive analysis of recessive carrier status using exome and genome sequencing data in 1543 Southern Chinese. NPJ Genom Med 2022; 7:23. [PMID: 35314707 PMCID: PMC8938515 DOI: 10.1038/s41525-022-00287-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 01/21/2022] [Indexed: 12/31/2022] Open
Abstract
Traditional carrier screening has been utilized for the detection of carriers of genetic disorders. Since a comprehensive assessment of the carrier frequencies of recessive conditions in the Southern Chinese population is not yet available, we performed a secondary analysis on the spectrum and carrier status for 315 genes causing autosomal recessive disorders in 1543 Southern Chinese individuals with next-generation sequencing data, 1116 with exome sequencing and 427 with genome sequencing data. Our data revealed that 1 in 2 people (47.8% of the population) was a carrier for one or more recessive conditions, and 1 in 12 individuals (8.30% of the population) was a carrier for treatable inherited conditions. In alignment with current American College of Obstetricians and Gynecologists (ACOG) pan-ethnic carrier recommendations, 1 in 26 individuals were identified as carriers of cystic fibrosis, thalassemia, and spinal muscular atrophy in the Southern Chinese population. When the >1% expanded carrier screening rate recommendation by ACOG was used, 11 diseases were found to meet the criteria in the Southern Chinese population. Approximately 1 in 3 individuals (35.5% of the population) were carriers of these 11 conditions. If the 1 in 200 carrier frequency threshold is used, and additional seven genes would meet the criteria, and 2 in 5 individuals (38.7% of the population) would be detected as a carrier. This study provides a comprehensive catalogue of the carrier spectrum and frequency in the Southern Chinese population and can serve as a reference for careful evaluation of the conditions to be included in expanded carrier screening for Southern Chinese people.
Collapse
Affiliation(s)
- Jeffrey Fong Ting Chau
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Mullin Ho Chung Yu
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Martin Man Chun Chui
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Cyrus Chun Wing Yeung
- Department of Surgery, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Aaron Wing Cheung Kwok
- Department of Surgery, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Xuehan Zhuang
- Department of Surgery, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Ryan Lee
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Jasmine Lee Fong Fung
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Mianne Lee
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Christopher Chun Yu Mak
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Nicole Ying Ting Ng
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Claudia Ching Yan Chung
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Marcus Chun Yin Chan
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Mandy Ho Yin Tsang
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Joshua Chun Ki Chan
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Kelvin Yuen Kwong Chan
- Prenatal Diagnostic Laboratory, Department of Obstetrics and Gynaecology, Tsan Yuk Hospital, Hong Kong SAR, China
| | - Anita Sik Yau Kan
- Prenatal Diagnostic Laboratory, Department of Obstetrics and Gynaecology, Tsan Yuk Hospital, Hong Kong SAR, China
| | - Patrick Ho Yu Chung
- Department of Surgery, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Wanling Yang
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - So Lun Lee
- Department of Paediatrics and Adolescent Medicine, Duchess of Kent Children's Hospital, Hong Kong SAR, China
| | - Godfrey Chi Fung Chan
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Paul Kwong Hang Tam
- Department of Surgery, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
- Li Dak-Sum Research Centre, The University of Hong Kong-Karolinska Institute Collaboration in Regenerative Medicine, Hong Kong SAR, China
| | - Yu Lung Lau
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Kit San Yeung
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China.
| | - Brian Hon Yin Chung
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China.
| | - Clara Sze Man Tang
- Department of Surgery, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China.
- Li Dak-Sum Research Centre, The University of Hong Kong-Karolinska Institute Collaboration in Regenerative Medicine, Hong Kong SAR, China.
| |
Collapse
|
8
|
Scott S, Hallwirth CV, Hartkopf F, Grigson S, Jain Y, Alexander IE, Bauer DC, O W Wilson L. Isling: a tool for detecting integration of wild-type viruses and clinical vectors. J Mol Biol 2021; 434:167408. [PMID: 34929203 DOI: 10.1016/j.jmb.2021.167408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 12/09/2021] [Accepted: 12/13/2021] [Indexed: 10/19/2022]
Abstract
Detecting viral and vector integration events is a key step when investigating interactions between viral and host genomes. This is relevant in several fields, including virology, cancer research and gene therapy. For example, investigating integrations of wild-type viruses such as human papillomavirus and hepatitis B virus has proven to be crucial for understanding the role of these integrations in cancer. Furthermore, identifying the extent of vector integration is vital for determining the potential for genotoxicity in gene therapies. To address these questions, we developed isling, the first tool specifically designed for identifying viral integrations in both wild-type and vector from next-generation sequencing data. Isling addresses complexities in integration behaviour including integration of fragmented genomes and integration junctions with ambiguous locations in a host or vector genome, and can also flag possible vector recombinations. We show that isling is up to 1.6-fold faster and up to 170% more accurate than other viral integration tools, and performs well on both simulated and real datasets. Isling is therefore an efficient and application-agnostic tool that will enable a broad range of investigations into viral and vector integration. These include comparisons between integrations of wild-type viruses and gene therapy vectors, as well as assessing the genotoxicity of vectors and understanding the role of viruses in cancer.
Collapse
Affiliation(s)
- Suzanne Scott
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, North Ryde, Australia; Gene Therapy Research Unit, Children's Medical Research Institute, Westmead, Australia; The Sydney Children's Hospitals Network, Faculty of Medicine and Health, The University of Sydney, Westmead, Australia
| | - Claus V Hallwirth
- Gene Therapy Research Unit, Children's Medical Research Institute, Westmead, Australia; The Sydney Children's Hospitals Network, Faculty of Medicine and Health, The University of Sydney, Westmead, Australia
| | - Felix Hartkopf
- Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany
| | - Susanna Grigson
- College of Science and Engineering, Flinders University, Adelaide, Australia
| | - Yatish Jain
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, North Ryde, Australia
| | - Ian E Alexander
- Gene Therapy Research Unit, Children's Medical Research Institute, Westmead, Australia; The Sydney Children's Hospitals Network, Faculty of Medicine and Health, The University of Sydney, Westmead, Australia; Discipline of Child and Adolescent Health,Faculty of Medicine and Health,The University of Sydney, Sydney, New South Wales, Australia
| | - Denis C Bauer
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, North Ryde, Australia; Discipline of Child and Adolescent Health,Faculty of Medicine and Health,The University of Sydney, Sydney, New South Wales, Australia; Macquarie University, Department of Biomedical Sciences, Faculty of Medicine and Health Science, Macquarie Park, Australia.
| | - Laurence O W Wilson
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, North Ryde, Australia; Macquarie University, Applied BioSciences, Faculty of Science and Engineering, Macquarie Park, Australia.
| |
Collapse
|
9
|
Genomic Landscapes of Epstein-Barr Virus in Pulmonary Lymphoepithelioma-like Carcinoma. J Virol 2021; 96:e0169321. [PMID: 34908446 DOI: 10.1128/jvi.01693-21] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Epstein-Barr virus (EBV) infection is associated with multiple malignancies, including pulmonary lymphoepithelioma-like carcinoma (pLELC), a particular subtype of primary lung cancer. However, the genomic characteristics of EBV related to pLELC remain unclear. Here, we obtained the whole-genome dataset of EBV isolated from 78 pLELC patients and 37 healthy controls using EBV-captured sequencing. Compared to the reference genome (NC_007605), a total of 3995 variations were detected across pLELC-derived EBV sequences, with the mutational hotspots located in latent genes. Combined with 180 published EBV sequences derived from healthy people in Southern China, we performed a genome-wide association study and identified 32 variations significantly related to pLELC (p < 2.56×10-05, Bonferroni correction), with the top signal of SNP coordinate T7327C (OR = 1.22, p = 2.39×10-15) locating in the origin of plasmid replication (OriP). The results of population structure analysis of EBV isolates in East Asian showed the EBV strains derived from pLELC were more similar to those from nasopharyngeal carcinoma (NPC) than other EBV-associated diseases. In addition, typical latency type-II infection were recognized for EBV of pLELC at both transcription and methylation levels. Taken together, we defined the global view of EBV genomic profiles in pLELC patients for the first time, providing new insights to deepening our understanding of this rare EBV-associated primary lung carcinoma. Importance Pulmonary lymphoepithelioma-like carcinoma (pLELC) is a rarely distinctive subtype of primary lung cancer closely associated with Epstein-Barr virus (EBV) infection. Here, we gave the first overview of pLELC-derived EBV at the level of genome, methylation and transcription. We obtained the EBV sequences dataset from 78 primary pLELC patients, and revealed the sequences diversity across EBV genome and detected variability in known immune epitopes. Genome-wide association analysis combining 217 healthy controls identifies significant variations related to the risk of pLELC. Meanwhile, we characterized the integration landscapes of EBV at the genome-wide level. These results provided new insight for understanding EBV's role in pLELC tumorigenesis.
Collapse
|
10
|
Jia W, Xu C, Li SC. Resolving complex structures at oncovirus integration loci with conjugate graph. Brief Bioinform 2021; 22:6359003. [PMID: 34463709 DOI: 10.1093/bib/bbab359] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 08/10/2021] [Accepted: 08/12/2021] [Indexed: 01/10/2023] Open
Abstract
Oncovirus integrations cause copy number variations and complex structural variations (SVs) on host genomes. However, the understanding of how inserted viral DNA impacts the local genome remains limited. The linear structure of the oncovirus integrated local genomic map (LGM) will lay the foundations to understand how oncovirus integrations emerge and compromise the host genome's functioning. We propose a conjugate graph model to reconstruct the rearranged LGM at integrated loci. Simulation tests prove the reliability and credibility of the algorithm. Applications of the algorithm to whole-genome sequencing data of human papillomavirus (HPV) and hepatitis B virus (HBV)-infected cancer samples gained biological insights on oncovirus integrations. We observed four affection patterns of oncovirus integrations from the HPV and HBV-integrated cancer samples, including the coding-frame truncation, hyper-amplification of tumor gene, the viral cis-regulation inserted at the single intron and at the intergenic region. We found that the focal duplicates and host SVs are frequent in the HPV-integrated LGMs, while the focal deletions are prevalent in HBV-integrated LGMs. Furthermore, with the results yields from our method, we found the enhanced microhomology-mediated end joining might lead to both HPV and HBV integrations and conjectured that the HPV integrations might mainly occur during the DNA replication process. The conjugate graph algorithm code and LGM construction pipeline, available at https://github.com/deepomicslab/FuseSV.
Collapse
Affiliation(s)
- Wenlong Jia
- Department of Computer Science, City University of Hong Kong, Hong Kong
| | - Chang Xu
- Department of Computer Science, City University of Hong Kong, Hong Kong
| | - Shuai Cheng Li
- Department of Computer Science, City University of Hong Kong, Hong Kong
| |
Collapse
|
11
|
Yuan Y, Bayer PE, Batley J, Edwards D. Current status of structural variation studies in plants. PLANT BIOTECHNOLOGY JOURNAL 2021; 19:2153-2163. [PMID: 34101329 PMCID: PMC8541774 DOI: 10.1111/pbi.13646] [Citation(s) in RCA: 71] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 05/31/2021] [Accepted: 06/03/2021] [Indexed: 05/23/2023]
Abstract
Structural variations (SVs) including gene presence/absence variations and copy number variations are a common feature of genomes in plants and, together with single nucleotide polymorphisms and epigenetic differences, are responsible for the heritable phenotypic diversity observed within and between species. Understanding the contribution of SVs to plant phenotypic variation is important for plant breeders to assist in producing improved varieties. The low resolution of early genetic technologies and inefficient methods have previously limited our understanding of SVs in plants. However, with the rapid expansion in genomic technologies, it is possible to assess SVs with an ever-greater resolution and accuracy. Here, we review the current status of SV studies in plants, examine the roles that SVs play in phenotypic traits, compare current technologies and assess future challenges for SV studies.
Collapse
Affiliation(s)
- Yuxuan Yuan
- School of Biological Sciences and Institute of AgricultureThe University of Western AustraliaPerthWAAustralia
- School of Life Sciences and State Key Laboratory for AgrobiotechnologyThe Chinese University of Hong KongHong Kong SARChina
| | - Philipp E. Bayer
- School of Biological Sciences and Institute of AgricultureThe University of Western AustraliaPerthWAAustralia
| | - Jacqueline Batley
- School of Biological Sciences and Institute of AgricultureThe University of Western AustraliaPerthWAAustralia
| | - David Edwards
- School of Biological Sciences and Institute of AgricultureThe University of Western AustraliaPerthWAAustralia
| |
Collapse
|
12
|
Liu T, Chen J, Zhang Q, Hippe K, Hunt C, Le T, Cao R, Tang H. The Development of Machine Learning Methods in discriminating Secretory Proteins of Malaria Parasite. Curr Med Chem 2021; 29:807-821. [PMID: 34636289 DOI: 10.2174/0929867328666211005140625] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 07/28/2021] [Accepted: 08/15/2021] [Indexed: 11/22/2022]
Abstract
Malaria caused by Plasmodium falciparum is one of the major infectious diseases in the world. It is essential to exploit an effective method to predict secretory proteins of malaria parasites to develop effective cures and treatment. Biochemical assays can provide details for accurate identification of the secretory proteins, but these methods are expensive and time-consuming. In this paper, we summarized the machine learning-based identification algorithms and compared the construction strategies between different computational methods. Also, we discussed the use of machine learning to improve the ability of algorithms to identify proteins secreted by malaria parasites.
Collapse
Affiliation(s)
- Ting Liu
- School of Basic Medical Sciences, Southwest Medical University, Luzhou. China
| | - Jiamao Chen
- School of Basic Medical Sciences, Southwest Medical University, Luzhou. China
| | - Qian Zhang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou. China
| | - Kyle Hippe
- Department of Computer Science, Pacific Lutheran University. United States
| | - Cassandra Hunt
- Department of Computer Science, Pacific Lutheran University. United States
| | - Thu Le
- Department of Computer Science, Pacific Lutheran University. United States
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University. United States
| | - Hua Tang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou. China
| |
Collapse
|
13
|
Zhao YW, Zhang S, Ding H. Recent development of machine learning methods in sumoylation sites prediction. Curr Med Chem 2021; 29:894-907. [PMID: 34525906 DOI: 10.2174/0929867328666210915112030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 07/24/2021] [Accepted: 08/07/2021] [Indexed: 11/22/2022]
Abstract
Sumoylation of proteins is an important reversible post-translational modification of proteins and mediates a variety of cellular processes. Sumo-modified proteins can change their subcellular localization, activity and stability. In addition, it also plays an important role in various cellular processes such as transcriptional regulation and signal transduction. The abnormal sumoylation is involved in many diseases, including neurodegeneration and immune-related diseases, as well as the development of cancer. Therefore, identification of the sumoylation site (SUMO site) is fundamental to understanding their molecular mechanisms and regulatory roles. In contrast to labor-intensive and costly experimental approaches, computational prediction of sumoylation sites in silico also attracted much attention for its accuracy, convenience and speed. At present, many computational prediction models have been used to identify SUMO sites, but these contents have not been comprehensively summarized and reviewed. Therefore, the research progress of relevant models is summarized and discussed in this paper. We will briefly summarize the development of bioinformatics methods on sumoylation site prediction. We will mainly focus on the benchmark dataset construction, feature extraction, machine learning method, published results and online tools. We hope the review will provide more help for wet-experimental scholars.
Collapse
Affiliation(s)
- Yi-Wei Zhao
- School of Medicine, University of Electronic Science and Technology of China, Chengdu 610054. China
| | - Shihua Zhang
- College of Life Science and Health, Wuhan University of Science and Technology, Wuhan 430065. China
| | - Hui Ding
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054. China
| |
Collapse
|
14
|
Maroilley T, Li X, Oldach M, Jean F, Stasiuk SJ, Tarailo-Graovac M. Deciphering complex genome rearrangements in C. elegans using short-read whole genome sequencing. Sci Rep 2021; 11:18258. [PMID: 34521941 PMCID: PMC8440550 DOI: 10.1038/s41598-021-97764-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 08/30/2021] [Indexed: 12/14/2022] Open
Abstract
Genomic rearrangements cause congenital disorders, cancer, and complex diseases in human. Yet, they are still understudied in rare diseases because their detection is challenging, despite the advent of whole genome sequencing (WGS) technologies. Short-read (srWGS) and long-read WGS approaches are regularly compared, and the latter is commonly recommended in studies focusing on genomic rearrangements. However, srWGS is currently the most economical, accurate, and widely supported technology. In Caenorhabditis elegans (C. elegans), such variants, induced by various mutagenesis processes, have been used for decades to balance large genomic regions by preventing chromosomal crossover events and allowing the maintenance of lethal mutations. Interestingly, those chromosomal rearrangements have rarely been characterized on a molecular level. To evaluate the ability of srWGS to detect various types of complex genomic rearrangements, we sequenced three balancer strains using short-read Illumina technology. As we experimentally validated the breakpoints uncovered by srWGS, we showed that, by combining several types of analyses, srWGS enables the detection of a reciprocal translocation (eT1), a free duplication (sDp3), a large deletion (sC4), and chromoanagenesis events. Thus, applying srWGS to decipher real complex genomic rearrangements in model organisms may help designing efficient bioinformatics pipelines with systematic detection of complex rearrangements in human genomes.
Collapse
Affiliation(s)
- Tatiana Maroilley
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, T2N 4N1, Canada.,Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB, T2N 4N1, Canada
| | - Xiao Li
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, T2N 4N1, Canada.,Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB, T2N 4N1, Canada
| | - Matthew Oldach
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, T2N 4N1, Canada.,Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB, T2N 4N1, Canada
| | - Francesca Jean
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, T2N 4N1, Canada.,Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB, T2N 4N1, Canada
| | - Susan J Stasiuk
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, T2N 4N1, Canada.,Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB, T2N 4N1, Canada
| | - Maja Tarailo-Graovac
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, T2N 4N1, Canada. .,Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB, T2N 4N1, Canada.
| |
Collapse
|
15
|
Yang YH, Wang JS, Yuan SS, Liu ML, Su W, Lin H, Zhang ZY. A Survey for Predicting ATP Binding Residues of Proteins Using Machine Learning Methods. Curr Med Chem 2021; 29:789-806. [PMID: 34514982 DOI: 10.2174/0929867328666210910125802] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 06/29/2021] [Accepted: 07/04/2021] [Indexed: 11/22/2022]
Abstract
Protein-ligand interactions are necessary for majority protein functions. Adenosine-5'-triphosphate (ATP) is one such ligand that plays vital role as a coenzyme in providing energy for cellular activities, catalyzing biological reaction and signaling. Knowing ATP binding residues of proteins is helpful for annotation of protein function and drug design. However, due to the huge amounts of protein sequences influx into databases in the post-genome era, experimentally identifying ATP binding residues is cost-ineffective and time-consuming. To address this problem, computational methods have been developed to predict ATP binding residues. In this review, we briefly summarized the application of machine learning methods in detecting ATP binding residues of proteins. We expect this review will be helpful for further research.
Collapse
Affiliation(s)
- Yu-He Yang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054. China
| | - Jia-Shu Wang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054. China
| | - Shi-Shi Yuan
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054. China
| | - Meng-Lu Liu
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054. China
| | - Wei Su
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054. China
| | - Hao Lin
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054. China
| | - Zhao-Yue Zhang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054. China
| |
Collapse
|
16
|
Ye X, Ren W, Liu D, Li X, Li W, Wang X, Meng FL, Yeap LS, Hou Y, Zhu S, Casellas R, Zhang H, Wu K, Pan-Hammarström Q. Genome-wide mutational signatures revealed distinct developmental paths for human B cell lymphomas. J Exp Med 2021; 218:211517. [PMID: 33136155 PMCID: PMC7608067 DOI: 10.1084/jem.20200573] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Revised: 07/31/2020] [Accepted: 09/18/2020] [Indexed: 12/11/2022] Open
Abstract
Both somatic hypermutation (SHM) and class switch recombination (CSR) are initiated by activation-induced cytidine deaminase (AID). Dysregulation of these processes has been linked to B cell lymphomagenesis. Here we performed an in-depth analysis of diffuse large B cell lymphoma (DLBCL) and follicular lymphoma (FL) genomes. We characterized seven genomic mutational signatures, including two B cell tumor-specific signatures, one of which is novel and associated with aberrant SHM. We further identified two major mutational signatures (K1 and K2) of clustered mutations (kataegis) resulting from the activities of AID or error-prone DNA polymerase η, respectively. K1 was associated with the immunoglobulin (Ig) switch region mutations/translocations and the ABC subtype of DLBCL, whereas K2 was related to the Ig variable region mutations and the GCB subtype of DLBCL and FL. Similar patterns were also observed in chronic lymphocytic leukemia subtypes. Thus, alterations associated with aberrant CSR and SHM activities can be linked to distinct developmental paths for different subtypes of B cell lymphomas.
Collapse
Affiliation(s)
- Xiaofei Ye
- Department of Lymphoma, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center of Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, China.,BGI-Shenzhen, Shenzhen, China.,Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Weicheng Ren
- Department of Lymphoma, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center of Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, China.,Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Dongbing Liu
- BGI-Shenzhen, Shenzhen, China.,Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, Shenzhen, China
| | - Xiaobo Li
- BGI-Shenzhen, Shenzhen, China.,Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, Shenzhen, China
| | - Wei Li
- Department of Lymphoma, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center of Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, China
| | - Xianhuo Wang
- Department of Lymphoma, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center of Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, China
| | - Fei-Long Meng
- State Key Laboratory of Molecular Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai, China
| | - Leng-Siew Yeap
- Shanghai Institute of Immunology, State Key Laboratory of Oncogenes and Related Genes, Department of Immunology and Microbiology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | | | | | - Rafael Casellas
- Genomics and Immunity, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, MD.,Center of Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Huilai Zhang
- Department of Lymphoma, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center of Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, China
| | - Kui Wu
- BGI-Shenzhen, Shenzhen, China.,Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, Shenzhen, China
| | - Qiang Pan-Hammarström
- Department of Lymphoma, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center of Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, China.,Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| |
Collapse
|
17
|
Li Y, Pu F, Wang J, Zhou Z, Zhang C, He F, Ma Z, Zhang J. Machine Learning Methods in Prediction of Protein Palmitoylation Sites: A Brief Review. Curr Pharm Des 2021; 27:2189-2198. [PMID: 33183190 DOI: 10.2174/1381612826666201112142826] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 07/27/2020] [Indexed: 11/22/2022]
Abstract
Protein palmitoylation is a fundamental and reversible post-translational lipid modification that involves a series of biological processes. Although a large number of experimental studies have explored the molecular mechanism behind the palmitoylation process, the computational methods has attracted much attention for its good performance in predicting palmitoylation sites compared with expensive and time-consuming biochemical experiments. The prediction of protein palmitoylation sites is helpful to reveal its biological mechanism. Therefore, the research on the application of machine learning methods to predict palmitoylation sites has become a hot topic in bioinformatics and promoted the development in the related fields. In this review, we briefly introduced the recent development in predicting protein palmitoylation sites by using machine learningbased methods and discussed their benefits and drawbacks. The perspective of machine learning-based methods in predicting palmitoylation sites was also provided. We hope the review could provide a guide in related fields.
Collapse
Affiliation(s)
- Yanwen Li
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Feng Pu
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Jingru Wang
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Zhiguo Zhou
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Chunhua Zhang
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Fei He
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Zhiqiang Ma
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Jingbo Zhang
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| |
Collapse
|
18
|
Min X, Lu F, Li C. Sequence-Based Deep Learning Frameworks on Enhancer-Promoter Interactions Prediction. Curr Pharm Des 2021; 27:1847-1855. [PMID: 33234095 DOI: 10.2174/1381612826666201124112710] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 07/29/2020] [Accepted: 08/06/2020] [Indexed: 11/22/2022]
Abstract
Enhancer-promoter interactions (EPIs) in the human genome are of great significance to transcriptional regulation, which tightly controls gene expression. Identification of EPIs can help us better decipher gene regulation and understand disease mechanisms. However, experimental methods to identify EPIs are constrained by funds, time, and manpower, while computational methods using DNA sequences and genomic features are viable alternatives. Deep learning methods have shown promising prospects in classification and efforts that have been utilized to identify EPIs. In this survey, we specifically focus on sequence-based deep learning methods and conduct a comprehensive review of the literature. First, we briefly introduce existing sequence- based frameworks on EPIs prediction and their technique details. After that, we elaborate on the dataset, pre-processing means, and evaluation strategies. Finally, we concluded with the challenges these methods are confronted with and suggest several future opportunities. We hope this review will provide a useful reference for further studies on enhancer-promoter interactions.
Collapse
Affiliation(s)
- Xiaoping Min
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Fengqing Lu
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Chunyan Li
- Graduate School, Yunnan Minzu University, Kunming 650504, China
| |
Collapse
|
19
|
Meng Y, Jin M. HFS-SLPEE: A Novel Hierarchical Feature Selection and Second Learning Probability Error Ensemble Model for Precision Cancer Diagnosis. Front Cell Dev Biol 2021; 9:696359. [PMID: 34277640 PMCID: PMC8278475 DOI: 10.3389/fcell.2021.696359] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Accepted: 05/19/2021] [Indexed: 11/15/2022] Open
Abstract
The emergence of high-throughput RNA-seq data has offered unprecedented opportunities for cancer diagnosis. However, capturing biological data with highly nonlinear and complex associations by most existing approaches for cancer diagnosis has been challenging. In this study, we propose a novel hierarchical feature selection and second learning probability error ensemble model (named HFS-SLPEE) for precision cancer diagnosis. Specifically, we first integrated protein-coding gene expression profiles, non-coding RNA expression profiles, and DNA methylation data to provide rich information; afterward, we designed a novel hierarchical feature selection method, which takes the CpG-gene biological associations into account and can select a compact set of superior features; next, we used four individual classifiers with significant differences and apparent complementary to build the heterogeneous classifiers; lastly, we developed a second learning probability error ensemble model called SLPEE to thoroughly learn the new data consisting of classifiers-predicted class probability values and the actual label, further realizing the self-correction of the diagnosis errors. Benchmarking comparisons on TCGA showed that HFS-SLPEE performs better than the state-of-the-art approaches. Moreover, we analyzed in-depth 10 groups of selected features and found several novel HFS-SLPEE-predicted epigenomics and epigenetics biomarkers for breast invasive carcinoma (BRCA) (e.g., TSLP and ADAMTS9-AS2), lung adenocarcinoma (LUAD) (e.g., HBA1 and CTB-43E15.1), and kidney renal clear cell carcinoma (KIRC) (e.g., IRX2 and BMPR1B-AS1).
Collapse
Affiliation(s)
| | - Min Jin
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| |
Collapse
|
20
|
Grassi L, Harris C, Zhu J, Hardman C, Hatton D. DetectIS: a pipeline to rapidly detect exogenous DNA integration sites using DNA or RNA paired-end sequencing data. Bioinformatics 2021; 37:4230-4232. [PMID: 33978747 PMCID: PMC9502153 DOI: 10.1093/bioinformatics/btab366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 05/07/2021] [Accepted: 05/11/2021] [Indexed: 11/13/2022] Open
Abstract
Motivation Recombinant DNA technology is widely used for different applications in biology, medicine and bio-technology. Viral transduction and plasmid transfection are among the most frequently used techniques to generate recombinant cell lines. Many of these methods result in the random integration of the plasmid into the host genome. Rapid identification of the integration sites is highly desirable in order to characterize these engineered cell lines. Results We developed detectIS: a pipeline specifically designed to identify genomic integration sites of exogenous DNA, either a plasmid containing one or more transgenes or a virus. The pipeline is based on a Nextflow workflow combined with a Singularity image containing all the necessary software, ensuring high reproducibility and scalability of the analysis. We tested it on simulated datasets and RNA-seq data from a human sample infected with Hepatitis B virus. Comparisons with other state of the art tools show that our method can identify the integration site in different recombinant cell lines, with accurate results, lower computational demand and shorter execution times. Availability and implementation The Nextflow workflow, the Singularity image and a test dataset are available at https://github.com/AstraZeneca/detectIS. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Luigi Grassi
- Biopharmaceutical Development, BioPharmaceuticals R&D, AstraZenec, Cambridge, UK a
| | - Claire Harris
- Biopharmaceutical Development, BioPharmaceuticals R&D, AstraZenec, Cambridge, UK a
| | - Jie Zhu
- Biopharmaceutical Development, BioPharmaceuticals R&D, AstraZeneca, Gaithersburg, USA
| | - Colin Hardman
- Data Science & Artificial Intelligence, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Diane Hatton
- Biopharmaceutical Development, BioPharmaceuticals R&D, AstraZenec, Cambridge, UK a
| |
Collapse
|
21
|
Pischedda E, Crava C, Carlassara M, Zucca S, Gasmi L, Bonizzoni M. ViR: a tool to solve intrasample variability in the prediction of viral integration sites using whole genome sequencing data. BMC Bioinformatics 2021; 22:45. [PMID: 33541262 PMCID: PMC7863434 DOI: 10.1186/s12859-021-03980-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 01/27/2021] [Indexed: 12/16/2022] Open
Abstract
Background Several bioinformatics pipelines have been developed to detect sequences from viruses that integrate into the human genome because of the health relevance of these integrations, such as in the persistence of viral infection and/or in generating genotoxic effects, often progressing into cancer. Recent genomics and metagenomics analyses have shown that viruses also integrate into the genome of non-model organisms (i.e., arthropods, fish, plants, vertebrates). However, rarely studies of endogenous viral elements (EVEs) in non-model organisms have gone beyond their characterization from reference genome assemblies. In non-model organisms, we lack a thorough understanding of the widespread occurrence of EVEs and their biological relevance, apart from sporadic cases which nevertheless point to significant roles of EVEs in immunity and regulation of expression. The concomitance of repetitive DNA, duplications and/or assembly fragmentations in a genome sequence and intrasample variability in whole-genome sequencing (WGS) data could determine misalignments when mapping data to a genome assembly. This phenomenon hinders our ability to properly identify integration sites. Results To fill this gap, we developed ViR, a pipeline which solves the dispersion of reads due to intrasample variability in sequencing data from both single and pooled DNA samples thus ameliorating the detection of integration sites. We tested ViR to work with both in silico and real sequencing data from a non-model organism, the arboviral vector Aedes albopictus. Potential viral integrations predicted by ViR were molecularly validated supporting the accuracy of ViR results. Conclusion ViR will open new venues to explore the biology of EVEs, especially in non-model organisms. Importantly, while we generated ViR with the identification of EVEs in mind, its application can be extended to detect any lateral transfer event providing an ad-hoc sequence to interrogate.
Collapse
Affiliation(s)
- Elisa Pischedda
- Department of Biology and Biotechnology, University of Pavia, 27100, Pavia, Italy
| | - Cristina Crava
- Department of Biology and Biotechnology, University of Pavia, 27100, Pavia, Italy.,ERI BIOTECMED, Universitat de Valencia, 46010, Valencia, Spain
| | - Martina Carlassara
- Department of Biology and Biotechnology, University of Pavia, 27100, Pavia, Italy
| | | | - Leila Gasmi
- Department of Biology and Biotechnology, University of Pavia, 27100, Pavia, Italy
| | - Mariangela Bonizzoni
- Department of Biology and Biotechnology, University of Pavia, 27100, Pavia, Italy.
| |
Collapse
|
22
|
Liu R, Wei L, Zhang P. A deep learning framework for drug repurposing via emulating clinical trials on real-world patient data. NAT MACH INTELL 2021; 3:68-75. [PMID: 35603127 PMCID: PMC9119409 DOI: 10.1038/s42256-020-00276-w] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 11/16/2020] [Indexed: 02/03/2023]
Abstract
Drug repurposing is an effective strategy to identify new uses for existing drugs, providing the quickest possible transition from bench to bedside. Real-world data, such as electronic health records and insurance claims, provide information on large cohorts of users for many drugs. Here we present an efficient and easily customized framework for generating and testing multiple candidates for drug repurposing using a retrospective analysis of real-world data. Building upon well-established causal inference and deep learning methods, our framework emulates randomized clinical trials for drugs present in a large-scale medical claims database. We demonstrate our framework on a coronary artery disease cohort of millions of patients. We successfully identify drugs and drug combinations that substantially improve the coronary artery disease outcomes but haven't been indicated for treating coronary artery disease, paving the way for drug repurposing.
Collapse
Affiliation(s)
- Ruoqi Liu
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
| | - Lai Wei
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Ping Zhang
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
- Translational Data Analytics Institute, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
23
|
Ding Y, Tang J, Guo F. The Computational Models of Drug-target Interaction Prediction. Protein Pept Lett 2020; 27:348-358. [PMID: 30968771 DOI: 10.2174/0929866526666190410124110] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Revised: 02/22/2019] [Accepted: 04/02/2019] [Indexed: 12/19/2022]
Abstract
The identification of Drug-Target Interactions (DTIs) is an important process in drug discovery and medical research. However, the tradition experimental methods for DTIs identification are still time consuming, extremely expensive and challenging. In the past ten years, various computational methods have been developed to identify potential DTIs. In this paper, the identification methods of DTIs are summarized. What's more, several state-of-the-art computational methods are mainly introduced, containing network-based method and machine learning-based method. In particular, for machine learning-based methods, including the supervised and semisupervised models, have essential differences in the approach of negative samples. Although these effective computational models in identification of DTIs have achieved significant improvements, network-based and machine learning-based methods have their disadvantages, respectively. These computational methods are evaluated on four benchmark data sets via values of Area Under the Precision Recall curve (AUPR).
Collapse
Affiliation(s)
- Yijie Ding
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, China
| | - Jijun Tang
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, United States.,School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Fei Guo
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
24
|
Abstract
Background:
The basic building block of a body is protein which is a complex system
whose structure plays a key role in activation, catalysis, messaging and disease states. Therefore,
careful investigation of protein structure is necessary for the diagnosis of diseases and for the drug
designing. Protein structures are described at their different levels of complexity: primary (chain),
secondary (helical), tertiary (3D), and quaternary structure. Analyzing complex 3D structure of
protein is a difficult task but it can be analyzed as a network of interconnection between its
component, where amino acids are considered as nodes and interconnection between them are
edges.
Objective:
Many literature works have proven that the small world network concept provides
many new opportunities to investigate network of biological systems. The objective of this paper is
analyzing the protein structure using small world concept.
Methods:
Protein is analyzed using small world network concept, specifically where extreme
condition is having a degree distribution which follows power law. For the correct verification of
the proposed approach, dataset of the Oncogene protein structure is analyzed using Python
programming.
Results:
Protein structure is plotted as network of amino acids (Residue Interaction Graph (RIG))
using distance matrix of nodes with given threshold, then various centrality measures (i.e., degree
distribution, Degree-Betweenness correlation, and Betweenness-Closeness correlation) are
calculated for 1323 nodes and graphs are plotted.
Conclusion:
Ultimately, it is concluded that there exist hubs with higher centrality degree but less
in number, and they are expected to be robust toward harmful effects of mutations with new
functions.
Collapse
Affiliation(s)
- Neetu Kumari
- Department of Computer Science, Banaras Hindu University, Varanasi, India
| | - Anshul Verma
- Department of Computer Science, Banaras Hindu University, Varanasi, India
| |
Collapse
|
25
|
Zhang Y, Feng T, Wang S, Dong R, Yang J, Su J, Wang B. A Novel XGBoost Method to Identify Cancer Tissue-of-Origin Based on Copy Number Variations. Front Genet 2020; 11:585029. [PMID: 33329723 PMCID: PMC7716814 DOI: 10.3389/fgene.2020.585029] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2020] [Accepted: 10/05/2020] [Indexed: 01/18/2023] Open
Abstract
The discovery of cancer of unknown primary (CUP) is of great significance in designing more effective treatments and improving the diagnostic efficiency in cancer patients. In the study, we develop an appropriate machine learning model for tracing the tissue of origin of CUP with high accuracy after feature engineering and model evaluation. Based on a copy number variation data consisting of 4,566 training cases and 1,262 independent validation cases, an XGBoost classifier is applied to 10 types of cancer. Extremely randomized tree (Extra tree) is used for dimension reduction so that fewer variables replace the original high-dimensional variables. Features with top 300 weights are selected and principal component analysis is applied to eliminate noise. We find that XGBoost classifier achieves the highest overall accuracy of 0.8913 in the 10-fold cross-validation for training samples and 0.7421 on independent validation datasets for predicting tumor tissue of origin. Furthermore, by contrasting various performance indices, such as precision and recall rate, the experimental results show that XGBoost classifier significantly improves the classification performance of various tumors with less prediction error, as compared to other classifiers, such as K-nearest neighbors (KNN), Bayes, support vector machine (SVM), and Adaboost. Our method can infer tissue of origin for the 10 cancer types with acceptable accuracy in both cross-validation and independent validation data. It may be used as an auxiliary diagnostic method to determine the actual clinicopathological status of specific cancer.
Collapse
Affiliation(s)
- Yulin Zhang
- College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao, China
| | - Tong Feng
- College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao, China
| | - Shudong Wang
- College of Computer and Communication Engineering, China University of Petroleum (East China), Qingdao, China
| | - Ruyi Dong
- Geneis (Beijing) Co., Ltd., Beijing, China
| | | | - Jionglong Su
- School of AI and Advanced Computing, XJTLU Entrepreneur College (Taicang), Xi’an Jiaotong-Liverpool University, Suzhou, China
| | - Bo Wang
- Geneis (Beijing) Co., Ltd., Beijing, China
| |
Collapse
|
26
|
Yang L, Gao H, Wu K, Zhang H, Li C, Tang L. Identification of Cancerlectins By Using Cascade Linear Discriminant Analysis and Optimal g-gap Tripeptide Composition. Curr Bioinform 2020. [DOI: 10.2174/1574893614666190730103156] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Background:
Lectins are a diverse group of glycoproteins or glycoconjugate proteins
that can be extracted from plants, invertebrates and higher animals. Cancerlectins, a kind of lectins,
which play a key role in the process of tumor cells interacting with each other and are being employed
as therapeutic agents. A full understanding of cancerlectins is significant because it provides
a tool for the future direction of cancer therapy.
Objective:
To develop an accurate and practically useful timesaving tool to identify cancerlectins.
A novel sequence-based method is proposed along with a correlative webserver to access the proposed
tool.
Methods:
Firstly, protein features were extracted in a newly feature building way termed, g-gap
tripeptide composition. After which a proposed cascade linear discriminant analysis (Cascade
LDA) is used to alleviate the high dimensional difficulties with the Analysis Of Variance (ANOVA)
as a feature importance criterion. Finally, Support Vector Machine (SVM) is used as the classifier
to identify cancerlectins.
Results:
The proposed method achieved an accuracy of 91.34% with sensitivity of 89.89%, specificity
of 92.48% and an 0.8318 Mathew’s correlation coefficient based on only 13 fusion features
in jackknife cross validation, the result of which is superior to other published methods in this domain.
Conclusion:
In this study, a new method based only on primary structure of protein is proposed
and experimental results show that it could be a promising tool to identify cancerlectins. An openaccess
webserver is made available in this work to facilitate other related works.
Collapse
Affiliation(s)
- Liangwei Yang
- Center for Informational Biology, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Hui Gao
- Center for Informational Biology, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Keyu Wu
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Haotian Zhang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Changyu Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Lixia Tang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
27
|
Wang C, Zhang H, Li Z, Zhou X, Cheng Y, Chen R. White Blood Cell Image Segmentation Based on Color Component Combination and Contour Fitting. Curr Bioinform 2020. [DOI: 10.2174/1574893614666191017102310] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
White Blood Cell (WBC) image segmentation plays a key role in cell
morphology analysis. However, WBC segmentation is still a challenging task due to the diversity
of WBCs under different staining conditions.
Objective:
In this paper, we propose a novel WBC segmentation method based on color component
combination and contour fitting to segment WBC images accurately.
Methods:
Specifically, the proposed method first uses color component combination and image
thresholding to achieve nucleus segmentation, then uses a color prior to remove image background,
and extracts the initial WBC contour via Canny edge detection, and finally judges and
closes the unclosed WBC contour by contour fitting. Accordingly, cytoplasm segmentation is
achieved by subtracting the nucleus region from the WBC region.
Results:
Experimental results on 100 WBC images under rapid staining condition and 50 WBC
images under standard staining condition showed that the proposed method improved segmentation
accuracy of white blood cells under rapid and standard staining conditions.
Conclusion:
The proposed color component combination and contour fitting is effective in WBC
segmentation task.
Collapse
Affiliation(s)
- Chuansheng Wang
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
| | - Hong Zhang
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
| | - Zuoyong Li
- Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, Minjiang University, Fuzhou, China
| | - Xiaogen Zhou
- College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China
| | - Yong Cheng
- School of Information Mechanical & Electrical Engineering, Jiangsu Open University, Nanjing, China
| | - Rongyan Chen
- Department of Clinical Laboratory, the People's Hospital Affiliated to Fujian University of Traditional Chinese Medicine, Fuzhou, China
| |
Collapse
|
28
|
Zhuang X, Ye R, So MT, Lam WY, Karim A, Yu M, Ngo ND, Cherny SS, Tam PKH, Garcia-Barcelo MM, Tang CSM, Sham PC. A random forest-based framework for genotyping and accuracy assessment of copy number variations. NAR Genom Bioinform 2020; 2:lqaa071. [PMID: 33575619 PMCID: PMC7671382 DOI: 10.1093/nargab/lqaa071] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 08/18/2020] [Accepted: 08/26/2020] [Indexed: 12/24/2022] Open
Abstract
Detection of copy number variations (CNVs) is essential for uncovering genetic factors underlying human diseases. However, CNV detection by current methods is prone to error, and precisely identifying CNVs from paired-end whole genome sequencing (WGS) data is still challenging. Here, we present a framework, CNV-JACG, for Judging the Accuracy of CNVs and Genotyping using paired-end WGS data. CNV-JACG is based on a random forest model trained on 21 distinctive features characterizing the CNV region and its breakpoints. Using the data from the 1000 Genomes Project, Genome in a Bottle Consortium, the Human Genome Structural Variation Consortium and in-house technical replicates, we show that CNV-JACG has superior sensitivity over the latest genotyping method, SV2, particularly for the small CNVs (≤1 kb). We also demonstrate that CNV-JACG outperforms SV2 in terms of Mendelian inconsistency in trios and concordance between technical replicates. Our study suggests that CNV-JACG would be a useful tool in assessing the accuracy of CNVs to meet the ever-growing needs for uncovering the missing heritability linked to CNVs.
Collapse
Affiliation(s)
- Xuehan Zhuang
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Rui Ye
- Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Man-Ting So
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Wai-Yee Lam
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Anwarul Karim
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Michelle Yu
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Ngoc Diem Ngo
- National Hospital of Pediatrics, Ha Noi 100000, Vietnam
| | - Stacey S Cherny
- Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Paul Kwong-Hang Tam
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | | | - Clara Sze-Man Tang
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Pak Chung Sham
- Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| |
Collapse
|
29
|
Xie W, Feng YE. Prediction of the Disordered Regions of Intrinsically Disordered Proteins Based on the Molecular Functions. Protein Pept Lett 2020; 27:279-286. [PMID: 30819075 DOI: 10.2174/0929866526666190226160629] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2018] [Revised: 01/03/2019] [Accepted: 02/08/2019] [Indexed: 01/29/2023]
Abstract
BACKGROUND Intrinsically disordered proteins lack a well-defined three dimensional structure under physiological conditions while possessing the essential biological functions. They take part in various physiological processes such as signal transduction, transcription and posttranslational modifications and etc. The disordered regions are the main functional sites for intrinsically disordered proteins. Therefore, the research of the disordered regions has become a hot issue. OBJECTIVE In this paper, our motivation is to analysis of the features of disordered regions with different molecular functions and predict of different disordered regions using valid features. METHODS In this article, according to the different molecular function, we firstly divided intrinsically disordered proteins into six classes in DisProt database. Then, we extracted four features using bioinformatics methods, namely, Amino Acid Index (AAIndex), codon frequency (Codon), three kinds of protein secondary structure compositions (3PSS) and Chemical Shifts (CSs), and used these features to predict the disordered regions of the different functions by Support Vector Machine (SVM). RESULTS The best overall accuracy was 99.29% using the chemical shift (CSs) as feature. In feature fusion, the overall accuracy can reach 88.70% by using CSs+AAIndex as features. The overall accuracy was up to 86.09% by using CSs+AAIndex+Codon+3PSS as features. CONCLUSION We predicted and analyzed the disordered regions based on the molecular functions. The results showed that the prediction performance can be improved by adding chemical shifts and AAIndex as features, especially chemical shifts. Moreover, the chemical shift was the most effective feature in the prediction. We hoped that our results will be constructive for the study of intrinsically disordered proteins.
Collapse
Affiliation(s)
- WeiXia Xie
- College of Science, Inner Mongolia Agriculture University, Hohhot 010018, China
| | - Yong E Feng
- College of Science, Inner Mongolia Agriculture University, Hohhot 010018, China
| |
Collapse
|
30
|
Chen X, Kost J, Li D. Comprehensive comparative analysis of methods and software for identifying viral integrations. Brief Bioinform 2020; 20:2088-2097. [PMID: 30102374 DOI: 10.1093/bib/bby070] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2018] [Revised: 07/02/2018] [Accepted: 07/12/2018] [Indexed: 12/13/2022] Open
Abstract
Many viruses are capable of integrating in the human genome, particularly viruses involved in tumorigenesis. Viral integrations can be considered genetic markers for discovering virus-caused cancers and inferring cancer cell development. Next-generation sequencing (NGS) technologies have been widely used to screen for viral integrations in cancer genomes, and a number of bioinformatics tools have been developed to detect viral integrations using NGS data. However, there has been no systematic comparison of the methods or software. In this study, we performed a comprehensive comparative analysis of the designs, performance, functionality and limitations among the existing methods and software for detecting viral integrations. We further compared the sensitivity, precision and runtime of integration detection of four representative tools. Our analyses showed that each of the existing software had its own merits; however, none of them were sufficient for parallel or accurate virome-wide detection. After carefully evaluating the limitations shared by the existing methods, we proposed strategies and directions for developing virome-wide integration detection.
Collapse
Affiliation(s)
- Xun Chen
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Jason Kost
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Dawei Li
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA.,Department of Computer Science, University of Vermont, Burlington, Vermont 05405, USA.,Neuroscience, Behavior, and Health Initiative, University of Vermont, Burlington, Vermont 05405, USA.,Cancer Center, University of Vermont, Burlington, Vermont 05405, USA
| |
Collapse
|
31
|
Liang Y, Wang H, Yang J, Li X, Dai C, Shao P, Tian G, Wang B, Wang Y. A Deep Learning Framework to Predict Tumor Tissue-of-Origin Based on Copy Number Alteration. Front Bioeng Biotechnol 2020; 8:701. [PMID: 32850687 PMCID: PMC7419421 DOI: 10.3389/fbioe.2020.00701] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Accepted: 06/04/2020] [Indexed: 12/18/2022] Open
Abstract
Cancer of unknown primary site (CUPS) is a type of metastatic tumor for which the sites of tumor origin cannot be determined. Precise diagnosis of the tissue origin for metastatic CUPS is crucial for developing treatment schemes to improve patient prognosis. Recently, there have been many studies using various cancer biomarkers to predict the tissue-of-origin (TOO) of CUPS. However, only a very few of them use copy number alteration (CNA) to trance TOO. In this paper, a two-step computational framework called CNA_origin is introduced to predict the tissue-of-origin of a tumor from its gene CNA levels. CNA_origin set up an intellectual deep-learning network mainly composed of an autoencoder and a convolution neural network (CNN). Based on real datasets released from the public database, CNA_origin had an overall accuracy of 83.81% on 10-fold cross-validation and 79% on independent datasets for predicting tumor origin, which improved the accuracy by 7.75 and 9.72% compared with the method published in a previous paper. Our results suggested that the autoencoder model can extract key characteristics of CNA and that the CNN classifier model developed in this study can predict the origin of tumors robustly and effectively. CNA_origin was written in Python and can be downloaded from https://github.com/YingLianghnu/CNA_origin.
Collapse
Affiliation(s)
- Ying Liang
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, China
| | - Haifeng Wang
- Department of Urology, Shanghai East Hospital, Tongji University School of Medicine, Shanghai, China
| | | | - Xiong Li
- School of Software, East China Jiaotong University, Nanchang, China
| | - Chan Dai
- Geneis (Beijing) Co. Ltd., Beijing, China
| | - Peng Shao
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, China
| | - Geng Tian
- Geneis (Beijing) Co. Ltd., Beijing, China
| | - Bo Wang
- Geneis (Beijing) Co. Ltd., Beijing, China
| | - Yinglong Wang
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, China
| |
Collapse
|
32
|
Guan ZX, Li SH, Zhang ZM, Zhang D, Yang H, Ding H. A Brief Survey for MicroRNA Precursor Identification Using Machine Learning Methods. Curr Genomics 2020; 21:11-25. [PMID: 32655294 PMCID: PMC7324890 DOI: 10.2174/1389202921666200214125102] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 01/24/2020] [Accepted: 01/30/2020] [Indexed: 11/22/2022] Open
Abstract
MicroRNAs, a group of short non-coding RNA molecules, could regulate gene expression. Many diseases are associated with abnormal expression of miRNAs. Therefore, accurate identification of miRNA precursors is necessary. In the past 10 years, experimental methods, comparative genomics methods, and artificial intelligence methods have been used to identify pre-miRNAs. However, experimental methods and comparative genomics methods have their disadvantages, such as time-consuming. In contrast, machine learning-based method is a better choice. Therefore, the review summarizes the current advances in pre-miRNA recognition based on computational methods, including the construction of benchmark datasets, feature extraction methods, prediction algorithms, and the results of the models. And we also provide valid information about the predictors currently available. Finally, we give the future perspectives on the identification of pre-miRNAs. The review provides scholars with a whole background of pre-miRNA identification by using machine learning methods, which can help researchers have a clear understanding of progress of the research in this field.
Collapse
Affiliation(s)
- Zheng-Xing Guan
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Shi-Hao Li
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Zi-Mei Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Dan Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Hui Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| |
Collapse
|
33
|
Feng P, Wang Z. Recent Advances in Computational Methods for Identifying Anticancer Peptides. Curr Drug Targets 2020; 20:481-487. [PMID: 30068270 DOI: 10.2174/1389450119666180801121548] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2018] [Revised: 05/28/2018] [Accepted: 05/28/2018] [Indexed: 01/10/2023]
Abstract
Anticancer peptide (ACP) is a kind of small peptides that can kill cancer cells without damaging normal cells. In recent years, ACP has been pre-clinically used for cancer treatment. Therefore, accurate identification of ACPs will promote their clinical applications. In contrast to labor-intensive experimental techniques, a series of computational methods have been proposed for identifying ACPs. In this review, we briefly summarized the current progress in computational identification of ACPs. The challenges and future perspectives in developing reliable methods for identification of ACPs were also discussed. We anticipate that this review could provide novel insights into future researches on anticancer peptides.
Collapse
Affiliation(s)
- Pengmian Feng
- School of Public Health, North China University of Science and Technology, Tangshan, 063000, China
| | - Zhenyi Wang
- Center for Genomics and Computational Biology, School of Life Science, North China University of Science and Technology, Tangshan, 063000, China
| |
Collapse
|
34
|
Tan JX, Lv H, Wang F, Dao FY, Chen W, Ding H. A Survey for Predicting Enzyme Family Classes Using Machine Learning Methods. Curr Drug Targets 2020; 20:540-550. [PMID: 30277150 DOI: 10.2174/1389450119666181002143355] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2018] [Revised: 08/17/2018] [Accepted: 09/04/2018] [Indexed: 12/13/2022]
Abstract
Enzymes are proteins that act as biological catalysts to speed up cellular biochemical processes. According to their main Enzyme Commission (EC) numbers, enzymes are divided into six categories: EC-1: oxidoreductase; EC-2: transferase; EC-3: hydrolase; EC-4: lyase; EC-5: isomerase and EC-6: synthetase. Different enzymes have different biological functions and acting objects. Therefore, knowing which family an enzyme belongs to can help infer its catalytic mechanism and provide information about the relevant biological function. With the large amount of protein sequences influxing into databanks in the post-genomics age, the annotation of the family for an enzyme is very important. Since the experimental methods are cost ineffective, bioinformatics tool will be a great help for accurately classifying the family of the enzymes. In this review, we summarized the application of machine learning methods in the prediction of enzyme family from different aspects. We hope that this review will provide insights and inspirations for the researches on enzyme family classification.
Collapse
Affiliation(s)
- Jiu-Xin Tan
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hao Lv
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Fang Wang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Fu-Ying Dao
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Wei Chen
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Department of Physics, School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan 063000, China.,Gordon Life Science Institute, Boston, MA 02478, United States
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
35
|
Huang M, He M, Guo Y, Li H, Shen S, Xie Y, Li X, Xiao H, Fang L, Li D, Peng B, Liang L, Yu J, Kuang M, Xu L, Peng S. The Influence of Immune Heterogeneity on the Effectiveness of Immune Checkpoint Inhibitors in Multifocal Hepatocellular Carcinomas. Clin Cancer Res 2020; 26:4947-4957. [PMID: 32527942 DOI: 10.1158/1078-0432.ccr-19-3840] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2019] [Revised: 03/17/2020] [Accepted: 06/04/2020] [Indexed: 11/16/2022]
Abstract
PURPOSE Immune checkpoint inhibitor therapy is emerging as the promising option for patients with advanced hepatocellular carcinoma. We aimed to investigate the heterogeneity of different tumor nodules of the same patient with multifocal hepatocellular carcinomas in response to immunotherapy and its molecular mechanisms. EXPERIMENTAL DESIGN We attained 45 surgical tumor samples including 33 small and 12 large nodules from 12 patients with multifocal hepatocellular carcinoma and evaluated genomic and immune heterogeneity among tumors through whole-genome sequencing and RNA sequencing. IHC was performed to validate the expression of immune markers. The responses to anti-programmed cell death protein-1 (PD-1) therapy in patients with multifocal hepatocellular carcinoma were evaluated. RESULTS The small and large tumors within the same patient presented with similar genomic characteristics, indicating their same genomic origin. We further found the small tumors had higher immune cell infiltration including more CD8+ T cells, M1 macrophages, and monocytes as compared with large tumors. Besides, the expression of interferon signature predictive of response to anti-PD-1 therapy was significantly upregulated in the small tumors. Moreover, the immune pathways were more vigorous along with less active proliferation pathways in the small tumors. In keeping with this, we found that small nodules were more sensitive to anti-PD-1 therapy than large nodules in patients with multifocal hepatocellular carcinoma. CONCLUSIONS The small tumors in patients with multifocal hepatocellular carcinoma had higher immune cell infiltration and upregulation of immune pathways as compared with the large tumors, which can partially explain the different responses of small and large tumors in the same case to anti-PD-1 therapy.
Collapse
Affiliation(s)
- Manling Huang
- Department of Gastroenterology and Hepatology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Minghui He
- Department of Liver Surgery, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Yu Guo
- Department of Liver Surgery, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Heping Li
- Department of Oncology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Shunli Shen
- Department of Liver Surgery, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Yubin Xie
- Institute of Precision Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Xiaoxing Li
- Institute of Precision Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Han Xiao
- Division of Interventional Ultrasound, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Lujing Fang
- Institute of Precision Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Dongming Li
- Department of Liver Surgery, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Baogang Peng
- Department of Liver Surgery, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Lijian Liang
- Department of Liver Surgery, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Jun Yu
- Institute of Precision Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Ming Kuang
- Department of Liver Surgery, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.,Institute of Precision Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.,Division of Interventional Ultrasound, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Lixia Xu
- Department of Oncology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China. .,Institute of Precision Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Sui Peng
- Department of Gastroenterology and Hepatology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China. .,Institute of Precision Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.,Clinical Trial Unit, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
36
|
Li N, Yang J, Zhu W, Liang Y. MVSC: A Multi-variation Simulator of Cancer Genome. Comb Chem High Throughput Screen 2020; 23:326-333. [PMID: 32183666 DOI: 10.2174/1386207323666200317121136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Revised: 11/29/2019] [Accepted: 02/27/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND Many forms of variations exist in the genome, which are the main causes of individual phenotypic differences. The detection of variants, especially those located in the tumor genome, still faces many challenges due to the complexity of the genome structure. Thus, the performance assessment of variation detection tools using next-generation sequencing platforms is urgently needed. METHOD We have created a software package called the Multi-Variation Simulator of Cancer genomes (MVSC) to simulate common genomic variants, including single nucleotide polymorphisms, small insertion and deletion polymorphisms, and structural variations (SVs), which are analogous to human somatically acquired variations. Three sets of variations embedded in genomic sequences in different periods were dynamically and sequentially simulated one by one. RESULTS In cancer genome simulation, complex SVs are important because this type of variation is characteristic of the tumor genome structure. Overlapping variations of different sizes can also coexist in the same genome regions, adding to the complexity of cancer genome architecture. Our results show that MVSC can efficiently simulate a variety of genomic variants that cannot be simulated by existing software packages. CONCLUSION The MVSC-simulated variants can be used to assess the performance of existing tools designed to detect SVs in next-generation sequencing data, and we also find that MVSC is memory and time-efficient compared with similar software packages.
Collapse
Affiliation(s)
- Ning Li
- School of Information and Electronic Engineering, Wuzhou University, Wuzhou, China
| | - Jialiang Yang
- Department of Mathematics and Statistics, Hainan Normal University, Haikou, Hainan 571158, China
| | - Wen Zhu
- Department of Mathematics and Statistics, Hainan Normal University, Haikou, Hainan 571158, China.,College of Computer Science and Electronic Engineering, Hunan University, Hunan, China
| | - Ying Liang
- College of Computer Science and Electronic Engineering, Hunan University, Hunan, China.,College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang 330000, China
| |
Collapse
|
37
|
Hu Y, Zhao T, Zhang N, Zhang Y, Cheng L. A Review of Recent Advances and Research on Drug Target Identification Methods. Curr Drug Metab 2019; 20:209-216. [PMID: 30251599 DOI: 10.2174/1389200219666180925091851] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Revised: 01/01/2018] [Accepted: 08/02/2018] [Indexed: 12/14/2022]
Abstract
BACKGROUND From a therapeutic viewpoint, understanding how drugs bind and regulate the functions of their target proteins to protect against disease is crucial. The identification of drug targets plays a significant role in drug discovery and studying the mechanisms of diseases. Therefore the development of methods to identify drug targets has become a popular issue. METHODS We systematically review the recent work on identifying drug targets from the view of data and method. We compiled several databases that collect data more comprehensively and introduced several commonly used databases. Then divided the methods into two categories: biological experiments and machine learning, each of which is subdivided into different subclasses and described in detail. RESULTS Machine learning algorithms are the majority of new methods. Generally, an optimal set of features is chosen to predict successful new drug targets with similar properties. The most widely used features include sequence properties, network topological features, structural properties, and subcellular locations. Since various machine learning methods exist, improving their performance requires combining a better subset of features and choosing the appropriate model for the various datasets involved. CONCLUSION The application of experimental and computational methods in protein drug target identification has become increasingly popular in recent years. Current biological and computational methods still have many limitations due to unbalanced and incomplete datasets or imperfect feature selection methods.
Collapse
Affiliation(s)
- Yang Hu
- School of Life Science and Technology, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Tianyi Zhao
- School of Life Science and Technology, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Ningyi Zhang
- School of Life Science and Technology, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Ying Zhang
- Department of Pharmacy, Heilongjiang Province Land Reclamation Headquarters General Hospital, Harbin 150088, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| |
Collapse
|
38
|
Kervarrec T, Aljundi M, Appenzeller S, Samimi M, Maubec E, Cribier B, Deschamps L, Sarma B, Sarosi EM, Berthon P, Levy A, Bousquet G, Tallet A, Touzé A, Guyétant S, Schrama D, Houben R. Polyomavirus-Positive Merkel Cell Carcinoma Derived from a Trichoblastoma Suggests an Epithelial Origin of this Merkel Cell Carcinoma. J Invest Dermatol 2019; 140:976-985. [PMID: 31759946 DOI: 10.1016/j.jid.2019.09.026] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2019] [Revised: 08/22/2019] [Accepted: 09/19/2019] [Indexed: 12/20/2022]
Abstract
Merkel cell carcinoma (MCC), an aggressive neuroendocrine carcinoma of the skin, is to date the only human cancer known to be frequently caused by a polyomavirus. However, it is a matter of debate which cells are targeted by the Merkel cell polyomavirus (MCPyV) to give rise to the phenotypically multifaceted MCC cells. To assess the lineage of origin of MCPyV-positive MCC, genetic analysis of a very rare tumor combining benign trichoblastoma and MCPyV-positive MCC was conducted by massive parallel sequencing. Although MCPyV was found to be integrated only in the MCC part, six somatic mutations were shared by both tumor components. The mutational overlap between the trichoblastoma and MCPyV-positive MCC parts of the combined tumor implies that MCPyV integration occurred in an epithelial tumor cell before MCC development. Therefore, our report demonstrates that MCPyV-positive MCC can derive from the epithelial lineage.
Collapse
Affiliation(s)
- Thibault Kervarrec
- Department of Pathology, Université de Tours, Centre Hospitalier Universitaire de Tours, Tours Cedex, France; Biologie des infections à polyomavirus team, UMR INRA ISP 1282, Université de Tours, Tours, France; Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany.
| | - Mohanad Aljundi
- Department of Dermatology, Avicenne University Hospital, Bobigny, France
| | - Silke Appenzeller
- Core Unit Bioinformatics, Comprehensive Cancer Center Mainfranken, University Hospital of Würzburg, Würzburg, Germany
| | - Mahtab Samimi
- Biologie des infections à polyomavirus team, UMR INRA ISP 1282, Université de Tours, Tours, France; Department of Dermatology, Université de Tours, Centre Hospitalier Universitaire de Tours, Tours Cedex, France
| | - Eve Maubec
- Department of Dermatology, Avicenne University Hospital, Bobigny, France
| | - Bernard Cribier
- Dermatology Clinic, Hôpitaux Universitaires & Université de Strasbourg, Hôpital Civil, Strasbourg, France
| | | | - Bhavishya Sarma
- Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany
| | - Eva-Maria Sarosi
- Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany
| | - Patricia Berthon
- Biologie des infections à polyomavirus team, UMR INRA ISP 1282, Université de Tours, Tours, France
| | - Annie Levy
- Department of Pathology, Avicenne University Hospital, Bobigny, France
| | - Guilhem Bousquet
- Department of Medical Oncology, Avicenne University Hospital, Bobigny, France
| | - Anne Tallet
- Platform of Somatic Tumor Molecular Genetics, Université de Tours, Centre Hospitalier Universitaire de Tours, Tours Cedex, France
| | - Antoine Touzé
- Biologie des infections à polyomavirus team, UMR INRA ISP 1282, Université de Tours, Tours, France
| | - Serge Guyétant
- Department of Pathology, Université de Tours, Centre Hospitalier Universitaire de Tours, Tours Cedex, France; Biologie des infections à polyomavirus team, UMR INRA ISP 1282, Université de Tours, Tours, France
| | - David Schrama
- Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany
| | - Roland Houben
- Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany
| |
Collapse
|
39
|
Wang F, Guan ZX, Dao FY, Ding H. A Brief Review of the Computational Identification of Antifreeze Protein. CURR ORG CHEM 2019. [DOI: 10.2174/1385272823666190718145613] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Lots of cold-adapted organisms could produce antifreeze proteins (AFPs) to counter the freezing of cell fluids by controlling the growth of ice crystal. AFPs have been found in various species such as in vertebrates, invertebrates, plants, bacteria, and fungi. These AFPs from fish, insects and plants displayed a high diversity. Thus, the identification of the AFPs is a challenging task in computational proteomics. With the accumulation of AFPs and development of machine meaning methods, it is possible to construct a high-throughput tool to timely identify the AFPs. In this review, we briefly reviewed the application of machine learning methods in antifreeze proteins identification from difference section, including published benchmark dataset, sequence descriptor, classification algorithms and published methods. We hope that this review will produce new ideas and directions for the researches in identifying antifreeze proteins.
Collapse
Affiliation(s)
- Fang Wang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Zheng-Xing Guan
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Fu-Ying Dao
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
40
|
Wei HH, Yang W, Tang H, Lin H. The Development of Machine Learning Methods in Cell-Penetrating Peptides Identification: A Brief Review. Curr Drug Metab 2019; 20:217-223. [DOI: 10.2174/1389200219666181010114750] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Revised: 05/21/2018] [Accepted: 08/02/2018] [Indexed: 11/22/2022]
Abstract
Background:Cell-penetrating Peptides (CPPs) are important short peptides that facilitate cellular intake or uptake of various molecules. CPPs can transport drug molecules through the plasma membrane and send these molecules to different cellular organelles. Thus, CPP identification and related mechanisms have been extensively explored. In order to reveal the penetration mechanisms of a large number of CPPs, it is necessary to develop convenient and fast methods for CPPs identification.Methods:Biochemical experiments can provide precise details for accurately identifying CPP, but these methods are expensive and laborious. To overcome these disadvantages, several computational methods have been developed to identify CPPs. We have performed review on the development of machine learning methods in CPP identification. This review provides an insight into CPP identification.Results:We summarized the machine learning-based CPP identification methods and compared the construction strategies of 11 different computational methods. Furthermore, we pointed out the limitations and difficulties in predicting CPPs.Conclusion:In this review, the last studies on CPP identification using machine learning method were reported. We also discussed the future development direction of CPP recognition with computational methods.
Collapse
Affiliation(s)
- Huan-Huan Wei
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Wuritu Yang
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hua Tang
- Department of Pathophysiology, Southwest Medical University, Luzhou, China
| | - Hao Lin
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
41
|
Nagel S, Pommerenke C, MacLeod RAF, Meyer C, Kaufmann M, Fähnrich S, Drexler HG. Deregulated expression of NKL homeobox genes in T-cell lymphomas. Oncotarget 2019; 10:3227-3247. [PMID: 31143370 PMCID: PMC6524933 DOI: 10.18632/oncotarget.26929] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Accepted: 04/29/2019] [Indexed: 11/25/2022] Open
Abstract
Recently, we have presented a scheme, termed "NKL-code", which describes physiological expression patterns of NKL homeobox genes in early hematopoiesis and in lymphopoiesis including main stages of T-, B- and NK-cell development. Aberrant activity of these genes underlies the generation of hematological malignancies notably T-cell leukemia. Here, we searched for deregulated NKL homeobox genes in main entities of T-cell lymphomas comprising angioimmunoblastic T-cell lymphoma (AITL), anaplastic large cell lymphoma (ALCL), adult T-cell leukemia/lymphoma (ATLL), hepatosplenic T-cell lymphoma (HSTL), NK/T-cell lymphoma (NKTL) and peripheral T-cell lymphoma (PTCL). Our data revealed altogether 19 aberrantly overexpressed genes in these types, demonstrating deregulated NKL homeobox genes involvement in T-cell lymphomas as well. For detailed analysis we focused on NKL homeobox gene MSX1 which is normally expressed in NK-cells. MSX1 was overexpressed in subsets of HSTL patients and HSTL-derived sister cell lines DERL-2 and DERL-7 which served as models to characterize mechanisms of deregulation. We performed karyotyping, genomic and expression profiling, and whole genome sequencing to reveal mutated and deregulated gene candidates, including the fusion gene CD53-PDGFRB. Subsequent knockdown experiments allowed the reconstruction of an aberrant network involved in MSX1 deregulation, including chromatin factors AUTS2 and mutated histone HIST1H3B(K27M). The gene encoding AUTS2 is located at chromosome 7q11 and may represent a basic target of the HSTL hallmark aberration i(7q). Taken together, our findings highlight an oncogenic role for deregulated NKL homeobox genes in T-cell lymphoma and identify MSX1 as a novel player in HSTL, implicated in aberrant NK- and T-cell differentiation.
Collapse
Affiliation(s)
- Stefan Nagel
- Department of Human and Animal Cell Lines, Leibniz-Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Claudia Pommerenke
- Department of Human and Animal Cell Lines, Leibniz-Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Roderick A F MacLeod
- Department of Human and Animal Cell Lines, Leibniz-Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Corinna Meyer
- Department of Human and Animal Cell Lines, Leibniz-Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Maren Kaufmann
- Department of Human and Animal Cell Lines, Leibniz-Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Silke Fähnrich
- Department of Human and Animal Cell Lines, Leibniz-Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Hans G Drexler
- Department of Human and Animal Cell Lines, Leibniz-Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| |
Collapse
|
42
|
Maroilley T, Tarailo-Graovac M. Uncovering Missing Heritability in Rare Diseases. Genes (Basel) 2019; 10:E275. [PMID: 30987386 PMCID: PMC6523881 DOI: 10.3390/genes10040275] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 03/29/2019] [Accepted: 04/01/2019] [Indexed: 12/14/2022] Open
Abstract
The problem of 'missing heritability' affects both common and rare diseases hindering: discovery, diagnosis, and patient care. The 'missing heritability' concept has been mainly associated with common and complex diseases where promising modern technological advances, like genome-wide association studies (GWAS), were unable to uncover the complete genetic mechanism of the disease/trait. Although rare diseases (RDs) have low prevalence individually, collectively they are common. Furthermore, multi-level genetic and phenotypic complexity when combined with the individual rarity of these conditions poses an important challenge in the quest to identify causative genetic changes in RD patients. In recent years, high throughput sequencing has accelerated discovery and diagnosis in RDs. However, despite the several-fold increase (from ~10% using traditional to ~40% using genome-wide genetic testing) in finding genetic causes of these diseases in RD patients, as is the case in common diseases-the majority of RDs are also facing the 'missing heritability' problem. This review outlines the key role of high throughput sequencing in uncovering genetics behind RDs, with a particular focus on genome sequencing. We review current advances and challenges of sequencing technologies, bioinformatics approaches, and resources.
Collapse
Affiliation(s)
- Tatiana Maroilley
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada.
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.
| | - Maja Tarailo-Graovac
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada.
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.
| |
Collapse
|
43
|
Wang X, Li H, Gao P, Liu Y, Zeng W. Combining Support Vector Machine with Dual g-gap Dipeptides to Discriminate between Acidic and Alkaline Enzymes. LETT ORG CHEM 2019. [DOI: 10.2174/1570178615666180925125912] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The catalytic activity of the enzyme is different from that of the inorganic catalyst. In a high-temperature, over-acid or over-alkaline environment, the structure of the enzyme is destroyed and then loses its activity. Although the biochemistry experiments can measure the optimal PH environment of the enzyme, these methods are inefficient and costly. In order to solve these problems, computational model could be established to determine the optimal acidic or alkaline environment of the enzyme. Firstly, in this paper, we introduced a new feature called dual g-gap dipeptide composition to formulate enzyme samples. Subsequently, the best feature was selected by using the F value calculated from analysis of variance. Finally, support vector machine was utilized to build prediction model for distinguishing acidic from alkaline enzyme. The overall accuracy of 95.9% was achieved with Jackknife cross-validation, which indicates that our method is professional and efficient in terms of acid and alkaline enzyme predictions. The feature proposed in this paper could also be applied in other fields of bioinformatics.
Collapse
Affiliation(s)
- Xianfang Wang
- School of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, China
| | - Hongfei Li
- School of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, China
| | - Peng Gao
- School of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, China
| | - Yifeng Liu
- School of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, China
| | - Wenjing Zeng
- TianJiabing Middle School of Chengdu, Chengdu 610011, China
| |
Collapse
|
44
|
Yonge F, Weixia X. Identification of Mitochondrial Proteins of Malaria Parasite Adding the New Parameter. LETT ORG CHEM 2019. [DOI: 10.2174/1570178615666180608100348] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Malaria has been one of the serious infectious diseases caused by Plasmodium falciparum (P. falciparum). Mitochondrial proteins of P. falciparum are regarded as effective drug targets against malaria. Thus, it is necessary to accurately identify mitochondrial proteins of malaria parasite. Many algorithms have been proposed for the prediction of mitochondrial proteins of malaria parasite and yielded the better results. However, the parameters used by these methods were primarily based on amino acid sequences. In this study, we added a novel parameter for predicting mitochondrial proteins of malaria parasite based on protein secondary structure. Firstly, we extracted three feature parameters, namely, three kinds of protein secondary structures compositions (3PSS), 20 amino acid compositions (20AAC) and 400 dipeptide compositions (400DC), and used the analysis of variance (ANOVA) to screen 400 dipeptides. Secondly, we adopted these features to predict mitochondrial proteins of malaria parasite by using support vector machine (SVM). Finally, we found that 1) adding the feature of protein secondary structure (3PSS) can indeed improve the prediction accuracy. This result demonstrated that the parameter of protein secondary structure is a valid feature in the prediction of mitochondrial proteins of malaria parasite; 2) feature combination can improve the prediction’s results; feature selection can reduce the dimension and simplify the calculation. We achieved the sensitivity (Sn) of 98.16%, the specificity (Sp) of 97.64% and overall accuracy (Acc) of 97.88% with 0.957 of Mathew’s correlation coefficient (MCC) by using 3PSS+ 20AAC+ 34DC as a feature in 15-fold cross-validation. This result is compared with that of the similar work in the same dataset, showing the superiority of our work.
Collapse
Affiliation(s)
- Feng Yonge
- College of Science, Inner Mongolia Agriculture University, Hohhot 010018, China
| | - Xie Weixia
- College of Science, Inner Mongolia Agriculture University, Hohhot 010018, China
| |
Collapse
|
45
|
Kong L, Zhang L, Han X, Lv J. Protein Structural Class Prediction Based on Distance-related Statistical Features from Graphical Representation of Predicted Secondary Structure. LETT ORG CHEM 2019. [DOI: 10.2174/1570178615666180914110451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Protein structural class prediction is beneficial to protein structure and function analysis. Exploring good feature representation is a key step for this prediction task. Prior works have demonstrated the effectiveness of the secondary structure based feature extraction methods especially for lowsimilarity protein sequences. However, the prediction accuracies still remain limited. To explore the potential of secondary structure information, a novel feature extraction method based on a generalized chaos game representation of predicted secondary structure is proposed. Each protein sequence is converted into a 20-dimensional distance-related statistical feature vector to characterize the distribution of secondary structure elements and segments. The feature vectors are then fed into a support vector machine classifier to predict the protein structural class. Our experiments on three widely used lowsimilarity benchmark datasets (25PDB, 1189 and 640) show that the proposed method achieves superior performance to the state-of-the-art methods. It is anticipated that our method could be extended to other graphical representations of protein sequence and be helpful in future protein research.
Collapse
Affiliation(s)
- Liang Kong
- School of Mathematics and Information Science & Technology, Hebei Normal University of Science & Technology, Qinhuangdao, China
| | - Lichao Zhang
- College of Sciences, Northeastern University, Shenyang, China
| | | | - Jinfeng Lv
- School of Mathematics and Information Science & Technology, Hebei Normal University of Science & Technology, Qinhuangdao, China
| |
Collapse
|
46
|
Chen X, Kost J, Sulovari A, Wong N, Liang WS, Cao J, Li D. A virome-wide clonal integration analysis platform for discovering cancer viral etiology. Genome Res 2019; 29:819-830. [PMID: 30872350 PMCID: PMC6499315 DOI: 10.1101/gr.242529.118] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Accepted: 03/11/2019] [Indexed: 12/31/2022]
Abstract
Oncoviral infection is responsible for 12%–15% of cancer in humans. Convergent evidence from epidemiology, pathology, and oncology suggests that new viral etiologies for cancers remain to be discovered. Oncoviral profiles can be obtained from cancer genome sequencing data; however, widespread viral sequence contamination and noncausal viruses complicate the process of identifying genuine oncoviruses. Here, we propose a novel strategy to address these challenges by performing virome-wide screening of early-stage clonal viral integrations. To implement this strategy, we developed VIcaller, a novel platform for identifying viral integrations that are derived from any characterized viruses and shared by a large proportion of tumor cells using whole-genome sequencing (WGS) data. The sensitivity and precision were confirmed with simulated and benchmark cancer data sets. By applying this platform to cancer WGS data sets with proven or speculated viral etiology, we newly identified or confirmed clonal integrations of hepatitis B virus (HBV), human papillomavirus (HPV), Epstein-Barr virus (EBV), and BK Virus (BKV), suggesting the involvement of these viruses in early stages of tumorigenesis in affected tumors, such as HBV in TERT and KMT2B (also known as MLL4) gene loci in liver cancer, HPV and BKV in bladder cancer, and EBV in non-Hodgkin's lymphoma. We also showed the capacity of VIcaller to identify integrations from some uncharacterized viruses. This is the first study to systematically investigate the strategy and method of virome-wide screening of clonal integrations to identify oncoviruses. Searching clonal viral integrations with our platform has the capacity to identify virus-caused cancers and discover cancer viral etiologies.
Collapse
Affiliation(s)
- Xun Chen
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Jason Kost
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Arvis Sulovari
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Nathalie Wong
- Department of Anatomical and Cellular Pathology, Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, NT, Hong Kong 999077, P.R. China
| | - Winnie S Liang
- Translational Genomics Research Institute, Phoenix, Arizona 85004, USA
| | - Jian Cao
- Division of Medical Oncology, Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08903, USA.,Department of Medicine, Rutgers Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08903, USA
| | - Dawei Li
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA.,Neuroscience, Behavior, and Health Initiative, University of Vermont, Burlington, Vermont 05405, USA.,Department of Computer Science, University of Vermont, Burlington, Vermont 05405, USA
| |
Collapse
|
47
|
Yang W, Zhu XJ, Huang J, Ding H, Lin H. A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization. Curr Bioinform 2019. [DOI: 10.2174/1574893613666181113131415] [Citation(s) in RCA: 111] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Background:The location of proteins in a cell can provide important clues to their functions in various biological processes. Thus, the application of machine learning method in the prediction of protein subcellular localization has become a hotspot in bioinformatics. As one of key organelles, the Golgi apparatus is in charge of protein storage, package, and distribution.Objective:The identification of protein location in Golgi apparatus will provide in-depth insights into their functions. Thus, the machine learning-based method of predicting protein location in Golgi apparatus has been extensively explored. The development of protein sub-Golgi apparatus localization prediction should be reviewed for providing a whole background for the fields.Method:The benchmark dataset, feature extraction, machine learning method and published results were summarized.Results:We briefly introduced the recent progresses in protein sub-Golgi apparatus localization prediction using machine learning methods and discussed their advantages and disadvantages.Conclusion:We pointed out the perspective of machine learning methods in protein sub-Golgi localization prediction.
Collapse
Affiliation(s)
- Wuritu Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan, 610054, China
| | - Xiao-Juan Zhu
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan, 610054, China
| | - Jian Huang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan, 610054, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan, 610054, China
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan, 610054, China
| |
Collapse
|
48
|
Tang CSM, Li P, Lai FPL, Fu AX, Lau ST, So MT, Lui KNC, Li Z, Zhuang X, Yu M, Liu X, Ngo ND, Miao X, Zhang X, Yi B, Tang S, Sun X, Zhang F, Liu H, Liu Q, Zhang R, Wang H, Huang L, Dong X, Tou J, Cheah KSE, Yang W, Yuan Z, Yip KYL, Sham PC, Tam PKH, Garcia-Barcelo MM, Ngan ESW. Identification of Genes Associated With Hirschsprung Disease, Based on Whole-Genome Sequence Analysis, and Potential Effects on Enteric Nervous System Development. Gastroenterology 2018; 155:1908-1922.e5. [PMID: 30217742 DOI: 10.1053/j.gastro.2018.09.012] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/28/2018] [Revised: 08/28/2018] [Accepted: 09/05/2018] [Indexed: 12/27/2022]
Abstract
BACKGROUND & AIMS Hirschsprung disease, or congenital aganglionosis, is believed to be oligogenic-that is, caused by multiple genetic factors. We performed whole-genome sequence analyses of patients with Hirschsprung disease to identify genetic factors that contribute to disease development and analyzed the functional effects of these variants. METHODS We performed whole-genome sequence analyses of 443 patients with short-segment disease, recruited from hospitals in China and Vietnam, and 493 ethnically matched individuals without Hirschsprung disease (controls). We performed genome-wide association analyses and gene-based rare-variant burden tests to identify rare and common disease-associated variants and study their interactions. We obtained induced pluripotent stem cell (iPSC) lines from 4 patients with Hirschsprung disease and 2 control individuals, and we used these to generate enteric neural crest cells for transcriptomic analyses. We assessed the neuronal lineage differentiation capability of iPSC-derived enteric neural crest cells using an in vitro differentiation assay. RESULTS We identified 4 susceptibility loci, including 1 in the phospholipase D1 gene (PLD1) (P = 7.4 × 10-7). The patients had a significant excess of rare protein-altering variants in genes previously associated with Hirschsprung disease and in the β-secretase 2 gene (BACE2) (P = 2.9 × 10-6). The epistatic effects of common and rare variants across these loci provided a sensitized background that increased risk for the disease. In studies of the iPSCs, we observed common and distinct pathways associated with variants in RET that affect risk. In functional assays, we found variants in BACE2 to protect enteric neurons from apoptosis. We propose that alterations in BACE1 signaling via amyloid β precursor protein and BACE2 contribute to pathogenesis of Hirschsprung disease. CONCLUSIONS In whole-genome sequence analyses of patients with Hirschsprung disease, we identified rare and common variants associated with disease risk. Using iPSC cells, we discovered some functional effects of these variants.
Collapse
Affiliation(s)
- Clara Sze-Man Tang
- Department of Surgery, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China; Dr Li Dak-Sum Research Centre, The University of Hong Kong, Pokfulam, Hong Kong, China
| | - Peng Li
- Department of Surgery, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China
| | - Frank Pui-Ling Lai
- Department of Surgery, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China; Dr Li Dak-Sum Research Centre, The University of Hong Kong, Pokfulam, Hong Kong, China
| | - Alexander Xi Fu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Sin-Ting Lau
- Department of Surgery, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China; Dr Li Dak-Sum Research Centre, The University of Hong Kong, Pokfulam, Hong Kong, China
| | - Man Ting So
- Department of Surgery, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China
| | - Kathy Nga-Chu Lui
- Department of Surgery, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China
| | - Zhixin Li
- Department of Surgery, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China; Dr Li Dak-Sum Research Centre, The University of Hong Kong, Pokfulam, Hong Kong, China
| | - Xuehan Zhuang
- Department of Surgery, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China
| | - Michelle Yu
- Department of Surgery, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China
| | - Xuelai Liu
- Hebei Medical University Second Hospital, Shijiazhuang, Hebei, China
| | - Ngoc D Ngo
- National Hospital of Pediatrics, Ha Noi, Viet Nam
| | - Xiaoping Miao
- Department of Epidemiology and Biostatistics, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Xi Zhang
- Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Bin Yi
- Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Shaotao Tang
- Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Xiaobing Sun
- Department of Paediatric Surgery, Shandong Medical University, Shandong, China
| | - Furen Zhang
- Shandong Provincial Institute of Dermatology and Venereology, Shandong Academy of Medical Sciences, Jinan, Shandong, China
| | - Hong Liu
- Shandong Provincial Institute of Dermatology and Venereology, Shandong Academy of Medical Sciences, Jinan, Shandong, China
| | - Qiji Liu
- The Key Laboratory for Experimental Teratology of the Ministry of Education, Shandong University School of Medicine, Jinan, Shandong, China
| | - Ruizhong Zhang
- Guangzhou Women and Children's Medical Center, Guangzhou, Guangdong, China
| | - Hualong Wang
- Changchun Children's Hospital, Changchun, Jilin, China
| | - Liuming Huang
- Bayi Children's Hospital, General Hospital of Beijing Military Region, Beijing, China
| | - Xiao Dong
- Shenzhen Children's Hospital, Shenzhen, Guangdong, China
| | - Jinfa Tou
- Zhejiang Children's Hospital, Hangzhou, Zhejiang, China
| | - Kathryn Song-Eng Cheah
- School of Biological Sciences, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China
| | - Wanling Yang
- Department of Pediatrics and Adolescent Medicine, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China
| | - Zhenwei Yuan
- Department of Paediatric Surgery, Shengjing Hospital, China Medical University, Shenyang, China
| | - Kevin Yuk-Lap Yip
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Pak-Chung Sham
- Department of Psychiatry, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China; Centre for Genomic Sciences, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China
| | - Paul Kwang-Hang Tam
- Department of Surgery, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China
| | - Maria-Mercè Garcia-Barcelo
- Department of Surgery, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China.
| | - Elly Sau-Wai Ngan
- Department of Surgery, Li Ka Shing Faculty of Medicine, University of Hong Kong, Pokfulam, Hong Kong, China.
| |
Collapse
|
49
|
Wang P, Zhu W, Liao B, Cai L, Peng L, Yang J. Predicting Influenza Antigenicity by Matrix Completion With Antigen and Antiserum Similarity. Front Microbiol 2018; 9:2500. [PMID: 30405563 PMCID: PMC6206390 DOI: 10.3389/fmicb.2018.02500] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Accepted: 10/01/2018] [Indexed: 12/20/2022] Open
Abstract
The rapid mutation of influenza viruses especially on the two surface proteins hemagglutinin (HA) and neuraminidase (NA) has made them capable to escape from population immunity, which has become a key challenge for influenza vaccine design. Thus, it is crucial to predict influenza antigenic evolution and identify new antigenic variants in a timely manner. However, traditional experimental methods like hemagglutination inhibition (HI) assay to select vaccine strains are time and labor-intensive, while popular computational methods are less sensitive, which presents the need for more accurate algorithms. In this study, we have proposed a novel low-rank matrix completion model MCAAS to infer antigenic distances between antigens and antisera based on partially revealed antigenic distances, virus similarity based on HA protein sequences, and vaccine similarity based on vaccine strains. The model exploits the correlations of viruses and vaccines in serological tests as well as the ability of HAs from viruses and vaccine strains in inferring influenza antigenicity. We also compared the effects of comprehensive 65 amino acids substitution matrices in predicting influenza antigenicity. As a result, we applied MCAAS into H3N2 seasonal influenza virus data. Our model achieved a 10-fold cross validation root-mean-squared error (RMSE) of 0.5982, significantly outperformed existing computational methods like antigenic cartography, AntigenMap and BMCSI. We also constructed the antigenic map and studied the association between genetic and antigenic evolution of H3N2 influenza viruses. Finally, our analyses showed that homologous structure derived amino acid substitution matrix (HSDM) is most powerful in predicting influenza antigenicity, which is consistent with previous studies.
Collapse
Affiliation(s)
- Peng Wang
- College of Information Science and Engineering, Hunan University, Changsha, Changsha, China
| | - Wen Zhu
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Bo Liao
- College of Information Science and Engineering, Hunan University, Changsha, Changsha, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Lijun Cai
- College of Information Science and Engineering, Hunan University, Changsha, Changsha, China
| | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Jialiang Yang
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine At Mount Sinai, New York, NY, United States
| |
Collapse
|
50
|
Yang H, Lv H, Ding H, Chen W, Lin H. iRNA-2OM: A Sequence-Based Predictor for Identifying 2'-O-Methylation Sites in Homo sapiens. J Comput Biol 2018; 25:1266-1277. [PMID: 30113871 DOI: 10.1089/cmb.2018.0004] [Citation(s) in RCA: 119] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
2'-O-methylation plays an important biological role in gene expression. Owing to the explosive increase in genomic sequencing data, it is necessary to develop a method for quickly and efficiently identifying whether a sequence contains the 2'-O-methylation site. As an additional method to the experimental technique, a computational method may help to identify 2'-O-methylation sites. In this study, based on the experimental 2'-O-methylation data of Homo sapiens, we proposed a support vector machine-based model to predict 2'-O-methylation sites in H. sapiens. In this model, the RNA sequences were encoded with the optimal features obtained from feature selection. In the fivefold cross-validation test, the accuracy reached 97.95%.
Collapse
Affiliation(s)
- Hui Yang
- 1 Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China , Chengdu, China
| | - Hao Lv
- 1 Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China , Chengdu, China
| | - Hui Ding
- 1 Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China , Chengdu, China
| | - Wei Chen
- 1 Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China , Chengdu, China .,2 Department of Physics, School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology , Tangshan, China
| | - Hao Lin
- 1 Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China , Chengdu, China
| |
Collapse
|