1
|
Bing P, Liu W, Zhai Z, Li J, Guo Z, Xiang Y, He B, Zhu L. A novel approach for denoising electrocardiogram signals to detect cardiovascular diseases using an efficient hybrid scheme. Front Cardiovasc Med 2024; 11:1277123. [PMID: 38699582 PMCID: PMC11064874 DOI: 10.3389/fcvm.2024.1277123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 03/25/2024] [Indexed: 05/05/2024] Open
Abstract
Background Electrocardiogram (ECG) signals are inevitably contaminated with various kinds of noises during acquisition and transmission. The presence of noises may produce the inappropriate information on cardiac health, thereby preventing specialists from making correct analysis. Methods In this paper, an efficient strategy is proposed to denoise ECG signals, which employs a time-frequency framework based on S-transform (ST) and combines bi-dimensional empirical mode decomposition (BEMD) and non-local means (NLM). In the method, the ST maps an ECG signal into a subspace in the time frequency domain, then the BEMD decomposes the ST-based time-frequency representation (TFR) into a series of sub-TFRs at different scales, finally the NLM removes noise and restores ECG signal characteristics based on structural self-similarity. Results The proposed method is validated using numerous ECG signals from the MIT-BIH arrhythmia database, and several different types of noises with varying signal-to-noise (SNR) are taken into account. The experimental results show that the proposed technique is superior to the existing wavelet based approach and NLM filtering, with the higher SNR and structure similarity index measure (SSIM), the lower root mean squared error (RMSE) and percent root mean square difference (PRD). Conclusions The proposed method not only significantly suppresses the noise presented in ECG signals, but also preserves the characteristics of ECG signals better, thus, it is more suitable for ECG signals processing.
Collapse
Affiliation(s)
- Pingping Bing
- Hunan Provincial Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha, China
| | - Wei Liu
- College of Mechanical and Electrical Engineering, Beijing University of Chemical Technology, Beijing, China
| | - Zhixing Zhai
- College of Mechanical and Electrical Engineering, Beijing University of Chemical Technology, Beijing, China
| | - Jianghao Li
- Hunan Provincial Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha, China
| | - Zhiqun Guo
- Hunan Provincial Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha, China
| | - Yanrui Xiang
- Hunan Provincial Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha, China
| | - Binsheng He
- Hunan Provincial Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha, China
| | - Lemei Zhu
- Hunan Provincial Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha, China
| |
Collapse
|
2
|
Zhu L, Xu R, Yang L, Shi W, Zhang Y, Liu J, Li X, Zhou J, Bing P. Minimal residual disease (MRD) detection in solid tumors using circulating tumor DNA: a systematic review. Front Genet 2023; 14:1172108. [PMID: 37636270 PMCID: PMC10448395 DOI: 10.3389/fgene.2023.1172108] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 04/20/2023] [Indexed: 08/29/2023] Open
Abstract
Minimal residual disease (MRD) refers to a very small number of residual tumor cells in the body during or after treatment, representing the persistence of the tumor and the possibility of clinical progress. Circulating tumor DNA (ctDNA) is a DNA fragment actively secreted by tumor cells or released into the circulatory system during the process of apoptosis or necrosis of tumor cells, which emerging as a non-invasive biomarker to dynamically monitor the therapeutic effect and prediction of recurrence. The feasibility of ctDNA as MRD detection and the revolution in ctDNA-based liquid biopsies provides a potential method for cancer monitoring. In this review, we summarized the main methods of ctDNA detection (PCR-based Sequencing and Next-Generation Sequencing) and their advantages and disadvantages. Additionally, we reviewed the significance of ctDNA analysis to guide the adjuvant therapy and predict the relapse of lung, breast and colon cancer et al. Finally, there are still many challenges of MRD detection, such as lack of standardization, false-negatives or false-positives results make misleading, and the requirement of validation using large independent cohorts to improve clinical outcomes.
Collapse
Affiliation(s)
- Lemei Zhu
- Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha, China
- Academician Workstation, Changsha Medical University, Changsha, China
- School of Public Health, Changsha Medical University, Changsha, China
| | - Ran Xu
- Geneis Beijing Co., Ltd., Beijing, China
| | | | - Wei Shi
- Geneis Beijing Co., Ltd., Beijing, China
| | - Yuan Zhang
- Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha, China
- Academician Workstation, Changsha Medical University, Changsha, China
- School of Public Health, Changsha Medical University, Changsha, China
| | - Juan Liu
- Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha, China
- Academician Workstation, Changsha Medical University, Changsha, China
- School of Public Health, Changsha Medical University, Changsha, China
| | - Xi Li
- Department of Orthopedics, Xiangya Hospital Central South University, Changsha, China
| | - Jun Zhou
- Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha, China
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Pingping Bing
- Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha, China
- Academician Workstation, Changsha Medical University, Changsha, China
| |
Collapse
|
3
|
Liu H, Bing P, Zhang M, Tian G, Ma J, Li H, Bao M, He K, He J, He B, Yang J. MNNMDA: Predicting human microbe-disease association via a method to minimize matrix nuclear norm. Comput Struct Biotechnol J 2023; 21:1414-1423. [PMID: 36824227 PMCID: PMC9941872 DOI: 10.1016/j.csbj.2022.12.053] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 12/29/2022] [Accepted: 12/30/2022] [Indexed: 01/03/2023] Open
Abstract
Identifying the potential associations between microbes and diseases is the first step for revealing the pathological mechanisms of microbe-associated diseases. However, traditional culture-based microbial experiments are expensive and time-consuming. Thus, it is critical to prioritize disease-associated microbes by computational methods for further experimental validation. In this study, we proposed a novel method called MNNMDA, to predict microbe-disease associations (MDAs) by applying a Matrix Nuclear Norm method into known microbe and disease data. Specifically, we first calculated Gaussian interaction profile kernel similarity and functional similarity for diseases and microbes. Then we constructed a heterogeneous information network by combining the integrated disease similarity network, the integrated microbe similarity network and the known microbe-disease bipartite network. Finally, we formulated the microbe-disease association prediction problem as a low-rank matrix completion problem, which was solved by minimizing the nuclear norm of a matrix with a few regularization terms. We tested the performances of MNNMDA in three datasets including HMDAD, Disbiome, and Combined Data with small, medium and large sizes respectively. We also compared MNNMDA with 5 state-of-the-art methods including KATZHMDA, LRLSHMDA, NTSHMDA, GATMDA, and KGNMDA, respectively. MNNMDA achieved area under the ROC curves (AUROC) of 0.9536 and 0.9364 respectively on HDMAD and Disbiome, better than the AUCs of compared methods under the 5-fold cross-validation for all microbe-disease associations. It also obtained a relatively good performance with AUROC 0.8858 in the combined data. In addition, MNNMDA was also better than other methods in area under precision and recall curve (AUPR) under the 5-fold cross-validation for all associations, and in both AUROC and AUPR under the 5-fold cross-validation for diseases and the 5-fold cross-validation for microbes. Finally, the case studies on colon cancer and inflammatory bowel disease (IBD) also validated the effectiveness of MNNMDA. In conclusion, MNNMDA is an effective method in predicting microbe-disease associations. Availability The codes and data for this paper are freely available at Github https://github.com/Haiyan-Liu666/MNNMDA.
Collapse
Affiliation(s)
- Haiyan Liu
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,College of Information Engineering, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China
| | - Meijun Zhang
- Geneis Beijing Co., Ltd., Beijing 100102, PR China
| | - Geng Tian
- Geneis Beijing Co., Ltd., Beijing 100102, PR China
| | - Jun Ma
- College of Information Engineering, Changsha Medical University, Changsha 410219, PR China
| | - Haigang Li
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China
| | - Meihua Bao
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China
| | - Kunhui He
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China
| | - Jianjun He
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China,Corresponding authors at: Academician Workstation, Changsha Medical University, Changsha 410219, PR China.
| | - Binsheng He
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China,Corresponding authors at: Academician Workstation, Changsha Medical University, Changsha 410219, PR China.
| | - Jialiang Yang
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,Geneis Beijing Co., Ltd., Beijing 100102, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China,Corresponding authors at: Academician Workstation, Changsha Medical University, Changsha 410219, PR China.
| |
Collapse
|
4
|
He B, Wang K, Xiang J, Bing P, Tang M, Tian G, Guo C, Xu M, Yang J. DGHNE: network enhancement-based method in identifying disease-causing genes through a heterogeneous biomedical network. Brief Bioinform 2022; 23:6712302. [PMID: 36151744 DOI: 10.1093/bib/bbac405] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 08/01/2022] [Accepted: 08/21/2022] [Indexed: 12/14/2022] Open
Abstract
The identification of disease-causing genes is critical for mechanistic understanding of disease etiology and clinical manipulation in disease prevention and treatment. Yet the existing approaches in tackling this question are inadequate in accuracy and efficiency, demanding computational methods with higher identification power. Here, we proposed a new method called DGHNE to identify disease-causing genes through a heterogeneous biomedical network empowered by network enhancement. First, a disease-disease association network was constructed by the cosine similarity scores between phenotype annotation vectors of diseases, and a new heterogeneous biomedical network was constructed by using disease-gene associations to connect the disease-disease network and gene-gene network. Then, the heterogeneous biomedical network was further enhanced by using network embedding based on the Gaussian random projection. Finally, network propagation was used to identify candidate genes in the enhanced network. We applied DGHNE together with five other methods into the most updated disease-gene association database termed DisGeNet. Compared with all other methods, DGHNE displayed the highest area under the receiver operating characteristic curve and the precision-recall curve, as well as the highest precision and recall, in both the global 5-fold cross-validation and predicting new disease-gene associations. We further performed DGHNE in identifying the candidate causal genes of Parkinson's disease and diabetes mellitus, and the genes connecting hyperglycemia and diabetes mellitus. In all cases, the predicted causing genes were enriched in disease-associated gene ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways, and the gene-disease associations were highly evidenced by independent experimental studies.
Collapse
Affiliation(s)
- Binsheng He
- Academician Workstation, Changsha Medical University, Changsha 410219, China.,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, P. R. China.,School of pharmacy, Changsha Medical University, Changsha 410219, P. R. China
| | - Kun Wang
- School of Mathematical Sciences, Ocean University of China, Qingdao 266100, China
| | - Ju Xiang
- Academician Workstation, Changsha Medical University, Changsha 410219, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha 410219, China.,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, P. R. China.,School of pharmacy, Changsha Medical University, Changsha 410219, P. R. China
| | - Min Tang
- School of Life Sciences, Jiangsu University, Zhenjiang 212001, Jiangsu, China
| | - Geng Tian
- Geneis (Beijing) Co., Ltd., Beijing 100102, China
| | - Cheng Guo
- Center for Infection and Immunity, Mailman School of Public Health, Columbia University, New York, NY, 10032, USA
| | - Miao Xu
- Broad institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA
| | - Jialiang Yang
- Academician Workstation, Changsha Medical University, Changsha 410219, China.,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, P. R. China.,School of pharmacy, Changsha Medical University, Changsha 410219, P. R. China.,Geneis (Beijing) Co., Ltd., Beijing 100102, China
| |
Collapse
|
5
|
Bing P, Liu Y, Liu W, Zhou J, Zhu L. Electrocardiogram classification using TSST-based spectrogram and ConViT. Front Cardiovasc Med 2022; 9:983543. [PMID: 36299867 PMCID: PMC9590285 DOI: 10.3389/fcvm.2022.983543] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 09/22/2022] [Indexed: 11/13/2022] Open
Abstract
As an important auxiliary tool of arrhythmia diagnosis, Electrocardiogram (ECG) is frequently utilized to detect a variety of cardiovascular diseases caused by arrhythmia, such as cardiac mechanical infarction. In the past few years, the classification of ECG has always been a challenging problem. This paper presents a novel deep learning model called convolutional vision transformer (ConViT), which combines vision transformer (ViT) with convolutional neural network (CNN), for ECG arrhythmia classification, in which the unique soft convolutional inductive bias of gated positional self-attention (GPSA) layers integrates the superiorities of attention mechanism and convolutional architecture. Moreover, the time-reassigned synchrosqueezing transform (TSST), a newly developed time-frequency analysis (TFA) method where the time-frequency coefficients are reassigned in the time direction, is employed to sharpen pulse traits for feature extraction. Aiming at the class imbalance phenomena in the traditional ECG database, the smote algorithm and focal loss (FL) are used for data augmentation and minority-class weighting, respectively. The experiment using MIT-BIH arrhythmia database indicates that the overall accuracy of the proposed model is as high as 99.5%. Furthermore, the specificity (Spe), F1-Score and positive Matthews Correlation Coefficient (MCC) of supra ventricular ectopic beat (S) and ventricular ectopic beat (V) are all more than 94%. These results demonstrate that the proposed method is superior to most of the existing methods.
Collapse
Affiliation(s)
- Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Yang Liu
- College of Mechanical and Electrical Engineering, Beijing University of Chemical Technology, Beijing, China
| | - Wei Liu
- College of Mechanical and Electrical Engineering, Beijing University of Chemical Technology, Beijing, China
| | - Jun Zhou
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Lemei Zhu
- Academician Workstation, Changsha Medical University, Changsha, China
| |
Collapse
|
6
|
Zhang Y, Lin L, Wu Y, Bing P, Zhou J, Yu W. Upregulation of TIMM8A is correlated with prognosis and immune regulation in BC. Front Oncol 2022; 12:922178. [PMID: 36248992 PMCID: PMC9559820 DOI: 10.3389/fonc.2022.922178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Accepted: 09/12/2022] [Indexed: 11/25/2022] Open
Abstract
Backgrounds Breast cancer is a common malignant tumors in women. TIMM8A was up-regulated in different cancers. The aim of this work was to clarify the value of TIMM8A in the diagnosis, prognosis of Breast Cancer (BC), and its association with immune cells and immune detection points. Gene mutations. Methods The transcription and expression profile of TIMM8A between BC and normal tissues was downloaded from The Cancer Genome atlas (TCGA). The expression of TIMM8A protein was evaluated by human protein map. The correlation between TIMM8A and clinical features was analyzed using the R package to establish a ROC diagnostic curve. cBioPortal and MethSurv were used to identify gene alterations and DNA methylation and their effects on prognosis. The tumor immune estimation resource (TIMER) database and tumor immune system interaction database (TISIDB) database were used to determine the relationship between TIMM8A gene expression levels and immune infiltration. The CTD database was used to predict related drugs that inhibit TIMM8A, and the PubChem database was used to determine the molecular structure of potentially effective drug small molecules. Results The expression of TIMM8A in breast cancer tissues was significantly higher than that in normally adjacent tissues to cancer. ROC curve analysis showed that the AUC value of TIMM8A was 0.679. Kaplan-Meier method showed that patients with high TIMM8A had a lower prognosis (Overall Survival HR = 1.83 (1.31 − 2.54), P < 0.001) than patients with low TIMM8A expression of breast cancer (148.5 months vs. 115.4 months, P < 0.001). Methylation levels at seven CpG were associated with prognosis. Correlation analysis showed that TIMM8A expression was associated with tumor immune cell infiltration. There was a significant positive correlation of TIMM8A with PDL-1, and CTLA-4 in BC. In addition, CTD database analysis identified 15 small molecular drugs that target TIMM8A, such as Cyclosporine, Leflunomide, and Tretinoin, which might be effective therapies for targeted inhibition of TIMM8A. Conclusion In breast cancer, up-regulated TIMM 8A was significantly related to lower survival rate and higher immune invasiveness. Our research showed that TIMM 8A could be used as a biomarker for poor prognosis of breast cancer and a potential target of immunotherapy.
Collapse
Affiliation(s)
- Yu Zhang
- Endoscopy Center, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, China
| | - Lin Lin
- Department of Medical Oncology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, China
| | - Yunfei Wu
- Department of Thoracic Surgery, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Jun Zhou
- Academician Workstation, Changsha Medical University, Changsha, China
- *Correspondence: Jun Zhou, ; Wei Yu,
| | - Wei Yu
- Department of Clinical Pharmacy, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, China
- *Correspondence: Jun Zhou, ; Wei Yu,
| |
Collapse
|
7
|
Xu J, Cui L, Zhuang J, Meng Y, Bing P, He B, Tian G, Kwok Pui C, Wu T, Wang B, Yang J. Evaluating the performance of dropout imputation and clustering methods for single-cell RNA sequencing data. Comput Biol Med 2022; 146:105697. [PMID: 35697529 DOI: 10.1016/j.compbiomed.2022.105697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 05/16/2022] [Accepted: 06/04/2022] [Indexed: 11/03/2022]
Abstract
Recent advances in single-cell RNA sequencing (scRNA-seq) provide exciting opportunities for transcriptome analysis at single-cell resolution. Clustering individual cells is a key step to reveal cell subtypes and infer cell lineage in scRNA-seq analysis. Although many dedicated algorithms have been proposed, clustering quality remains a computational challenge for scRNA-seq data, which is exacerbated by inflated zero counts due to various technical noise. To address this challenge, we assess the combinations of nine popular dropout imputation methods and eight clustering methods on a collection of 10 well-annotated scRNA-seq datasets with different sample sizes. Our results show that (i) imputation algorithms do typically improve the performance of clustering methods, and the quality of data visualization using t-Distributed Stochastic Neighbor Embedding; and (ii) the performance of a particular combination of imputation and clustering methods varies with dataset size. For example, the combination of single-cell analysis via expression recovery and Sparse Subspace Clustering (SSC) methods usually works well on smaller datasets, while the combination of adaptively-thresholded low-rank approximation and single-cell interpretation via multikernel learning (SIMLR) usually achieves the best performance on larger datasets.
Collapse
Affiliation(s)
- Junlin Xu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, 410082, China
| | - Lingyu Cui
- College of Life Science, Northeast Forestry University, Harbin, Heilongjiang, 150000, China
| | - Jujuan Zhuang
- School of Science, Dalian Maritime University, Dalian, Liaoning, 116026, China
| | - Yajie Meng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, 410082, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha, 410219, China
| | - Binsheng He
- Academician Workstation, Changsha Medical University, Changsha, 410219, China
| | - Geng Tian
- Geneis Beijing Co., Ltd., Beijing, 100102, China; Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, 266000, China
| | - Choi Kwok Pui
- Department of Statistics and Data Science, Department of Mathematics, National University of Singapore, Singapore, 117546, Republic of Singapore
| | - Taoyang Wu
- School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK
| | - Bing Wang
- School of Electrical & Information Engineering, Anhui University of Technology, Anhui, 243002, China.
| | - Jialiang Yang
- Geneis Beijing Co., Ltd., Beijing, 100102, China; Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, 266000, China.
| |
Collapse
|
8
|
Bing P, Zhou W, Tan S. Study on the Mechanism of Astragalus Polysaccharide in Treating Pulmonary Fibrosis Based on "Drug-Target-Pathway" Network. Front Pharmacol 2022; 13:865065. [PMID: 35370663 PMCID: PMC8964346 DOI: 10.3389/fphar.2022.865065] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 02/16/2022] [Indexed: 02/01/2023] Open
Abstract
Pulmonary fibrosis is a chronic, progressive and irreversible heterogeneous disease of pulmonary interstitial tissue. Its incidence is increasing year by year in the world, and it will be further increased due to the pandemic of COVID-19. However, at present, there is no safe and effective treatment for this disease, so it is very meaningful to find drugs with high efficiency and less adverse reactions. The natural astragalus polysaccharide has the pharmacological effect of anti-pulmonary fibrosis with little toxic and side effects. At present, the mechanism of anti-pulmonary fibrosis of astragalus polysaccharide is not clear. Based on the network pharmacology and molecular docking method, this study analyzes the mechanism of Astragalus polysaccharides in treating pulmonary fibrosis, which provides a theoretical basis for its further clinical application. The active components of Astragalus polysaccharides were screened out by Swisstarget database, and the related targets of pulmonary fibrosis were screened out by GeneCards database. Protein-protein interaction network analysis and molecular docking were carried out to verify the docking affinity of active ingredients. At present, through screening, we have obtained 92 potential targets of Astragalus polysaccharides for treating pulmonary fibrosis, including 11 core targets. Astragalus polysaccharides has the characteristics of multi-targets and multi-pathways, and its mechanism of action may be through regulating the expression of VCAM1, RELA, CDK2, JUN, CDK1, HSP90AA1, NOS2, SOD1, CASP3, AHSA1, PTGER3 and other genes during the development of pulmonary fibrosis.
Collapse
Affiliation(s)
- Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Wenhu Zhou
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Songwen Tan
- Academician Workstation, Changsha Medical University, Changsha, China
| |
Collapse
|
9
|
Xie X, Wang X, Liang Y, Yang J, Wu Y, Li L, Sun X, Bing P, He B, Tian G, Shi X. Evaluating Cancer-Related Biomarkers Based on Pathological Images: A Systematic Review. Front Oncol 2021; 11:763527. [PMID: 34900711 PMCID: PMC8660076 DOI: 10.3389/fonc.2021.763527] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 10/18/2021] [Indexed: 12/12/2022] Open
Abstract
Many diseases are accompanied by changes in certain biochemical indicators called biomarkers in cells or tissues. A variety of biomarkers, including proteins, nucleic acids, antibodies, and peptides, have been identified. Tumor biomarkers have been widely used in cancer risk assessment, early screening, diagnosis, prognosis, treatment, and progression monitoring. For example, the number of circulating tumor cell (CTC) is a prognostic indicator of breast cancer overall survival, and tumor mutation burden (TMB) can be used to predict the efficacy of immune checkpoint inhibitors. Currently, clinical methods such as polymerase chain reaction (PCR) and next generation sequencing (NGS) are mainly adopted to evaluate these biomarkers, which are time-consuming and expansive. Pathological image analysis is an essential tool in medical research, disease diagnosis and treatment, functioning by extracting important physiological and pathological information or knowledge from medical images. Recently, deep learning-based analysis on pathological images and morphology to predict tumor biomarkers has attracted great attention from both medical image and machine learning communities, as this combination not only reduces the burden on pathologists but also saves high costs and time. Therefore, it is necessary to summarize the current process of processing pathological images and key steps and methods used in each process, including: (1) pre-processing of pathological images, (2) image segmentation, (3) feature extraction, and (4) feature model construction. This will help people choose better and more appropriate medical image processing methods when predicting tumor biomarkers.
Collapse
Affiliation(s)
- Xiaoliang Xie
- Department of Colorectal Surgery, General Hospital of Ningxia Medical University, Yinchuan, China.,College of Clinical Medicine, Ningxia Medical University, Yinchuan, China
| | - Xulin Wang
- Department of Oncology Surgery, Central Hospital of Jia Mu Si City, Jia Mu Si, China
| | - Yuebin Liang
- Geneis Beijing Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Jingya Yang
- Geneis Beijing Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China.,School of Electrical and Information Engineering, Anhui University of Technology, Ma'anshan, China
| | - Yan Wu
- Geneis Beijing Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Li Li
- Beijing Shanghe Jiye Biotech Co., Ltd., Bejing, China
| | - Xin Sun
- Department of Medical Affairs, Central Hospital of Jia Mu Si City, Jia Mu Si, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Binsheng He
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Geng Tian
- Geneis Beijing Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China.,IBMC-BGI Center, T`he Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou, China
| | - Xiaoli Shi
- Geneis Beijing Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| |
Collapse
|
10
|
Yang Z, Zhou Y, Li H, Lei J, Bing P, He B, Li Y. A Facile Route to Pyrazolo[1,2‐a]cinnoline via Rhodium(III)‐catalyzed Annulation of Pyrazolidinoes and Iodonium Ylides. ASIAN J ORG CHEM 2021. [DOI: 10.1002/ajoc.202100656] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Zi Yang
- Academician Workstation Changsha Medical University Changsha 410219 P. R. China
| | - Yi Zhou
- Academician Workstation Changsha Medical University Changsha 410219 P. R. China
| | - Haigang Li
- Academician Workstation Changsha Medical University Changsha 410219 P. R. China
- Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations Changsha Medical University Changsha 410219 P. R. China
| | - Jieni Lei
- Academician Workstation Changsha Medical University Changsha 410219 P. R. China
| | - Pingping Bing
- Academician Workstation Changsha Medical University Changsha 410219 P. R. China
| | - Binsheng He
- Academician Workstation Changsha Medical University Changsha 410219 P. R. China
| | - Yaqian Li
- Academician Workstation Changsha Medical University Changsha 410219 P. R. China
| |
Collapse
|
11
|
Lang J, Zhu R, Sun X, Zhu S, Li T, Shi X, Sun Y, Yang Z, Wang W, Bing P, He B, Tian G. Evaluation of the MGISEQ-2000 Sequencing Platform for Illumina Target Capture Sequencing Libraries. Front Genet 2021; 12:730519. [PMID: 34777467 PMCID: PMC8578046 DOI: 10.3389/fgene.2021.730519] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 09/24/2021] [Indexed: 01/19/2023] Open
Abstract
Illumina is the leading sequencing platform in the next-generation sequencing (NGS) market globally. In recent years, MGI Tech has presented a series of new sequencers, including DNBSEQ-T7, MGISEQ-2000 and MGISEQ-200. As a complex application of NGS, cancer-detecting panels pose increasing demands for the high accuracy and sensitivity of sequencing and data analysis. In this study, we used the same capture DNA libraries constructed based on the Illumina protocol to evaluate the performance of the Illumina Nextseq500 and MGISEQ-2000 sequencing platforms. We found that the two platforms had high consistency in the results of hotspot mutation analysis; more importantly, we found that there was a significant loss of fragments in the 101-133 bp size range on the MGISEQ-2000 sequencing platform for Illumina libraries, but not for the capture DNA libraries prepared based on the MGISEQ protocol. This phenomenon may indicate fragment selection or low fragment ligation efficiency during the DNA circularization step, which is a unique step of the MGISEQ-2000 sequence platform. In conclusion, these different sequencing libraries and corresponding sequencing platforms are compatible with each other, but protocol and platform selection need to be carefully evaluated in combination with research purpose.
Collapse
Affiliation(s)
- Jidong Lang
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China.,Academician Workstation, Changsha Medical University, Changsha, China
| | - Rongrong Zhu
- Vascular Surgery Department, Tsinghua University Affiliated Beijing Tsinghua Changgung Hospital, Beijing, China
| | - Xue Sun
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Siyu Zhu
- Department of Medicine, School of Medicine, University of California at San Diego, La Jolla, CA, United States
| | - Tianbao Li
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Xiaoli Shi
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Yanqi Sun
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Zhou Yang
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Weiwei Wang
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Binsheng He
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Geng Tian
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| |
Collapse
|
12
|
He B, Hou F, Ren C, Bing P, Xiao X. A Review of Current In Silico Methods for Repositioning Drugs and Chemical Compounds. Front Oncol 2021; 11:711225. [PMID: 34367996 PMCID: PMC8340770 DOI: 10.3389/fonc.2021.711225] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Accepted: 07/07/2021] [Indexed: 12/23/2022] Open
Abstract
Drug repositioning is a new way of applying the existing therapeutics to new disease indications. Due to the exorbitant cost and high failure rate in developing new drugs, the continued use of existing drugs for treatment, especially anti-tumor drugs, has become a widespread practice. With the assistance of high-throughput sequencing techniques, many efficient methods have been proposed and applied in drug repositioning and individualized tumor treatment. Current computational methods for repositioning drugs and chemical compounds can be divided into four categories: (i) feature-based methods, (ii) matrix decomposition-based methods, (iii) network-based methods, and (iv) reverse transcriptome-based methods. In this article, we comprehensively review the widely used methods in the above four categories. Finally, we summarize the advantages and disadvantages of these methods and indicate future directions for more sensitive computational drug repositioning methods and individualized tumor treatment, which are critical for further experimental validation.
Collapse
Affiliation(s)
- Binsheng He
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Fangxing Hou
- Queen Mary School, Nanchang University, Jiangxi, China
| | - Changjing Ren
- School of Science, Dalian Maritime University, Dalian, China.,Genies Beijing Co., Ltd., Beijing, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Xiangzuo Xiao
- Department of Radiology, The First Affiliated Hospital of Nanchang University, Jiangxi, China
| |
Collapse
|
13
|
Liu H, Qiu C, Wang B, Bing P, Tian G, Zhang X, Ma J, He B, Yang J. Evaluating DNA Methylation, Gene Expression, Somatic Mutation, and Their Combinations in Inferring Tumor Tissue-of-Origin. Front Cell Dev Biol 2021; 9:619330. [PMID: 34012960 PMCID: PMC8126648 DOI: 10.3389/fcell.2021.619330] [Citation(s) in RCA: 70] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 03/22/2021] [Indexed: 12/18/2022] Open
Abstract
Carcinoma of unknown primary (CUP) is a type of metastatic cancer, the primary tumor site of which cannot be identified. CUP occupies approximately 5% of cancer incidences in the United States with usually unfavorable prognosis, making it a big threat to public health. Traditional methods to identify the tissue-of-origin (TOO) of CUP like immunohistochemistry can only deal with around 20% CUP patients. In recent years, more and more studies suggest that it is promising to solve the problem by integrating machine learning techniques with big biomedical data involving multiple types of biomarkers including epigenetic, genetic, and gene expression profiles, such as DNA methylation. Different biomarkers play different roles in cancer research; for example, genomic mutations in a patient’s tumor could lead to specific anticancer drugs for treatment; DNA methylation and copy number variation could reveal tumor tissue of origin and molecular classification. However, there is no systematic comparison on which biomarker is better at identifying the cancer type and site of origin. In addition, it might also be possible to further improve the inference accuracy by integrating multiple types of biomarkers. In this study, we used primary tumor data rather than metastatic tumor data. Although the use of primary tumors may lead to some biases in our classification model, their tumor-of-origins are known. In addition, previous studies have suggested that the CUP prediction model built from primary tumors could efficiently predict TOO of metastatic cancers (Lal et al., 2013; Brachtel et al., 2016). We systematically compared the performances of three types of biomarkers including DNA methylation, gene expression profile, and somatic mutation as well as their combinations in inferring the TOO of CUP patients. First, we downloaded the gene expression profile, somatic mutation and DNA methylation data of 7,224 tumor samples across 21 common cancer types from the cancer genome atlas (TCGA) and generated seven different feature matrices through various combinations. Second, we performed feature selection by the Pearson correlation method. The selected features for each matrix were used to build up an XGBoost multi-label classification model to infer cancer TOO, an algorithm proven to be effective in a few previous studies. The performance of each biomarker and combination was compared by the 10-fold cross-validation process. Our results showed that the TOO tracing accuracy using gene expression profile was the highest, followed by DNA methylation, while somatic mutation performed the worst. Meanwhile, we found that simply combining multiple biomarkers does not have much effect in improving prediction accuracy.
Collapse
Affiliation(s)
- Haiyan Liu
- Academician Workstation, Changsha Medical University, Changsha, China.,College of Information Engineering, Changsha Medical University, Changsha, China
| | - Chun Qiu
- Department of Oncology, Hainan General Hospital, Haikou, China
| | - Bo Wang
- Geneis Beijing Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Geng Tian
- Geneis Beijing Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Xueliang Zhang
- Department of Oncology, Jiamusi Cancer Hospital, Jiamusi, China
| | - Jun Ma
- College of Information Engineering, Changsha Medical University, Changsha, China
| | - Bingsheng He
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Jialiang Yang
- Academician Workstation, Changsha Medical University, Changsha, China.,Geneis Beijing Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| |
Collapse
|
14
|
Abstract
Background:
Thymidylate Synthase (TS) is an important target for folic acid inhibitors
such as pemetrexed, which has considerable effects on the first-line treatment, second-line
treatment and maintenance therapy for patients with late-stage Non-Small Cell Lung Cancer
(NSCLC). Therefore, detecting mutations in the TYMS gene encoding TS is critical in clinical
applications. With the development of Next-Generation Sequencing (NGS) technology, the
accuracy of TYMS mutation detection is getting higher and higher. However, traditional methods
suffer from false positives and false-negatives caused by factors like limited sequencing read
length and sequencing errors.
Objective:
A method was needed to overcome the short sequencing read length and sequencing
errors of NGS to make the detection of TYMS more accurate.
Methods:
In this study, we developed a novel method based on "Paired Seed Sequence Distance”
(PSSD) to detect the Variable Number of Tandem Repeat (VNTR) mutation for TYMS.
Results:
With the 121 samples validated by sanger, the consistency rate of PSSD method was
85.95% (104/121), higher than the strict matching method (78.51% (95/121)). The consistency rate
of the two methods was 89.26% (108/121). We also found that the PSSD method was significantly
better than the strict matching method, especially in the 4R typing.
Conclusion:
Our method not only improves the detection rate and accuracy of TYMS VNTR
mutations but also avoids problems caused by sequencing errors and limited sequencing length.
This method provides a new solution for similar polymorphism analyses and other sequencing
analyses.
Collapse
Affiliation(s)
- Binsheng He
- Academician Workstation, Changsha Medical University, Changsha 410219, China
| | - Jialiang Yang
- Academician Workstation, Changsha Medical University, Changsha 410219, China
| | - Geng Tian
- Geneis (Beijing) Co. Ltd., Beijing, 100102, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha 410219, China
| | - Jidong Lang
- Geneis (Beijing) Co. Ltd., Beijing, 100102, China
| |
Collapse
|
15
|
He B, Dai C, Lang J, Bing P, Tian G, Wang B, Yang J. A machine learning framework to trace tumor tissue-of-origin of 13 types of cancer based on DNA somatic mutation. Biochim Biophys Acta Mol Basis Dis 2020; 1866:165916. [PMID: 32771416 DOI: 10.1016/j.bbadis.2020.165916] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 07/20/2020] [Accepted: 08/03/2020] [Indexed: 12/13/2022]
Abstract
Carcinoma of unknown primary (CUP), defined as metastatic cancers with unknown cancer origin, occurs in 3-5 per 100 cancer patients in the United States. Heterogeneity and metastasis of cancer brings great difficulties to the follow-up diagnosis and treatment for CUP. To find the tissue-of-origin (TOO) of the CUP, multiple methods have been raised. However, the accuracies for computed tomography (CT) and positron emission tomography (PET) to identify TOO were 20%-27% and 24%-40% respectively, which were not enough for determining targeted therapies. In this study, we provide a machine learning framework to trace tumor tissue origin by using gene length-normalized somatic mutation sequencing data. Somatic mutation data was downloaded from the Data Portal (Release 28) of the International Cancer Genome Consortium (ICGC), and 4909 samples for 13 cancers was used to identify primary site of cancers. Optimal results were obtained based on a 600-gene set by using the random forest algorithm with 10-fold cross-validation, and the average accuracy and F1-score were 0.8822 and 0.8886 respectively across 13 types of cancer. In conclusion, we provide an effective computational framework to infer cancer tissue-of-origin by combining DNA sequencing and machine learning techniques, which is promising in assisting clinical diagnosis of cancers.
Collapse
Affiliation(s)
- Bingsheng He
- Academician Workstation, Changsha Medical University, Changsha 410219, China.
| | - Chan Dai
- Geneis Beijing Co., Ltd., Beijing 100102, China
| | - Jidong Lang
- Geneis Beijing Co., Ltd., Beijing 100102, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha 410219, China
| | - Geng Tian
- Geneis Beijing Co., Ltd., Beijing 100102, China
| | - Bo Wang
- Geneis Beijing Co., Ltd., Beijing 100102, China.
| | - Jialiang Yang
- Academician Workstation, Changsha Medical University, Changsha 410219, China; Geneis Beijing Co., Ltd., Beijing 100102, China.
| |
Collapse
|
16
|
He B, Lu Q, Lang J, Yu H, Peng C, Bing P, Li S, Zhou Q, Liang Y, Tian G. A New Method for CTC Images Recognition Based on Machine Learning. Front Bioeng Biotechnol 2020; 8:897. [PMID: 32850745 PMCID: PMC7423836 DOI: 10.3389/fbioe.2020.00897] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 07/13/2020] [Indexed: 12/18/2022] Open
Abstract
Circulating tumor cells (CTCs) derived from primary tumors and/or metastatic tumors are markers for tumor prognosis, and can also be used to monitor therapeutic efficacy and tumor recurrence. Circulating tumor cells enrichment and screening can be automated, but the final counting of CTCs currently requires manual intervention. This not only requires the participation of experienced pathologists, but also easily causes artificial misjudgment. Medical image recognition based on machine learning can effectively reduce the workload and improve the level of automation. So, we use machine learning to identify CTCs. First, we collected the CTC test results of 600 patients. After immunofluorescence staining, each picture presented a positive CTC cell nucleus and several negative controls. The images of CTCs were then segmented by image denoising, image filtering, edge detection, image expansion and contraction techniques using python’s openCV scheme. Subsequently, traditional image recognition methods and machine learning were used to identify CTCs. Machine learning algorithms are implemented using convolutional neural network deep learning networks for training. We took 2300 cells from 600 patients for training and testing. About 1300 cells were used for training and the others were used for testing. The sensitivity and specificity of recognition reached 90.3 and 91.3%, respectively. We will further revise our models, hoping to achieve a higher sensitivity and specificity.
Collapse
Affiliation(s)
- Binsheng He
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Qingqing Lu
- Geneis (Beijing) Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Jidong Lang
- Geneis (Beijing) Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Hai Yu
- Geneis (Beijing) Co., Ltd., Beijing, China
| | - Chao Peng
- Geneis (Beijing) Co., Ltd., Beijing, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Shijun Li
- Department of Pathology, Chifeng Municipal Hospital, Chifeng, China
| | - Qiliang Zhou
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Yuebin Liang
- Geneis (Beijing) Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Geng Tian
- Geneis (Beijing) Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| |
Collapse
|
17
|
He B, Zhang Y, Zhou Z, Wang B, Liang Y, Lang J, Lin H, Bing P, Yu L, Sun D, Luo H, Yang J, Tian G. A Neural Network Framework for Predicting the Tissue-of-Origin of 15 Common Cancer Types Based on RNA-Seq Data. Front Bioeng Biotechnol 2020; 8:737. [PMID: 32850691 PMCID: PMC7419649 DOI: 10.3389/fbioe.2020.00737] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 06/10/2020] [Indexed: 12/19/2022] Open
Abstract
Sequencing-based identification of tumor tissue-of-origin (TOO) is critical for patients with cancer of unknown primary lesions. Even if the TOO of a tumor can be diagnosed by clinicopathological observation, reevaluations by computational methods can help avoid misdiagnosis. In this study, we developed a neural network (NN) framework using the expression of a 150-gene panel to infer the tumor TOO for 15 common solid tumor cancer types, including lung, breast, liver, colorectal, gastroesophageal, ovarian, cervical, endometrial, pancreatic, bladder, head and neck, thyroid, prostate, kidney, and brain cancers. To begin with, we downloaded the RNA-Seq data of 7,460 primary tumor samples across the above mentioned 15 cancer types, with each type of cancer having between 142 and 1,052 samples, from the cancer genome atlas. Then, we performed feature selection by the Pearson correlation method and performed a 150-gene panel analysis; the genes were significantly enriched in the GO:2001242 Regulation of intrinsic apoptotic signaling pathway and the GO:0009755 Hormone-mediated signaling pathway and other similar functions. Next, we developed a novel NN model using the 150 genes to predict tumor TOO for the 15 cancer types. The average prediction sensitivity and precision of the framework are 93.36 and 94.07%, respectively, for the 7,460 tumor samples based on the 10-fold cross-validation; however, the prediction sensitivity and precision for a few specific cancers, like prostate cancer, reached 100%. We also tested the trained model on a 20-sample independent dataset with metastatic tumor, and achieved an 80% accuracy. In summary, we present here a highly accurate method to infer tumor TOO, which has potential clinical implementation.
Collapse
Affiliation(s)
- Binsheng He
- Academician Workstation, Changsha Medical University, Changsha, China
| | | | - Zhen Zhou
- Department of Radiology, Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Research Institute, Beijing, China
| | - Bo Wang
- Geneis (Beijing) Co., Ltd., Beijing, China
| | | | | | - Huixin Lin
- Geneis (Beijing) Co., Ltd., Beijing, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Lan Yu
- Inner Mongolia People's Hospital, Huhhot, China
| | - Dejun Sun
- Inner Mongolia People's Hospital, Huhhot, China
| | - Huaiqing Luo
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Jialiang Yang
- Academician Workstation, Changsha Medical University, Changsha, China.,Geneis (Beijing) Co., Ltd., Beijing, China
| | - Geng Tian
- Geneis (Beijing) Co., Ltd., Beijing, China
| |
Collapse
|
18
|
He B, Zhu R, Yang H, Lu Q, Wang W, Song L, Sun X, Zhang G, Li S, Yang J, Tian G, Bing P, Lang J. Assessing the Impact of Data Preprocessing on Analyzing Next Generation Sequencing Data. Front Bioeng Biotechnol 2020; 8:817. [PMID: 32850708 PMCID: PMC7409520 DOI: 10.3389/fbioe.2020.00817] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2020] [Accepted: 06/26/2020] [Indexed: 11/13/2022] Open
Abstract
Data quality control and preprocessing are often the first step in processing next-generation sequencing (NGS) data of tumors. Not only can it help us evaluate the quality of sequencing data, but it can also help us obtain high-quality data for downstream data analysis. However, by comparing data analysis results of preprocessing with Cutadapt, FastP, Trimmomatic, and raw sequencing data, we found that the frequency of mutation detection had some fluctuations and differences, and human leukocyte antigen (HLA) typing directly resulted in erroneous results. We think that our research had demonstrated the impact of data preprocessing steps on downstream data analysis results. We hope that it can promote the development or optimization of better data preprocessing methods, so that downstream information analysis can be more accurate.
Collapse
Affiliation(s)
- Binsheng He
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Rongrong Zhu
- Vascular Surgery Department, Tsinghua University Affiliated Beijing Tsinghua Changgung Hospital, Beijing, China
| | - Huandong Yang
- Department of Gastrointestinal Surgery, Yidu Central Hospital of Weifang, Weifang, China
| | | | | | - Lei Song
- Geneis Beijing Co., Ltd., Beijing, China
| | - Xue Sun
- Geneis Beijing Co., Ltd., Beijing, China
| | | | - Shijun Li
- Department of Pathology, Chifeng Municipal Hospital, Chifeng, China
| | - Jialiang Yang
- Academician Workstation, Changsha Medical University, Changsha, China.,Geneis Beijing Co., Ltd., Beijing, China
| | - Geng Tian
- Geneis Beijing Co., Ltd., Beijing, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha, China
| | | |
Collapse
|
19
|
He B, Lang J, Wang B, Liu X, Lu Q, He J, Gao W, Bing P, Tian G, Yang J. TOOme: A Novel Computational Framework to Infer Cancer Tissue-of-Origin by Integrating Both Gene Mutation and Expression. Front Bioeng Biotechnol 2020; 8:394. [PMID: 32509741 PMCID: PMC7248358 DOI: 10.3389/fbioe.2020.00394] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 04/08/2020] [Indexed: 02/05/2023] Open
Abstract
Metastatic cancers require further diagnosis to determine their primary tumor sites. However, the tissue-of-origin for around 5% tumors could not be identified by routine medical diagnosis according to a statistics in the United States. With the development of machine learning techniques and the accumulation of big cancer data from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO), it is now feasible to predict cancer tissue-of-origin by computational tools. Metastatic tumor inherits characteristics from its tissue-of-origin, and both gene expression profile and somatic mutation have tissue specificity. Thus, we developed a computational framework to infer tumor tissue-of-origin by integrating both gene mutation and expression (TOOme). Specifically, we first perform feature selection on both gene expressions and mutations by a random forest method. The selected features are then used to build up a multi-label classification model to infer cancer tissue-of-origin. We adopt a few popular multiple-label classification methods, which are compared by the 10-fold cross validation process. We applied TOOme to the TCGA data containing 7,008 non-metastatic samples across 20 solid tumors. Seventy four genes by gene expression profile and six genes by gene mutation are selected by the random forest process, which can be divided into two categories: (1) cancer type specific genes and (2) those expressed or mutated in several cancers with different levels of expression or mutation rates. Function analysis indicates that the selected genes are significantly enriched in gland development, urogenital system development, hormone metabolic process, thyroid hormone generation prostate hormone generation and so on. According to the multiple-label classification method, random forest performs the best with a 10-fold cross-validation prediction accuracy of 96%. We also use the 19 metastatic samples from TCGA and 256 cancer samples downloaded from GEO as independent testing data, for which TOOme achieves a prediction accuracy of 89%. The cross-validation validation accuracy is better than those using gene expression (i.e., 95%) and gene mutation (53%) alone. In conclusion, TOOme provides a quick yet accurate alternative to traditional medical methods in inferring cancer tissue-of-origin. In addition, the methods combining somatic mutation and gene expressions outperform those using gene expression or mutation alone.
Collapse
Affiliation(s)
- Binsheng He
- Academician Workstation, Changsha Medical University, Changsha, China
| | | | - Bo Wang
- Geneis Beijing Co., Ltd., Beijing, China
| | | | | | - Jianjun He
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Wei Gao
- Fujian Provincial Cancer Hospital, Fuzhou, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Geng Tian
- Geneis Beijing Co., Ltd., Beijing, China
| | | |
Collapse
|
20
|
Liu X, Lang J, Li S, Wang Y, Peng L, Wang W, Han Y, Qi C, Song L, Yang S, Zhang K, Zang G, Pei H, Lu Q, Peng Y, Xi S, Wang W, Yuan D, Bing P, Zhou L, Tian G. Fragment Enrichment of Circulating Tumor DNA With Low-Frequency Mutations. Front Genet 2020; 11:147. [PMID: 32180799 PMCID: PMC7059766 DOI: 10.3389/fgene.2020.00147] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Accepted: 02/07/2020] [Indexed: 02/03/2023] Open
Abstract
Human blood contains cell-free DNA (cfDNA), with circulating tumor-derived DNAs (ctDNAs) widely used in cancer diagnosis and treatment. However, it is still difficult to efficiently and accurately identify and distinguish specific ctDNAs from normal cfDNA in cancer patient blood samples. In this study, ctDNA fragment length distribution analysis showed that ctDNA fragments are frequently shorter than the normal cfDNAs, which is consistent with previous findings. Interestingly, the ctDNA fragment length was found to be partially associated with the mutant allele frequency, with a low mutant allele frequency (< ~0.6%) associated with a longer ctDNA fragment length when compared to normal cfDNAs. The findings of this study contribute to improving the detection of low-frequency tumor mutations.
Collapse
Affiliation(s)
- Xiaojun Liu
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Jidong Lang
- Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Shijun Li
- Department of Pathology, Chifeng Municipal Hospital, Chifeng, China
| | - Yuehua Wang
- Department of Pathology, Chifeng Municipal Hospital, Chifeng, China
| | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Weitao Wang
- Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Yingmin Han
- Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Cuixiao Qi
- Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Lei Song
- Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Shuangshuang Yang
- Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Kaixin Zhang
- Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Guoliang Zang
- Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Hong Pei
- Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Qingqing Lu
- Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Yonggang Peng
- Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Shuxue Xi
- Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Weiwei Wang
- Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Dawei Yuan
- Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Pingping Bing
- Academics Working Station, Changsha Medical University, Changsha, China
| | - Liqian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Geng Tian
- School of Computer Science, Hunan University of Technology, Zhuzhou, China.,Bioinformatics Department, Geneis (Beijing) Co. Ltd., Beijing, China
| |
Collapse
|
21
|
Bai M, Liu H, Xu K, Zhang X, Deng B, Tan C, Deng J, Bing P, Yin Y. Compensation effects of coated cysteamine on meat quality, amino acid composition, fatty acid composition, mineral content in dorsal muscle and serum biochemical indices in finishing pigs offered reduced trace minerals diet. Sci China Life Sci 2019; 62:1550-1553. [PMID: 31418137 DOI: 10.1007/s11427-018-9399-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 05/21/2019] [Indexed: 11/27/2022]
Affiliation(s)
- Miaomiao Bai
- National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, Hunan Province Key Laboratory of Animal Nutritional Physiology and Metabolic Process, Hunan Provincial Engineering Research Center of Healthy Livestock, Key Laboratory of Agro-ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, 410125, China
- College of Animal Science, South China Agricultural University, Guangzhou, 510642, China
| | - Hongnan Liu
- National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, Hunan Province Key Laboratory of Animal Nutritional Physiology and Metabolic Process, Hunan Provincial Engineering Research Center of Healthy Livestock, Key Laboratory of Agro-ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, 410125, China.
- Hunan Co-Innovation Center of Animal Production Safety, Changsha, 410128, China.
| | - Kang Xu
- National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, Hunan Province Key Laboratory of Animal Nutritional Physiology and Metabolic Process, Hunan Provincial Engineering Research Center of Healthy Livestock, Key Laboratory of Agro-ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, 410125, China
- Hunan Co-Innovation Center of Animal Production Safety, Changsha, 410128, China
| | - Xiaofeng Zhang
- Hangzhou King Techina Technology Company Academician Expert Workstation, Hangzhou King Techina Technology Co., Ltd., Hangzhou, 311107, China
| | - Baichuan Deng
- College of Animal Science, South China Agricultural University, Guangzhou, 510642, China
| | - Chengquan Tan
- College of Animal Science, South China Agricultural University, Guangzhou, 510642, China
| | - Jinping Deng
- College of Animal Science, South China Agricultural University, Guangzhou, 510642, China.
| | - Pingping Bing
- Academics Working Station, Changsha Medical University, Changsha, 410219, China
| | - Yulong Yin
- National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, Hunan Province Key Laboratory of Animal Nutritional Physiology and Metabolic Process, Hunan Provincial Engineering Research Center of Healthy Livestock, Key Laboratory of Agro-ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, 410125, China.
- College of Animal Science, South China Agricultural University, Guangzhou, 510642, China.
- Hunan Co-Innovation Center of Animal Production Safety, Changsha, 410128, China.
| |
Collapse
|
22
|
Bing P, Maode L, Li F, Sheng H. Expression of Renal Transforming Growth Factor-β and Its Receptors in a Rat Model of Chronic Cyclosporine-Induced Nephropathy. Transplant Proc 2006; 38:2176-9. [PMID: 16980035 DOI: 10.1016/j.transproceed.2006.07.015] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
OBJECTIVE We sought to detect expression of transforming growth factor-beta 1 (TGF-beta1) as well as its receptors type I (TRI) and type II (TRII) in rat kidneys during chronic cyclosporine (CsA)-induced nephropathy. METHODS Twenty four rats were randomly divided into three experimental groups: group 1; NSD (control, n = 8) were administered a normal sodium diet, group 2; LSD (n = 8) were administered a low sodium diet, group 3; CsA (n = 8) were sodium-depleted rats administered Neoral by gastric gavage in a model of chronic CsA-induced nephropathy. TGF-beta1, TRI, and TRII proteins, as well as TRI and TRII mRNAs were measured in the CsA-treated rat kidneys by immunohistochemistry and in situ hybridization, respectively. Semiquantitative results were shown by image analysis. RESULTS The expression of TGF-beta1, TRI, TRII, TRI mRNA, and TRII mRNA were all increased in CsA-treated rat kidneys, compared with NSD or LSD (P < .05). CONCLUSION Our study showed that the ligand of TGF-beta1 and its receptors TRI, TRII were all up-regulated. It may be important to inhibit the expression of TGF-beta1 or its receptors in patients who suffer from chronic CsA-induced nephropathy.
Collapse
Affiliation(s)
- P Bing
- Surgery Department, West China Hospital, SiChuan University, Guoxuexiang 37, Chengdu 610041, P.R. China.
| | | | | | | |
Collapse
|
23
|
Bing P, Maode L, Li F, Sheng H. Comparison of Expression of TGF-β1, its Receptors TGFβ1R-I and TGFβ1R-II in Rat Kidneys During Chronic Nephropathy Induced by Cyclosporine and Tacrolimus. Transplant Proc 2006; 38:2180-2. [PMID: 16980036 DOI: 10.1016/j.transproceed.2006.06.102] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE Chronic rejection is a major cause of graft dysfunction after kidney transplantation. Long-term treatment with cyclosporine (CsA) or tacrolimus (FK506) results in chronic nephrotoxicity. Transforming growth factor-beta1 (TGFbeta1) and its receptors type I (TR-I) and type II (TR-II) have been known to contribute to this side effect. The expression of TGF-beta1, TR-I, and TR-II in rat kidneys has not been compared during chronic nephropathy induced by CsA or FK506. METHODS Rat models of chronic CsA- or FK506-induced nephropathy were established using Sandimun Neoral or Prograf administration. The kidneys were dissected and TGF-beta1, TR-I, and TR-II proteins and TR-I and TR-II mRNAs measured by immunohistochemistry and in situ hybridization, respectively, to compare the results of the two groups. RESULTS The functional and morphologic studies showed that in the rats the nephrotoxic effects of FK506 were not as significant as those of CsA. The results of immunohistochemistry and in situ hybridization showed that the expression of renal TGFbeta1, TR I, TR-II proteins and TR and TR II mRNA in the FK506 group were lower than those in the CsA groups. CONCLUSION These results showed that both FK506 and CSA display nephrotoxicity, but that the nephrotoxicity of FK506 was less than that of CsA in chronic nephropathy.
Collapse
Affiliation(s)
- P Bing
- Department of Surgery, West China Hospital, Sichuan University, Guoxuexiang 37, Chengdu 610041, P.R. China.
| | | | | | | |
Collapse
|