1
|
Yang CH, Hou MF, Chuang LY, Yang CS, Lin YD. Dimensionality reduction approach for many-objective epistasis analysis. Brief Bioinform 2023; 24:6858949. [PMID: 36458451 DOI: 10.1093/bib/bbac512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 10/07/2022] [Accepted: 10/26/2022] [Indexed: 12/04/2022] Open
Abstract
In epistasis analysis, single-nucleotide polymorphism-single-nucleotide polymorphism interactions (SSIs) among genes may, alongside other environmental factors, influence the risk of multifactorial diseases. To identify SSI between cases and controls (i.e. binary traits), the score for model quality is affected by different objective functions (i.e. measurements) because of potential disease model preferences and disease complexities. Our previous study proposed a multiobjective approach-based multifactor dimensionality reduction (MOMDR), with the results indicating that two objective functions could enhance SSI identification with weak marginal effects. However, SSI identification using MOMDR remains a challenge because the optimal measure combination of objective functions has yet to be investigated. This study extended MOMDR to the many-objective version (i.e. many-objective MDR, MaODR) by integrating various disease probability measures based on a two-way contingency table to improve the identification of SSI between cases and controls. We introduced an objective function selection approach to determine the optimal measure combination in MaODR among 10 well-known measures. In total, 6 disease models with and 40 disease models without marginal effects were used to evaluate the general algorithms, namely those based on multifactor dimensionality reduction, MOMDR and MaODR. Our results revealed that the MaODR-based three objective function model, correct classification rate, likelihood ratio and normalized mutual information (MaODR-CLN) exhibited the higher 6.47% detection success rates (Accuracy) than MOMDR and higher 17.23% detection success rates than MDR through the application of an objective function selection approach. In a Wellcome Trust Case Control Consortium, MaODR-CLN successfully identified the significant SSIs (P < 0.001) associated with coronary artery disease. We performed a systematic analysis to identify the optimal measure combination in MaODR among 10 objective functions. Our combination detected SSIs-based binary traits with weak marginal effects and thus reduced spurious variables in the score model. MOAI is freely available at https://sites.google.com/view/maodr/home.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Information Management at the Tainan University of Technology, and at the Department of Electronic Engineering at National Kaohsiung of Science and Technology, Taiwan.,Biomedical Engineering, Kaohsiung Medical University, Taiwan
| | - Ming-Feng Hou
- Kaohsiung Medical University Hospital, and Professor at the Department of Surgery, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Li-Yeh Chuang
- Department of Chemical Engineering & Institute of Biotechnology and Chemical Engineering at I-Shou University, Taiwan
| | - Cheng-San Yang
- Department of Plastic Surgery, and serves as the Medical Matters Secretary of Chia-Yi Christian Hospital, Taiwan
| | - Yu-Da Lin
- Department of Computer Science and Information Engineering, and at the National Penghu University of Science and Technology, Taiwan
| |
Collapse
|
2
|
Yang CH, Wu KC, Chuang LY, Chang HW. DeepBarcoding: Deep Learning for Species Classification Using DNA Barcoding. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2158-2165. [PMID: 33600318 DOI: 10.1109/tcbb.2021.3056570] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
DNA barcodes with short sequence fragments are used for species identification. Because of advances in sequencing technologies, DNA barcodes have gradually been emphasized. DNA sequences from different organisms are easily and rapidly acquired. Therefore, DNA sequence analysis tools play an increasingly crucial role in species identification. This study proposed deep barcoding, a deep learning framework for species classification by using DNA barcodes. Deep barcoding uses raw sequence data as the input to represent one-hot encoding as a one-dimensional image and uses a deep convolutional neural network with a fully connected deep neural network for sequence analysis. It can achieve an average accuracy of >90 percent for both simulation and real datasets. Although deep learning yields outstanding performance for species classification with DNA sequences, its application remains a challenge. The deep barcoding model can be a potential tool for species classification and can elucidate DNA barcode-based species identification.
Collapse
|
3
|
Chen JB, Yang HS, Moi SH, Chuang LY, Yang CH. Identification of mortality-risk-related missense variant for renal clear cell carcinoma using deep learning. Ther Adv Chronic Dis 2021; 12:2040622321992624. [PMID: 33643601 PMCID: PMC7890720 DOI: 10.1177/2040622321992624] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 01/13/2021] [Indexed: 11/24/2022] Open
Abstract
Introduction: Kidney renal clear cell carcinoma (KIRCC) is a highly heterogeneous and lethal cancer that can arise in patients with renal disease. DeepSurv combines a deep feed-forward neural network with a Cox proportional hazards function and could provide optimized survival results compared with convenient survival analysis. Methods: This study used an improved DeepSurv algorithm to identify the candidate genes to be targeted for treatment on the basis of the overall mortality status of KIRCC subjects. All the somatic mutation missense variants of KIRCC subjects were abstracted from TCGA-KIRC database. Results: The improved DeepSurv model (95.1%) achieved greater balanced accuracy compared with the DeepSurv model (75%), and identified 610 high-risk variants associated with overall mortality. The results of gene differential expression analysis also indicated nine KIRCC mortality-risk-related pathways, namely the tRNA charging pathway, the D-myo-inositol-5-phosphate metabolism pathway, the DNA double-strand break repair by nonhomologous end-joining pathway, the superpathway of inositol phosphate compounds, the 3-phosphoinositide degradation pathway, the production of nitric oxide and reactive oxygen species in macrophages pathway, the synaptic long-term depression pathway, the sperm motility pathway, and the role of JAK2 in hormone-like cytokine signaling pathway. The biological findings in this study indicate the KIRCC mortality-risk-related pathways were more likely to be associated with cancer cell growth, cancer cell differentiation, and immune response inhibition. Conclusion: The results proved that the improved DeepSurv model effectively classified mortality-related high-risk variants and identified the candidate genes. In the context of KIRCC overall mortality, the proposed model effectively recognized mortality-related high-risk variants for KIRCC.
Collapse
Affiliation(s)
- Jin-Bor Chen
- Division of Nephrology, Department of Internal Medicine, Kaohsiung Chang Gung Memorial Hospital and Chang Gung University College of Medicine, Kaohsiung
| | - Huai-Shuo Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung
| | - Sin-Hua Moi
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung
| | - Li-Yeh Chuang
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung
| | - Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, 415 Jiangong Road, San-Min District, Kaohsiung, 82444
| |
Collapse
|
4
|
Tang J, Wang Y, Luo Y, Fu J, Zhang Y, Li Y, Xiao Z, Lou Y, Qiu Y, Zhu F. Computational advances of tumor marker selection and sample classification in cancer proteomics. Comput Struct Biotechnol J 2020; 18:2012-2025. [PMID: 32802273 PMCID: PMC7403885 DOI: 10.1016/j.csbj.2020.07.009] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 07/06/2020] [Accepted: 07/08/2020] [Indexed: 12/11/2022] Open
Abstract
Cancer proteomics has become a powerful technique for characterizing the protein markers driving transformation of malignancy, tracing proteome variation triggered by therapeutics, and discovering the novel targets and drugs for the treatment of oncologic diseases. To facilitate cancer diagnosis/prognosis and accelerate drug target discovery, a variety of methods for tumor marker identification and sample classification have been developed and successfully applied to cancer proteomic studies. This review article describes the most recent advances in those various approaches together with their current applications in cancer-related studies. Firstly, a number of popular feature selection methods are overviewed with objective evaluation on their advantages and disadvantages. Secondly, these methods are grouped into three major classes based on their underlying algorithms. Finally, a variety of sample separation algorithms are discussed. This review provides a comprehensive overview of the advances on tumor maker identification and patients/samples/tissues separations, which could be guidance to the researches in cancer proteomics.
Collapse
Key Words
- ANN, Artificial Neural Network
- ANOVA, Analysis of Variance
- CFS, Correlation-based Feature Selection
- Cancer proteomics
- Computational methods
- DAPC, Discriminant Analysis of Principal Component
- DT, Decision Trees
- EDA, Estimation of Distribution Algorithm
- FC, Fold Change
- GA, Genetic Algorithms
- GR, Gain Ratio
- HC, Hill Climbing
- HCA, Hierarchical Cluster Analysis
- IG, Information Gain
- LDA, Linear Discriminant Analysis
- LIMMA, Linear Models for Microarray Data
- MBF, Markov Blanket Filter
- MWW, Mann–Whitney–Wilcoxon test
- OPLS-DA, Orthogonal Partial Least Squares Discriminant Analysis
- PCA, Principal Component Analysis
- PLS-DA, Partial Least Square Discriminant Analysis
- RF, Random Forest
- RF-RFE, Random Forest with Recursive Feature Elimination
- SA, Simulated Annealing
- SAM, Significance Analysis of Microarrays
- SBE, Sequential Backward Elimination
- SFS, and Sequential Forward Selection
- SOM, Self-organizing Map
- SU, Symmetrical Uncertainty
- SVM, Support Vector Machine
- SVM-RFE, Support Vector Machine with Recursive Feature Elimination
- Sample classification
- Tumor marker selection
- sPLSDA, Sparse Partial Least Squares Discriminant Analysis
- t-SNE, Student t Distribution
- χ2, Chi-square
Collapse
Affiliation(s)
- Jing Tang
- Department of Bioinformatics, Chongqing Medical University, Chongqing 400016, China.,College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yunxia Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yongchao Luo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Jianbo Fu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yang Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.,School of Pharmaceutical Sciences and Innovative Drug Research Centre, Chongqing University, Chongqing 401331, China
| | - Yi Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ziyu Xiao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yan Lou
- Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, Hangzhou 310000, China
| | - Yunqing Qiu
- Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, Hangzhou 310000, China
| | - Feng Zhu
- Department of Bioinformatics, Chongqing Medical University, Chongqing 400016, China.,College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
5
|
Chuang LY, Yang CS, Yang HS, Yang CH. Identification of High-Order Single-Nucleotide Polymorphism Barcodes in Breast Cancer Using a Hybrid Taguchi-Genetic Algorithm: Case-Control Study. JMIR Med Inform 2020; 8:e16886. [PMID: 32554381 PMCID: PMC7351259 DOI: 10.2196/16886] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 02/09/2020] [Accepted: 04/08/2020] [Indexed: 12/24/2022] Open
Abstract
Background Breast cancer has a major disease burden in the female population, and it is a highly genome-associated human disease. However, in genetic studies of complex diseases, modern geneticists face challenges in detecting interactions among loci. Objective This study aimed to investigate whether variations of single-nucleotide polymorphisms (SNPs) are associated with histopathological tumor characteristics in breast cancer patients. Methods A hybrid Taguchi-genetic algorithm (HTGA) was proposed to identify the high-order SNP barcodes in a breast cancer case-control study. A Taguchi method was used to enhance a genetic algorithm (GA) for identifying high-order SNP barcodes. The Taguchi method was integrated into the GA after the crossover operations in order to optimize the generated offspring systematically for enhancing the GA search ability. Results The proposed HTGA effectively converged to a promising region within the problem space and provided excellent SNP barcode identification. Regression analysis was used to validate the association between breast cancer and the identified high-order SNP barcodes. The maximum OR was less than 1 (range 0.870-0.755) for two- to seven-order SNP barcodes. Conclusions We systematically evaluated the interaction effects of 26 SNPs within growth factor–related genes for breast carcinogenesis pathways. The HTGA could successfully identify relevant high-order SNP barcodes by evaluating the differences between cases and controls. The validation results showed that the HTGA can provide better fitness values as compared with other methods for the identification of high-order SNP barcodes using breast cancer case-control data sets.
Collapse
Affiliation(s)
| | - Cheng-San Yang
- Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chiayi City, Taiwan
| | - Huai-Shuo Yang
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung City, Taiwan
| | - Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung City, Taiwan.,Drug Development and Value Creation Research Center, Kaohsiung Medical University, Kaohsiung, Taiwan.,College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
| |
Collapse
|
6
|
Yang CH, Kao YK, Chuang LY, Lin YD. Catfish Taguchi-Based Binary Differential Evolution Algorithm for Analyzing Single Nucleotide Polymorphism Interactions in Chronic Dialysis. IEEE Trans Nanobioscience 2018; 17:291-299. [PMID: 29994217 DOI: 10.1109/tnb.2018.2844342] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Single-nucleotide polymorphism (SNP)-SNP interactions are crucial for understanding the association between disease-related multifactorials for disease analysis. Existing statistical methods for determining such interactions are limited by the considerable computation required for evaluating all potential associations between disease-related multifactorials. Identifying SNP-SNP interactions is thus a major challenge in genetic association studies. This paper proposes a catfish Taguchi-based binary differential evolution (CT-BDE) algorithm for identifying SNP-SNP interactions. In the search space, the catfish effect prevents the premature convergence of the population, and the Taguchi method improves the search ability of the BDE algorithm. Hence, the proposed algorithm enables obtaining a favorable solution regarding the identification of high-order SNP-SNP interactions. Additionally, the proposed algorithm applies an effective fitness function derived from a multifactor dimensionality reduction (MDR) operation to evaluate the solutions from BDE-based algorithms. Simulated and real data sets were used to evaluate the ability of several BDE-based algorithms in identifying specific SNP-SNP interactions. We compared the fitness function derived from the MDR operation with that derived according to the difference between cases and controls, by using the different BDE-based algorithms. The results showed that the proposed CT-BDE algorithm applying the fitness function derived from the MDR operation exhibited a superior ability in identifying SNP-SNP interactions compared with the other BDE-based algorithms.
Collapse
|
7
|
Yang CH, Wu KC, Chuang LY, Chang HW. Decision Tree Algorithm-Generated Single-Nucleotide Polymorphism Barcodes of rbcL Genes for 38 Brassicaceae Species Tagging. Evol Bioinform Online 2018; 14:1176934318760856. [PMID: 29551885 PMCID: PMC5846911 DOI: 10.1177/1176934318760856] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2017] [Accepted: 01/24/2018] [Indexed: 01/17/2023] Open
Abstract
DNA barcode sequences are accumulating in large data sets. A barcode is generally a sequence larger than 1000 base pairs and generates a computational burden. Although the DNA barcode was originally envisioned as straightforward species tags, the identification usage of barcode sequences is rarely emphasized currently. Single-nucleotide polymorphism (SNP) association studies provide us an idea that the SNPs may be the ideal target of feature selection to discriminate between different species. We hypothesize that SNP-based barcodes may be more effective than the full length of DNA barcode sequences for species discrimination. To address this issue, we tested a ribulose diphosphate carboxylase (rbcL) SNP barcoding (RSB) strategy using a decision tree algorithm. After alignment and trimming, 31 SNPs were discovered in the rbcL sequences from 38 Brassicaceae plant species. In the decision tree construction, these SNPs were computed to set up the decision rule to assign the sequences into 2 groups level by level. After algorithm processing, 37 nodes and 31 loci were required for discriminating 38 species. Finally, the sequence tags consisting of 31 rbcL SNP barcodes were identified for discriminating 38 Brassicaceae species based on the decision tree-selected SNP pattern using RSB method. Taken together, this study provides the rational that the SNP aspect of DNA barcode for rbcL gene is a useful and effective sequence for tagging 38 Brassicaceae species.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan.,Graduate Institute of Clinical Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Kuo-Chuan Wu
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan.,Department of Computer Science and Information Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
| | - Li-Yeh Chuang
- Department of Chemical Engineering, Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung, Taiwan
| | - Hsueh-Wei Chang
- Institute of Medical Science and Technology, National Sun Yat-sen University, Kaohsiung, Taiwan.,Department of Medical Research, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung, Taiwan.,Department of Biomedical Science and Environmental Biology, Kaohsiung Medical University, Kaohsiung, Taiwan
| |
Collapse
|
8
|
Yang CH, Weng ZJ, Chuang LY, Yang CS. Identification of SNP-SNP interaction for chronic dialysis patients. Comput Biol Med 2017; 83:94-101. [DOI: 10.1016/j.compbiomed.2017.02.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Revised: 02/14/2017] [Accepted: 02/15/2017] [Indexed: 01/10/2023]
|
9
|
Yang CH, Lin YD, Chuang LY, Chang HW. Analysis of high-order SNP barcodes in mitochondrial D-loop for chronic dialysis susceptibility. J Biomed Inform 2016; 63:112-119. [PMID: 27507088 DOI: 10.1016/j.jbi.2016.08.009] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2016] [Revised: 06/26/2016] [Accepted: 08/05/2016] [Indexed: 12/18/2022]
Abstract
OBJECTIVES Positively identifying disease-associated single nucleotide polymorphism (SNP) markers in genome-wide studies entails the complex association analysis of a huge number of SNPs. Such large numbers of SNP barcode (SNP/genotype combinations) continue to pose serious computational challenges, especially for high-dimensional data. METHODS We propose a novel exploiting SNP barcode method based on differential evolution, termed IDE (improved differential evolution). IDE uses a "top combination strategy" to improve the ability of differential evolution to explore high-order SNP barcodes in high-dimensional data. RESULTS We simulate disease data and use real chronic dialysis data to test four global optimization algorithms. In 48 simulated disease models, we show that IDE outperforms existing global optimization algorithms in terms of exploring ability and power to detect the specific SNP/genotype combinations with a maximum difference between cases and controls. In real data, we show that IDE can be used to evaluate the relative effects of each individual SNP on disease susceptibility. CONCLUSION IDE generated significant SNP barcode with less computational complexity than the other algorithms, making IDE ideally suited for analysis of high-order SNP barcodes.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan.
| | - Yu-Da Lin
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan.
| | - Li-Yeh Chuang
- Department of Chemical Engineering & Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung, Taiwan.
| | - Hsueh-Wei Chang
- Institute of Medical Science and Technology, National Sun Yat-Sen University, Kaohsiung, Taiwan; Department of Biomedical Science and Environmental Biology, Kaohsiung Medical University, Kaohsiung, Taiwan.
| |
Collapse
|
10
|
Fu OY, Chang HW, Lin YD, Chuang LY, Hou MF, Yang CH. Breast cancer-associated high-order SNP-SNP interaction of CXCL12/CXCR4-related genes by an improved multifactor dimensionality reduction (MDR-ER). Oncol Rep 2016; 36:1739-47. [PMID: 27461876 DOI: 10.3892/or.2016.4956] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2016] [Accepted: 03/03/2016] [Indexed: 11/06/2022] Open
Abstract
In association studies, the combined effects of single nucleotide polymorphism (SNP)-SNP interactions and the problem of imbalanced data between cases and controls are frequently ignored. In the present study, we used an improved multifactor dimensionality reduction (MDR) approach namely MDR-ER to detect the high order SNP‑SNP interaction in an imbalanced breast cancer data set containing seven SNPs of chemokine CXCL12/CXCR4 pathway genes. Most individual SNPs were not significantly associated with breast cancer. After MDR‑ER analysis, six significant SNP‑SNP interaction models with seven genes (highest cross‑validation consistency, 10; classification error rates, 41.3‑21.0; and prediction error rates, 47.4‑55.3) were identified. CD4 and VEGFA genes were associated in a 2‑loci interaction model (classification error rate, 41.3; prediction error rate, 47.5; odds ratio (OR), 2.069; 95% bootstrap CI, 1.40‑2.90; P=1.71E‑04) and it also appeared in all the best 2‑7‑loci models. When the loci number increased, the classification error rates and P‑values decreased. The powers in 2‑7‑loci in all models were >0.9. The minimum classification error rate of the MDR‑ER‑generated model was shown with the 7‑loci interaction model (classification error rate, 21.0; OR=15.282; 95% bootstrap CI, 9.54‑23.87; P=4.03E‑31). In the epistasis network analysis, the overall effect with breast cancer susceptibility was identified and the SNP order of impact on breast cancer was identified as follows: CD4 = VEGFA > KITLG > CXCL12 > CCR7 = MMP2 > CXCR4. In conclusion, the MDR‑ER can effectively and correctly identify the best SNP‑SNP interaction models in an imbalanced data set for breast cancer cases.
Collapse
Affiliation(s)
- Ou-Yang Fu
- Graduate Institute of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 80708, Taiwan, R.O.C
| | - Hsueh-Wei Chang
- Cancer Center, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80708, Taiwan, R.O.C
| | - Yu-Da Lin
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung 80778, Taiwan, R.O.C
| | - Li-Yeh Chuang
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I‑Shou University, Kaohsiung 84001, Taiwan, R.O.C
| | - Ming-Feng Hou
- Cancer Center, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80708, Taiwan, R.O.C
| | - Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung 80778, Taiwan, R.O.C
| |
Collapse
|
11
|
Yang CH, Lin YD, Yen CY, Chuang LY, Chang HW. A systematic gene-gene and gene-environment interaction analysis of DNA repair genes XRCC1, XRCC2, XRCC3, XRCC4, and oral cancer risk. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2016; 19:238-47. [PMID: 25831063 DOI: 10.1089/omi.2014.0121] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Oral cancer is the sixth most common cancer worldwide with a high mortality rate. Biomarkers that anticipate susceptibility, prognosis, or response to treatments are much needed. Oral cancer is a polygenic disease involving complex interactions among genetic and environmental factors, which require multifaceted analyses. Here, we examined in a dataset of 103 oral cancer cases and 98 controls from Taiwan the association between oral cancer risk and the DNA repair genes X-ray repair cross-complementing group (XRCCs) 1-4, and the environmental factors of smoking, alcohol drinking, and betel quid (BQ) chewing. We employed logistic regression, multifactor dimensionality reduction (MDR), and hierarchical interaction graphs for analyzing gene-gene (G×G) and gene-environment (G×E) interactions. We identified a significantly elevated risk of the XRCC2 rs2040639 heterozygous variant among smokers [adjusted odds ratio (OR) 3.7, 95% confidence interval (CI)=1.1-12.1] and alcohol drinkers [adjusted OR=5.7, 95% CI=1.4-23.2]. The best two-factor based G×G interaction of oral cancer included the XRCC1 rs1799782 and XRCC2 rs2040639 [OR=3.13, 95% CI=1.66-6.13]. For the G×E interaction, the estimated OR of oral cancer for two (drinking-BQ chewing), three (XRCC1-XRCC2-BQ chewing), four (XRCC1-XRCC2-age-BQ chewing), and five factors (XRCC1-XRCC2-age-drinking-BQ chewing) were 32.9 [95% CI=14.1-76.9], 31.0 [95% CI=14.0-64.7], 49.8 [95% CI=21.0-117.7] and 82.9 [95% CI=31.0-221.5], respectively. Taken together, the genotypes of XRCC1 rs1799782 and XRCC2 rs2040639 DNA repair genes appear to be significantly associated with oral cancer. These were enhanced by exposure to certain environmental factors. The observations presented here warrant further research in larger study samples to examine their relevance for routine clinical care in oncology.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- 1 Department of Electronic Engineering, National Kaohsiung University of Applied Sciences , Kaohsiung, Taiwan
| | | | | | | | | |
Collapse
|
12
|
Zhang Z, Wang Z, Mai G, Luo Y, Zhao M, Zhou F. Evolutionary optimization of transcription factor binding motif detection. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2015; 827:261-74. [PMID: 25387969 DOI: 10.1007/978-94-017-9245-5_15] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
All the cell types are under strict control of how their genes are transcribed into expressed transcripts by the temporally dynamic orchestration of the transcription factor binding activities. Given a set of known binding sites (BSs) of a given transcription factor (TF), computational TFBS screening technique represents a cost efficient and large scale strategy to complement the experimental ones. There are two major classes of computational TFBS prediction algorithms based on the tertiary and primary structures, respectively. A tertiary structure based algorithm tries to calculate the binding affinity between a query DNA fragment and the tertiary structure of the given TF. Due to the limited number of available TF tertiary structures, primary structure based TFBS prediction algorithm is a necessary complementary technique for large scale TFBS screening. This study proposes a novel evolutionary algorithm to randomly mutate the weights of different positions in the binding motif of a TF, so that the overall TFBS prediction accuracy is optimized. The comparison with the most widely used algorithm, Position Weight Matrix (PWM), suggests that our algorithm performs better or the same level in all the performance measurements, including sensitivity, specificity, accuracy and Matthews correlation coefficient. Our data also suggests that it is necessary to remove the widely used assumption of independence between motif positions. The supplementary material may be found at: http://www.healthinformaticslab.org/supp/ .
Collapse
Affiliation(s)
- Zhao Zhang
- School of Computer Science and Software Engineering, Tianjin Polytechnic University, Tianjin, China
| | | | | | | | | | | |
Collapse
|
13
|
Farooqi AA, Yaylim I, Ozkan NE, Zaman F, Halim TA, Chang HW. Restoring TRAIL mediated signaling in ovarian cancer cells. Arch Immunol Ther Exp (Warsz) 2014; 62:459-74. [PMID: 25030086 DOI: 10.1007/s00005-014-0307-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2013] [Accepted: 06/26/2014] [Indexed: 02/08/2023]
Abstract
Ovarian cancer has emerged as a multifaceted and genomically complex disease. Genetic/epigenetic mutations, suppression of tumor suppressors, overexpression of oncogenes, rewiring of intracellular signaling cascades and loss of apoptosis are some of the deeply studied mechanisms. In vitro and in vivo studies have highlighted different molecular mechanisms that regulate tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) mediated apoptosis in ovarian cancer. In this review, we bring to limelight, expansion in understanding systematical characterization of ovarian cancer cells has led to the rapid development of new drugs and treatments to target negative regulators of TRAIL mediated signaling pathway. Wide ranging synthetic and natural agents have been shown to stimulate mRNA and protein expression of death receptors. This review is compartmentalized into programmed cell death protein 4, platelet-derived growth factor signaling and miRNA control of TRAIL mediated signaling to ovarian cancer. Mapatumumab and PRO95780 have been tested for efficacy against ovarian cancer. Use of high-throughput screening assays will aid in dissecting the heterogeneity of this disease and increasing a long-term survival which might be achieved by translating rapidly accumulating information obtained from molecular and cellular studies to clinic researches.
Collapse
Affiliation(s)
- Ammad Ahmad Farooqi
- Laboratory for Translational Oncology and Personalized Medicine, RLMC, 35 km Ferozepur Road, Lahore, Pakistan,
| | | | | | | | | | | |
Collapse
|