1
|
Hirt J, Khanteymoori A, Hohenhaus M, Kopp MA, Howells DW, Schwab JM, Watzlawick R. Inhibition of the Nogo-pathway in experimental spinal cord injury: a meta-analysis of 76 experimental treatments. Sci Rep 2023; 13:22898. [PMID: 38129508 PMCID: PMC10739940 DOI: 10.1038/s41598-023-49260-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 12/06/2023] [Indexed: 12/23/2023] Open
Abstract
Recovery after spinal cord injury (SCI) may be propagated by plasticity-enhancing treatments. The myelin-associated nerve outgrowth inhibitor Nogo-A (Reticulon 4, RTN4) pathway has been shown to restrict neuroaxonal plasticity in experimental SCI models. Early randomized controlled trials are underway to investigate the effect of Nogo-A/Nogo-Receptor (NgR1) pathway blockers. This systematic review and meta-analysis of therapeutic approaches blocking the Nogo-A pathway interrogated the efficacy of functional locomotor recovery after experimental SCI according to a pre-registered study protocol. A total of 51 manuscripts reporting 76 experiments in 1572 animals were identified for meta-analysis. Overall, a neurobehavioral improvement by 18.9% (95% CI 14.5-23.2) was observed. Subgroup analysis (40 experiments, N = 890) revealed SCI-modelling factors associated with outcome variability. Lack of reported randomization and smaller group sizes were associated with larger effect sizes. Delayed treatment start was associated with lower effect sizes. Trim and Fill assessment as well as Egger regression suggested the presence of publication bias. Factoring in theoretically missing studies resulted in a reduced effect size [8.8% (95% CI 2.6-14.9)]. The available data indicates that inhibition of the Nogo-A/NgR1pathway alters functional recovery after SCI in animal studies although substantial differences appear for the applied injury mechanisms and other study details. Mirroring other SCI interventions assessed earlier we identify similar factors associated with outcome heterogeneity.
Collapse
Affiliation(s)
- Julian Hirt
- Department of Neurology and Experimental Neurology, Charité Campus Mitte, Clinical and Experimental Spinal Cord Injury Research Laboratory (Neuroparaplegiology), Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Alireza Khanteymoori
- Department of Neurosurgery, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Breisacher Straße 64, 79106, Freiburg, Germany
| | - Marc Hohenhaus
- Department of Neurosurgery, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Breisacher Straße 64, 79106, Freiburg, Germany
| | - Marcel A Kopp
- Department of Neurology and Experimental Neurology, Charité Campus Mitte, Clinical and Experimental Spinal Cord Injury Research Laboratory (Neuroparaplegiology), Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - David W Howells
- Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Melbourne, VIC, Australia
| | - Jan M Schwab
- Department of Neurology and Experimental Neurology, Charité Campus Mitte, Clinical and Experimental Spinal Cord Injury Research Laboratory (Neuroparaplegiology), Charité - Universitätsmedizin Berlin, Berlin, Germany
- Department of Neurology, Spinal Cord Injury Division (Paraplegiology), The Neurological Institute, The Ohio State University, Wexner Medical Center, Columbus, OH, USA
- Belford Center for Spinal Cord Injury, Departments of Neuroscience and Physical Medicine and Rehabilitation, The Neurological Institute, The Ohio State University, Wexner Medical Center, Columbus, OH, USA
| | - Ralf Watzlawick
- Department of Neurology and Experimental Neurology, Charité Campus Mitte, Clinical and Experimental Spinal Cord Injury Research Laboratory (Neuroparaplegiology), Charité - Universitätsmedizin Berlin, Berlin, Germany.
- Department of Neurosurgery, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Breisacher Straße 64, 79106, Freiburg, Germany.
| |
Collapse
|
2
|
Alipanahi R, Safari L, Khanteymoori A. CRISPR genome editing using computational approaches: A survey. Front Bioinform 2023; 2:1001131. [PMID: 36710911 PMCID: PMC9875887 DOI: 10.3389/fbinf.2022.1001131] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 12/19/2022] [Indexed: 01/13/2023] Open
Abstract
Clustered regularly interspaced short palindromic repeats (CRISPR)-based gene editing has been widely used in various cell types and organisms. To make genome editing with Clustered regularly interspaced short palindromic repeats far more precise and practical, we must concentrate on the design of optimal gRNA and the selection of appropriate Cas enzymes. Numerous computational tools have been created in recent years to help researchers design the best gRNA for Clustered regularly interspaced short palindromic repeats researches. There are two approaches for designing an appropriate gRNA sequence (which targets our desired sites with high precision): experimental and predicting-based approaches. It is essential to reduce off-target sites when designing an optimal gRNA. Here we review both traditional and machine learning-based approaches for designing an appropriate gRNA sequence and predicting off-target sites. In this review, we summarize the key characteristics of all available tools (as far as possible) and compare them together. Machine learning-based tools and web servers are believed to become the most effective and reliable methods for predicting on-target and off-target activities of Clustered regularly interspaced short palindromic repeats in the future. However, these predictions are not so precise now and the performance of these algorithms -especially deep learning one's-depends on the amount of data used during training phase. So, as more features are discovered and incorporated into these models, predictions become more in line with experimental observations. We must concentrate on the creation of ideal gRNA and the choice of suitable Cas enzymes in order to make genome editing with Clustered regularly interspaced short palindromic repeats far more accurate and feasible.
Collapse
Affiliation(s)
| | - Leila Safari
- Department of Computer Engineering, University of Zanjan, Zanjan, Iran,*Correspondence: Leila Safari,
| | | |
Collapse
|
3
|
Abbaszadeh O, Azarpeyvand A, Khanteymoori A, Bahari A. Data-Driven and Knowledge-Based Algorithms for Gene Network Reconstruction on High-Dimensional Data. IEEE/ACM Trans Comput Biol Bioinform 2022; 19:1545-1557. [PMID: 33119511 DOI: 10.1109/tcbb.2020.3034861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Previous efforts in gene network reconstruction have mainly focused on data-driven modeling, with little attention paid to knowledge-based approaches. Leveraging prior knowledge, however, is a promising paradigm that has been gaining momentum in network reconstruction and computational biology research communities. This paper proposes two new algorithms for reconstructing a gene network from expression profiles with and without prior knowledge in small sample and high-dimensional settings. First, using tools from the statistical estimation theory, particularly the empirical Bayesian approach, the current research estimates a covariance matrix via the shrinkage method. Second, estimated covariance matrix is employed in the penalized normal likelihood method to select the Gaussian graphical model. This formulation allows the application of prior knowledge in the covariance estimation, as well as in the Gaussian graphical model selection. Experimental results on simulated and real datasets show that, compared to state-of-the-art methods, the proposed algorithms achieve better results in terms of both PR and ROC curves. Finally, the present work applies its method on the RNA-seq data of human gastric atrophy patients, which was obtained from the EMBL-EBI database. The source codes and relevant data can be downloaded from: https://github.com/AbbaszadehO/DKGN.
Collapse
|
4
|
Khojasteh H, Khanteymoori A, Olyaee MH. Comparing protein-protein interaction networks of SARS-CoV-2 and (H1N1) influenza using topological features. Sci Rep 2022; 12:5867. [PMID: 35393450 PMCID: PMC8988119 DOI: 10.1038/s41598-022-08574-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Accepted: 03/03/2022] [Indexed: 01/04/2023] Open
Abstract
SARS-CoV-2 pandemic first emerged in late 2019 in China. It has since infected more than 298 million individuals and caused over 5 million deaths globally. The identification of essential proteins in a protein–protein interaction network (PPIN) is not only crucial in understanding the process of cellular life but also useful in drug discovery. There are many centrality measures to detect influential nodes in complex networks. Since SARS-CoV-2 and (H1N1) influenza PPINs pose 553 common human proteins. Analyzing influential proteins and comparing these networks together can be an effective step in helping biologists for drug-target prediction. We used 21 centrality measures on SARS-CoV-2 and (H1N1) influenza PPINs to identify essential proteins. We applied principal component analysis and unsupervised machine learning methods to reveal the most informative measures. Appealingly, some measures had a high level of contribution in comparison to others in both PPINs, namely Decay, Residual closeness, Markov, Degree, closeness (Latora), Barycenter, Closeness (Freeman), and Lin centralities. We also investigated some graph theory-based properties like the power law, exponential distribution, and robustness. Both PPINs tended to properties of scale-free networks that expose their nature of heterogeneity. Dimensionality reduction and unsupervised learning methods were so effective to uncover appropriate centrality measures.
Collapse
Affiliation(s)
- Hakimeh Khojasteh
- Department of Computer Engineering, University of Zanjan, Zanjan, Iran
| | | | - Mohammad Hossein Olyaee
- Department of Computer Engineering, Engineering Faculty, University of Gonabad, Zanjan, Gonabad, Iran
| |
Collapse
|
5
|
Ferdowsi A, Khanteymoori A. Discovering Communities in Networks: A Linear Programming Approach Using Max-Min Modularity. Annals of Computer Science and Information Systems 2021. [DOI: 10.15439/2021f65] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
6
|
Lamprecht AL, Palmblad M, Ison J, Schwämmle V, Al Manir MS, Altintas I, Baker CJO, Ben Hadj Amor A, Capella-Gutierrez S, Charonyktakis P, Crusoe MR, Gil Y, Goble C, Griffin TJ, Groth P, Ienasescu H, Jagtap P, Kalaš M, Kasalica V, Khanteymoori A, Kuhn T, Mei H, Ménager H, Möller S, Richardson RA, Robert V, Soiland-Reyes S, Stevens R, Szaniszlo S, Verberne S, Verhoeven A, Wolstencroft K. Perspectives on automated composition of workflows in the life sciences. F1000Res 2021; 10:897. [PMID: 34804501 PMCID: PMC8573700 DOI: 10.12688/f1000research.54159.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/27/2021] [Indexed: 12/29/2022] Open
Abstract
Scientific data analyses often combine several computational tools in automated pipelines, or workflows. Thousands of such workflows have been used in the life sciences, though their composition has remained a cumbersome manual process due to a lack of standards for annotation, assembly, and implementation. Recent technological advances have returned the long-standing vision of automated workflow composition into focus. This article summarizes a recent Lorentz Center workshop dedicated to automated composition of workflows in the life sciences. We survey previous initiatives to automate the composition process, and discuss the current state of the art and future perspectives. We start by drawing the "big picture" of the scientific workflow development life cycle, before surveying and discussing current methods, technologies and practices for semantic domain modelling, automation in workflow development, and workflow assessment. Finally, we derive a roadmap of individual and community-based actions to work toward the vision of automated workflow development in the forthcoming years. A central outcome of the workshop is a general description of the workflow life cycle in six stages: 1) scientific question or hypothesis, 2) conceptual workflow, 3) abstract workflow, 4) concrete workflow, 5) production workflow, and 6) scientific results. The transitions between stages are facilitated by diverse tools and methods, usually incorporating domain knowledge in some form. Formal semantic domain modelling is hard and often a bottleneck for the application of semantic technologies. However, life science communities have made considerable progress here in recent years and are continuously improving, renewing interest in the application of semantic technologies for workflow exploration, composition and instantiation. Combined with systematic benchmarking with reference data and large-scale deployment of production-stage workflows, such technologies enable a more systematic process of workflow development than we know today. We believe that this can lead to more robust, reusable, and sustainable workflows in the future.
Collapse
Affiliation(s)
| | - Magnus Palmblad
- Leiden University Medical Center, 2333 ZA, Leiden, The Netherlands
| | - Jon Ison
- French Institute of Bioinformatics, 91057 Évry, France
| | | | | | - Ilkay Altintas
- University of California San Diego, La Jolla, CA, 92093, USA
| | - Christopher J. O. Baker
- University of New Brunswick, Saint John, E2L 4L5, Canada
- IPSNP Computing Inc., Saint John, E2L 4S6, Canada
| | | | | | | | | | - Yolanda Gil
- University of Southern California, Marina Del Rey, CA, 90292, USA
| | - Carole Goble
- Department of Computer Science, The University of Manchester, Manchester, M13 9PL, UK
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Paul Groth
- University of Amsterdam, 1090 GH Amsterdam, The Netherlands
| | - Hans Ienasescu
- Technical University of Denmark, 2800 Kongens Lyngby, Denmark
| | - Pratik Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, 55455, USA
| | | | | | | | - Tobias Kuhn
- VU Amsterdam, 1081 HV Amsterdam, The Netherlands
| | - Hailiang Mei
- Sequencing Analysis Support Core, Leiden University Medical Center, 2333 ZC Leiden, The Netherlands
| | | | - Steffen Möller
- IBIMA, Rostock University Medical Center, 18057 Rostock, Germany
| | | | | | - Stian Soiland-Reyes
- Department of Computer Science, The University of Manchester, Manchester, M13 9PL, UK
- Informatics Institute, University of Amsterdam, 1090 GH Amsterdam, The Netherlands
| | - Robert Stevens
- Department of Computer Science, The University of Manchester, Manchester, M13 9PL, UK
| | | | - Suzan Verberne
- Leiden Institute of Advanced Computer Science, Leiden University, 2333 BE Leiden, The Netherlands
| | - Aswin Verhoeven
- Leiden University Medical Center, 2333 ZA, Leiden, The Netherlands
| | - Katherine Wolstencroft
- Leiden Institute of Advanced Computer Science, Leiden University, 2333 BE Leiden, The Netherlands
| |
Collapse
|
7
|
Gu Q, Kumar A, Bray S, Creason A, Khanteymoori A, Jalili V, Grüning B, Goecks J. Galaxy-ML: An accessible, reproducible, and scalable machine learning toolkit for biomedicine. PLoS Comput Biol 2021; 17:e1009014. [PMID: 34061826 PMCID: PMC8213174 DOI: 10.1371/journal.pcbi.1009014] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Revised: 06/18/2021] [Accepted: 04/27/2021] [Indexed: 11/25/2022] Open
Abstract
Supervised machine learning is an essential but difficult to use approach in biomedical data analysis. The Galaxy-ML toolkit (https://galaxyproject.org/community/machine-learning/) makes supervised machine learning more accessible to biomedical scientists by enabling them to perform end-to-end reproducible machine learning analyses at large scale using only a web browser. Galaxy-ML extends Galaxy (https://galaxyproject.org), a biomedical computational workbench used by tens of thousands of scientists across the world, with a suite of tools for all aspects of supervised machine learning.
Collapse
Affiliation(s)
- Qiang Gu
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, Oregon, United States of America
- The Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon, United States of America
| | - Anup Kumar
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Simon Bray
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Allison Creason
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, Oregon, United States of America
- The Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon, United States of America
| | - Alireza Khanteymoori
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Vahid Jalili
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, Oregon, United States of America
- The Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon, United States of America
| | - Björn Grüning
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Jeremy Goecks
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, Oregon, United States of America
- The Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon, United States of America
- * E-mail:
| |
Collapse
|
8
|
Pirgazi J, Olyaee MH, Khanteymoori A. KFGRNI: A robust method to inference gene regulatory network from time-course gene data based on ensemble Kalman filter. J Bioinform Comput Biol 2021; 19:2150002. [PMID: 33657986 DOI: 10.1142/s0219720021500025] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
A central problem of systems biology is the reconstruction of Gene Regulatory Networks (GRNs) by the use of time series data. Although many attempts have been made to design an efficient method for GRN inference, providing a best solution is still a challenging task. Existing noise, low number of samples, and high number of nodes are the main reasons causing poor performance of existing methods. The present study applies the ensemble Kalman filter algorithm to model a GRN from gene time series data. The inference of a GRN is decomposed with p genes into p subproblems. In each subproblem, the ensemble Kalman filter algorithm identifies the weight of interactions for each target gene. With the use of the ensemble Kalman filter, the expression pattern of the target gene is predicted from the expression patterns of all the remaining genes. The proposed method is compared with several well-known approaches. The results of the evaluation indicate that the proposed method improves inference accuracy and demonstrates better regulatory relations with noisy data.
Collapse
Affiliation(s)
- Jamshid Pirgazi
- Department of Electrical and Computer Engineering, University of Science and Technology of Mazandaran Behshahr, Iran
| | - Mohammad Hossein Olyaee
- Department of Computer Engineering, Engineering Faculty, University of Gonabad, Gonabad, Iran
| | - Alireza Khanteymoori
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Germany.,Department of Computer Engineering, Engineering Faculty, University of Zanjan Zanjan Province, Iran
| |
Collapse
|
9
|
Olyaee MH, Khanteymoori A, Fazli E. A fuzzy c-means clustering approach for haplotype reconstruction based on minimum error correction. Informatics in Medicine Unlocked 2021. [DOI: 10.1016/j.imu.2021.100646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
|
10
|
Zamani F, Olyaee MH, Khanteymoori A. NCMHap: a novel method for haplotype reconstruction based on Neutrosophic c-means clustering. BMC Bioinformatics 2020; 21:475. [PMID: 33092523 PMCID: PMC7579908 DOI: 10.1186/s12859-020-03775-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Accepted: 09/22/2020] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Single individual haplotype problem refers to reconstructing haplotypes of an individual based on several input fragments sequenced from a specified chromosome. Solving this problem is an important task in computational biology and has many applications in the pharmaceutical industry, clinical decision-making, and genetic diseases. It is known that solving the problem is NP-hard. Although several methods have been proposed to solve the problem, it is found that most of them have low performances in dealing with noisy input fragments. Therefore, proposing a method which is accurate and scalable, is a challenging task. RESULTS In this paper, we introduced a method, named NCMHap, which utilizes the Neutrosophic c-means (NCM) clustering algorithm. The NCM algorithm can effectively detect the noise and outliers in the input data. In addition, it can reduce their effects in the clustering process. The proposed method has been evaluated by several benchmark datasets. Comparing with existing methods indicates when NCM is tuned by suitable parameters, the results are encouraging. In particular, when the amount of noise increases, it outperforms the comparing methods. CONCLUSION The proposed method is validated using simulated and real datasets. The achieved results recommend the application of NCMHap on the datasets which involve the fragments with a huge amount of gaps and noise.
Collapse
Affiliation(s)
- Fatemeh Zamani
- Department of Computer Engineering, University of Zanjan, Zanjan, Iran
| | - Mohammad Hossein Olyaee
- Department of Computer Engineering, Faculty of Engineering, University of Gonabad, Gonabad, Iran
| | | |
Collapse
|
11
|
Olyaee MH, Pirgazi J, Khalifeh K, Khanteymoori A. RCOVID19: Recurrence-based SARS-CoV-2 features using chaos game representation. Data Brief 2020; 32:106144. [PMID: 32835040 PMCID: PMC7411429 DOI: 10.1016/j.dib.2020.106144] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 07/28/2020] [Accepted: 08/04/2020] [Indexed: 11/28/2022] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the COVID-19 pandemic. It was first detected in China and was rapidly spread to other countries. Several thousands of whole genome sequences of SARS-CoV-2 have been reported and it is important to compare them and identify distinctive evolutionary/mutant markers. Utilizing chaos game representation (CGR) as well as recurrence quantification analysis (RQA) as a powerful nonlinear analysis technique, we proposed an effective process to extract several valuable features from genomic sequences of SARS-CoV-2. The represented features enable us to compare genomic sequences with different lengths. The provided dataset involves totally 18 RQA-based features for 4496 instances of SARS-CoV-2.
Collapse
Affiliation(s)
- Mohammad Hossein Olyaee
- Faculty of Engineering, Department of Computer Engineering, University of Gonabad, Gonabad, Iran
| | - Jamshid Pirgazi
- Department of Electrical and Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
| | - Khosrow Khalifeh
- Department of Biology, Faculty of Sciences, University of Zanjan, Zanjan, Iran
| | | |
Collapse
|
12
|
Olyaee MH, Khanteymoori A, Khalifeh K. Application of Chaotic Laws to Improve Haplotype Assembly Using Chaos Game Representation. Sci Rep 2019; 9:10361. [PMID: 31316124 PMCID: PMC6637069 DOI: 10.1038/s41598-019-46844-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Accepted: 07/01/2019] [Indexed: 02/06/2023] Open
Abstract
Sequence data are deposited in the form of unphased genotypes and it is not possible to directly identify the location of a particular allele on a specific parental chromosome or haplotype. This study employed nonlinear time series modeling approaches to analyze the haplotype sequences obtained from the NGS sequencing method. To evaluate the chaotic behavior of haplotypes, we analyzed their whole sequences, as well as several subsequences from distinct haplotypes, in terms of the SNP distribution on their chromosomes. This analysis utilized chaos game representation (CGR) followed by the application of two different scaling methods. It was found that chaotic behavior clearly exists in most haplotype subsequences. For testing the applicability of the proposed model, the present research determined the alleles in gap positions and positions with low coverage by using chromosome subsequences in which 10% of each subsequence's alleles are replaced by gaps. After conversion of the subsequences' CGR into the coordinate series, a Local Projection (LP) method predicted the measure of ambiguous positions in the coordinate series. It was discovered that the average reconstruction rate for all input data is more than 97%, demonstrating that applying this knowledge can effectively improve the reconstruction rate of given haplotypes.
Collapse
Affiliation(s)
| | | | - Khosrow Khalifeh
- Department of Biology, Faculty of Sciences, University of Zanjan, Zanjan, Iran
| |
Collapse
|
13
|
Olyaee MH, Khanteymoori A. AROHap: An effective algorithm for single individual haplotype reconstruction based on asexual reproduction optimization. Comput Biol Chem 2017; 72:1-10. [PMID: 29289750 DOI: 10.1016/j.compbiolchem.2017.12.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2016] [Revised: 11/22/2017] [Accepted: 12/10/2017] [Indexed: 10/18/2022]
Abstract
In this paper, a method for single individual haplotype (SIH) reconstruction using Asexual reproduction optimization (ARO) is proposed. Haplotypes, as a set of genetic variations in each chromosome, contain vital information such as the relationship between human genome and diseases. Finding haplotypes in diploid organisms is a challenging task. Experimental methods are expensive and require special equipment. In SIH problem, we encounter with several fragments and each fragment covers some parts of desired haplotype. The main goal is bi-partitioning of the fragments with minimum error correction (MEC). This problem is addressed as NP-hard and several attempts have been made in order to solve it using heuristic methods. The current method, AROHap, has two main phases. In the first phase, most of the fragments are clustered based on a practical metric distance. In the second phase, ARO algorithm as a fast convergence bio-inspired method is used to improve the initial bi-partitioning of the fragments in the previous step. AROHap is implemented with several benchmark datasets. The experimental results demonstrate that satisfactory results were obtained, proving that AROHap can be used for SIH reconstruction problem.
Collapse
Affiliation(s)
- Mohammad-H Olyaee
- Department of Computer Engineering, University of Zanjan, Zanjan, Iran.
| | | |
Collapse
|