1
|
Lu P, Jiang J. AE-RW: Predicting miRNA-disease associations by using autoencoder and random walk on miRNA-gene-disease heterogeneous network. Comput Biol Chem 2024; 110:108085. [PMID: 38754260 DOI: 10.1016/j.compbiolchem.2024.108085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/04/2024] [Accepted: 04/23/2024] [Indexed: 05/18/2024]
Abstract
Since scientific investigations have demonstrated that aberrant expression of miRNAs brings about the incidence of numerous intricate diseases, precise determination of miRNA-disease relationships greatly contributes to the advancement of human medical progress. To tackle the issue of inefficient conventional experimental approaches, numerous computational methods have been proposed to predict miRNA-disease association with enhanced accuracy. However, constructing miRNA-gene-disease heterogeneous network by incorporating gene information has been relatively under-explored in existing computational techniques. Accordingly, this paper puts forward a technique to predict miRNA-disease association by applying autoencoder and implementing random walk on miRNA-gene-disease heterogeneous network(AE-RW). Firstly, we integrate association information and similarities between miRNAs, genes, and diseases to construct a miRNA-gene-disease heterogeneous network. Subsequently, we consolidate two network feature representations extracted independently via an autoencoder and a random walk procedure. Finally, deep neural network(DNN) are utilized to conduct association prediction. The experimental results demonstrate that the AE-RW model achieved an AUC of 0.9478 through 5-fold CV on the HMDD v3.2 dataset, outperforming the five most advanced existing models. Additionally, case studies were implemented for breast and lung cancer, further validated the superior predictive capabilities of our model.
Collapse
Affiliation(s)
- Pengli Lu
- School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, Gansu, PR China.
| | - Jicheng Jiang
- School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, Gansu, PR China.
| |
Collapse
|
2
|
Maruyama K, Miyazaki S, Kobayashi R, Hikita H, Tsubone T, Ohnuma K. The migration pattern of cells during the mesoderm and endoderm differentiation from human pluripotent stem cells. In Vitro Cell Dev Biol Anim 2024:10.1007/s11626-024-00904-4. [PMID: 38656570 DOI: 10.1007/s11626-024-00904-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Accepted: 03/16/2024] [Indexed: 04/26/2024]
Abstract
Gastrulation is the first major differentiation process in animal embryos. However, the dynamics of human gastrulation remain mostly unknown owing to the ethical limitations. We studied the dynamics of the mesoderm and endoderm cell differentiation from human pluripotent stem cells for insight into the cellular dynamics of human gastrulation. Human pluripotent stem cells have properties similar to those of the epiblast, which gives rise to the three germ layers. The mesoderm and endoderm were induced with more than 75% purity from human induced pluripotent stem cells. Single-cell dynamics of pluripotent stem cell-derived mesoderm and endoderm cells were traced using time-lapse imaging. Both mesoderm and endoderm cells migrate randomly, accompanied by short-term directional persistence. No substantial differences were detected between mesoderm and endoderm migration. Computer simulations created using the measured parameters revealed that random movement and external force, such as the spread out of cells from the primitive streak area, mimicked the homogeneous discoidal germ layer formation. These results were consistent with the development of amniotes, which suggests the effectiveness of human pluripotent stem cells as a good model for studying human embryogenesis.
Collapse
Affiliation(s)
- Kenshiro Maruyama
- Department of Science of Technology Innovation, Nagaoka University of Technology, 1603-1 Kamitomioka, Nagaoka, Niigata, 940-2188, Japan
| | - Shota Miyazaki
- Department of Bioengineering, Nagaoka University of Technology, 1603-1 Kamitomioka, Nagaoka, Niigata, 940-2188, Japan
| | - Ryo Kobayashi
- Department of Bioengineering, Nagaoka University of Technology, 1603-1 Kamitomioka, Nagaoka, Niigata, 940-2188, Japan
| | - Haru Hikita
- Department of Electrical, Electronics and Information Engineering, Nagaoka University of Technology, 1603-1 Kamitomioka, Nagaoka, Niigata, 940-2188, Japan
| | - Tadashi Tsubone
- Department of Electrical, Electronics and Information Engineering, Nagaoka University of Technology, 1603-1 Kamitomioka, Nagaoka, Niigata, 940-2188, Japan
| | - Kiyoshi Ohnuma
- Department of Science of Technology Innovation, Nagaoka University of Technology, 1603-1 Kamitomioka, Nagaoka, Niigata, 940-2188, Japan.
- Department of Bioengineering, Nagaoka University of Technology, 1603-1 Kamitomioka, Nagaoka, Niigata, 940-2188, Japan.
| |
Collapse
|
3
|
Li R, Guan J, Wang Z, Zhou S. A new and effective two-step clustering approach for single cell RNA sequencing data. BMC Genomics 2023; 23:864. [PMID: 37946133 PMCID: PMC10636845 DOI: 10.1186/s12864-023-09577-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 08/10/2023] [Indexed: 11/12/2023] Open
Abstract
BACKGROUND The rapid devolvement of single cell RNA sequencing (scRNA-seq) technology leads to huge amounts of scRNA-seq data, which greatly advance the research of many biomedical fields involving tissue heterogeneity, pathogenesis of disease and drug resistance etc. One major task in scRNA-seq data analysis is to cluster cells in terms of their expression characteristics. Up to now, a number of methods have been proposed to infer cell clusters, yet there is still much space to improve their performance. RESULTS In this paper, we develop a new two-step clustering approach to effectively cluster scRNA-seq data, which is called TSC - the abbreviation of Two-Step Clustering. Particularly, by dividing all cells into two types: core cells (those possibly lying around the centers of clusters) and non-core cells (those locating in the boundary areas of clusters), we first clusters the core cells by hierarchical clustering (the first step) and then assigns the non-core cells to the corresponding nearest clusters (the second step). Extensive experiments on 12 real scRNA-seq datasets show that TSC outperforms the state of the art methods. CONCLUSION TSC is an effective clustering method due to its two-steps clustering strategy, and it is a useful tool for scRNA-seq data analysis.
Collapse
Affiliation(s)
- Ruiyi Li
- Translational Medical Center for Stem Cell Therapy, Shanghai East Hospital, and School of Medicine, Tongji University, 1239 Siping Road, 200092, Shanghai, China
- Department of Computer Science and Technology, Tongji University, 4800 Caoan Road, 201804, Shanghai, China
| | - Jihong Guan
- Department of Computer Science and Technology, Tongji University, 4800 Caoan Road, 201804, Shanghai, China.
| | - Zhiye Wang
- Department of Computer Science and Technology, Tongji University, 4800 Caoan Road, 201804, Shanghai, China
| | - Shuigeng Zhou
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 2005 Songhu Road, 200438, Shanghai, China.
| |
Collapse
|
4
|
Salcedo MV, Gravel N, Keshavarzi A, Huang LC, Kochut KJ, Kannan N. Predicting protein and pathway associations for understudied dark kinases using pattern-constrained knowledge graph embedding. PeerJ 2023; 11:e15815. [PMID: 37868056 PMCID: PMC10590106 DOI: 10.7717/peerj.15815] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 07/10/2023] [Indexed: 10/24/2023] Open
Abstract
The 534 protein kinases encoded in the human genome constitute a large druggable class of proteins that include both well-studied and understudied "dark" members. Accurate prediction of dark kinase functions is a major bioinformatics challenge. Here, we employ a graph mining approach that uses the evolutionary and functional context encoded in knowledge graphs (KGs) to predict protein and pathway associations for understudied kinases. We propose a new scalable graph embedding approach, RegPattern2Vec, which employs regular pattern constrained random walks to sample diverse aspects of node context within a KG flexibly. RegPattern2Vec learns functional representations of kinases, interacting partners, post-translational modifications, pathways, cellular localization, and chemical interactions from a kinase-centric KG that integrates and conceptualizes data from curated heterogeneous data resources. By contextualizing information relevant to prediction, RegPattern2Vec improves accuracy and efficiency in comparison to other random walk-based graph embedding approaches. We show that the predictions produced by our model overlap with pathway enrichment data produced using experimentally validated Protein-Protein Interaction (PPI) data from both publicly available databases and experimental datasets not used in training. Our model also has the advantage of using the collected random walks as biological context to interpret the predicted protein-pathway associations. We provide high-confidence pathway predictions for 34 dark kinases and present three case studies in which analysis of meta-paths associated with the prediction enables biological interpretation. Overall, RegPattern2Vec efficiently samples multiple node types for link prediction on biological knowledge graphs and the predicted associations between understudied kinases, pseudokinases, and known pathways serve as a conceptual starting point for hypothesis generation and testing.
Collapse
Affiliation(s)
- Mariah V. Salcedo
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA, United States of America
| | - Nathan Gravel
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States of America
| | - Abbas Keshavarzi
- School of Computing, University of Georgia, Athens, GA, United States of America
| | - Liang-Chin Huang
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States of America
| | - Krzysztof J. Kochut
- School of Computing, University of Georgia, Athens, GA, United States of America
| | - Natarajan Kannan
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA, United States of America
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States of America
| |
Collapse
|
5
|
Patwary MSA, Das KP. Forecasting stock indices with the COVID-19 infection rate as an exogenous variable. PeerJ Comput Sci 2023; 9:e1532. [PMID: 37705632 PMCID: PMC10495988 DOI: 10.7717/peerj-cs.1532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 07/20/2023] [Indexed: 09/15/2023]
Abstract
Forecasting stock market indices is challenging because stock prices are usually nonlinear and non- stationary. COVID-19 has had a significant impact on stock market volatility, which makes forecasting more challenging. Since the number of confirmed cases significantly impacted the stock price index; hence, it has been considered a covariate in this analysis. The primary focus of this study is to address the challenge of forecasting volatile stock indices during Covid-19 by employing time series analysis. In particular, the goal is to find the best method to predict future stock price indices in relation to the number of COVID-19 infection rates. In this study, the effect of covariates has been analyzed for three stock indices: S & P 500, Morgan Stanley Capital International (MSCI) world stock index, and the Chicago Board Options Exchange (CBOE) Volatility Index (VIX). Results show that parametric approaches can be good forecasting models for the S & P 500 index and the VIX index. On the other hand, a random walk model can be adopted to forecast the MSCI index. Moreover, among the three random walk forecasting methods for the MSCI index, the naïve method provides the best forecasting model.
Collapse
Affiliation(s)
| | - Kumer Pial Das
- Research, Innovation, and Economic Development, University of Louisiana at Lafayette, Lafayette, LA, United States of America
| |
Collapse
|
6
|
Li X, Yuan H, Wu X, Wang C, Wu M, Shi H, Lv Y. MultiDS-MDA: Integrating multiple data sources into heterogeneous network for predicting novel metabolite-drug associations. Comput Biol Med 2023; 162:107067. [PMID: 37276756 DOI: 10.1016/j.compbiomed.2023.107067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 05/15/2023] [Accepted: 05/27/2023] [Indexed: 06/07/2023]
Abstract
Metabolic processes in the human body play an important role in maintaining normal life activities, and the abnormal concentration of metabolites is closely related to the occurrence and development of diseases. The use of drugs is considered to have a major impact on metabolism, and drug metabolites can contribute to efficacy, drug toxicity and drug-drug interaction. However, our understanding of metabolite-drug associations is far from complete, and individual data source tends to be incomplete and noisy. Therefore, the integration of various types of data sources for inferring reliable metabolite-drug associations is urgently needed. In this study, we proposed a computational framework, MultiDS-MDA, for identifying metabolite-drug associations by integrating multiple data sources, including chemical structure information of metabolites and drugs, the relationships of metabolite-gene, metabolite-disease, drug-gene and drug-disease, the data of gene ontology (GO) and disease ontology (DO) and known metabolite-drug connections. The performance of MultiDS-MDA was evaluated by 5-fold cross-validation, which achieved an area under the ROC curve (AUROC) of 0.911 and an area under the precision-recall curve (AUPRC) of 0.907. Additionally, MultiDS-MDA showed outstanding performance compared with similar approaches. Case studies for three metabolites (cholesterol, thromboxane B2 and coenzyme Q10) and three drugs (simvastatin, pravastatin and morphine) also demonstrated the reliability and efficiency of MultiDS-MDA, and it is anticipated that MultiDS-MDA will serve as a powerful tool for future exploration of metabolite-drug interactions and contribute to drug development and drug combination.
Collapse
Affiliation(s)
- Xiuhong Li
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Hao Yuan
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Xiaoliang Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Chengyi Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Meitao Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Hongbo Shi
- College of Bioinformatics Science and Technology, Harbin Medical University, China.
| | - Yingli Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, China.
| |
Collapse
|
7
|
Zhang GZ, Gao YL. BRWMC: Predicting lncRNA-disease associations based on bi- random walk and matrix completion on disease and lncRNA networks. Comput Biol Chem 2023; 103:107833. [PMID: 36812824 DOI: 10.1016/j.compbiolchem.2023.107833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Revised: 12/29/2022] [Accepted: 02/15/2023] [Indexed: 02/19/2023]
Abstract
Many experiments have proved that long non-coding RNAs (lncRNAs) in humans have been implicated in disease development. The prediction of lncRNA-disease association is essential in promoting disease treatment and drug development. It is time-consuming and laborious to explore the relationship between lncRNA and diseases in the laboratory. The computation-based approach has clear advantages and has become a promising research direction. This paper proposes a new lncRNA disease association prediction algorithm BRWMC. Firstly, BRWMC constructed several lncRNA (disease) similarity networks based on different measurement angles and fused them into an integrated similarity network by similarity network fusion (SNF). In addition, the random walk method is used to preprocess the known lncRNA-disease association matrix and calculate the estimated scores of potential lncRNA-disease associations. Finally, the matrix completion method accurately predicts the potential lncRNA-disease associations. Under the framework of leave-one-out cross-validation and 5-fold cross-validation, the AUC values obtained by BRWMC are 0.9610 and 0.9739, respectively. In addition, case studies of three common diseases show that BRWMC is a reliable method for prediction.
Collapse
Affiliation(s)
- Guo-Zheng Zhang
- School of Computer Science, Qufu Normal University, Rizhao, China
| | - Ying-Lian Gao
- Qufu Normal University Library, Qufu Normal University, Rizhao, China.
| |
Collapse
|
8
|
Han S, Hong J, Yun SJ, Koo HJ, Kim TY. PWN: enhanced random walk on a warped network for disease target prioritization. BMC Bioinformatics 2023; 24:105. [PMID: 36944912 PMCID: PMC10031933 DOI: 10.1186/s12859-023-05227-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 03/13/2023] [Indexed: 03/23/2023] Open
Abstract
BACKGROUND Extracting meaningful information from unbiased high-throughput data has been a challenge in diverse areas. Specifically, in the early stages of drug discovery, a considerable amount of data was generated to understand disease biology when identifying disease targets. Several random walk-based approaches have been applied to solve this problem, but they still have limitations. Therefore, we suggest a new method that enhances the effectiveness of high-throughput data analysis with random walks. RESULTS We developed a new random walk-based algorithm named prioritization with a warped network (PWN), which employs a warped network to achieve enhanced performance. Network warping is based on both internal and external features: graph curvature and prior knowledge. CONCLUSIONS We showed that these compositive features synergistically increased the resulting performance when applied to random walk algorithms, which led to PWN consistently achieving the best performance among several other known methods. Furthermore, we performed subsequent experiments to analyze the characteristics of PWN.
Collapse
Affiliation(s)
- Seokjin Han
- Standigm Inc., 70, Nonhyeon-ro 85-gil, Gangnam-gu, Seoul, 06234, Republic of Korea
| | - Jinhee Hong
- Standigm Inc., 70, Nonhyeon-ro 85-gil, Gangnam-gu, Seoul, 06234, Republic of Korea
| | - So Jeong Yun
- Standigm Inc., 70, Nonhyeon-ro 85-gil, Gangnam-gu, Seoul, 06234, Republic of Korea
| | - Hee Jung Koo
- Standigm UK Co., Ltd, 50-60 Station Road, Cambridge, CB1 2JH, UK.
| | - Tae Yong Kim
- Standigm Inc., 70, Nonhyeon-ro 85-gil, Gangnam-gu, Seoul, 06234, Republic of Korea.
| |
Collapse
|
9
|
Triambak S, Mahapatra D, Barik N, Chutjian A. Plausible explanation for the third COVID-19 wave in India and its implications. Infect Dis Model 2023; 8:183-191. [PMID: 36643865 PMCID: PMC9824946 DOI: 10.1016/j.idm.2023.01.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 12/29/2022] [Accepted: 01/01/2023] [Indexed: 01/09/2023] Open
Abstract
Recently some of us used a random-walk Monte Carlo simulation approach to study the spread of COVID-19. The calculations were reasonably successful in describing secondary and tertiary waves of infection, in countries such as the USA, India, South Africa and Serbia. However, they failed to predict the observed third wave for India. In this work we present a more complete set of simulations for India, that take into consideration two aspects that were not incorporated previously. These include the stochastic movement of an erstwhile protected fraction of the population, and the reinfection of some recovered individuals because of their exposure to a new variant of the SARS-CoV-2 virus. The extended simulations now show the third COVID-19 wave for India that was missing in the earlier calculations. They also suggest an additional fourth wave, which was indeed observed during approximately the same time period as the model prediction.
Collapse
Affiliation(s)
- S. Triambak
- Department of Physics and Astronomy, University of the Western Cape, P/B X17, Bellville, 7535, South Africa,Corresponding author
| | - D.P. Mahapatra
- Department of Physics, Utkal University, Vani Vihar, Bhubaneshwar, 751004, India
| | - N. Barik
- Department of Physics, Utkal University, Vani Vihar, Bhubaneshwar, 751004, India
| | - A. Chutjian
- Armenian Engineers and Scientists of America, 326 Mira Loma Ave., Glendale, CA, 91204, USA
| |
Collapse
|
10
|
Wang C, Shi J, Cai J, Zhang Y, Zheng X, Zhang N. DriverRWH: discovering cancer driver genes by random walk on a gene mutation hypergraph. BMC Bioinformatics 2022; 23:277. [PMID: 35831792 PMCID: PMC9281118 DOI: 10.1186/s12859-022-04788-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 06/08/2022] [Indexed: 12/24/2022] Open
Abstract
Background Recent advances in next-generation sequencing technologies have helped investigators generate massive amounts of cancer genomic data. A critical challenge in cancer genomics is identification of a few cancer driver genes whose mutations cause tumor growth. However, the majority of existing computational approaches underuse the co-occurrence mutation information of the individuals, which are deemed to be important in tumorigenesis and tumor progression, resulting in high rate of false positive. Results To make full use of co-mutation information, we present a random walk algorithm referred to as DriverRWH on a weighted gene mutation hypergraph model, using somatic mutation data and molecular interaction network data to prioritize candidate driver genes. Applied to tumor samples of different cancer types from The Cancer Genome Atlas, DriverRWH shows significantly better performance than state-of-art prioritization methods in terms of the area under the curve scores and the cumulative number of known driver genes recovered in top-ranked candidate genes. Besides, DriverRWH discovers several potential drivers, which are enriched in cancer-related pathways. DriverRWH recovers approximately 50% known driver genes in the top 30 ranked candidate genes for more than half of the cancer types. In addition, DriverRWH is also highly robust to perturbations in the mutation data and gene functional network data. Conclusion DriverRWH is effective among various cancer types in prioritizes cancer driver genes and provides considerable improvement over other tools with a better balance of precision and sensitivity. It can be a useful tool for detecting potential driver genes and facilitate targeted cancer therapies. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04788-7.
Collapse
Affiliation(s)
- Chenye Wang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Junhan Shi
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Jiansheng Cai
- Department of Mathematics, Weifang University, Weifang, 261061, Shandong, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Xiaoqi Zheng
- Department of Mathematics, Shanghai Normal University, Shanghai, 200234, China
| | - Naiqian Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China.
| |
Collapse
|
11
|
Mahapatra DP, Triambak S. Towards predicting COVID-19 infection waves: A random-walk Monte Carlo simulation approach. Chaos Solitons Fractals 2022; 156:111785. [PMID: 35035125 PMCID: PMC8743467 DOI: 10.1016/j.chaos.2021.111785] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 12/27/2021] [Accepted: 12/30/2021] [Indexed: 06/01/2023]
Abstract
Phenomenological and deterministic models are often used for the estimation of transmission parameters in an epidemic and for the prediction of its growth trajectory. Such analyses are usually based on single peak outbreak dynamics. In light of the present COVID-19 pandemic, there is a pressing need to better understand observed epidemic growth with multiple peak structures, preferably using first-principles methods. Along the lines of our previous work [Physica A 574, 126014 (2021)], here we apply 2D random-walk Monte Carlo calculations to better understand COVID-19 spread through contact interactions. Lockdown scenarios and all other control interventions are imposed through mobility restrictions and a regulation of the infection rate within the stochastically interacting population. The susceptible, infected and recovered populations are tracked over time, with daily infection rates obtained without recourse to the solution of differential equations. The simulations were carried out for population densities corresponding to four countries, India, Serbia, South Africa and USA. In all cases our results capture the observed infection growth rates. More importantly, the simulation model is shown to predict secondary and tertiary waves of infections with reasonable accuracy. This predictive nature of multiple wave structures provides a simple and effective tool that may be useful in planning mitigation strategies during the present pandemic.
Collapse
Affiliation(s)
- D P Mahapatra
- Department of Physics, Utkal University, Vani Vihar, Bhubaneshwar 751004, India
| | - S Triambak
- Department of Physics and Astronomy, University of the Western Cape, P/B X17, Bellville 7535, South Africa
| |
Collapse
|
12
|
Alpern S, Zeng L. Social Distancing, Gathering, Search Games: Mobile Agents on Simple Networks. Dyn Games Appl 2022; 12:288-311. [PMID: 35127231 PMCID: PMC8809073 DOI: 10.1007/s13235-022-00427-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 01/03/2022] [Indexed: 06/14/2023]
Abstract
During epidemics, the population is asked to socially distance, with pairs of individuals keeping two meters apart. We model this as a new optimization problem by considering a team of agents placed on the nodes of a network. Their common aim is to achieve pairwise graph distances of at least D, a state we call socially distanced. (If D = 1 , they want to be at distinct nodes; if D = 2 they want to be non-adjacent.) We allow only a simple type of motion called a lazy random walk: with probability p (called the laziness parameter), they remain at their current node next period; with complementary probability 1 - p , they move to a random adjacent node. The team seeks the common value of p which achieves social distance in the least expected time, which is the absorption time of a Markov chain. We observe that the same Markov chain, with different goals (absorbing states), models the gathering, or multi-rendezvous problem (all agents at the same node). Allowing distinct laziness for two types of agents (searchers and hider) extends the existing literature on predator-prey search games to multiple searchers. We consider only special networks: line, cycle and grid.
Collapse
Affiliation(s)
- Steve Alpern
- Warwick Business School, University of Warwick, Coventry, CV4 7AL UK
| | - Li Zeng
- Department of Statistics, University of Warwick, Coventry, CV4 7AL UK
| |
Collapse
|
13
|
Thafar MA, Olayan RS, Albaradei S, Bajic VB, Gojobori T, Essack M, Gao X. DTi2Vec: Drug-target interaction prediction using network embedding and ensemble learning. J Cheminform 2021; 13:71. [PMID: 34551818 PMCID: PMC8459562 DOI: 10.1186/s13321-021-00552-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Accepted: 09/05/2021] [Indexed: 11/21/2022] Open
Abstract
Drug-target interaction (DTI) prediction is a crucial step in drug discovery and repositioning as it reduces experimental validation costs if done right. Thus, developing in-silico methods to predict potential DTI has become a competitive research niche, with one of its main focuses being improving the prediction accuracy. Using machine learning (ML) models for this task, specifically network-based approaches, is effective and has shown great advantages over the other computational methods. However, ML model development involves upstream hand-crafted feature extraction and other processes that impact prediction accuracy. Thus, network-based representation learning techniques that provide automated feature extraction combined with traditional ML classifiers dealing with downstream link prediction tasks may be better-suited paradigms. Here, we present such a method, DTi2Vec, which identifies DTIs using network representation learning and ensemble learning techniques. DTi2Vec constructs the heterogeneous network, and then it automatically generates features for each drug and target using the nodes embedding technique. DTi2Vec demonstrated its ability in drug-target link prediction compared to several state-of-the-art network-based methods, using four benchmark datasets and large-scale data compiled from DrugBank. DTi2Vec showed a statistically significant increase in the prediction performances in terms of AUPR. We verified the "novel" predicted DTIs using several databases and scientific literature. DTi2Vec is a simple yet effective method that provides high DTI prediction performance while being scalable and efficient in computation, translating into a powerful drug repositioning tool.
Collapse
Affiliation(s)
- Maha A Thafar
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
- College of Computers and Information Technology, Computer Science Department, Taif University, Taif, Kingdom of Saudi Arabia
| | - Rawan S Olayan
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Somayah Albaradei
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
- Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia
| | - Vladimir B Bajic
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Takashi Gojobori
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Magbubah Essack
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.
| |
Collapse
|
14
|
Nasiri E, Berahmand K, Rostami M, Dabiri M. A novel link prediction algorithm for protein-protein interaction networks by attributed graph embedding. Comput Biol Med 2021; 137:104772. [PMID: 34450380 DOI: 10.1016/j.compbiomed.2021.104772] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Revised: 07/29/2021] [Accepted: 08/13/2021] [Indexed: 10/20/2022]
Abstract
The prediction of interactions in protein networks is very critical in various biological processes. In recent years, scientists have focused on computational approaches to predict the interactions of proteins. In protein-protein interaction (PPI) networks, each protein is accompanied by various features, including amino acid sequence, subcellular location, and protein domains. Embedding-based methods have been widely applied for many network analysis tasks, such as link prediction. The Deepwalk algorithm is one of the most popular graph embedding methods that capture the network structure using pure random walking. Here in this paper, we treat the protein-protein interaction prediction problem as a link prediction in attributed networks, and we use an attributed embedding approach to predict the interactions between proteins in the PPI network. In particular, the present paper seeks to present a modified version of Deepwalk based on feature selection for solving link prediction in the protein-protein interaction, which will benefit both network structure and protein features. More specifically the feature selection step consists of two distinct parts. First, a set of relevant features are selected from the original feature set, such that the dimensionality of features is reduced. Second, in the selected set of features, each feature is assigned with a weight based on its significance and therefore the contribution of each feature is distinguished from others. In this method, the new random walk model for link prediction will be introduced by integrating network structure and protein features, based on the assumption that two nodes on the network will be linked since they are nearby in the network. In order to justify the proposal, the authors carry out many experiments on protein-protein interaction networks for comparison with the state-of-the-art network embedding methods. The experimental results from the graphs indicate that our proposed approach is more capable compared to other link prediction approaches and increases the accuracy of prediction.
Collapse
Affiliation(s)
- Elahe Nasiri
- Department of Information Technology and Communications, Azarbaijan Shahid Madani University, Tabriz, Iran.
| | - Kamal Berahmand
- School of Computer Sciences, Department of Science and Engineering, Queensland University of Technology, Brisbane, Australia.
| | - Mehrdad Rostami
- Department of Computer Engineering, University of Kurdistan, Sanandaj, Iran.
| | - Mohammad Dabiri
- Department of Plant Biotechnology, University of Kurdistan, Sanandaj, Iran.
| |
Collapse
|
15
|
Triambak S, Mahapatra DP. A random walk Monte Carlo simulation study of COVID-19-like infection spread. Physica A 2021; 574:126014. [PMID: 33875903 PMCID: PMC8047309 DOI: 10.1016/j.physa.2021.126014] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 03/05/2021] [Indexed: 05/30/2023]
Abstract
Recent analysis of early COVID-19 data from China showed that the number of confirmed cases followed a subexponential power-law increase, with a growth exponent of around 2.2 (Maier and Brockmann, 2020). The power-law behavior was attributed to a combination of effective containment and mitigation measures employed as well as behavioral changes by the population. In this work, we report a random walk Monte Carlo simulation study of proximity-based infection spread. Control interventions such as lockdown measures and mobility restrictions are incorporated in the simulations through a single parameter, the size of each step in the random walk process. The step size l is taken to be a multiple of 〈 r 〉 , which is the average separation between individuals. Three temporal growth regimes (quadratic, intermediate power-law and exponential) are shown to emerge naturally from our simulations. For l = 〈 r 〉 , we get intermediate power-law growth exponents that are in general agreement with available data from China. On the other hand, we obtain a quadratic growth for smaller step sizes l ≲ 〈 r 〉 ∕ 2 , while for large l the growth is found to be exponential. We further performed a comparative case study of early fatality data (under varying levels of lockdown conditions) from three other countries, India, Brazil and South Africa. We show that reasonable agreement with these data can be obtained by incorporating small-world-like connections in our simulations.
Collapse
Affiliation(s)
- S Triambak
- Department of Physics and Astronomy, University of the Western Cape, P/B X17, Bellville 7535, South Africa
| | - D P Mahapatra
- Department of Physics, Utkal University, Vani Vihar, Bhubaneshwar 751004, India
| |
Collapse
|
16
|
Dasgupta A, Sengupta S. Scalable Estimation of Epidemic Thresholds via Node Sampling. Sankhya Ser A 2021; 84:321-344. [PMID: 34248309 PMCID: PMC8260572 DOI: 10.1007/s13171-021-00249-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Accepted: 05/11/2021] [Indexed: 02/06/2023]
Abstract
Infectious or contagious diseases can be transmitted from one person to another through social contact networks. In today's interconnected global society, such contagion processes can cause global public health hazards, as exemplified by the ongoing Covid-19 pandemic. It is therefore of great practical relevance to investigate the network transmission of contagious diseases from the perspective of statistical inference. An important and widely studied boundary condition for contagion processes over networks is the so-called epidemic threshold. The epidemic threshold plays a key role in determining whether a pathogen introduced into a social contact network will cause an epidemic or die out. In this paper, we investigate epidemic thresholds from the perspective of statistical network inference. We identify two major challenges that are caused by high computational and sampling complexity of the epidemic threshold. We develop two statistically accurate and computationally efficient approximation techniques to address these issues under the Chung-Lu modeling framework. The second approximation, which is based on random walk sampling, further enjoys the advantage of requiring data on a vanishingly small fraction of nodes. We establish theoretical guarantees for both methods and demonstrate their empirical superiority.
Collapse
Affiliation(s)
- Anirban Dasgupta
- Computer Science and Engineering, Indian Institute of Technology, Gandhinagar, Gandhinagar, India
| | - Srijan Sengupta
- Statistics, North Carolina State University, Raleigh, NC USA
| |
Collapse
|
17
|
Li S, Cao Y, Zhang H, Lu X, Wang T, Xu S, Kong T, Bo C, Li L, Ning S, Wang J, Wang L. Construction of lncRNA-Mediated ceRNA Network for Investigating Immune Pathogenesis of Ischemic Stroke. Mol Neurobiol 2021; 58:4758-4769. [PMID: 34173933 DOI: 10.1007/s12035-021-02426-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 05/11/2021] [Indexed: 11/30/2022]
Abstract
Ischemic stroke (IS) is a common and serious neurological disease. Extensive evidence indicates that activation of the immune system contributes significantly to the development of IS pathology. In recent years, some long non-coding RNAs (lncRNAs), acting as competing endogenous RNAs (ceRNAs), have been reported to affect IS process, especially the immunological response after stroke. However, the roles of lncRNA-mediated ceRNAs in immune pathogenesis of IS are not systemically investigated. In the present study, we generated a global immune-related ceRNA network containing immune-related genes (IRGs), miRNAs, and lncRNAs based on experimentally verified interactions. Further, we excavated an IS immune-related ceRNA (ISIRC) network through mapping significantly differentially expressed IRGs, miRNAs, and lncRNAs of patients with IS into the global network. We analyzed the topological properties of the two networks, respectively, and found that lncRNA NEAT1 and lncRNA KCNQ1OT1 played core roles in aforementioned two immune-related networks. Moreover, the results of functional enrichment analyses revealed that lncRNAs in the ISIRC network were mainly involved in several immune-related biological processes and pathways. Finally, we identified 17 lncRNAs which were highly related to the immune mechanism of IS through performing random walk with restart for the ISIRC network. Importantly, it has been confirmed that NEAT1, KCNQ1OT1, GAS5, and RMRP could regulate immuno-inflammatory response after stroke, such as production of inflammatory factors and activation of the immune cells. Our results suggested that lncRNAs exerted an important role in the immune pathogenesis of IS and provided a new strategy to do research on IS.
Collapse
Affiliation(s)
- Shuang Li
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang Province, China
| | - Yuze Cao
- Department of Neurology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, 100730, China
| | - Huixue Zhang
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang Province, China
| | - Xiaoyu Lu
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang Province, China
| | - Tianfeng Wang
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang Province, China
| | - Si Xu
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang Province, China
| | - Tongxiao Kong
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang Province, China
| | - Chunrui Bo
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang Province, China
| | - Lifang Li
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang Province, China
| | - Shangwei Ning
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang Province, China.
| | - Jianjian Wang
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang Province, China.
| | - Lihua Wang
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang Province, China.
| |
Collapse
|
18
|
Nagatani T, Tainaka KI. Effects of pest control on a food chain in patchy environment: Species-dependent activity range on multilayer graphs. Biosystems 2021; 206:104425. [PMID: 33865913 DOI: 10.1016/j.biosystems.2021.104425] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/31/2021] [Accepted: 04/02/2021] [Indexed: 12/28/2022]
Abstract
Ecosystems on earth are strongly affected by human life. We pay attention to pest control in a patchy environment. To date, many authors have reported the indeterminacy in pest control. Most of these works have been studied in single-habitat systems. In the present article, however, we consider a food chain model (prey, predator and top predator) on five networks of patches, where node and link denote habitable patch and migration path, respectively. Each network includes three layers which represent the activity ranges of respective species. Reaction-migration equations are solved analytically and numerically. It is found the dynamics largely change depending on the geometry of networks. When removal rate of top predator is increased, the so-called "top-down effect" is commonly observed. In this case, the pest control will be successful, but extinction point of top predator largely differs on different networks. When removal rate of intermediate predator is increased, the responses of system become complicated. The responses differ not only for each patch but also for each geometry. Hence, the pest control on intermediate predators may fail.
Collapse
Affiliation(s)
- Takashi Nagatani
- Department of Mechanical Engineering, Shizuoka University, Hamamatsu, 432-8561, Japan
| | - Kei-Ichi Tainaka
- Department of Mathematical and Systems Engineering, Shizuoka University, Hamamatsu, 432-8561, Japan.
| |
Collapse
|
19
|
Ko I, Chambers D, Barrett E. Recurrent autonomous autoencoder for intelligent DDoS attack mitigation within the ISP domain. INT J MACH LEARN CYB 2021;:1-23. [PMID: 33786073 DOI: 10.1007/s13042-021-01306-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Accepted: 03/10/2021] [Indexed: 10/29/2022]
Abstract
The continuous advancement of DDoS attack technology and an increasing number of IoT devices connected on 5G networks escalate the level of difficulty for DDoS mitigation. A growing number of researchers have started to utilise Deep Learning algorithms to improve the performance of DDoS mitigation systems. Real DDoS attack data has no labels, and hence, we present an intelligent attack mitigation (IAM) system, which takes an ensemble approach by employing Recurrent Autonomous Autoencoders (RAA) as basic learners with a majority voting scheme. The RAA is a target-driven, distributionenabled, and imbalanced clustering algorithm, which is designed to work with the ISP's blackholing mechanism for DDoS flood attack mitigation. It can dynamically select features, decide a reference target (RT), and determine an optimal threshold to classify network traffic. A novel Comparison-Max Random Walk algorithm is used to determine the RT, which is used as an instrument to direct the model to classify the data so that the predicted positives are close or equal to the RT. We also propose Estimated Evaluation Metrics (EEM) to evaluate the performance of unsupervised models. The IAM system is tested with UDP flood, TCP flood, ICMP flood, multi-vector and a real UDP flood attack data. Additionally, to check the scalability of the IAM system, we tested it on every subdivided data set for distributed computing. The average Recall on all data sets was above 98%.
Collapse
|
20
|
Bijma NN, Filippov AE, Gorb SN. Sisyphus and his rock: Quasi- random walk inspired by the motion of a ball transported by a dung beetle on combined terrain. J Theor Biol 2021; 520:110659. [PMID: 33662373 DOI: 10.1016/j.jtbi.2021.110659] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Revised: 02/12/2021] [Accepted: 02/25/2021] [Indexed: 10/22/2022]
Abstract
The majority of biologically inspired dynamic problems are essentially defined by the complexity of the contact surface where such motion takes place. From a statistical point of view, such a surface in many biological problems is typically a combination of a universal scale invariant (fractal) component and a well-defined component having a characteristic scale. If the biological object, here a dung ball, or its parts have a size comparable to the dimensions of the surface peculiarities, one can expect a strong influence on the motion. To avoid competition for the same food resource, some dung-feeding insect species form a dung ball and roll it away from the dung pile. In order to quickly escape competition, dung beetles seem to strictly follow an initial bearing. On flat terrain, they manage to roll a dung ball along a nearly perfect straight path. However, on a more realistic terrain, which normally includes both components mentioned above, the motion is more complex. In this study, we numerically model the ball transportation on terrain with different scales of surface profile. A strong correlation is observed between effective ball transportation (time, distance, work) and the ratio of the size of the ball relative to the size of the terrain roughness. Surface irregularities, with a characteristic size comparable to the ball diameter, are negatively correlated to the efficiency of ball transportation. In addition a strong correlation is found between the quasi random noise, numerically simulating the activity of a dung beetle trying to escape from a valley in which it is trapped, and the success in ball transportation.
Collapse
Affiliation(s)
- Nienke N Bijma
- Functional Morphology and Biomechanics, Zoological Institute, Kiel University, Am Botanischen Garten, 1-9, Kiel 24118, Germany.
| | - Alexander E Filippov
- Functional Morphology and Biomechanics, Zoological Institute, Kiel University, Am Botanischen Garten, 1-9, Kiel 24118, Germany; Donetsk Institute for Physics and Engineering, National Academy of Sciences of Ukraine, Donetsk, Ukraine
| | - Stanislav N Gorb
- Functional Morphology and Biomechanics, Zoological Institute, Kiel University, Am Botanischen Garten, 1-9, Kiel 24118, Germany
| |
Collapse
|
21
|
Xiao Y, Xiao Z, Feng X, Chen Z, Kuang L, Wang L. A novel computational model for predicting potential LncRNA-disease associations based on both direct and indirect features of LncRNA-disease pairs. BMC Bioinformatics 2020; 21:555. [PMID: 33267800 PMCID: PMC7709313 DOI: 10.1186/s12859-020-03906-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Accepted: 11/25/2020] [Indexed: 12/25/2022] Open
Abstract
Background Accumulating evidence has demonstrated that long non-coding RNAs (lncRNAs) are closely associated with human diseases, and it is useful for the diagnosis and treatment of diseases to get the relationships between lncRNAs and diseases. Due to the high costs and time complexity of traditional bio-experiments, in recent years, more and more computational methods have been proposed by researchers to infer potential lncRNA-disease associations. However, there exist all kinds of limitations in these state-of-the-art prediction methods as well. Results In this manuscript, a novel computational model named FVTLDA is proposed to infer potential lncRNA-disease associations. In FVTLDA, its major novelty lies in the integration of direct and indirect features related to lncRNA-disease associations such as the feature vectors of lncRNA-disease pairs and their corresponding association probability fractions, which guarantees that FVTLDA can be utilized to predict diseases without known related-lncRNAs and lncRNAs without known related-diseases. Moreover, FVTLDA neither relies solely on known lncRNA-disease nor requires any negative samples, which guarantee that it can infer potential lncRNA-disease associations more equitably and effectively than traditional state-of-the-art prediction methods. Additionally, to avoid the limitations of single model prediction techniques, we combine FVTLDA with the Multiple Linear Regression (MLR) and the Artificial Neural Network (ANN) for data analysis respectively. Simulation experiment results show that FVTLDA with MLR can achieve reliable AUCs of 0.8909, 0.8936 and 0.8970 in 5-Fold Cross Validation (fivefold CV), 10-Fold Cross Validation (tenfold CV) and Leave-One-Out Cross Validation (LOOCV), separately, while FVTLDA with ANN can achieve reliable AUCs of 0.8766, 0.8830 and 0.8807 in fivefold CV, tenfold CV, and LOOCV respectively. Furthermore, in case studies of gastric cancer, leukemia and lung cancer, experiment results show that there are 8, 8 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with MLR, and 8, 7 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with ANN, having been verified by recent literature. Comparing with the representative prediction model of KATZLDA, comparison results illustrate that FVTLDA with MLR and FVTLDA with ANN can achieve the average case study contrast scores of 0.8429 and 0.8515 respectively, which are both notably higher than the average case study contrast score of 0.6375 achieved by KATZLDA. Conclusion The simulation results show that FVTLDA has good prediction performance, which is a good supplement to future bioinformatics research.
Collapse
Affiliation(s)
- Yubin Xiao
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410001, People's Republic of China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, People's Republic of China
| | - Zheng Xiao
- Hunan Province Key Laboratory of Tumor Cellular and Molecular Pathology, Cancer Research Institute, University of South China, Hengyang, 421001, Hunan, People's Republic of China
| | - Xiang Feng
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410001, People's Republic of China
| | - Zhiping Chen
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410001, People's Republic of China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, People's Republic of China
| | - Lei Wang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410001, People's Republic of China. .,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, People's Republic of China.
| |
Collapse
|
22
|
Christensen K, Cocconi L, Sendova-Franks AB. Animal intermittent locomotion: A null model for the probability of moving forward in bounded space. J Theor Biol 2020; 510:110533. [PMID: 33181179 DOI: 10.1016/j.jtbi.2020.110533] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Revised: 10/26/2020] [Accepted: 10/27/2020] [Indexed: 01/06/2023]
Abstract
We present a null model to be compared with biological data to test for intrinsic persistence in movement between stops during intermittent locomotion in bounded space with different geometries and boundary conditions. We describe spatio-temporal properties of the sequence of stopping points r1,r2,r3,… visited by a Random Walker within a bounded space. The path between stopping points is not considered, only the displacement. Since there are no intrinsic correlations in the displacements between stopping points, there is no intrinsic persistence in the movement between them. Hence, this represents a null-model against which to compare empirical data for directional persistence in the movement between stopping points when there is external bias due to the bounded space. This comparison is a necessary first step in testing hypotheses about the function of the stops that punctuate intermittent locomotion in diverse organisms. We investigate the probability of forward movement, defined as a deviation of less than 90° between two successive displacement vectors, as a function of the ratio between the largest displacement between stops that could be performed by the random walker and the system size, α=Δℓ/Lmax. As expected, the probability of forward movement is 1/2 when α→0. However, when α is finite, this probability is less than 1/2 with a minimum value when α=1. For certain boundary conditions, the minimum value is between 1/3 and 1/4 in 1D while it can be even lower in 2D. The probability of forward movement in 1D is calculated exactly for all values 0<α⩽1 for several boundary conditions. Analytical calculations for the probability of forward movement are performed in 2D for circular and square bounded regions with one boundary condition. Numerical results for all values 0<α⩽1 are presented for several boundary conditions. The cases of rectangle and ellipse are also considered and an approximate model of the dependence of the forward movement probability on the aspect ratio is provided. Finally, some practical points are presented on how these results can be utilised in the empirical analysis of animal movement in two-dimensional bounded space.
Collapse
Affiliation(s)
- Kim Christensen
- Blackett Laboratory, Imperial College London, London SW7 2AZ, UK; Center for Complexity Science, Imperial College London, London SW7 2AZ, UK.
| | - Luca Cocconi
- Blackett Laboratory, Imperial College London, London SW7 2AZ, UK; Center for Complexity Science, Imperial College London, London SW7 2AZ, UK; Theoretical Physics of Biology Laboratory, The Francis Crick Institute, London NW1 1AT, UK
| | - Ana B Sendova-Franks
- School of Biological Sciences, University of Bristol, 24 Tyndall Avenue, Bristol BS8 1TQ, UK
| |
Collapse
|
23
|
Sotero RC, Sanchez-Rodriguez LM, Moradi N, Dousty M. Estimation of global and local complexities of brain networks: A random walks approach. Netw Neurosci 2020; 4:575-594. [PMID: 32885116 PMCID: PMC7462425 DOI: 10.1162/netn_a_00138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Accepted: 03/23/2020] [Indexed: 11/29/2022] Open
Abstract
The complexity of brain activity has been observed at many spatial scales and has been proposed to differentiate between mental states and disorders. Here we introduced a new measure of (global) network complexity, constructed as the sum of the complexities of its nodes (i.e., local complexity). The complexity of each node is obtained by comparing the sample entropy of the time series generated by the movement of a random walker on the network resulting from removing the node and its connections, with the sample entropy of the time series obtained from a regular lattice (ordered state) and a random network (disordered state). We studied the complexity of fMRI-based resting-state networks. We found that positively correlated (pos) networks comprising only the positive functional connections have higher complexity than anticorrelation (neg) networks (comprising the negative connections) and the network consisting of the absolute value of all connections (abs). We also observed a significant correlation between complexity and the strength of functional connectivity in the pos network. Our results suggest that the pos network is related to the information processing in the brain and that functional connectivity studies should analyze pos and neg networks separately instead of the abs network, as is commonly done.
Collapse
Affiliation(s)
- Roberto C. Sotero
- Hotchkiss Brain Institute, University of Calgary, AB, Canada
- Department of Radiology, University of Calgary, AB, Canada
- Biomedical Engineering Graduate Program, University of Calgary, AB, Canada
| | - Lazaro M. Sanchez-Rodriguez
- Hotchkiss Brain Institute, University of Calgary, AB, Canada
- Department of Radiology, University of Calgary, AB, Canada
| | - Narges Moradi
- Hotchkiss Brain Institute, University of Calgary, AB, Canada
- Department of Radiology, University of Calgary, AB, Canada
- Biomedical Engineering Graduate Program, University of Calgary, AB, Canada
| | - Mehdy Dousty
- Institute of Biomaterials and Biomedical Engineering, University of Toronto, ON, Canada
- KITE, Toronto Rehab, University Health Network, Toronto, ON, Canada
| |
Collapse
|
24
|
Abstract
BACKGROUND The importance of thermal resources to terrestrial ectotherms has been well documented but less often considered in larger-scale analyses of habitat use and selection, such as those routinely conducted using standard habitat features such as vegetation and physical structure. Selection of habitat based on thermal attributes may be of particular importance for ectothermic species, especially in colder climates. In Canada, Western Rattlesnakes (Crotalus oreganus) reach their northern limits, with limited time to conduct annual migratory movements between hibernacula and summer habitat. We radio-tracked 35 male snakes departing from 10 different hibernacula. We examined coarse-scale differences in migratory movements across the region, and then compared the route of each snake with thermal landscapes and ruggedness GIS maps generated for different periods of the animals' active season. RESULTS We observed dichotomous habitat use (grasslands versus upland forests) throughout most of the species' northern range, reflected in different migratory movements of male snakes emanating from different hibernacula. Snakes utilizing higher-elevation forests moved further during the course of their annual migrations, and these snakes were more likely to use warmer areas of the landscape. CONCLUSION In addition to thermal benefits, advantages gained from selective migratory patterns may include prey availability and outbreeding. Testing these alternative hypotheses was beyond the scope of this study, and to collect the data to do so will require overcoming certain challenges. Still, insight into migratory differences between rattlesnake populations and the causal mechanism(s) of migrations will improve our ability to assess the implications of landscape change, management, and efficacy of conservation planning. Our findings suggest that such assessments may need to be tailored to individual dens and the migration strategies of their inhabitants. Additionally, local and landscape-scale migration patterns, as detected in this study, will have repercussions for snakes under climate-induced shifts in ecosystem boundaries and thermal regimes.
Collapse
Affiliation(s)
- Jessica A. Harvey
- Environmental Science Program, Thompson Rivers University, Kamloops, Canada
- Victoria, Canada
| | - Karl W. Larsen
- Department of Natural Resource Science, Thompson Rivers University, 805 TRU Way, Kamloops, British Columbia V2C 0C8 Canada
| |
Collapse
|
25
|
Ding Y, Chen B, Lei X, Liao B, Wu FX. Predicting novel CircRNA-disease associations based on random walk and logistic regression model. Comput Biol Chem 2020; 87:107287. [PMID: 32446243 DOI: 10.1016/j.compbiolchem.2020.107287] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 05/09/2020] [Indexed: 12/24/2022]
Abstract
Circular RNAs (circRNAs), a large group of small endogenous noncoding RNA molecules, have been proved to modulate protein-coding genes in the human genome. In recent years, many experimental studies have demonstrated that circRNAs are dysregulated in a number of diseases, and they can serve as biomarkers for disease diagnosis and prognosis. However, it is expensive and time-consuming to identify circRNA-disease associations by biological experiments and few computational models have been proposed for novel circRNA-disease association prediction. In this study, we develop a computational model based on the random walk and the logistic regression (RWLR) to predict circRNA-disease associations. Firstly, a circRNA-circRNA similarity network is constructed by calculating their functional similarity of circRNA based on circRNA-related gene ontology. Then, a random walk with restart is implemented on the circRNA similarity network, and the features of each pair of circRNA-disease are extracted based on the results of the random walk and the circRNA-disease association matrix. Finally, a logistic regression model is used to predict novel circRNA-disease associations. Leave one out validation (LOOCV), five-fold cross validation (5CV) and ten-fold cross validation (10CV) are adopted to evaluate the prediction performance of RWLR, by comparing with the latest two methods PWCDA and DWNN-RLS. The experiment results show that our RWLR has higher AUC values of LOOCV, 5CV and 10CV than the other two latest methods, which demonstrates that RWLR has a better performance than other computational methods. What's more, case studies also illustrate the reliability and effectiveness of RWLR for circRNA-disease association prediction.
Collapse
Affiliation(s)
- Yulian Ding
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 1L5, Canada
| | - Bolin Chen
- School of Computer Science and Technology, Northwestern Polytechnical University, Xi'an 710072, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China
| | - Bo Liao
- School of Mathematics and Statistics, Hainan Normal University, Haikou 571158, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 1L5, Canada; Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada; Department of Computer Science, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada.
| |
Collapse
|
26
|
Wang W, Smith J, Hejase HA, Liu KJ. Non-parametric and semi-parametric support estimation using SEquential RESampling random walks on biomolecular sequences. Algorithms Mol Biol 2020; 15:7. [PMID: 32322294 PMCID: PMC7164268 DOI: 10.1186/s13015-020-00167-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Accepted: 04/04/2020] [Indexed: 11/18/2022] Open
Abstract
Non-parametric and semi-parametric resampling procedures are widely used to perform support estimation in computational biology and bioinformatics. Among the most widely used methods in this class is the standard bootstrap method, which consists of random sampling with replacement. While not requiring assumptions about any particular parametric model for resampling purposes, the bootstrap and related techniques assume that sites are independent and identically distributed (i.i.d.). The i.i.d. assumption can be an over-simplification for many problems in computational biology and bioinformatics. In particular, sequential dependence within biomolecular sequences is often an essential biological feature due to biochemical function, evolutionary processes such as recombination, and other factors. To relax the simplifying i.i.d. assumption, we propose a new non-parametric/semi-parametric sequential resampling technique that generalizes “Heads-or-Tails” mirrored inputs, a simple but clever technique due to Landan and Graur. The generalized procedure takes the form of random walks along either aligned or unaligned biomolecular sequences. We refer to our new method as the SERES (or “SEquential RESampling”) method. To demonstrate the performance of the new technique, we apply SERES to estimate support for the multiple sequence alignment problem. Using simulated and empirical data, we show that SERES-based support estimation yields comparable or typically better performance compared to state-of-the-art methods.
Collapse
|
27
|
Rezaeinia P, Fairley K, Pal P, Meyer FG, Carter RM. Identifying brain network topology changes in task processes and psychiatric disorders. Netw Neurosci 2020; 4:257-273. [PMID: 32181418 PMCID: PMC7069064 DOI: 10.1162/netn_a_00122] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Accepted: 12/11/2019] [Indexed: 11/04/2022] Open
Abstract
A central goal in neuroscience is to understand how dynamic networks of neural activity produce effective representations of the world. Advances in the theory of graph measures raise the possibility of elucidating network topologies central to the construction of these representations. We leverage a result from the description of lollipop graphs to identify an iconic network topology in functional magnetic resonance imaging data and characterize changes to those networks during task performance and in populations diagnosed with psychiatric disorders. During task performance, we find that task-relevant subnetworks change topology, becoming more integrated by increasing connectivity throughout cortex. Analysis of resting state connectivity in clinical populations shows a similar pattern of subnetwork topology changes; resting scans becoming less default-like with more integrated sensory paths. The study of brain network topologies and their relationship to cognitive models of information processing raises new opportunities for understanding brain function and its disorders.
Collapse
Affiliation(s)
- Paria Rezaeinia
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, CA, USA
| | - Kim Fairley
- Department of Economics, Leiden University, Leiden, The Netherlands
| | - Piya Pal
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, CA, USA
| | - François G Meyer
- Department of Applied Mathematics, University of Colorado Boulder, Boulder, CO, USA
| | - R McKell Carter
- Institute of Cognitive Science, University of Colorado Boulder, Boulder, CO, USA
| |
Collapse
|
28
|
Ruiz-Suarez S, Leos-Barajas V, Alvarez-Castro I, Morales JM. Using approximate Bayesian inference for a "steps and turns" continuous-time random walk observed at regular time intervals. PeerJ 2020; 8:e8452. [PMID: 32095333 PMCID: PMC7020826 DOI: 10.7717/peerj.8452] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2019] [Accepted: 12/23/2019] [Indexed: 11/20/2022] Open
Abstract
The study of animal movement is challenging because movement is a process modulated by many factors acting at different spatial and temporal scales. In order to describe and analyse animal movement, several models have been proposed which differ primarily in the temporal conceptualization, namely continuous and discrete time formulations. Naturally, animal movement occurs in continuous time but we tend to observe it at fixed time intervals. To account for the temporal mismatch between observations and movement decisions, we used a state-space model where movement decisions (steps and turns) are made in continuous time. That is, at any time there is a non-zero probability of making a change in movement direction. The movement process is then observed at regular time intervals. As the likelihood function of this state-space model turned out to be intractable yet simulating data is straightforward, we conduct inference using different variations of Approximate Bayesian Computation (ABC). We explore the applicability of this approach as a function of the discrepancy between the temporal scale of the observations and that of the movement process in a simulation study. Simulation results suggest that the model parameters can be recovered if the observation time scale is moderately close to the average time between changes in movement direction. Good estimates were obtained when the scale of observation was up to five times that of the scale of changes in direction. We demonstrate the application of this model to a trajectory of a sheep that was reconstructed in high resolution using information from magnetometer and GPS devices. The state-space model used here allowed us to connect the scales of the observations and movement decisions in an intuitive and easy to interpret way. Our findings underscore the idea that the time scale at which animal movement decisions are made needs to be considered when designing data collection protocols. In principle, ABC methods allow to make inferences about movement processes defined in continuous time but in terms of easily interpreted steps and turns.
Collapse
Affiliation(s)
- Sofia Ruiz-Suarez
- INIBIOMA (CONICET-Universidad Nacional del Comahue), Rio Negro, Argentina
- Facultad de Ciencias Económicas, Universidad Nacional de Rosario, Rosario, Argentina
| | - Vianey Leos-Barajas
- Department of Statistics, North Carolina State University, Raleigh, United States of America
- Department of Forestry and Environmental Resources, North Carolina State University, Raleigh, NC, United States of America
| | | | | |
Collapse
|
29
|
Rodríguez J, Jattin J, Soracipa Y. Probabilistic temporal prediction of the deaths caused by traffic in Colombia. Mortality caused by traffic prediction. Accid Anal Prev 2020; 135:105332. [PMID: 31838321 DOI: 10.1016/j.aap.2019.105332] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 07/02/2019] [Accepted: 10/15/2019] [Indexed: 06/10/2023]
Abstract
BACKGROUND from probability theory and probabilistic random walk, predictions about the quantity of cases of a given phenomenon for certain year, such as epidemics of dengue, have been previously obtained with results close to 100% in precision. OBJECTIVE To confirm the applicability of a methodology based on probability and probabilistic random walk to predict the dynamics of deaths from road traffic injuries in Colombia for 2010. METHODOLOGY through the development of a total probability space that analyses the probabilistic behaviour of augments and decreases observed in the variation of the lengths of the death rates caused by traffic in Colombia from 2004 to 2009, the most likely event for 2010 was established for predicting the rate of deaths for that year. RESULTS The predicted rate of deaths caused by traffic injuries in Colombia for 2010 was 14.88 with the methodology. When this value is compared with the value reported by national statistics, which was a rate of 12.9, a precision of 86.6% with the prediction was achieved. CONCLUSIONS the applicability of the developed methodology to predict the dynamic behaviour of deaths caused by traffic injuries in Colombia for 2010 by means of a probabilistic random walk was confirmed with a good precision, suggesting that this methodology could be useful to verify the efficacy of national road safety strategies implemented to reduce mortality rates.
Collapse
Affiliation(s)
- Javier Rodríguez
- Insight Group, Asociación Colombiana de Neurocirugía, Cra. 79B N° 51-16 Sur. Int. 5, Apt. 102, Kennedy, Bogotá D.C., Colombia.
| | - Jairo Jattin
- Insight Group, Asociación Colombiana de Neurocirugía, Cra. 79B N° 51-16 Sur. Int. 5, Apt. 102, Kennedy, Bogotá D.C., Colombia
| | - Yolanda Soracipa
- Insight Group, Asociación Colombiana de Neurocirugía, Cra. 79B N° 51-16 Sur. Int. 5, Apt. 102, Kennedy, Bogotá D.C., Colombia
| |
Collapse
|
30
|
Abstract
The abundance of high-throughput data and technical refinements in graph theories have allowed network analysis to become an effective approach for various medical fields. This chapter introduces co-expression, Bayesian, and regression-based network construction methods, which are the basis of network analysis. Various methods in network topology analysis are explained, along with their unique features and applications in biomedicine. Furthermore, we explain the role of network embedding in reducing the dimensionality of networks and outline several popular algorithms used by researchers today. Current literature has implemented different combinations of topology analysis and network embedding techniques, and we outline several studies in the fields of genetic-based disease prediction, drug-target identification, and multi-level omics integration.
Collapse
|
31
|
Liu H, Zhang W, Nie L, Ding X, Luo J, Zou L. Predicting effective drug combinations using gradient tree boosting based on features extracted from drug-protein heterogeneous network. BMC Bioinformatics 2019; 20:645. [PMID: 31818267 PMCID: PMC6902475 DOI: 10.1186/s12859-019-3288-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Accepted: 11/21/2019] [Indexed: 01/30/2023] Open
Abstract
Background Although targeted drugs have contributed to impressive advances in the treatment of cancer patients, their clinical benefits on tumor therapies are greatly limited due to intrinsic and acquired resistance of cancer cells against such drugs. Drug combinations synergistically interfere with protein networks to inhibit the activity level of carcinogenic genes more effectively, and therefore play an increasingly important role in the treatment of complex disease. Results In this paper, we combined the drug similarity network, protein similarity network and known drug-protein associations into a drug-protein heterogenous network. Next, we ran random walk with restart (RWR) on the heterogenous network using the combinatorial drug targets as the initial probability, and obtained the converged probability distribution as the feature vector of each drug combination. Taking these feature vectors as input, we trained a gradient tree boosting (GTB) classifier to predict new drug combinations. We conducted performance evaluation on the widely used drug combination data set derived from the DCDB database. The experimental results show that our method outperforms seven typical classifiers and traditional boosting algorithms. Conclusions The heterogeneous network-derived features introduced in our method are more informative and enriching compared to the primary ontology features, which results in better performance. In addition, from the perspective of network pharmacology, our method effectively exploits the topological attributes and interactions of drug targets in the overall biological network, which proves to be a systematic and reliable approach for drug discovery.
Collapse
Affiliation(s)
- Hui Liu
- Lab of Information Management, Changzhou University, Jiangsu, China
| | - Wenhao Zhang
- Lab of Information Management, Changzhou University, Jiangsu, China
| | - Lixia Nie
- School of Information Science and Engineering, Changzhou University, Jiangsu, China
| | - Xiancheng Ding
- Information Center, Changzhou University, Jiangsu, 213164, China
| | - Judong Luo
- Department of Radiation Oncology, the Affiliated Changzhou No.2 People's Hospital of Nanjing Medical University, Changzhou, China.
| | - Ling Zou
- School of Information Science and Engineering, Changzhou University, Jiangsu, China.
| |
Collapse
|
32
|
Ögren M, Jha D, Dobberschütz S, Müter D, Carlsson M, Gulliksson M, Stipp SLS, Sørensen HO. Numerical simulations of NMR relaxation in chalk using local Robin boundary conditions. J Magn Reson 2019; 308:106597. [PMID: 31546178 DOI: 10.1016/j.jmr.2019.106597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 09/11/2019] [Accepted: 09/12/2019] [Indexed: 06/10/2023]
Abstract
The interpretation of nuclear magnetic resonance (NMR) data is of interest in a number of fields. In Ögren (2014) local boundary conditions for random walk simulations of NMR relaxation in digital domains were presented. Here, we have applied those boundary conditions to large, three-dimensional (3D) porous media samples. We compared the random walk results with known solutions and then applied them to highly structured 3D domains, from images derived using synchrotron radiation CT scanning of North Sea chalk samples. As expected, there were systematic errors caused by digitalization of the pore surfaces so we quantified those errors, and by using linear local boundary conditions, we were able to significantly improve the output. We also present a technique for treating numerical data prior to input into the ESPRIT algorithm for retrieving Laplace components of time series from NMR data (commonly called T-inversion).
Collapse
Affiliation(s)
- M Ögren
- Nano-Science Center, Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100 København Ø, Denmark; School of Science and Technology, Örebro University, 701 82 Örebro, Sweden.
| | - D Jha
- Nano-Science Center, Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100 København Ø, Denmark
| | - S Dobberschütz
- Nano-Science Center, Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100 København Ø, Denmark
| | - D Müter
- Nano-Science Center, Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100 København Ø, Denmark
| | - M Carlsson
- Center for Mathematical Sciences, Lund University, Box 118, 22100 Lund, Sweden
| | - M Gulliksson
- School of Science and Technology, Örebro University, 701 82 Örebro, Sweden
| | - S L S Stipp
- Nano-Science Center, Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100 København Ø, Denmark
| | - H O Sørensen
- Nano-Science Center, Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100 København Ø, Denmark
| |
Collapse
|
33
|
Yokoi H, Tainaka KI, Sato K. Metapopulation model for a prey-predator system: Nonlinear migration due to the finite capacities of patches. J Theor Biol 2019; 477:24-35. [PMID: 31194986 DOI: 10.1016/j.jtbi.2019.05.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 05/29/2019] [Accepted: 05/31/2019] [Indexed: 10/26/2022]
Abstract
Many species live in spatially separated patches, and individuals can migrate between patches through paths. In real ecosystems, the capacities of patches are finite. If a patch is already occupied by the individuals of some species, then the migration into the patch is impossible. In the present paper, we deal with prey-predator system composed of two patches. Each patch contains a limited number of cells, where the cell is either empty or occupied by an individual of prey or predator. We introduce "swapping migration" defined by the exchange between occupied and empty cells. An individual can migrate, only when there are empty cells in the destination patch. Reaction-migration equations in prey-predator system are presented, where the migration term forms nonlinear function of densities. We numerically solve equilibrium densities, and find that the population dynamics are largely affected by nonlinear migration. Not only extinction points but also the responses to the environmental changes crucially depend on the patch capacities.
Collapse
Affiliation(s)
- Hiroki Yokoi
- National Research Institute of Far Seas Fisheries, Fisheries Research Agency, 5-7-1, Orido, Shimizu, Shizuoka 424-8633, Japan
| | - Kei-Ichi Tainaka
- Department of Mathematical and Systems Engineering, Shizuoka University, Hamamatsu 432-8561, Japan
| | - Kazunori Sato
- Department of Mathematical and Systems Engineering, Shizuoka University, Hamamatsu 432-8561, Japan.
| |
Collapse
|
34
|
Nordam T, Nepstad R, Litzler E, Röhrs J. On the use of random walk schemes in oil spill modelling. Mar Pollut Bull 2019; 146:631-638. [PMID: 31426202 DOI: 10.1016/j.marpolbul.2019.07.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 06/25/2019] [Accepted: 07/01/2019] [Indexed: 06/10/2023]
Abstract
In oil spill models, vertical mixing due to turbulence is commonly modelled by random walk. If the eddy diffusivity varies with depth, failing to take the derivative of the diffusivity into account in the random walk scheme will lead to incorrect results. Depending on the diffusivity profile, the result may be either over- or underprediction of the amount of surfaced oil. The importance of using consistent random walk schemes has been known for decades in, e.g., the plankton modelling community. However, it appears not to be common knowledge in the oil spill community, with inconsistent random walk schemes appearing even in recent publications. We demonstrate and quantify the error due to inconsistent random walk, using a simplified oil spill model, and two different diffusivity profiles. In the two cases considered, a commonly used inconsistent scheme predicts respectively 54% and 202% the amount of surface oil, compared to a consistent scheme.
Collapse
Affiliation(s)
- Tor Nordam
- SINTEF Ocean, Trondheim, Norway; Norwegian University of Science and Technology, Trondheim, Norway.
| | | | | | | |
Collapse
|
35
|
Kim TR, Jeong HH, Sohn KA. Topological integration of RPPA proteomic data with multi-omics data for survival prediction in breast cancer via pathway activity inference. BMC Med Genomics 2019; 12:94. [PMID: 31296204 PMCID: PMC6624183 DOI: 10.1186/s12920-019-0511-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The analysis of integrated multi-omics data enables the identification of disease-related biomarkers that cannot be identified from a single omics profile. Although protein-level data reflects the cellular status of cancer tissue more directly than gene-level data, past studies have mainly focused on multi-omics integration using gene-level data as opposed to protein-level data. However, the use of protein-level data (such as mass spectrometry) in multi-omics integration has some limitations. For example, the correlation between the characteristics of gene-level data (such as mRNA) and protein-level data is weak, and it is difficult to detect low-abundance signaling proteins that are used to target cancer. The reverse phase protein array (RPPA) is a highly sensitive antibody-based quantification method for signaling proteins. However, the number of protein features in RPPA data is extremely low compared to the number of gene features in gene-level data. In this study, we present a new method for integrating RPPA profiles with RNA-Seq and DNA methylation profiles for survival prediction based on the integrative directed random walk (iDRW) framework proposed in our previous study. In the iDRW framework, each omics profile is merged into a single pathway profile that reflects the topological information of the pathway. In order to address the sparsity of RPPA profiles, we employ the random walk with restart (RWR) approach on the pathway network. RESULTS Our model was validated using survival prediction analysis for a breast cancer dataset from The Cancer Genome Atlas. Our proposed model exhibited improved performance compared with other methods that utilize pathway information and also out-performed models that did not include the RPPA data utilized in our study. The risk pathways identified for breast cancer in this study were closely related to well-known breast cancer risk pathways. CONCLUSIONS Our results indicated that RPPA data is useful for survival prediction for breast cancer patients under our framework. We also observed that iDRW effectively integrates RNA-Seq, DNA methylation, and RPPA profiles, while variation in the composition of the omics data can affect both prediction performance and risk pathway identification. These results suggest that omics data composition is a critical parameter for iDRW.
Collapse
Affiliation(s)
- Tae Rim Kim
- Department of Computer Engineering, Ajou University, Suwon, 16499, South Korea
| | - Hyun-Hwan Jeong
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.,Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX, 77030, USA
| | - Kyung-Ah Sohn
- Department of Computer Engineering, Ajou University, Suwon, 16499, South Korea.
| |
Collapse
|
36
|
Song J, Peng W, Wang F. A random walk-based method to identify driver genes by integrating the subcellular localization and variation frequency into bipartite graph. BMC Bioinformatics 2019; 20:238. [PMID: 31088372 PMCID: PMC6518800 DOI: 10.1186/s12859-019-2847-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2019] [Accepted: 04/24/2019] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Cancer as a worldwide problem is driven by genomic alterations. With the advent of high-throughput sequencing technology, a huge amount of genomic data generates at every second which offer many valuable cancer information and meanwhile throw a big challenge to those investigators. As the major characteristic of cancer is heterogeneity and most of alterations are supposed to be useless passenger mutations that make no contribution to the cancer progress. Hence, how to dig out driver genes that have effect on a selective growth advantage in tumor cells from those tremendously and noisily data is still an urgent task. RESULTS Considering previous network-based method ignoring some important biological properties of driver genes and the low reliability of gene interactive network, we proposed a random walk method named as Subdyquency that integrates the information of subcellular localization, variation frequency and its interaction with other dysregulated genes to improve the prediction accuracy of driver genes. We applied our model to three different cancers: lung, prostate and breast cancer. The results show our model can not only identify the well-known important driver genes but also prioritize the rare unknown driver genes. Besides, compared with other existing methods, our method can improve the precision, recall and fscore to a higher level for most of cancer types. CONCLUSIONS The final results imply that driver genes are those prone to have higher variation frequency and impact more dysregulated genes in the common significant compartment. AVAILABILITY The source code can be obtained at https://github.com/weiba/Subdyquency .
Collapse
Affiliation(s)
- Junrong Song
- Faculty of Management and Economics/Computer center/Faculty of Information Engineering and Automation/Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Lianhua Road, 650050, Kunming, People's Republic of China
| | - Wei Peng
- Faculty of Management and Economics/Computer center/Faculty of Information Engineering and Automation/Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Lianhua Road, 650050, Kunming, People's Republic of China.
| | - Feng Wang
- Faculty of Management and Economics/Computer center/Faculty of Information Engineering and Automation/Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Lianhua Road, 650050, Kunming, People's Republic of China
| |
Collapse
|
37
|
Liang L, Chen V, Zhu K, Fan X, Lu X, Lu S. Integrating data and knowledge to identify functional modules of genes: a multilayer approach. BMC Bioinformatics 2019; 20:225. [PMID: 31046665 PMCID: PMC6498600 DOI: 10.1186/s12859-019-2800-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Accepted: 04/09/2019] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Characterizing the modular structure of cellular network is an important way to identify novel genes for targeted therapeutics. This is made possible by the rising of high-throughput technology. Unfortunately, computational methods to identify functional modules were limited by the data quality issues of high-throughput techniques. This study aims to integrate knowledge extracted from literature to further improve the accuracy of functional module identification. RESULTS Our new model and algorithm were applied to both yeast and human interactomes. Predicted functional modules have covered over 90% of the proteins in both organisms, while maintaining a comparable overall accuracy. We found that the combination of both mRNA expression information and biomedical knowledge greatly improved the performance of functional module identification, which is better than those only using protein interaction network weighted with transcriptomic data, literature knowledge, or simply unweighted protein interaction network. Our new algorithm also achieved better performance when comparing with some other well-known methods, especially in terms of the positive predictive value (PPV), which indicated the confidence of novel discovery. CONCLUSION Higher PPV with the multiplex approach suggested that information from both sources has been effectively integrated to reduce false positive. With protein coverage higher than 90%, our algorithm is able to generate more novel biological hypothesis with higher confidence.
Collapse
Affiliation(s)
- Lifan Liang
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Vicky Chen
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
- Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc, Frederick, USA
| | - Kunju Zhu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
- Clinical Medicine Research Institute, Jinan University, Guangzhou, 51063, Guangdong, China
| | - Xiaonan Fan
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi'an, 710072, Shanxi, China
| | - Xinghua Lu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Songjian Lu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
38
|
Kim SY, Jeong HH, Kim J, Moon JH, Sohn KA. Robust pathway-based multi-omics data integration using directed random walks for survival prediction in multiple cancer studies. Biol Direct 2019; 14:8. [PMID: 31036036 PMCID: PMC6489180 DOI: 10.1186/s13062-019-0239-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Accepted: 04/10/2019] [Indexed: 01/15/2023] Open
Abstract
Background Integrating the rich information from multi-omics data has been a popular approach to survival prediction and bio-marker identification for several cancer studies. To facilitate the integrative analysis of multiple genomic profiles, several studies have suggested utilizing pathway information rather than using individual genomic profiles. Methods We have recently proposed an integrative directed random walk-based method utilizing pathway information (iDRW) for more robust and effective genomic feature extraction. In this study, we applied iDRW to multiple genomic profiles for two different cancers, and designed a directed gene-gene graph which reflects the interaction between gene expression and copy number data. In the experiments, the performances of the iDRW method and four state-of-the-art pathway-based methods were compared using a survival prediction model which classifies samples into two survival groups. Results The results show that the integrative analysis guided by pathway information not only improves prediction performance, but also provides better biological insights into the top pathways and genes prioritized by the model in both the neuroblastoma and the breast cancer datasets. The pathways and genes selected by the iDRW method were shown to be related to the corresponding cancers. Conclusions In this study, we demonstrated the effectiveness of a directed random walk-based multi-omics data integration method applied to gene expression and copy number data for both breast cancer and neuroblastoma datasets. We revamped a directed gene-gene graph considering the impact of copy number variation on gene expression and redefined the weight initialization and gene-scoring method. The benchmark result for iDRW with four pathway-based methods demonstrated that the iDRW method improved survival prediction performance and jointly identified cancer-related pathways and genes for two different cancer datasets. Reviewers This article was reviewed by Helena Molina-Abril and Marta Hidalgo.
Collapse
Affiliation(s)
- So Yeon Kim
- Department of Computer Engineering, Ajou University, Suwon, 16499, South Korea
| | - Hyun-Hwan Jeong
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.,Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX, 77030, USA
| | - Jaesik Kim
- Department of Computer Engineering, Ajou University, Suwon, 16499, South Korea
| | - Jeong-Hyeon Moon
- Department of Computer Engineering, Ajou University, Suwon, 16499, South Korea
| | - Kyung-Ah Sohn
- Department of Computer Engineering, Ajou University, Suwon, 16499, South Korea.
| |
Collapse
|
39
|
Niu YW, Wang GH, Yan GY, Chen X. Integrating random walk and binary regression to identify novel miRNA-disease association. BMC Bioinformatics 2019; 20:59. [PMID: 30691413 DOI: 10.1186/s12859-019-2640-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 01/18/2019] [Indexed: 02/07/2023] Open
Abstract
Background In the last few decades, cumulative experimental researches have witnessed and verified the important roles of microRNAs (miRNAs) in the development of human complex diseases. Benefitting from the rapid growth both in the availability of miRNA-related data and the development of various analysis methodologies, up until recently, some computational models have been developed to predict human disease related miRNAs, efficiently and quickly. Results In this work, we proposed a computational model of Random Walk and Binary Regression-based MiRNA-Disease Association prediction (RWBRMDA). RWBRMDA extracted features for each miRNA from random walk with restart on the integrated miRNA similarity network for binary logistic regression to predict potential miRNA-disease associations. RWBRMDA obtained AUC of 0.8076 in the leave-one-out cross validation. Additionally, we carried out three different patterns of case studies on four human complex diseases. Specifically, Esophageal cancer and Prostate cancer were conducted as one kind of case study based on known miRNA-disease associations in HMDD v2.0 database. Out of the top 50 predicted miRNAs, 94 and 90% were respectively confirmed by recent experimental reports. To simulate new disease without known related miRNAs, the information of known Breast cancer related miRNAs was removed. As a result, 98% of the top 50 predicted miRNAs for Breast cancer were confirmed. Lymphoma, the verified ratio of which was 88%, was used to assess the prediction robustness of RWBRMDA based on the association records in HMDD v1.0 database. Conclusions We anticipated that RWBRMDA could benefit the future experimental investigations about the relation between human disease and miRNAs by generating promising and testable top-ranked miRNAs, and significantly reducing the effort and cost of identification works. Electronic supplementary material The online version of this article (10.1186/s12859-019-2640-9) contains supplementary material, which is available to authorized users.
Collapse
|
40
|
Alsmeyer G, Raschel K. The extinction problem for a distylous plant population with sporophytic self-incompatibility. J Math Biol 2019; 78:1841-1874. [PMID: 30683998 DOI: 10.1007/s00285-019-01328-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Revised: 01/11/2019] [Indexed: 11/28/2022]
Abstract
In this paper, the extinction problem for a class of distylous plant populations is considered within the framework of certain nonhomogeneous nearest-neighbor random walks in the positive quadrant. For the latter, extinction means absorption at one of the axes. Despite connections with some classical probabilistic models (standard two-type Galton-Watson process, two-urn model), exact formulae for the probabilities of absorption seem to be difficult to come by and one must therefore resort to good approximations. In order to meet this task, we develop potential-theoretic tools and provide various sub- and super-harmonic functions which, for large initial populations, provide bounds which in particular improve those that have appeared earlier in the literature.
Collapse
Affiliation(s)
- Gerold Alsmeyer
- Institut für Mathematische Stochastik, Fachbereich Mathematik und Informatik, Universität Münster, Orléans-Ring 10, 48149, Münster, Germany
| | - Kilian Raschel
- CNRS, Institut Denis Poisson, Université de Tours, Parc de Grandmont, 37200, Tours, France.
| |
Collapse
|
41
|
Abstract
Computational prediction of the clinical success or failure of a potential drug target for therapeutic use is a challenging problem. Novel network propagation algorithms that integrate heterogeneous biological networks are proving useful for drug target identification and prioritization. These approaches typically utilize a network describing relationships between targets, a method to disseminate the relevant information through the network, and a method to elucidate new associations between targets and diseases. Here, we utilize one such network propagation-based approach, DTINet, which starts with diffusion component analysis of networks of both potential drug targets and diseases. Then an inductive matrix completion algorithm is applied to identify novel disease targets based on their network topological similarities with known disease targets with successfully launched drugs. DTINet performed well as assessed with area under the precision-recall curve (AUPR = 0.88 ± 0.007) and area under the receiver operating characteristic curve (AUROC = 0.86 ± 0.008). These metrics improved when we combined data from multiple networks in the target space but reduced significantly when we used a more conservative method to define negative controls (AUPR = 0.56 ± 0.007, AUROC = 0.57 ± 0.007). We are optimistic that integration of more relevant and cleaner datasets and networks, careful calibration of model parameters, as well as algorithmic improvements will improve prediction accuracy. However, we also recognize that predicting drug targets that are likely to be successful is an extremely challenging problem due to its complex nature and sparsity of known disease targets.
Collapse
|
42
|
Abstract
BACKGROUND Identifying protein-protein interactions (PPIs) is of paramount importance for understanding cellular processes. Machine learning-based approaches have been developed to predict PPIs, but the effectiveness of these approaches is unsatisfactory. One major reason is that they randomly choose non-interacting protein pairs (negative samples) or heuristically select non-interacting pairs with low quality. RESULTS To boost the effectiveness of predicting PPIs, we propose two novel approaches (NIP-SS and NIP-RW) to generate high quality non-interacting pairs based on sequence similarity and random walk, respectively. Specifically, the known PPIs collected from public databases are used to generate the positive samples. NIP-SS then selects the top-m dissimilar protein pairs as negative examples and controls the degree distribution of selected proteins to construct the negative dataset. NIP-RW performs random walk on the PPI network to update the adjacency matrix of the network, and then selects protein pairs not connected in the updated network as negative samples. Next, we use auto covariance (AC) descriptor to encode the feature information of amino acid sequences. After that, we employ deep neural networks (DNNs) to predict PPIs based on extracted features, positive and negative examples. Extensive experiments show that NIP-SS and NIP-RW can generate negative samples with higher quality than existing strategies and thus enable more accurate prediction. CONCLUSIONS The experimental results prove that negative datasets constructed by NIP-SS and NIP-RW can reduce the bias and have good generalization ability. NIP-SS and NIP-RW can be used as a plugin to boost the effectiveness of PPIs prediction. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NIP .
Collapse
Affiliation(s)
- Long Zhang
- College of Computer and Information Sciences, Southwest University, Chongqing, China
| | - Guoxian Yu
- College of Computer and Information Sciences, Southwest University, Chongqing, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China.,Beijing Key Laboratory of Intelligent Processing for Building Big Data, Beijing, China
| | - Jun Wang
- College of Computer and Information Sciences, Southwest University, Chongqing, China.
| |
Collapse
|
43
|
Nagatani T, Tainaka KI, Ichinose G. Metapopulation model of rock-scissors-paper game with subpopulation-specific victory rates stabilized by heterogeneity. J Theor Biol 2018; 458:103-10. [PMID: 30213665 DOI: 10.1016/j.jtbi.2018.09.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2018] [Revised: 09/07/2018] [Accepted: 09/10/2018] [Indexed: 11/20/2022]
Abstract
Recently, metapopulation models for rock-paper-scissors games have been presented. Each subpopulation is represented by a node on a graph. An individual is either rock (R), scissors (S) or paper (P); it randomly migrates among subpopulations. In the present paper, we assume victory rates differ in different subpopulations. To investigate the dynamic state of each subpopulation (node), we numerically obtain the solutions of reaction-diffusion equations on the graphs with two and three nodes. In the case of homogeneous victory rates, we find each subpopulation has a periodic solution with neutral stability. However, when victory rates between subpopulations are heterogeneous, the solution approaches stable focuses. The heterogeneity of victory rates promotes the coexistence of species.
Collapse
|
44
|
Yuan Y, Chen YW, Dong C, Yu H, Zhu Z. Hybrid method combining superpixel, random walk and active contour model for fast and accurate liver segmentation. Comput Med Imaging Graph 2018; 70:119-134. [PMID: 30359946 DOI: 10.1016/j.compmedimag.2018.08.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2017] [Revised: 04/27/2018] [Accepted: 08/27/2018] [Indexed: 10/28/2022]
Abstract
Organ segmentation is an important pre-processing step in surgery planning and computer-aided diagnosis. In this paper, we propose a fast and accurate liver segmentation framework. Our proposed method combines a knowledge-based slice-by-slice Random Walk (RW) segmentation algorithm (proposed in our previous work) with a superpixel algorithm called the Contrast-enhanced Compact Watershed (CCWS) method to reduce computing time and memory costs. Compared to the commonly used Simple Linear Iterative Clustering (SLIC), we demonstrate that our CCWS is more appropriate for liver segmentation. To improve the methods accuracy, we use a modified narrow band active contour model as a refinement after the initial segmentation. The experiments showed that the superpixel-based slice-by-slice RW could segment the entire liver with improved speed, and the modified active contour model is more precise than the original Chan-Vese Model. As a result, the proposed framework is able to quickly and accurately segment the entire liver.
Collapse
Affiliation(s)
- Ye Yuan
- Software College of Northeastern University, No. 195 Chuangxin Road, Shenyang, China
| | - Yen-Wei Chen
- Graduate School of Information Science and Engineering, Ritsumeikan University, Noji-higashi 1-1-1, Kusatsu, Japan
| | - Chunhua Dong
- Department of Mathematics and Computer Science, Fort Valley State University, 1005 State University Drive, Fort Valley, United States
| | - Hai Yu
- Software College of Northeastern University, No. 195 Chuangxin Road, Shenyang, China
| | - Zhiliang Zhu
- Software College of Northeastern University, No. 195 Chuangxin Road, Shenyang, China.
| |
Collapse
|
45
|
Lorenz-Spreen P, Wolf F, Braun J, Ghoshal G, Djurdjevac Conrad N, Hövel P. Tracking online topics over time: understanding dynamic hashtag communities. Comput Soc Netw 2018; 5:9. [PMID: 30416936 PMCID: PMC6208799 DOI: 10.1186/s40649-018-0058-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Accepted: 09/28/2018] [Indexed: 11/10/2022]
Abstract
Background Hashtags are widely used for communication in online media. As a condensed version of information, they characterize topics and discussions. For their analysis, we apply methods from network science and propose novel tools for tracing their dynamics in time-dependent data. The observations are characterized by bursty behaviors in the increases and decreases of hashtag usage. These features can be reproduced with a novel model of dynamic rankings. Hashtag communities in time We build temporal and weighted co-occurrence networks from hashtags. On static snapshots, we infer the community structure using customized methods. On temporal networks, we solve the bipartite matching problem of detected communities at subsequent timesteps by taking into account higher-order memory. This results in a matching protocol that is robust toward temporal fluctuations and instabilities of the static community detection. The proposed methodology is broadly applicable and its outcomes reveal the temporal behavior of online topics. Modeling topic-dynamics We consider the size of the communities in time as a proxy for online popularity dynamics. We find that the distributions of gains and losses, as well as the interevent times are fat-tailed indicating occasional, but large and sudden changes in the usage of hashtags. Inspired by typical website designs, we propose a stochastic model that incorporates a ranking with respect to a time-dependent prestige score. This causes occasional cascades of rank shift events and reproduces the observations with good agreement. This offers an explanation for the observed dynamics, based on characteristic elements of online media.
Collapse
Affiliation(s)
- Philipp Lorenz-Spreen
- 1Institute of Theoretical Physics, Technische Universität Berlin, Hardenbergstraße 36, 10623 Berlin, Germany
| | - Frederik Wolf
- 2Potsdam Institute for Climate Impact Research (PIK), Telegraphenberg A 31, 14473 Potsdam, Germany
| | - Jonas Braun
- 3Department of Physics, Humboldt-Universität zu Berlin, Newtonstraße 15, 12489 Berlin, Germany
| | - Gourab Ghoshal
- 4Department of Physics and Astronomy, University of Rochester, Rochester, NY 14627 USA
| | | | - Philipp Hövel
- 1Institute of Theoretical Physics, Technische Universität Berlin, Hardenbergstraße 36, 10623 Berlin, Germany.,6School of Mathematical Sciences, University College Cork, Western Road, Cork, T12 XF62 Ireland
| |
Collapse
|
46
|
Boufadel MC, Cui F, Katz J, Nedwed T, Lee K. On the transport and modeling of dispersed oil under ice. Mar Pollut Bull 2018; 135:569-580. [PMID: 30301075 DOI: 10.1016/j.marpolbul.2018.07.046] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2018] [Revised: 07/09/2018] [Accepted: 07/17/2018] [Indexed: 06/08/2023]
Abstract
Theoretical arguments and numerical investigations were conducted to understand the transport of oil droplets under ice. It was found that the boundary layer (BL) in the water under ice produces a downward velocity that reaches up to 0.2% of horizontal current speed, and is, in general, larger than the rise velocity of 70 μm oil droplets. The eddy diffusivity was found to increase with depth and to decrease gradually afterward. Neglecting the gradient of eddy diffusivity when conducting Lagrangian transport of oil droplets would result in an unphysical spatial distribution. When the downward velocity of water was neglected, oil accumulated at the water-ice interface regardless of the attachment efficiency. The lift force was found to scrape off droplets of the ice, especially for droplets ≤ 70 μm. These findings suggest that previous oil spill simulations may have overestimated the number of small droplets (≤70 μm) at the water-ice interface.
Collapse
Affiliation(s)
- Michel C Boufadel
- Center for Natural Resources, New Jersey Institute of Technology, Newark, NJ, USA.
| | - Fangda Cui
- Center for Natural Resources, New Jersey Institute of Technology, Newark, NJ, USA
| | - Joseph Katz
- Department of Mechanical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Tim Nedwed
- Upstream Research Company, ExxonMobil, Spring, TX, USA
| | | |
Collapse
|
47
|
Kim SY, Kim TR, Jeong HH, Sohn KA. Integrative pathway-based survival prediction utilizing the interaction between gene expression and DNA methylation in breast cancer. BMC Med Genomics 2018; 11:68. [PMID: 30255812 PMCID: PMC6157196 DOI: 10.1186/s12920-018-0389-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Background Integrative analysis on multi-omics data has gained much attention recently. To investigate the interactive effect of gene expression and DNA methylation on cancer, we propose a directed random walk-based approach on an integrated gene-gene graph that is guided by pathway information. Methods Our approach first extracts a single pathway profile matrix out of the gene expression and DNA methylation data by performing the random walk over the integrated graph. We then apply a denoising autoencoder to the pathway profile to further identify important pathway features and genes. The extracted features are validated in the survival prediction task for breast cancer patients. Results The results show that the proposed method substantially improves the survival prediction performance compared to that of other pathway-based prediction methods, revealing that the combined effect of gene expression and methylation data is well reflected in the integrated gene-gene graph combined with pathway information. Furthermore, we show that our joint analysis on the methylation features and gene expression profile identifies cancer-specific pathways with genes related to breast cancer. Conclusions In this study, we proposed a DRW-based method on an integrated gene-gene graph with expression and methylation profiles in order to utilize the interactions between them. The results showed that the constructed integrated gene-gene graph can successfully reflect the combined effect of methylation features on gene expression profiles. We also found that the selected features by DA can effectively extract topologically important pathways and genes specifically related to breast cancer.
Collapse
Affiliation(s)
- So Yeon Kim
- Department of Computer Engineering, Ajou University, Suwon, 16499, South Korea
| | - Tae Rim Kim
- Department of Computer Engineering, Ajou University, Suwon, 16499, South Korea
| | - Hyun-Hwan Jeong
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.,Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX, 77030, USA
| | - Kyung-Ah Sohn
- Department of Computer Engineering, Ajou University, Suwon, 16499, South Korea.
| |
Collapse
|
48
|
Nagatani T, Ichinose G, Tainaka KI. Metapopulation model for rock-paper-scissors game: Mutation affects paradoxical impacts. J Theor Biol 2018; 450:22-9. [PMID: 29627264 DOI: 10.1016/j.jtbi.2018.04.005] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Revised: 04/01/2018] [Accepted: 04/03/2018] [Indexed: 11/20/2022]
Abstract
The rock-paper-scissors (RPS) game is known as one of the simplest cyclic dominance models. This game is key to understanding biodiversity. Three species, rock (R), paper (P) and scissors (S), can coexist in nature. In the present paper, we first present a metapopulation model for RPS game with mutation. Only mutation from R to S is allowed. The total population consists of spatially separated patches, and the mutation occurs in particular patches. We present reaction-diffusion equations which have two terms: reaction and migration terms. The former represents the RPS game with mutation, while the latter corresponds to random walk. The basic equations are solved analytically and numerically. It is found that the mutation induces one of three phases: the stable coexistence of three species, the stable phase of two species, and a single-species phase. The phase transitions among three phases occur by varying the mutation rate. We find the conditions for coexistence are largely changed depending on metapopulation models. We also find that the mutation induces different paradoxes in different patches.
Collapse
|
49
|
Mathijsen BWJ, Janssen AJEM, van Leeuwaarden JSH, Zwart B. Robust heavy-traffic approximations for service systems facing overdispersed demand. Queueing Syst 2018; 90:257-289. [PMID: 30956380 PMCID: PMC6413888 DOI: 10.1007/s11134-018-9584-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2017] [Revised: 04/23/2018] [Indexed: 06/09/2023]
Abstract
Arrival processes to service systems often display fluctuations that are larger than anticipated under the Poisson assumption, a phenomenon that is referred to as overdispersion. Motivated by this, we analyze a class of discrete-time stochastic models for which we derive heavy-traffic approximations that are scalable in the system size. Subsequently, we show how this leads to novel capacity sizing rules that acknowledge the presence of overdispersion. This, in turn, leads to robust approximations for performance characteristics of systems that are of moderate size and/or may not operate in heavy traffic.
Collapse
Affiliation(s)
- Britt W. J. Mathijsen
- Department of Mathematics and Computer Science, Eindhoven University of Technology, P.O. Box 513, 5600 MB Eindhoven, The Netherlands
| | - A. J. E. M. Janssen
- Department of Mathematics and Computer Science, Eindhoven University of Technology, P.O. Box 513, 5600 MB Eindhoven, The Netherlands
| | - Johan S. H. van Leeuwaarden
- Department of Mathematics and Computer Science, Eindhoven University of Technology, P.O. Box 513, 5600 MB Eindhoven, The Netherlands
| | - Bert Zwart
- Department of Mathematics and Computer Science, Eindhoven University of Technology, P.O. Box 513, 5600 MB Eindhoven, The Netherlands
- Centrum Wiskunde and Informatica, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands
| |
Collapse
|
50
|
Monk T. Martingales and the fixation probability of high-dimensional evolutionary graphs. J Theor Biol 2018; 451:10-18. [PMID: 29727631 DOI: 10.1016/j.jtbi.2018.04.039] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Revised: 04/27/2018] [Accepted: 04/30/2018] [Indexed: 11/26/2022]
Abstract
A principal problem of evolutionary graph theory is to find the probability that an initial mutant population will fix on a graph, i.e. that the mutants will eventually replace the indigenous population. This problem is particularly difficult when the dimensionality of a graph is high. Martingales can yield compact and exact expressions for the fixation probability of an evolutionary graph. Crucially, the tractability of martingales does not necessarily depend on the dimensionality of a graph. We will use martingales to obtain the exact fixation probability of graphs with high dimensionality, specifically k-partite graphs (or 'circular flows') and megastars (or 'superstars'). To do so, we require that the edges of the graph permit mutants to reproduce in one direction and indigenous in the other. The resultant expressions for fixation probabilities explicitly show their dependence on the parameters that describe the graph structure, and on the starting position(s) of the initial mutant population. In particular, we will investigate the effect of funneling on the fixation probability of k-partite graphs, as well as the effect of placing an initial mutant in different partitions. These are the first exact and explicit results reported for the fixation probability of evolutionary graphs with dimensionality greater than 2, that are valid over all parameter space. It might be possible to extend these results to obtain fixation probabilities of high-dimensional evolutionary graphs with undirected or directed connections. Martingales are a formidable theoretical tool that can solve fundamental problems in evolutionary graph theory, often within a few lines of straightforward mathematics.
Collapse
Affiliation(s)
- Travis Monk
- Biomedical Engineering and Neuroscience, The MARCS Institute, Western Sydney University, Locked Bag 1797, Penrith, NSW 2751, Australia.
| |
Collapse
|