Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

46
(from Reference Citation Analysis)

Article PDFs (18)

Cited by > 0 (42)

Searched Name

Alexander Schliep

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Bello L, Wiedenhöft J, Schliep A. Compressed computations using wavelets for hidden Markov models with continuous observations. PLoS One 2023;18:e0286074. [PMID: 37279196 DOI: 10.1371/journal.pone.0286074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 05/09/2023] [Indexed: 06/08/2023] Open

Viet Johansson S, Gummesson Svensson H, Bjerrum E, Schliep A, Haghir Chehreghani M, Tyrchan C, Engkvist O. Using Active Learning to Develop Machine Learning Models for Reaction Yield Prediction. Mol Inform 2022;41:e2200043. [PMID: 35732584 DOI: 10.1002/minf.202200043] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 06/22/2022] [Indexed: 01/05/2023]

Gustafsson J, Norberg P, Qvick-Wester JR, Schliep A. Fast parallel construction of variable-length Markov chains. BMC Bioinformatics 2021;22:487. [PMID: 34627154 PMCID: PMC8501649 DOI: 10.1186/s12859-021-04387-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Accepted: 09/20/2021] [Indexed: 11/10/2022] Open

Dansson HV, Stempfle L, Egilsdóttir H, Schliep A, Portelius E, Blennow K, Zetterberg H, Johansson FD. Predicting progression and cognitive decline in amyloid-positive patients with Alzheimer's disease. Alzheimers Res Ther 2021;13:151. [PMID: 34488882 PMCID: PMC8422748 DOI: 10.1186/s13195-021-00886-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Accepted: 08/08/2021] [Indexed: 11/10/2022]

Abstract

BACKGROUND

In Alzheimer's disease, amyloid- β (A β) peptides aggregate in the lowering CSF amyloid levels - a key pathological hallmark of the disease. However, lowered CSF amyloid levels may also be present in cognitively unimpaired elderly individuals. Therefore, it is of great value to explain the variance in disease progression among patients with A β pathology.

METHODS

A cohort of n=2293 participants, of whom n=749 were A β positive, was selected from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database to study heterogeneity in disease progression for individuals with A β pathology. The analysis used baseline clinical variables including demographics, genetic markers, and neuropsychological data to predict how the cognitive ability and AD diagnosis of subjects progressed using statistical models and machine learning. Due to the relatively low prevalence of A β pathology, models fit only to A β-positive subjects were compared to models fit to an extended cohort including subjects without established A β pathology, adjusting for covariate differences between the cohorts.

RESULTS

A β pathology status was determined based on the A β₄₂/A β₄₀ ratio. The best predictive model of change in cognitive test scores for A β-positive subjects at the 2-year follow-up achieved an R² score of 0.388 while the best model predicting adverse changes in diagnosis achieved a weighted F₁ score of 0.791. A β-positive subjects declined faster on average than those without A β pathology, but the specific level of CSF A β was not predictive of progression rate. When predicting cognitive score change 4 years after baseline, the best model achieved an R² score of 0.325 and it was found that fitting models to the extended cohort improved performance. Moreover, using all clinical variables outperformed the best model based only on a suite of cognitive test scores which achieved an R² score of 0.228.

CONCLUSION

Our analysis shows that CSF levels of A β are not strong predictors of the rate of cognitive decline in A β-positive subjects when adjusting for other variables. Baseline assessments of cognitive function accounts for the majority of variance explained in the prediction of 2-year decline but is insufficient for achieving optimal results in longer-term predictions. Predicting changes both in cognitive test scores and in diagnosis provides multiple perspectives of the progression of potential AD subjects.

Collapse

Tavara S, Schliep A. Effects of network topology on the performance of consensus and distributed learning of SVMs using ADMM. PeerJ Comput Sci 2021;7:e397. [PMID: 33817043 PMCID: PMC7959654 DOI: 10.7717/peerj-cs.397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Accepted: 01/26/2021] [Indexed: 06/12/2023]

Johansson S, Thakkar A, Kogej T, Bjerrum E, Genheden S, Bastys T, Kannas C, Schliep A, Chen H, Engkvist O. AI-assisted synthesis prediction. Drug Discov Today Technol 2020;32-33:65-72. [PMID: 33386096 DOI: 10.1016/j.ddtec.2020.06.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 06/01/2020] [Accepted: 06/10/2020] [Indexed: 11/25/2022]

Judd N, Sauce B, Wiedenhoeft J, Tromp J, Chaarani B, Schliep A, van Noort B, Penttilä J, Grimmer Y, Insensee C, Becker A, Banaschewski T, Bokde ALW, Quinlan EB, Desrivières S, Flor H, Grigis A, Gowland P, Heinz A, Ittermann B, Martinot JL, Paillère Martinot ML, Artiges E, Nees F, Papadopoulos Orfanos D, Paus T, Poustka L, Hohmann S, Millenet S, Fröhner JH, Smolka MN, Walter H, Whelan R, Schumann G, Garavan H, Klingberg T. Cognitive and brain development is independently influenced by socioeconomic status and polygenic scores for educational attainment. Proc Natl Acad Sci U S A 2020;117:12411-12418. [PMID: 32430323 PMCID: PMC7275733 DOI: 10.1073/pnas.2001228117] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open

Affiliation(s)

Nicholas Judd Department of Neuroscience, Karolinska Institute, Stockholm, 17165, Sweden
Bruno Sauce Department of Neuroscience, Karolinska Institute, Stockholm, 17165, Sweden
John Wiedenhoeft Department of Medical Statistics, University of Göttingen, Göttingen, 37073, Germany
Jeshua Tromp Department of Cognitive Psychology, Leiden University, Leiden, 2311, The Netherlands
Bader Chaarani Department of Psychiatry, University of Vermont, Burlington, VT 05405 Department of Psychological Science, University of Vermont, Burlington, VT 05405
Alexander Schliep Department of Computer Science and Engineering, University of Gothenburg, Gothenburg, 41756, Sweden
Betteke van Noort Hochschule für Gesundheit und Medizin, Medical School Berlin, Berlin, 14197, Germany
Jani Penttilä Department of Social and Health Care, Psychosocial Services Adolescent Outpatient Clinic, University of Tampere, Lahti, 33100, Finland
Yvonne Grimmer Department of Child and Adolescent Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, 69117, Germany
Corinna Insensee Department of Child and Adolescent Psychiatry and Psychotherapy, University Medical Center, Göttingen, 37075, Germany
Andreas Becker Department of Child and Adolescent Psychiatry and Psychotherapy, University Medical Center, Göttingen, 37075, Germany
Tobias Banaschewski Department of Child and Adolescent Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, 69117, Germany
Arun L W Bokde Discipline of Psychiatry, School of Medicine, Trinity College Dublin, Dublin, D02 PN40, Ireland Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, D02 PN40, Ireland
Erin Burke Quinlan Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, SE5 8AF, United Kingdom
Sylvane Desrivières Centre for Population Neuroscience and Precision Medicine (PONS), Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, SE5 8AF, United Kingdom
Herta Flor Institute of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, 69117, Germany Department of Psychology, School of Social Sciences, University of Mannheim, Mannheim, 68131, Germany
Antoine Grigis NeuroSpin, French Alternative Energies and Atomic Energy Commission (CEA), Université Paris-Saclay, F-91191 Gif-sur-Yvette, France
Penny Gowland Sir Peter Mansfield Imaging Centre, School of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, United Kingdom
Andreas Heinz Department of Psychiatry and Psychotherapy, Campus Charité Mitte, Charité, Universitätsmedizin Berlin, Berlin, 10117, Germany
Bernd Ittermann Physikalisch-Technische Bundesanstalt, Berlin, 38116, Germany
Jean-Luc Martinot INSERM Unit 1000 "Neuroimaging & Psychiatry," Institut National de la Santé et de la Recherche Médicale, University Paris Saclay, University Paris Descartes, Paris, 75006, France
Marie-Laure Paillère Martinot INSERM Unit 1000 "Neuroimaging & Psychiatry," Institut National de la Santé et de la Recherche Médicale, University Paris Saclay, University Paris Descartes, Paris, 75006, France Department of Child and Adolescent Psychiatry, Pitié-Salpêtrière Hospital, Assistance Publique-Hôpitaux de Paris, Sorbonne Université, Paris, 75006, France
Eric Artiges INSERM Unit 1000 "Neuroimaging & Psychiatry," Institut National de la Santé et de la Recherche Médicale, University Paris Saclay, University Paris Descartes, Paris, 75006, France
Frauke Nees Department of Child and Adolescent Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, 69117, Germany Department of Psychology, School of Social Sciences, University of Mannheim, Mannheim, 68131, Germany
Dimitri Papadopoulos Orfanos NeuroSpin, French Alternative Energies and Atomic Energy Commission (CEA), Université Paris-Saclay, F-91191 Gif-sur-Yvette, France
Tomáš Paus Bloorview Research Institute, Holland Bloorview Kids Rehabilitation Hospital, University of Toronto, Toronto, ON M6A 2E1, Canada Department of Psychology, University of Toronto, Toronto, ON M6A 2E1, Canada Department of Psychiatry, University of Toronto, Toronto, ON M6A 2E1, Canada
Luise Poustka Department of Child and Adolescent Psychiatry and Psychotherapy, University Medical Center, Göttingen, 37075, Germany
Sarah Hohmann Department of Child and Adolescent Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, 69117, Germany
Sabina Millenet Department of Child and Adolescent Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, 69117, Germany
Juliane H Fröhner Department of Psychiatry and Psychotherapy, Technische Universität Dresden, Dresden, 01087, Germany
Michael N Smolka Department of Psychiatry, Technische Universität Dresden, Dresden, 01062, Germany Neuroimaging Center, Technische Universität Dresden, Dresden, 01069, Germany
Henrik Walter Department of Psychiatry and Psychotherapy, Campus Charité Mitte, Charité, Universitätsmedizin Berlin, Berlin, 10117, Germany
Robert Whelan School of Psychology, Trinity College Dublin, Dublin, D02 PN40, Ireland Global Brain Health Institute, Trinity College Dublin, Dublin, D02 PN40, Ireland
Gunter Schumann Centre for Population Neuroscience and Precision Medicine (PONS), Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, SE5 8AF, United Kingdom
Hugh Garavan Department of Psychiatry, University of Vermont, Burlington, VT 05405 Department of Psychological Science, University of Vermont, Burlington, VT 05405
Torkel Klingberg Department of Neuroscience, Karolinska Institute, Stockholm, 17165, Sweden;

Collapse

Bakker FT, Antonelli A, Clarke JA, Cook JA, Edwards SV, Ericson PGP, Faurby S, Ferrand N, Gelang M, Gillespie RG, Irestedt M, Lundin K, Larsson E, Matos-Maraví P, Müller J, von Proschwitz T, Roderick GK, Schliep A, Wahlberg N, Wiedenhoeft J, Källersjö M. The Global Museum: natural history collections and the future of evolutionary science and public education. PeerJ 2020;8:e8225. [PMID: 32025365 PMCID: PMC6993751 DOI: 10.7717/peerj.8225] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Accepted: 11/15/2019] [Indexed: 12/27/2022] Open

Abstract

Natural history museums are unique spaces for interdisciplinary research and educational innovation. Through extensive exhibits and public programming and by hosting rich communities of amateurs, students, and researchers at all stages of their careers, they can provide a place-based window to focus on integration of science and discovery, as well as a locus for community engagement. At the same time, like a synthesis radio telescope, when joined together through emerging digital resources, the global community of museums (the ‘Global Museum’) is more than the sum of its parts, allowing insights and answers to diverse biological, environmental, and societal questions at the global scale, across eons of time, and spanning vast diversity across the Tree of Life. We argue that, whereas natural history collections and museums began with a focus on describing the diversity and peculiarities of species on Earth, they are now increasingly leveraged in new ways that significantly expand their impact and relevance. These new directions include the possibility to ask new, often interdisciplinary questions in basic and applied science, such as in biomimetic design, and by contributing to solutions to climate change, global health and food security challenges. As institutions, they have long been incubators for cutting-edge research in biology while simultaneously providing core infrastructure for research on present and future societal needs. Here we explore how the intersection between pressing issues in environmental and human health and rapid technological innovation have reinforced the relevance of museum collections. We do this by providing examples as food for thought for both the broader academic community and museum scientists on the evolving role of museums. We also identify challenges to the realization of the full potential of natural history collections and the Global Museum to science and society and discuss the critical need to grow these collections. We then focus on mapping and modelling of museum data (including place-based approaches and discovery), and explore the main projects, platforms and databases enabling this growth. Finally, we aim to improve relevant protocols for the long-term storage of specimens and tissues, ensuring proper connection with tomorrow’s technologies and hence further increasing the relevance of natural history museums.

Collapse

Affiliation(s)

Freek T Bakker Biosystematics Group, Wageningen University & Research, Wageningen, The Netherlands
Alexandre Antonelli Department of Science, Royal Botanic Gardens, Kew, Richmond, United Kingdom
Julia A Clarke Jackson School of Geosciences, University of Texas at Austin, Austin, TX, United States of America
Joseph A Cook Museum of Southwestern Biology, Department of Biology, University of New Mexico, Albuquerque, NM, United States of America
Scott V Edwards Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, United States of America.,Gothenburg Centre for Advanced Studies in Science and Technology, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
Per G P Ericson Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
Søren Faurby Department of Biological and Environmental Sciences, Gothenburg Global Biodiversity Centre, University of Gothenburg, Göteborg, Sweden
Nuno Ferrand Museu de História Natural e da Ciência, Universidade do Porto, Porto, Portugal
Magnus Gelang Department of Zoology, Gothenburg Natural History Museum, Göteborg, Sweden.,Gothenburg Global Biodiversity Centre, University of Gothenburg, Göteborg, Sweden
Rosemary G Gillespie Essig Museum of Entomology, Department of Environmental Science, Policy and Management, University of California, Berkeley, Berkeley, CA, United States of America
Martin Irestedt Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
Kennet Lundin Department of Zoology, Gothenburg Natural History Museum, Göteborg, Sweden.,Gothenburg Global Biodiversity Centre, University of Gothenburg, Göteborg, Sweden
Ellen Larsson Department of Biological and Environmental Sciences, Gothenburg Global Biodiversity Centre, University of Gothenburg, Göteborg, Sweden.,Gothenburg Global Biodiversity Centre, University of Gothenburg, Göteborg, Sweden
Pável Matos-Maraví Biology Centre of the Czech Academy of Sciences, Institute of Entomology, České Budějovice, Czechia
Johannes Müller Leibniz-Institut für Evolutions- und Biodiversitätsforschung, Museum für Naturkunde, Berlin, Germany
Ted von Proschwitz Department of Zoology, Gothenburg Natural History Museum, Göteborg, Sweden.,Gothenburg Global Biodiversity Centre, University of Gothenburg, Göteborg, Sweden
George K Roderick Essig Museum of Entomology, Department of Environmental Science, Policy and Management, University of California, Berkeley, Berkeley, CA, United States of America
Alexander Schliep Department of Computer Science and Engineering, University of Gothenburg, Göteborg, Sweden
Niklas Wahlberg Department of Biology, Lund University, Lund, Sweden
John Wiedenhoeft Department of Computer Science and Engineering, University of Gothenburg, Göteborg, Sweden
Mari Källersjö Gothenburg Global Biodiversity Centre, University of Gothenburg, Göteborg, Sweden.,Gothenburg Botanical Garden, Göteborg, Sweden

Collapse

Martinsson J, Schliep A, Eliasson B, Mogren O. Blood Glucose Prediction with Variance Estimation Using Recurrent Neural Networks. J Healthc Inform Res 2019;4:1-18. [PMID: 35415439 PMCID: PMC8982803 DOI: 10.1007/s41666-019-00059-y] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Revised: 04/26/2019] [Accepted: 10/18/2019] [Indexed: 11/28/2022]

Abstract AbstractMany factors affect blood glucose levels in type 1 diabetics, several of which vary largely both in magnitude and delay of the effect. Modern rapid-acting insulins generally have a peak time after 60–90 min, while carbohydrate intake can affect blood glucose levels more rapidly for high glycemic index foods, or slower for other carbohydrate sources. It is important to have good estimates of the development of glucose levels in the near future both for diabetic patients managing their insulin distribution manually, as well as for closed-loop systems making decisions about the distribution. Modern continuous glucose monitoring systems provide excellent sources of data to train machine learning models to predict future glucose levels. In this paper, we present an approach for predicting blood glucose levels for diabetics up to 1 h into the future. The approach is based on recurrent neural networks trained in an end-to-end fashion, requiring nothing but the glucose level history for the patient. Our approach obtains results that are comparable to the state of the art on the Ohio T1DM dataset for blood glucose level prediction. In addition to predicting the future glucose value, our model provides an estimate of its certainty, helping users to interpret the predicted levels. This is realized by training the recurrent neural network to parameterize a univariate Gaussian distribution over the output. The approach needs no feature engineering or data preprocessing and is computationally inexpensive. We evaluate our method using the standard root-mean-squared error (RMSE) metric, along with a blood glucose-specific metric called the surveillance error grid (SEG). We further study the properties of the distribution that is learned by the model, using experiments that determine the nature of the certainty estimate that the model is able to capture. Collapse

Wiedenhoeft J, Cagan A, Kozhemyakina R, Gulevich R, Schliep A. Bayesian localization of CNV candidates in WGS data within minutes. Algorithms Mol Biol 2019;14:20. [PMID: 31572486 PMCID: PMC6757390 DOI: 10.1186/s13015-019-0154-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Accepted: 08/08/2019] [Indexed: 11/10/2022] Open

Bravo GA, Antonelli A, Bacon CD, Bartoszek K, Blom MPK, Huynh S, Jones G, Knowles LL, Lamichhaney S, Marcussen T, Morlon H, Nakhleh LK, Oxelman B, Pfeil B, Schliep A, Wahlberg N, Werneck FP, Wiedenhoeft J, Willows-Munro S, Edwards SV. Embracing heterogeneity: coalescing the Tree of Life and the future of phylogenomics. PeerJ 2019;7:e6399. [PMID: 30783571 PMCID: PMC6378093 DOI: 10.7717/peerj.6399] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 01/07/2019] [Indexed: 12/23/2022] Open

Affiliation(s)

Gustavo A. Bravo Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
Alexandre Antonelli Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA Gothenburg Global Biodiversity Centre, Göteborg, Sweden Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden Gothenburg Botanical Garden, Göteborg, Sweden
Christine D. Bacon Gothenburg Global Biodiversity Centre, Göteborg, Sweden Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
Krzysztof Bartoszek Department of Computer and Information Science, Linköping University, Linköping, Sweden
Mozes P. K. Blom Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
Stella Huynh Institut de Biologie, Université de Neuchâtel, Neuchâtel, Switzerland
Graham Jones Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
L. Lacey Knowles Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
Sangeet Lamichhaney Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
Thomas Marcussen Centre for Ecological and Evolutionary Synthesis, University of Oslo, Oslo, Norway
Hélène Morlon Institut de Biologie, Ecole Normale Supérieure de Paris, Paris, France
Luay K. Nakhleh Department of Computer Science, Rice University, Houston, TX, USA
Bengt Oxelman Gothenburg Global Biodiversity Centre, Göteborg, Sweden Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
Bernard Pfeil Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
Alexander Schliep Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
Niklas Wahlberg Department of Biology, Lund University, Lund, Sweden
Fernanda P. Werneck Coordenação de Biodiversidade, Programa de Coleções Científicas Biológicas, Instituto Nacional de Pesquisa da Amazônia, Manaus, AM, Brazil
John Wiedenhoeft Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden Department of Computer Science, Rutgers University, Piscataway, NJ, USA
Sandi Willows-Munro School of Life Sciences, University of Kwazulu-Natal, Pietermaritzburg, South Africa
Scott V. Edwards Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA Gothenburg Centre for Advanced Studies in Science and Technology, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden

Collapse

Wiedenhoeft J, Schliep A. Using HaMMLET for Bayesian Segmentation of WGS Read-Depth Data. Methods Mol Biol 2018;1833:83-93. [PMID: 30039365 DOI: 10.1007/978-1-4939-8666-8_6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

O N Lopes ID, Schliep A, de L F de Carvalho AP. Automatic learning of pre-miRNAs from different species. BMC Bioinformatics 2016;17:224. [PMID: 27233515 PMCID: PMC4884428 DOI: 10.1186/s12859-016-1036-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Accepted: 04/12/2016] [Indexed: 12/18/2022] Open

Abstract

BACKGROUND

Discovery of microRNAs (miRNAs) relies on predictive models for characteristic features from miRNA precursors (pre-miRNAs). The short length of miRNA genes and the lack of pronounced sequence features complicate this task. To accommodate the peculiarities of plant and animal miRNAs systems, tools for both systems have evolved differently. However, these tools are biased towards the species for which they were primarily developed and, consequently, their predictive performance on data sets from other species of the same kingdom might be lower. While these biases are intrinsic to the species, their characterization can lead to computational approaches capable of diminishing their negative effect on the accuracy of pre-miRNAs predictive models. We investigate in this study how 45 predictive models induced for data sets from 45 species, distributed in eight subphyla/classes, perform when applied to a species different from the species used in its induction.

RESULTS

Our computational experiments show that the separability of pre-miRNAs and pseudo pre-miRNAs instances is species-dependent and no feature set performs well for all species, even within the same subphylum/class. Mitigating this species dependency, we show that an ensemble of classifiers reduced the classification errors for all 45 species. As the ensemble members were obtained using meaningful, and yet computationally viable feature sets, the ensembles also have a lower computational cost than individual classifiers that rely on energy stability parameters, which are of prohibitive computational cost in large scale applications.

CONCLUSION

In this study, the combination of multiple pre-miRNAs feature sets and multiple learning biases enhanced the predictive accuracy of pre-miRNAs classifiers of 45 species. This is certainly a promising approach to be incorporated in miRNA discovery tools towards more accurate and less species-dependent tools. The material to reproduce the results from this paper can be downloaded from http://dx.doi.org/10.5281/zenodo.49754 .

Collapse

Lopes IDON, Schliep A, de Carvalho ACPDLF. The discriminant power of RNA features for pre-miRNA recognition. BMC Bioinformatics 2014;15:124. [PMID: 24884650 PMCID: PMC4046174 DOI: 10.1186/1471-2105-15-124] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2013] [Accepted: 04/08/2014] [Indexed: 12/26/2022] Open

Abstract

BACKGROUND

Computational discovery of microRNAs (miRNA) is based on pre-determined sets of features from miRNA precursors (pre-miRNA). Some feature sets are composed of sequence-structure patterns commonly found in pre-miRNAs, while others are a combination of more sophisticated RNA features. In this work, we analyze the discriminant power of seven feature sets, which are used in six pre-miRNA prediction tools. The analysis is based on the classification performance achieved with these feature sets for the training algorithms used in these tools. We also evaluate feature discrimination through the F-score and feature importance in the induction of random forests.

RESULTS

Small or non-significant differences were found among the estimated classification performances of classifiers induced using sets with diversification of features, despite the wide differences in their dimension. Inspired in these results, we obtained a lower-dimensional feature set, which achieved a sensitivity of 90% and a specificity of 95%. These estimates are within 0.1% of the maximal values obtained with any feature set (SELECT, Section "Results and discussion") while it is 34 times faster to compute. Even compared to another feature set (FS2, see Section "Results and discussion"), which is the computationally least expensive feature set of those from the literature which perform within 0.1% of the maximal values, it is 34 times faster to compute. The results obtained by the tools used as references in the experiments carried out showed that five out of these six tools have lower sensitivity or specificity.

CONCLUSION

In miRNA discovery the number of putative miRNA loci is in the order of millions. Analysis of putative pre-miRNAs using a computationally expensive feature set would be wasteful or even unfeasible for large genomes. In this work, we propose a relatively inexpensive feature set and explore most of the learning aspects implemented in current ab-initio pre-miRNA prediction tools, which may lead to the development of efficient ab-initio pre-miRNA discovery tools.The material to reproduce the main results from this paper can be downloaded from http://bioinformatics.rutgers.edu/Static/Software/discriminant.tar.gz.

Collapse

Roy RS, Bhattacharya D, Schliep A. Turtle: Identifying frequent k -mers with cache-efficient algorithms. Bioinformatics 2014;30:1950-7. [DOI: 10.1093/bioinformatics/btu132] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Mahmud MP, Wiedenhoeft J, Schliep A. Indel-tolerant read mapping with trinucleotide frequencies using cache-oblivious kd-trees. Bioinformatics 2013;28:i325-i332. [PMID: 22962448 PMCID: PMC3436807 DOI: 10.1093/bioinformatics/bts380] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Marschall T, Costa IG, Canzar S, Bauer M, Klau GW, Schliep A, Schönhuth A. CLEVER: clique-enumerating variant finder. Bioinformatics 2012;28:2875-82. [DOI: 10.1093/bioinformatics/bts566] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Roy RS, Chen KC, Sengupta AM, Schliep A. SLIQ: Simple Linear Inequalities for Efficient Contig Scaffolding. J Comput Biol 2012;19:1162-75. [DOI: 10.1089/cmb.2011.0263] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open

Mahmud MP, Schliep A. Fast MCMC sampling for hidden Markov Models to determine copy number variations. BMC Bioinformatics 2011;12:428. [PMID: 22047014 PMCID: PMC3371636 DOI: 10.1186/1471-2105-12-428] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2011] [Accepted: 11/02/2011] [Indexed: 11/10/2022] Open

Hafemeister C, Krause R, Schliep A. Selecting oligonucleotide probes for whole-genome tiling arrays with a cross-hybridization potential. IEEE/ACM Trans Comput Biol Bioinform 2011;8:1642-1652. [PMID: 21358006 DOI: 10.1109/tcbb.2011.39] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Seifert M, Strickert M, Schliep A, Grosse I. Exploiting prior knowledge and gene distances in the analysis of tumor expression profiles with extended Hidden Markov Models. ACTA ACUST UNITED AC 2011;27:1645-52. [PMID: 21511716 DOI: 10.1093/bioinformatics/btr199] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Schilling R, Costa IG, Schliep A. pGQL: A probabilistic graphical query language for gene expression time courses. BioData Min 2011;4:9. [PMID: 21501515 PMCID: PMC3096586 DOI: 10.1186/1756-0381-4-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2010] [Accepted: 04/18/2011] [Indexed: 11/24/2022] Open

Hafemeister C, Costa IG, Schönhuth A, Schliep A. Classifying short gene expression time-courses with Bayesian estimation of piecewise constant functions. ACTA ACUST UNITED AC 2011;27:946-52. [PMID: 21266444 DOI: 10.1093/bioinformatics/btr037] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Georgi B, Costa IG, Schliep A. PyMix--the python mixture package--a tool for clustering of heterogeneous biological data. BMC Bioinformatics 2010;11:9. [PMID: 20053276 PMCID: PMC2823712 DOI: 10.1186/1471-2105-11-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2009] [Accepted: 01/06/2010] [Indexed: 11/10/2022] Open

Georgi B, Schultz J, Schliep A. Partially-supervised protein subclass discovery with simultaneous annotation of functional residues. BMC Struct Biol 2009;9:68. [PMID: 19857261 PMCID: PMC2777906 DOI: 10.1186/1472-6807-9-68] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/03/2009] [Accepted: 10/26/2009] [Indexed: 03/20/2023]

Costa IG, Schönhuth A, Hafemeister C, Schliep A. Constrained mixture estimation for analysis and robust classification of clinical time series. ACTA ACUST UNITED AC 2009;25:i6-14. [PMID: 19478017 PMCID: PMC2687976 DOI: 10.1093/bioinformatics/btp222] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Abstract

Motivation: Personalized medicine based on molecular aspects of diseases, such as gene expression profiling, has become increasingly popular. However, one faces multiple challenges when analyzing clinical gene expression data; most of the well-known theoretical issues such as high dimension of feature spaces versus few examples, noise and missing data apply. Special care is needed when designing classification procedures that support personalized diagnosis and choice of treatment. Here, we particularly focus on classification of interferon-β (IFNβ) treatment response in Multiple Sclerosis (MS) patients which has attracted substantial attention in the recent past. Half of the patients remain unaffected by IFNβ treatment, which is still the standard. For them the treatment should be timely ceased to mitigate the side effects.

Results: We propose constrained estimation of mixtures of hidden Markov models as a methodology to classify patient response to IFNβ treatment. The advantages of our approach are that it takes the temporal nature of the data into account and its robustness with respect to noise, missing data and mislabeled samples. Moreover, mixture estimation enables to explore the presence of response sub-groups of patients on the transcriptional level. We clearly outperformed all prior approaches in terms of prediction accuracy, raising it, for the first time, >90%. Additionally, we were able to identify potentially mislabeled samples and to sub-divide the good responders into two sub-groups that exhibited different transcriptional response programs. This is supported by recent findings on MS pathology and therefore may raise interesting clinical follow-up questions.

Availability: The method is implemented in the GQL framework and is available at http://www.ghmm.org/gql. Datasets are available at http://www.cin.ufpe.br/∼igcf/MSConst

Contact:igcf@cin.ufpe.br

Supplementary information:Supplementary data are available at Bioinformatics online.

Collapse

de Souto MCP, Costa IG, de Araujo DSA, Ludermir TB, Schliep A. Clustering cancer gene expression data: a comparative study. BMC Bioinformatics 2008;9:497. [PMID: 19038021 PMCID: PMC2632677 DOI: 10.1186/1471-2105-9-497] [Citation(s) in RCA: 261] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2008] [Accepted: 11/27/2008] [Indexed: 11/28/2022] Open

Macula AJ, Schliep A, Bishop MA, Renz TE. New, improved, and practical k-stem sequence similarity measures for probe design. J Comput Biol 2008;15:525-34. [PMID: 18549305 DOI: 10.1089/cmb.2007.0208] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Schliep A, Krause R. Efficient algorithms for the computational design of optimal tiling arrays. IEEE/ACM Trans Comput Biol Bioinform 2008;5:557-567. [PMID: 18989043 DOI: 10.1109/tcbb.2008.50] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]

Costa IG, Roepcke S, Hafemeister C, Schliep A. Inferring differentiation pathways from gene expression. ACTA ACUST UNITED AC 2008;24:i156-64. [PMID: 18586709 PMCID: PMC2718631 DOI: 10.1093/bioinformatics/btn153] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Rungsarityotin W, Krause R, Schödl A, Schliep A. Identifying protein complexes directly from high-throughput TAP data with Markov random fields. BMC Bioinformatics 2007;8:482. [PMID: 18093306 PMCID: PMC2222659 DOI: 10.1186/1471-2105-8-482] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2007] [Accepted: 12/19/2007] [Indexed: 11/10/2022] Open

Costa IG, Roepcke S, Schliep A. Gene expression trees in lymphoid development. BMC Immunol 2007;8:25. [PMID: 17925013 PMCID: PMC2244641 DOI: 10.1186/1471-2172-8-25] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2007] [Accepted: 10/09/2007] [Indexed: 11/10/2022] Open

Klau GW, Rahmann S, Schliep A, Vingron M, Reinert K. Optimal robust non-unique probe selection using Integer Linear Programming. Bioinformatics 2007;20 Suppl 1:i186-93. [PMID: 15262798 DOI: 10.1093/bioinformatics/bth936] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Georgi B, Schliep A. Context-specific independence mixture modeling for positional weight matrices. Bioinformatics 2006;22:e166-73. [PMID: 16873468 DOI: 10.1093/bioinformatics/btl249] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Schliep A, Rahmann S. Decoding non-unique oligonucleotide hybridization experiments of targets related by a phylogenetic tree. Bioinformatics 2006;22:e424-30. [PMID: 16873503 DOI: 10.1093/bioinformatics/btl254] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Pipenbacher P, Schliep A, Schneckener S, Schönhuth A, Schomburg D, Schrader R. ProClust: improved clustering of protein sequences with an extended graph-based approach. Bioinformatics 2005;18 Suppl 2:S182-91. [PMID: 12386002 DOI: 10.1093/bioinformatics/18.suppl_2.s182] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Abstract

MOTIVATION

The problem of finding remote homologues of a given protein sequence via alignment methods is not fully solved. In fact, the task seems to become more difficult with more data. As the size of the database increases, so does the noise level; the highest alignment scores due to random similarities increase and can be higher than the alignment score between true homologues. Comparing two sequences with an arbitrary alignment method yields a similarity value which may indicate an evolutionary relationship between them. A threshold value is usually chosen to distinguish between true homologue relationships and random similarities. To compensate for the higher probability of spurious hits in larger databases, this threshold is increased. Increasing specificity however leads to decreased sensitivity as a matter of principle. Sensitivity can be recovered by utilizing refined protocols. A number of approaches to this challenge have made use of the fact that proteins are often members of some larger protein family. This can be exploited by using position-specific substitution matrices or profiles, or by making use of transitivity of homology. Transitivity refers to the concept of concluding homology between proteins A and C based on homology between A and a third protein B and between B and C. It has been demonstrated that transitivity can lead to substantial improvement in recognition of remote homologues particularly in cases where the alignment score of A and C is below the noise level. A natural limit to the use of transitivity is imposed by domains. Domains, compact independent sub-units of proteins, are often shared between otherwise distinct proteins, and can cause substantial problems by incorrectly linking otherwise unrelated proteins.

RESULTS

We extend a graph-based clustering algorithm which uses an asymmetric distance measure, scaling similarity values based on the length of the protein sequences compared. Additionally, the significance of alignment scores is taken into account and used for a filtering step in the algorithm. Post-processing, to merge further clusters based on profile HMMs is proposed. SCOP sequences and their super-family level classification are used as a test set for a clustering computed with our method for the joint data set containing both SCOP and SWISS-PROT. Note, the joint data set includes all multi-domain proteins, which contain the SCOP domains that are a potential source of incorrect links. Our method compares at high specificities very favorably with PSI-Blast, which is probably the most widely-used tool for finding remote homologues. We demonstrate that using transitivity with as many as twelve intermediate sequences is crucial to achieving this level of performance. Moreover, from analysis of false positives we conclude that our method seems to correctly bound the degree of transitivity used. This analysis also yields explicit guidance in choosing parameters. The heuristics of the asymmetric distance measure used neither solve the multi-domain problem from a theoretical point of view, nor do they avoid all types of problems we have observed in real data. Nevertheless, they do provide a substantial improvement over existing approaches.

AVAILABILITY

The complete software source is freely available to all users under the GNU General Public License (GPL) from http://www.bioinformatik.uni-koeln.de/~proclust/download/

Collapse

Schliep A, Costa IG, Steinhoff C, Schönhuth A. Analyzing gene expression time-courses. IEEE/ACM Trans Comput Biol Bioinform 2005;2:179-93. [PMID: 17044182 DOI: 10.1109/tcbb.2005.31] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]

Costa IG, Schönhuth A, Schliep A. The Graphical Query Language: a tool for analysis of gene expression time-courses. Bioinformatics 2005;21:2544-5. [PMID: 15701683 DOI: 10.1093/bioinformatics/bti311] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Schliep A, Schönhuth A, Steinhoff C. Using hidden Markov models to analyze gene expression time course data. Bioinformatics 2004;19 Suppl 1:i255-63. [PMID: 12855468 DOI: 10.1093/bioinformatics/btg1036] [Citation(s) in RCA: 170] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Schliep A, Steinhoff C, Schönhuth A. Robust inference of groups in gene expression time-courses using mixtures of HMMs. Bioinformatics 2004;20 Suppl 1:i283-9. [PMID: 15262810 DOI: 10.1093/bioinformatics/bth937] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Schliep A, Torney DC, Rahmann S. Group testing with DNA chips: generating designs and decoding experiments. Proc IEEE Comput Soc Bioinform Conf 2003;2:84-91. [PMID: 16452782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]

Knab B, Schliep A, Steckemetz B, Wichern B. Model-Based Clustering With Hidden Markov Models and its Application to Financial Time-Series Data. Between Data Science and Applied Data Analysis 2003. [DOI: 10.1007/978-3-642-18991-3_64] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Kaderali L, Schliep A. Selecting signature oligonucleotides to identify organisms using DNA arrays. Bioinformatics 2002;18:1340-9. [PMID: 12376378 DOI: 10.1093/bioinformatics/18.10.1340] [Citation(s) in RCA: 79] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Bolten E, Schliep A, Schneckener S, Schomburg D, Schrader R. Clustering protein sequences--structure prediction by transitive homology. Bioinformatics 2001;17:935-41. [PMID: 11673238 DOI: 10.1093/bioinformatics/17.10.935] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Abstract

MOTIVATION

It is widely believed that for two proteins Aand Ba sequence identity above some threshold implies structural similarity due to a common evolutionary ancestor. Since this is only a sufficient, but not a necessary condition for structural similarity, the question remains what other criteria can be used to identify remote homologues. Transitivity refers to the concept of deducing a structural similarity between proteins A and C from the existence of a third protein B, such that A and B as well as B and C are homologues, as ascertained if the sequence identity between A and B as well as that between B and C is above the aforementioned threshold. It is not fully understood if transitivity always holds and whether transitivity can be extended ad infinitum.

RESULTS

We developed a graph-based clustering approach, where transitivity plays a crucial role. We determined all pair-wise similarities for the sequences in the SwissProt database using the Smith-Waterman local alignment algorithm. This data was transformed into a directed graph, where protein sequences constitute vertices. A directed edge was drawn from vertex A to vertex B if the sequences A and B showed similarity, scaled with respect to the self-similarity of A, above a fixed threshold. Transitivity was important in the clustering process, as intermediate sequences were used, limited though by the requirement of having directed paths in both directions between proteins linked over such sequences. The length dependency-implied by the self-similarity-of the scaling of the alignment scores appears to be an effective criterion to avoid clustering errors due to multi-domain proteins. To deal with the resulting large graphs we have developed an efficient library. Methods include the novel graph-based clustering algorithm capable of handling multi-domain proteins and cluster comparison algorithms. Structural Classification of Proteins (SCOP) was used as an evaluation data set for our method, yielding a 24% improvement over pair-wise comparisons in terms of detecting remote homologues.

AVAILABILITY

The software is available to academic users on request from the authors.

CONTACT

e.bolten@science-factory.com; schliep@zpr.uni-koeln.de; s.schneckener@science-factory.com; d.schomburg@uni-koeln.de; schrader@zpr.uni-koeln.de.

SUPPLEMENTARY INFORMATION

http://www.zaik.uni-koeln.de/~schliep/ProtClust.html.

Collapse

Knill E, Schliep A, Torney DC. Interpretation of pooling experiments using the Markov chain Monte Carlo method. J Comput Biol 1996;3:395-406. [PMID: 8891957 DOI: 10.1089/cmb.1996.3.395] [Citation(s) in RCA: 24] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open

Schlue WR, Schliep A, Walz W. Fluorescence marking of neuropile glial cells in the central nervous system of the leech Hirudo medicinalis. Cell Tissue Res 1980;209:257-69. [PMID: 7397768 DOI: 10.1007/bf00237630] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]