1
|
Welten S, Weber S, Holt A, Beyan O, Decker S. Will it run?-A proof of concept for smoke testing decentralized data analytics experiments. Front Med (Lausanne) 2024; 10:1305415. [PMID: 38259836 PMCID: PMC10801058 DOI: 10.3389/fmed.2023.1305415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Accepted: 12/14/2023] [Indexed: 01/24/2024] Open
Abstract
The growing interest in data-driven medicine, in conjunction with the formation of initiatives such as the European Health Data Space (EHDS) has demonstrated the need for methodologies that are capable of facilitating privacy-preserving data analysis. Distributed Analytics (DA) as an enabler for privacy-preserving analysis across multiple data sources has shown its potential to support data-intensive research. However, the application of DA creates new challenges stemming from its distributed nature, such as identifying single points of failure (SPOFs) in DA tasks before their actual execution. Failing to detect such SPOFs can, for example, result in improper termination of the DA code, necessitating additional efforts from multiple stakeholders to resolve the malfunctions. Moreover, these malfunctions disrupt the seamless conduct of DA and entail several crucial consequences, including technical obstacles to resolve the issues, potential delays in research outcomes, and increased costs. In this study, we address this challenge by introducing a concept based on a method called Smoke Testing, an initial and foundational test run to ensure the operability of the analysis code. We review existing DA platforms and systematically extract six specific Smoke Testing criteria for DA applications. With these criteria in mind, we create an interactive environment called Development Environment for AuTomated and Holistic Smoke Testing of Analysis-Runs (DEATHSTAR), which allows researchers to perform Smoke Tests on their DA experiments. We conduct a user-study with 29 participants to assess our environment and additionally apply it to three real use cases. The results of our evaluation validate its effectiveness, revealing that 96.6% of the analyses created and (Smoke) tested by participants using our approach successfully terminated without any errors. Thus, by incorporating Smoke Testing as a fundamental method, our approach helps identify potential malfunctions early in the development process, ensuring smoother data-driven research within the scope of DA. Through its flexibility and adaptability to diverse real use cases, our solution enables more robust and efficient development of DA experiments, which contributes to their reliability.
Collapse
Affiliation(s)
- Sascha Welten
- Chair of Computer Science 5, Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen University, Aachen, Germany
| | - Sven Weber
- Chair of Computer Science 5, Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen University, Aachen, Germany
- Institute for Biomedical Informatics, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Adrian Holt
- Chair of Computer Science 5, Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen University, Aachen, Germany
| | - Oya Beyan
- Institute for Biomedical Informatics, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
- Fraunhofer Institute for Applied Information Technology FIT, St. Augustin, Germany
| | - Stefan Decker
- Institute for Biomedical Informatics, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
- Fraunhofer Institute for Applied Information Technology FIT, St. Augustin, Germany
| |
Collapse
|
2
|
Namoun A, Abi Sen AA, Tufail A, Alshanqiti A, Nawaz W, BenRhouma O. A Two-Phase Machine Learning Framework for Context-Aware Service Selection to Empower People with Disabilities. Sensors (Basel) 2022; 22:5142. [PMID: 35890820 PMCID: PMC9324550 DOI: 10.3390/s22145142] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 06/24/2022] [Accepted: 07/01/2022] [Indexed: 06/15/2023]
Abstract
The use of software and IoT services is increasing significantly among people with special needs, who constitute 15% of the world's population. However, selecting appropriate services to create a composite assistive service based on the evolving needs and context of disabled user groups remains a challenging research endeavor. Our research applies a scenario-based design technique to contribute (1) an inclusive disability ontology for assistive service selection, (2) semi-synthetic generated disability service datasets, and (3) a machine learning (ML) framework to choose services adaptively to suit the dynamic requirements of people with special needs. The ML-based selection framework is applied in two complementary phases. In the first phase, all available atomic tasks are assessed to determine their appropriateness to the user goal and profiles, whereas in the subsequent phase, the list of service providers is narrowed by matching their quality-of-service factors against the context and characteristics of the disabled person. Our methodology is centered around a myriad of user characteristics, including their disability profile, preferences, environment, and available IT resources. To this end, we extended the widely used QWS V2.0 and WS-DREAM web services datasets with a fusion of selected accessibility features. To ascertain the validity of our approach, we compared its performance against common multi-criteria decision making (MCDM) models, namely AHP, SAW, PROMETHEE, and TOPSIS. The findings demonstrate superior service selection accuracy in contrast to the other methods while ensuring accessibility requirements are satisfied.
Collapse
Affiliation(s)
- Abdallah Namoun
- Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah 42351, Saudi Arabia; (A.A.A.S.); (A.A.); (W.N.); (O.B.)
| | - Adnan Ahmed Abi Sen
- Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah 42351, Saudi Arabia; (A.A.A.S.); (A.A.); (W.N.); (O.B.)
| | - Ali Tufail
- School of Digital Science, Universiti Brunei Darussalam, Tungku Link, Gadong BE1410, Brunei;
| | - Abdullah Alshanqiti
- Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah 42351, Saudi Arabia; (A.A.A.S.); (A.A.); (W.N.); (O.B.)
| | - Waqas Nawaz
- Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah 42351, Saudi Arabia; (A.A.A.S.); (A.A.); (W.N.); (O.B.)
| | - Oussama BenRhouma
- Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah 42351, Saudi Arabia; (A.A.A.S.); (A.A.); (W.N.); (O.B.)
| |
Collapse
|
3
|
Sheeba A, Padmakala S, Subasini CA, Karuppiah SP. MKELM: Mixed Kernel Extreme Learning Machine using BMDA optimization for web services based heart disease prediction in smart healthcare. Comput Methods Biomech Biomed Engin 2022; 25:1180-1194. [PMID: 35174762 DOI: 10.1080/10255842.2022.2034795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
In recent years, cardiovascular disease becomes a prominent source of death. The web services connect other medical equipments and the computers via internet for exchanging and combining the data in novel ways. The accurate prediction of heart disease is important to prevent cardiac patients prior to heart attack. The main drawback of heart disease is delay in identifying the disease in the early stage. This objective is obtained by using the machine learning method with rich healthcare information on heart diseases. In this paper, the smart healthcare method is proposed for the prediction of heart disease using Biogeography optimization algorithm and Mexican hat wavelet to enhance Dragonfly algorithm optimization with mixed kernel based extreme learning machine (BMDA-MKELM) approach. Here, data is gathered from the two devices such as sensor nodes as well as the electronic medical records. The android based design is utilized to gather the patient data and the reliable cloud-based scheme for the data storage. For further evaluation for the prediction of heart disease, data are gathered from cloud computing services. At last, BMDA-MKELM based prediction scheme is capable to classify cardiovascular diseases. In addition to this, the proposed prediction scheme is compared with another method with respect to measures such as accuracy, precision, specificity, and sensitivity. The experimental results depict that the proposed approach achieves better results for the prediction of heart disease when compared with other methods.
Collapse
Affiliation(s)
- Adlin Sheeba
- Department of Computer Science and Engineering, St. Joseph's Institute of Technology, Chennai, India
| | - S Padmakala
- Department of Computer Science and Engineering, St. Joseph's Institute of Technology, Chennai, India
| | - C A Subasini
- Department of Computer Science and Engineering, St. Joseph's Institute of Technology, Chennai, India
| | - S P Karuppiah
- Department of MBA, St. Joseph's College of Engineering, Chennai, India
| |
Collapse
|
4
|
Ciampittiello M, Manca D, Dresti C, Grisoni S, Lami A, Saidi H. Meteo-Hydrological Sensors within the Lake Maggiore Catchment: System Establishment, Functioning and Data Validation. Sensors (Basel) 2021; 21:s21248300. [PMID: 34960394 PMCID: PMC8705426 DOI: 10.3390/s21248300] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 11/29/2021] [Accepted: 12/07/2021] [Indexed: 12/01/2022]
Abstract
Climate change and human activities have a strong impact on lakes and their catchments, so to understand ongoing processes it is fundamental to monitor environmental variables with a spatially well-distributed and high frequency network and efficiently share data. An effective sharing and interoperability of environmental information between technician and end-user fosters an in-depth knowledge of the territory and its critical environmental issues. In this paper, we present the approaches and the results obtained during the PITAGORA project (Interoperable Technological Platform for Acquisition, Management and Organization of Environmental data, related to the lake basin). PITAGORA was aimed at developing both instruments and data management, including pre-processing and quality control of raw data to ensure that data are findable, accessible, interoperable, and reusable (FAIR principles). The main results show that the developed instrumentation is low-cost, easily implementable and reliable, and can be applied to the measurement of diverse environmental parameters such as meteorological, hydrological, physico-chemical, and geological. The flexibility of the solutions proposed make our system adaptable to different monitoring purposes, research, management, and civil protection. The real time access to environmental information can improve management of a territory and ecosystems, safety of the population, and sustainable socio-economic development.
Collapse
|
5
|
Honorato RV, Koukos PI, Jiménez-García B, Tsaregorodtsev A, Verlato M, Giachetti A, Rosato A, Bonvin AMJJ. Structural Biology in the Clouds: The WeNMR-EOSC Ecosystem. Front Mol Biosci 2021; 8:729513. [PMID: 34395534 PMCID: PMC8356364 DOI: 10.3389/fmolb.2021.729513] [Citation(s) in RCA: 243] [Impact Index Per Article: 81.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 07/13/2021] [Indexed: 12/05/2022] Open
Abstract
Structural biology aims at characterizing the structural and dynamic properties of biological macromolecules at atomic details. Gaining insight into three dimensional structures of biomolecules and their interactions is critical for understanding the vast majority of cellular processes, with direct applications in health and food sciences. Since 2010, the WeNMR project (www.wenmr.eu) has implemented numerous web-based services to facilitate the use of advanced computational tools by researchers in the field, using the high throughput computing infrastructure provided by EGI. These services have been further developed in subsequent initiatives under H2020 projects and are now operating as Thematic Services in the European Open Science Cloud portal (www.eosc-portal.eu), sending >12 millions of jobs and using around 4,000 CPU-years per year. Here we review 10 years of successful e-infrastructure solutions serving a large worldwide community of over 23,000 users to date, providing them with user-friendly, web-based solutions that run complex workflows in structural biology. The current set of active WeNMR portals are described, together with the complex backend machinery that allows distributed computing resources to be harvested efficiently.
Collapse
Affiliation(s)
- Rodrigo V Honorato
- Bijvoet Centre for Biomolecular Research, Faculty of Science, Department of Chemistry, Utrecht University, Utrecht, Netherlands
| | - Panagiotis I Koukos
- Bijvoet Centre for Biomolecular Research, Faculty of Science, Department of Chemistry, Utrecht University, Utrecht, Netherlands
| | - Brian Jiménez-García
- Bijvoet Centre for Biomolecular Research, Faculty of Science, Department of Chemistry, Utrecht University, Utrecht, Netherlands
| | | | | | - Andrea Giachetti
- Department of Chemistry and Magnetic Resonance Center, University of Florence, and C.I.R.M.M.P, Fiorentino, Italy
| | - Antonio Rosato
- Department of Chemistry and Magnetic Resonance Center, University of Florence, and C.I.R.M.M.P, Fiorentino, Italy
| | - Alexandre M J J Bonvin
- Bijvoet Centre for Biomolecular Research, Faculty of Science, Department of Chemistry, Utrecht University, Utrecht, Netherlands
| |
Collapse
|
6
|
Abbasi WA, Abbas SA, Andleeb S. PANDA: Predicting the change in proteins binding affinity upon mutations by finding a signal in primary structures. J Bioinform Comput Biol 2021; 19:2150015. [PMID: 34126874 DOI: 10.1142/s0219720021500153] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Accurately determining a change in protein binding affinity upon mutations is important to find novel therapeutics and to assist mutagenesis studies. Determination of change in binding affinity upon mutations requires sophisticated, expensive, and time-consuming wet-lab experiments that can be supported with computational methods. Most of the available computational prediction techniques depend upon protein structures that bound their applicability to only protein complexes with recognized 3D structures. In this work, we explore the sequence-based prediction of change in protein binding affinity upon mutation and question the effectiveness of [Formula: see text]-fold cross-validation (CV) across mutations adopted in previous studies to assess the generalization ability of such predictors with no known mutation during training. We have used protein sequence information instead of protein structures along with machine learning techniques to accurately predict the change in protein binding affinity upon mutation. Our proposed sequence-based novel change in protein binding affinity predictor called PANDA performs comparably to the existing methods gauged through an appropriate CV scheme and an external independent test dataset. On an external test dataset, our proposed method gives a maximum Pearson correlation coefficient of 0.52 in comparison to the state-of-the-art existing protein structure-based method called MutaBind which gives a maximum Pearson correlation coefficient of 0.59. Our proposed protein sequence-based method, to predict a change in binding affinity upon mutations, has wide applicability and comparable performance in comparison to existing protein structure-based methods. We made PANDA easily accessible through a cloud-based webserver and python code available at https://sites.google.com/view/wajidarshad/software and https://github.com/wajidarshad/panda, respectively.
Collapse
Affiliation(s)
- Wajid Arshad Abbasi
- Computational Biology and Data Analysis Lab., Department of Computer Sciences & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K 13100, Pakistan
| | - Syed Ali Abbas
- Computational Biology and Data Analysis Lab., Department of Computer Sciences & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K 13100, Pakistan
| | - Saiqa Andleeb
- Biotechnology Lab., Department of Zoology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K 13100, Pakistan
| |
Collapse
|
7
|
Bremer E, Saltz J, Almeida JS. ImageBox 2 - Efficient and Rapid Access of Image Tiles from Whole-Slide Images Using Serverless HTTP Range Requests. J Pathol Inform 2020; 11:29. [PMID: 33163255 PMCID: PMC7605284 DOI: 10.4103/jpi.jpi_31_20] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 04/20/2020] [Accepted: 07/03/2020] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Whole-slide images (WSI) are produced by a high-resolution scanning of pathology glass slides. There are a large number of whole-slide imaging scanners, and the resulting images are frequently larger than 100,000 × 100,000 pixels which typically image 100,000 to one million cells, ranging from several hundred megabytes to many gigabytes in size. AIMS AND OBJECTIVES Provide HTTP access over the web to Whole Slide Image tiles that do not have localized tiling servers but only basic HTTP access. Move all image decode and tiling functions to calling agent (ImageBox). METHODS Current software systems require tiling image servers to be installed on systems providing local disk access to these images. ImageBox2 breaks this requirement by accessing tiles from remote HTTP source via byte-level HTTP range requests. This method does not require changing the client software as the operation is relegated to the ImageBox2 server which is local (or remote) to the client and can access tiles from remote images that have no server of their own such as Amazon S3 hosted images. That is, it provides a data service [on a server that does not need to be managed], the definition of serverless execution model increasingly favored by cloud computing infrastructure. CONCLUSIONS The specific methodology described and assessed in this report preserves normal client connection semantics by enabling cloud-friendly tiling, promoting a web of http connected whole-slide images from a wide-ranging number of sources, and providing tiling where local tiling servers would have been otherwise unavailable.
Collapse
Affiliation(s)
- Erich Bremer
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Jonas S Almeida
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Maryland, USA
| |
Collapse
|
8
|
Mier P, Andrade-Navarro MA. MAGA: A Supervised Method to Detect Motifs From Annotated Groups in Alignments. Evol Bioinform Online 2020; 16:1176934320916199. [PMID: 32425492 PMCID: PMC7218316 DOI: 10.1177/1176934320916199] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Accepted: 03/10/2020] [Indexed: 11/17/2022] Open
Abstract
Multiple sequence alignments are usually phylogenetically driven. They are studied in the framework of evolution. But sometimes, it is interesting to study residue conservation at positions unconstrained by evolutionary rules. We present a supervised method to access a layer of information difficult to appreciate visually when many protein sequences are aligned. This new tool (MAGA; http://cbdm-01.zdv.uni-mainz.de/~munoz/maga/) locates positions in multiple sequence alignments differentially conserved in manually defined groups of sequences.
Collapse
Affiliation(s)
- Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz, Hanns-Dieter-Hüsch-Weg 15, Mainz 55128, Germany
| | - Miguel A Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz, Hanns-Dieter-Hüsch-Weg 15, Mainz 55128, Germany
| |
Collapse
|
9
|
Nguyen VD, Nguyen TH, Tayeen ASM, Laughinghouse HD, Sánchez-Reyes LL, Wiggins J, Pontelli E, Mozzherin D, O’Meara B, Stoltzfus A. Phylotastic: Improving Access to Tree-of-Life Knowledge With Flexible, on-the-Fly Delivery of Trees. Evol Bioinform Online 2020; 16:1176934319899384. [PMID: 32372858 PMCID: PMC7192527 DOI: 10.1177/1176934319899384] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 11/20/2019] [Indexed: 11/15/2022] Open
Abstract
A comprehensive phylogeny of species, i.e., a tree of life, has potential uses in a variety of contexts, including research, education, and public policy. Yet, accessing the tree of life typically requires special knowledge, complex software, or long periods of training. The Phylotastic project aims make it as easy to get a phylogeny of species as it is to get driving directions from mapping software. In prior work, we presented a design for an open system to validate and manage taxon names, find phylogeny resources, extract subtrees matching a user's taxon list, scale trees to time, and integrate related resources such as species images. Here, we report the implementation of a set of tools that together represent a robust, accessible system for on-the-fly delivery of phylogenetic knowledge. This set of tools includes a web portal to execute several customizable workflows to obtain species phylogenies (scaled by geologic time and decorated with thumbnail images); more than 30 underlying web services (accessible via a common registry); and code toolkits in R and Python (allowing others to develop custom applications using Phylotastic services). The Phylotastic system, accessible via http://www.phylotastic.org, provides a unique resource to access the current state of phylogenetic knowledge, useful for a variety of cases in which a tree extracted quickly from online resources (as distinct from a tree custom-made from character data) is sufficient, as it is for many casual uses of trees identified here.
Collapse
Affiliation(s)
- Van D Nguyen
- Department of Computer Science, New Mexico State University, Las Cruces, NM, USA
| | - Thanh H Nguyen
- Department of Computer Science, New Mexico State University, Las Cruces, NM, USA
| | - Abu Saleh Md Tayeen
- Department of Computer Science, New Mexico State University, Las Cruces, NM, USA
| | - H Dail Laughinghouse
- Institute for Bioscience and Biotechnology Research, Rockville, MD, USA
- Fort Lauderdale Research and Education Center, University of Florida/IFAS, Davie, FL, USA
| | - Luna L Sánchez-Reyes
- Department of Ecology and Evolutionary Biology, The University of Tennessee, Knoxville, Knoxville, TN, USA
| | - Jodie Wiggins
- Department of Ecology and Evolutionary Biology, The University of Tennessee, Knoxville, Knoxville, TN, USA
| | - Enrico Pontelli
- Department of Computer Science, New Mexico State University, Las Cruces, NM, USA
| | - Dmitry Mozzherin
- Illinois Natural History Survey, Species File Group, University of Illinois at Urbana–Champaign, Champaign, IL, USA
| | - Brian O’Meara
- Department of Ecology and Evolutionary Biology, The University of Tennessee, Knoxville, Knoxville, TN, USA
| | - Arlin Stoltzfus
- Institute for Bioscience and Biotechnology Research, Rockville, MD, USA
- Office of Data and Informatics, Material Measurement Laboratory, NIST, Gaithersburg, MD, USA
| |
Collapse
|
10
|
Kidmose RT, Juhl J, Nissen P, Boesen T, Karlsen JL, Pedersen BP. Namdinator - automatic molecular dynamics flexible fitting of structural models into cryo-EM and crystallography experimental maps. IUCrJ 2019; 6:526-531. [PMID: 31316797 PMCID: PMC6608625 DOI: 10.1107/s2052252519007619] [Citation(s) in RCA: 189] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2019] [Accepted: 05/25/2019] [Indexed: 05/20/2023]
Abstract
Model building into experimental maps is a key element of structural biology, but can be both time consuming and error prone for low-resolution maps. Here we present Namdinator, an easy-to-use tool that enables the user to run a molecular dynamics flexible fitting simulation followed by real-space refinement in an automated manner through a pipeline system. Namdinator will modify an atomic model to fit within cryo-EM or crystallography density maps, and can be used advantageously for both the initial fitting of models, and for a geometrical optimization step to correct outliers, clashes and other model problems. We have benchmarked Namdinator against 39 deposited cryo-EM models and maps, and observe model improvements in 34 of these cases (87%). Clashes between atoms were reduced, and the model-to-map fit and overall model geometry were improved, in several cases substantially. We show that Namdinator is able to model large-scale conformational changes compared to the starting model. Namdinator is a fast and easy tool for structural model builders at all skill levels. Namdinator is available as a web service (https://namdinator.au.dk), or it can be run locally as a command-line tool.
Collapse
Affiliation(s)
- Rune Thomas Kidmose
- Centre for Structural Biology, Department of Molecular Biology and Genetics, Aarhus University, Gustav Wieds Vej 10C, Aarhus, DK-8000, Denmark
| | - Jonathan Juhl
- Centre for Structural Biology, Department of Molecular Biology and Genetics, Aarhus University, Gustav Wieds Vej 10C, Aarhus, DK-8000, Denmark
| | - Poul Nissen
- Centre for Structural Biology, Department of Molecular Biology and Genetics, Aarhus University, Gustav Wieds Vej 10C, Aarhus, DK-8000, Denmark
| | - Thomas Boesen
- Centre for Structural Biology, Department of Molecular Biology and Genetics, Aarhus University, Gustav Wieds Vej 10C, Aarhus, DK-8000, Denmark
| | - Jesper Lykkegaard Karlsen
- Centre for Structural Biology, Department of Molecular Biology and Genetics, Aarhus University, Gustav Wieds Vej 10C, Aarhus, DK-8000, Denmark
- Correspondence e-mail: ,
| | - Bjørn Panyella Pedersen
- Centre for Structural Biology, Department of Molecular Biology and Genetics, Aarhus University, Gustav Wieds Vej 10C, Aarhus, DK-8000, Denmark
- Correspondence e-mail: ,
| |
Collapse
|
11
|
Wegrzyn JL, Staton MA, Street NR, Main D, Grau E, Herndon N, Buehler S, Falk T, Zaman S, Ramnath R, Richter P, Sun L, Condon B, Almsaeed A, Chen M, Mannapperuma C, Jung S, Ficklin S. Cyberinfrastructure to Improve Forest Health and Productivity: The Role of Tree Databases in Connecting Genomes, Phenomes, and the Environment. Front Plant Sci 2019; 10:813. [PMID: 31293610 PMCID: PMC6603172 DOI: 10.3389/fpls.2019.00813] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Accepted: 06/05/2019] [Indexed: 05/11/2023]
Abstract
Despite tremendous advancements in high throughput sequencing, the vast majority of tree genomes, and in particular, forest trees, remain elusive. Although primary databases store genetic resources for just over 2,000 forest tree species, these are largely focused on sequence storage, basic genome assemblies, and functional assignment through existing pipelines. The tree databases reviewed here serve as secondary repositories for community data. They vary in their focal species, the data they curate, and the analytics provided, but they are united in moving toward a goal of centralizing both data access and analysis. They provide frameworks to view and update annotations for complex genomes, interrogate systems level expression profiles, curate data for comparative genomics, and perform real-time analysis with genotype and phenotype data. The organism databases of today are no longer simply catalogs or containers of genetic information. These repositories represent integrated cyberinfrastructure that support cross-site queries and analysis in web-based environments. These resources are striving to integrate across diverse experimental designs, sequence types, and related measures through ontologies, community standards, and web services. Efficient, simple, and robust platforms that enhance the data generated by the research community, contribute to improving forest health and productivity.
Collapse
Affiliation(s)
- Jill L. Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Margaret A. Staton
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, Knoxville, TN, United States
| | - Nathaniel R. Street
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, Umeå, Sweden
| | - Dorrie Main
- Department of Horticulture, Washington State University, Pullman, WA, United States
| | - Emily Grau
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Nic Herndon
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Sean Buehler
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Taylor Falk
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Sumaira Zaman
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Risharde Ramnath
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Peter Richter
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Lang Sun
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Bradford Condon
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, Knoxville, TN, United States
| | - Abdullah Almsaeed
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, Knoxville, TN, United States
| | - Ming Chen
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, Knoxville, TN, United States
| | - Chanaka Mannapperuma
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, Umeå, Sweden
| | - Sook Jung
- Department of Horticulture, Washington State University, Pullman, WA, United States
| | - Stephen Ficklin
- Department of Horticulture, Washington State University, Pullman, WA, United States
| |
Collapse
|
12
|
Cai W, Du X, Xu J. A Personalized QoS Prediction Method for Web Services via Blockchain-Based Matrix Factorization. Sensors (Basel) 2019; 19:s19122749. [PMID: 31248105 PMCID: PMC6631161 DOI: 10.3390/s19122749] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/05/2019] [Revised: 06/12/2019] [Accepted: 06/17/2019] [Indexed: 11/16/2022]
Abstract
Personalized quality of service (QoS) prediction plays an important role in helping users build high-quality service-oriented systems. To obtain accurate prediction results, many approaches have been investigated in recent years. However, these approaches do not fully address untrustworthy QoS values submitted by unreliable users, leading to inaccurate predictions. To address this issue, inspired by blockchain with distributed ledger technology, distributed consensus mechanisms, encryption algorithms, etc., we propose a personalized QoS prediction method for web services that we call blockchain-based matrix factorization (BMF). We develop a user verification approach based on homomorphic hash, and use the Byzantine agreement to remove unreliable users. Then, matrix factorization is employed to improve the accuracy of predictions and we evaluate the proposed BMF on a real-world web services dataset. Experimental results show that the proposed method significantly outperforms existing approaches, making it much more effective than traditional techniques.
Collapse
Affiliation(s)
- Weihong Cai
- Department of Computer Science, Shantou University, Shantou 515063, Guangdong, China.
- Key Laboratory of Intelligent Manufacturing Technology (Shantou University), Ministry of Education, Shantou 515063, Guangdong, China.
| | - Xin Du
- Department of Computer Science, Shantou University, Shantou 515063, Guangdong, China.
- Key Laboratory of Intelligent Manufacturing Technology (Shantou University), Ministry of Education, Shantou 515063, Guangdong, China.
| | - Jianlong Xu
- Department of Computer Science, Shantou University, Shantou 515063, Guangdong, China.
- Key Laboratory of Intelligent Manufacturing Technology (Shantou University), Ministry of Education, Shantou 515063, Guangdong, China.
| |
Collapse
|
13
|
Madeira F, Madhusoodanan N, Lee J, Tivey ARN, Lopez R. Using EMBL-EBI Services via Web Interface and Programmatically via Web Services. ACTA ACUST UNITED AC 2019; 66:e74. [PMID: 31039604 DOI: 10.1002/cpbi.74] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The European Bioinformatics Institute (EMBL-EBI) provides access to a wide range of core databases and analysis tools that are of key importance in bioinformatics. As well as providing web interfaces to these resources, web services are available using REST and SOAP protocols that enable programmatic access and allow their integration into other applications and analytical workflows and pipelines. This article describes the various options available to researchers and bioinformaticians who would like to use our resources via the web interface employing RESTful web service clients provided in Perl, Python, and Java, or would like to use Docker containers to integrate the resources into analysis pipelines and workflows. © 2019 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Fábio Madeira
- European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Nandana Madhusoodanan
- European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Joon Lee
- European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Adrian R N Tivey
- European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Rodrigo Lopez
- European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| |
Collapse
|
14
|
Gil-de-la-Fuente A, Godzien J, Saugar S, Garcia-Carmona R, Badran H, Wishart DS, Barbas C, Otero A. CEU Mass Mediator 3.0: A Metabolite Annotation Tool. J Proteome Res 2018; 18:797-802. [PMID: 30574788 DOI: 10.1021/acs.jproteome.8b00720] [Citation(s) in RCA: 94] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
CEU Mass Mediator (CMM, http://ceumass.eps.uspceu.es ) is an online tool that has evolved from a simple interface to query different metabolomic databases (CMM 1.0) to a tool that unifies the compounds from these databases and, using an expert system with knowledge about the experimental setup and the compounds properties, filters and scores the query results (CMM 2.0). Since this last major revision, CMM has continued to grow, expanding the knowledge base of its expert system and including new services to support researchers in the metabolite annotation and identification process. The information from external databases has been refreshed, and an in-house library with oxidized lipids not present in other sources has been added. This has increased the number of experimental metabolites up 332,665 and the number of predicted metabolites to 681,198. Furthermore, new taxonomy and ontology metadata have been included. CMM has expanded its functionalities with a service for the annotation of oxidized glycerophosphocholines, a service for spectral comparison from MS2 data, and a spectral quality-assessment service to determine the reliability of a spectrum for compound identification purposes. To facilitate the collaboration and integration of CMM with external tools and metabolomic platforms, a RESTful API has been created, and it has already been integrated into the HMDB (Human Metabolome Database). This paper will present the novel functionalities incorporated into version 3.0 of CMM.
Collapse
Affiliation(s)
- Alberto Gil-de-la-Fuente
- Department of Information Technology, Escuela Politécnica Superior , Universidad San Pablo-CEU, CEU Universities, Campus Montepríncipe , Boadilla del Monte, Madrid 28668 , Spain.,Centre for Metabolomics and Bioanalysis (CEMBIO), Facultad de Farmacia , Universidad San Pablo-CEU, CEU Universities, Campus Montepríncipe , Boadilla del Monte, Madrid 28668 , Spain
| | - Joanna Godzien
- Centre for Metabolomics and Bioanalysis (CEMBIO), Facultad de Farmacia , Universidad San Pablo-CEU, CEU Universities, Campus Montepríncipe , Boadilla del Monte, Madrid 28668 , Spain
| | - Sergio Saugar
- Department of Information Technology, Escuela Politécnica Superior , Universidad San Pablo-CEU, CEU Universities, Campus Montepríncipe , Boadilla del Monte, Madrid 28668 , Spain
| | - Rodrigo Garcia-Carmona
- Department of Information Technology, Escuela Politécnica Superior , Universidad San Pablo-CEU, CEU Universities, Campus Montepríncipe , Boadilla del Monte, Madrid 28668 , Spain
| | - Hasan Badran
- Department of Biological Sciences University of Alberta , Edmonton , Alberta T6G 2E9 , Canada
| | - David S Wishart
- Department of Biological Sciences University of Alberta , Edmonton , Alberta T6G 2E9 , Canada.,Department of Computing Science , University of Alberta , Edmonton , Alberta T6G 2E8 , Canada.,Faculty of Pharmacy and Pharmaceutical Sciences , University of Alberta , Edmonton , Alberta T6G 2N8 , Canada
| | - Coral Barbas
- Centre for Metabolomics and Bioanalysis (CEMBIO), Facultad de Farmacia , Universidad San Pablo-CEU, CEU Universities, Campus Montepríncipe , Boadilla del Monte, Madrid 28668 , Spain
| | - Abraham Otero
- Department of Information Technology, Escuela Politécnica Superior , Universidad San Pablo-CEU, CEU Universities, Campus Montepríncipe , Boadilla del Monte, Madrid 28668 , Spain.,Centre for Metabolomics and Bioanalysis (CEMBIO), Facultad de Farmacia , Universidad San Pablo-CEU, CEU Universities, Campus Montepríncipe , Boadilla del Monte, Madrid 28668 , Spain
| |
Collapse
|
15
|
Jin W, Kim D. Development of Virtual Resource Based IoT Proxy for Bridging Heterogeneous Web Services in IoT Networks. Sensors (Basel) 2018; 18:E1721. [PMID: 29861453 DOI: 10.3390/s18061721] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Revised: 05/15/2018] [Accepted: 05/22/2018] [Indexed: 11/16/2022]
Abstract
The Internet of Things is comprised of heterogeneous devices, applications, and platforms using multiple communication technologies to connect the Internet for providing seamless services ubiquitously. With the requirement of developing Internet of Things products, many protocols, program libraries, frameworks, and standard specifications have been proposed. Therefore, providing a consistent interface to access services from those environments is difficult. Moreover, bridging the existing web services to sensor and actuator networks is also important for providing Internet of Things services in various industry domains. In this paper, an Internet of Things proxy is proposed that is based on virtual resources to bridge heterogeneous web services from the Internet to the Internet of Things network. The proxy enables clients to have transparent access to Internet of Things devices and web services in the network. The proxy is comprised of server and client to forward messages for different communication environments using the virtual resources which include the server for the message sender and the client for the message receiver. We design the proxy for the Open Connectivity Foundation network where the virtual resources are discovered by the clients as Open Connectivity Foundation resources. The virtual resources represent the resources which expose services in the Internet by web service providers. Although the services are provided by web service providers from the Internet, the client can access services using the consistent communication protocol in the Open Connectivity Foundation network. For discovering the resources to access services, the client also uses the consistent discovery interface to discover the Open Connectivity Foundation devices and virtual resources.
Collapse
|
16
|
Krissinel E, Uski V, Lebedev A, Winn M, Ballard C. Distributed computing for macromolecular crystallography. Acta Crystallogr D Struct Biol 2018; 74:143-151. [PMID: 29533240 PMCID: PMC5947778 DOI: 10.1107/s2059798317014565] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Accepted: 10/09/2017] [Indexed: 11/21/2022] Open
Abstract
Modern crystallographic computing is characterized by the growing role of automated structure-solution pipelines, which represent complex expert systems utilizing a number of program components, decision makers and databases. They also require considerable computational resources and regular database maintenance, which is increasingly more difficult to provide at the level of individual desktop-based CCP4 setups. On the other hand, there is a significant growth in data processed in the field, which brings up the issue of centralized facilities for keeping both the data collected and structure-solution projects. The paradigm of distributed computing and data management offers a convenient approach to tackling these problems, which has become more attractive in recent years owing to the popularity of mobile devices such as tablets and ultra-portable laptops. In this article, an overview is given of developments by CCP4 aimed at bringing distributed crystallographic computations to a wide crystallographic community.
Collapse
Affiliation(s)
- Evgeny Krissinel
- Scientific Computing Department, STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, England
| | - Ville Uski
- Scientific Computing Department, STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, England
| | - Andrey Lebedev
- Scientific Computing Department, STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, England
| | - Martyn Winn
- Scientific Computing Department, STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, England
| | - Charles Ballard
- Scientific Computing Department, STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, England
| |
Collapse
|
17
|
Murtazalieva KA, Druzhilovskiy DS, Goel RK, Sastry GN, Poroikov VV. How good are publicly available web services that predict bioactivity profiles for drug repurposing? SAR QSAR Environ Res 2017; 28:843-862. [PMID: 29183230 DOI: 10.1080/1062936x.2017.1399448] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2017] [Accepted: 10/29/2017] [Indexed: 06/07/2023]
Abstract
Drug repurposing provides a non-laborious and less expensive way for finding new human medicines. Computational assessment of bioactivity profiles shed light on the hidden pharmacological potential of the launched drugs. Currently, several freely available computational tools are available via the Internet, which predict multitarget profiles of drug-like compounds. They are based on chemical similarity assessment (ChemProt, SuperPred, SEA, SwissTargetPrediction and TargetHunter) or machine learning methods (ChemProt and PASS). To compare their performance, this study has created two evaluation sets, consisting of (1) 50 well-known repositioned drugs and (2) 12 drugs recently patented for new indications. In the first set, sensitivity values varied from 0.64 (TarPred) to 1.00 (PASS Online) for the initial indications and from 0.64 (TarPred) to 0.98 (PASS Online) for the repurposed indications. In the second set, sensitivity values varied from 0.08 (SuperPred) to 1.00 (PASS Online) for the initial indications and from 0.00 (SuperPred) to 1.00 (PASS Online) for the repurposed indications. Thus, this analysis demonstrated that the performance of machine learning methods surpassed those of chemical similarity assessments, particularly in the case of novel repurposed indications.
Collapse
Affiliation(s)
- K A Murtazalieva
- a Institute of Biomedical Chemistry , Moscow , Russia
- b Pirogov Russian National Research Medical University , Moscow , Russia
| | | | - R K Goel
- c Punjabi University , Patiala , Punjab , India
| | - G N Sastry
- d CSIR-Indian Institute of Chemical Technology , Hyderabad , India
| | - V V Poroikov
- a Institute of Biomedical Chemistry , Moscow , Russia
| |
Collapse
|
18
|
Bernal-Rusiel JL, Rannou N, Gollub RL, Pieper S, Murphy S, Robertson R, Grant PE, Pienaar R. Reusable Client-Side JavaScript Modules for Immersive Web-Based Real-Time Collaborative Neuroimage Visualization. Front Neuroinform 2017; 11:32. [PMID: 28507515 PMCID: PMC5410600 DOI: 10.3389/fninf.2017.00032] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2017] [Accepted: 04/13/2017] [Indexed: 11/27/2022] Open
Abstract
In this paper we present a web-based software solution to the problem of implementing real-time collaborative neuroimage visualization. In both clinical and research settings, simple and powerful access to imaging technologies across multiple devices is becoming increasingly useful. Prior technical solutions have used a server-side rendering and push-to-client model wherein only the server has the full image dataset. We propose a rich client solution in which each client has all the data and uses the Google Drive Realtime API for state synchronization. We have developed a small set of reusable client-side object-oriented JavaScript modules that make use of the XTK toolkit, a popular open-source JavaScript library also developed by our team, for the in-browser rendering and visualization of brain image volumes. Efficient realtime communication among the remote instances is achieved by using just a small JSON object, comprising a representation of the XTK image renderers' state, as the Google Drive Realtime collaborative data model. The developed open-source JavaScript modules have already been instantiated in a web-app called MedView, a distributed collaborative neuroimage visualization application that is delivered to the users over the web without requiring the installation of any extra software or browser plugin. This responsive application allows multiple physically distant physicians or researchers to cooperate in real time to reach a diagnosis or scientific conclusion. It also serves as a proof of concept for the capabilities of the presented technological solution.
Collapse
Affiliation(s)
- Jorge L Bernal-Rusiel
- Fetal-Neonatal Neuroimaging and Developmental Science Center, Boston Children's HospitalBoston, MA, USA
| | | | - Randy L Gollub
- Department of Radiology, Massachusetts General HospitalBoston, MA, USA.,Department of Psychiatry, Massachusetts General HospitalBoston, MA, USA.,Harvard Medical SchoolBoston, MA, USA
| | - Steve Pieper
- Isomics Inc.Cambridge, MA, USA.,Surgical Planning Laboratory, Brigham and Women's HospitalBoston, MA, USA
| | - Shawn Murphy
- Harvard Medical SchoolBoston, MA, USA.,Department of Neurology, Massachusetts General HospitalBoston, MA, USA.,Laboratory of Computer Science, Massachusetts General HospitalBoston, MA, USA
| | - Richard Robertson
- Harvard Medical SchoolBoston, MA, USA.,Department of Radiology, Boston Children's HospitalBoston, MA, USA
| | - Patricia E Grant
- Fetal-Neonatal Neuroimaging and Developmental Science Center, Boston Children's HospitalBoston, MA, USA.,Harvard Medical SchoolBoston, MA, USA.,Department of Radiology, Boston Children's HospitalBoston, MA, USA
| | - Rudolph Pienaar
- Fetal-Neonatal Neuroimaging and Developmental Science Center, Boston Children's HospitalBoston, MA, USA.,Harvard Medical SchoolBoston, MA, USA.,Department of Radiology, Boston Children's HospitalBoston, MA, USA
| |
Collapse
|
19
|
Abstract
Web services play a key role in bioinformatics enabling the integration of database access and analysis of algorithms. However, Web service repositories do not usually publish information on the changes made to their registered Web services. Dynamism is directly related to the changes in the repositories (services registered or unregistered) and at service level (annotation changes). Thus, users, software clients or workflow based approaches lack enough relevant information to decide when they should review or re-execute a Web service or workflow to get updated or improved results. The dynamism of the repository could be a measure for workflow developers to re-check service availability and annotation changes in the services of interest to them. This paper presents a review on the most well-known Web service repositories in the life sciences including an analysis of their dynamism. Freshness is introduced in this paper, and has been used as the measure for the dynamism of these repositories.
Collapse
Affiliation(s)
- David Urdidiales‐Nieto
- Department of Computer Languages and Computing ScienceHigher Technical School of Computer Science EngineeringUniversity of MalagaMalaga29071Spain
| | - Ismael Navas‐Delgado
- Department of Computer Languages and Computing ScienceHigher Technical School of Computer Science EngineeringUniversity of MalagaMalaga29071Spain
| | - José F. Aldana‐Montes
- Department of Computer Languages and Computing ScienceHigher Technical School of Computer Science EngineeringUniversity of MalagaMalaga29071Spain
| |
Collapse
|
20
|
Fahmi F, Nasution TH, Anggreiny A. Smart cloud system with image processing server in diagnosing brain diseases dedicated for hospitals with limited resources. Technol Health Care 2017; 25:607-610. [PMID: 28128774 DOI: 10.3233/thc-171298] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The use of medical imaging in diagnosing brain disease is growing. The challenges are related to the big size of data and complexity of the image processing. High standard of hardware and software are demanded, which can only be provided in big hospitals. Our purpose was to provide a smart cloud system to help diagnosing brain diseases for hospital with limited infrastructure. The expertise of neurologists was first implanted in cloud server to conduct an automatic diagnosis in real time using image processing technique developed based on ITK library and web service. Users upload images through website and the result, in this case the size of tumor was sent back immediately. A specific image compression technique was developed for this purpose. The smart cloud system was able to measure the area and location of tumors, with average size of 19.91 ± 2.38 cm2 and an average response time 7.0 ± 0.3 s. The capability of the server decreased when multiple clients accessed the system simultaneously: 14 ± 0 s (5 parallel clients) and 27 ± 0.2 s (10 parallel clients). The cloud system was successfully developed to process and analyze medical images for diagnosing brain diseases in this case for tumor.
Collapse
Affiliation(s)
- Fahmi Fahmi
- Department of Electrical Engineering, University of Sumatera Utara (USU), Medan, Indonesia
| | - Tigor H Nasution
- Department of Electrical Engineering, University of Sumatera Utara (USU), Medan, Indonesia
| | - Anggreiny Anggreiny
- Department of Radiology, Faculty of Medicine, University of Sumatera Utara (USU), Medan, Indonesia
| |
Collapse
|
21
|
Krishnakumar V, Contrino S, Cheng CY, Belyaeva I, Ferlanti ES, Miller JR, Vaughn MW, Micklem G, Town CD, Chan AP. ThaleMine: A Warehouse for Arabidopsis Data Integration and Discovery. Plant Cell Physiol 2017; 58:e4. [PMID: 28013278 DOI: 10.1093/pcp/pcw200] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 11/11/2016] [Indexed: 05/08/2023]
Abstract
ThaleMine (https://apps.araport.org/thalemine/) is a comprehensive data warehouse that integrates a wide array of genomic information of the model plant Arabidopsis thaliana. The data collection currently includes the latest structural and functional annotation from the Araport11 update, the Col-0 genome sequence, RNA-seq and array expression, co-expression, protein interactions, homologs, pathways, publications, alleles, germplasm and phenotypes. The data are collected from a wide variety of public resources. Users can browse gene-specific data through Gene Report pages, identify and create gene lists based on experiments or indexed keywords, and run GO enrichment analysis to investigate the biological significance of selected gene sets. Developed by the Arabidopsis Information Portal project (Araport, https://www.araport.org/), ThaleMine uses the InterMine software framework, which builds well-structured data, and provides powerful data query and analysis functionality. The warehoused data can be accessed by users via graphical interfaces, as well as programmatically via web-services. Here we describe recent developments in ThaleMine including new features and extensions, and discuss future improvements. InterMine has been broadly adopted by the model organism research community including nematode, rat, mouse, zebrafish, budding yeast, the modENCODE project, as well as being used for human data. ThaleMine is the first InterMine developed for a plant model. As additional new plant InterMines are developed by the legume and other plant research communities, the potential of cross-organism integrative data analysis will be further enabled.
Collapse
Affiliation(s)
- Vivek Krishnakumar
- Plant Genomics, J. Craig Venter Institute, Medical Center Dr, Rockville, MD, USA
| | - Sergio Contrino
- Department of Genetics, Cambridge Systems Biology Centre, Tennis Court Road, Cambridge, UK
| | - Chia-Yi Cheng
- Plant Genomics, J. Craig Venter Institute, Medical Center Dr, Rockville, MD, USA
| | - Irina Belyaeva
- Plant Genomics, J. Craig Venter Institute, Medical Center Dr, Rockville, MD, USA
| | - Erik S Ferlanti
- Life Sciences Computing, Texas Advanced Computing Center, 10100 Burnet Rd, Austin, TX, USA
| | - Jason R Miller
- Plant Genomics, J. Craig Venter Institute, Medical Center Dr, Rockville, MD, USA
| | - Matthew W Vaughn
- Life Sciences Computing, Texas Advanced Computing Center, 10100 Burnet Rd, Austin, TX, USA
| | - Gos Micklem
- Department of Genetics, Cambridge Systems Biology Centre, Tennis Court Road, Cambridge, UK
| | - Christopher D Town
- Plant Genomics, J. Craig Venter Institute, Medical Center Dr, Rockville, MD, USA
| | - Agnes P Chan
- Plant Genomics, J. Craig Venter Institute, Medical Center Dr, Rockville, MD, USA
| |
Collapse
|
22
|
Abstract
Phylogenetic trees are pervasively used to depict evolutionary relationships. Increasingly, researchers need to visualize large trees and compare multiple large trees inferred for the same set of taxa (reflecting uncertainty in the tree inference or genuine discordance among the loci analyzed). Existing tree visualization tools are however not well suited to these tasks. In particular, side-by-side comparison of trees can prove challenging beyond a few dozen taxa. Here, we introduce Phylo.io, a web application to visualize and compare phylogenetic trees side-by-side. Its distinctive features are: highlighting of similarities and differences between two trees, automatic identification of the best matching rooting and leaf order, scalability to large trees, high usability, multiplatform support via standard HTML5 implementation, and possibility to store and share visualizations. The tool can be freely accessed at http://phylo.io and can easily be embedded in other web servers. The code for the associated JavaScript library is available at https://github.com/DessimozLab/phylo-io under an MIT open source license.
Collapse
Affiliation(s)
- Oscar Robinson
- Department of Computer Science, University College London, London, United Kingdom Department of Genetics Evolution and Environment, University College London, London, United Kingdom
| | - David Dylus
- Department of Genetics Evolution and Environment, University College London, London, United Kingdom Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | - Christophe Dessimoz
- Department of Computer Science, University College London, London, United Kingdom Department of Genetics Evolution and Environment, University College London, London, United Kingdom Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
23
|
Leidenberger S, Käck M, Karlsson B, Kindvall O. The Analysis Portal and the Swedish LifeWatch e-infrastructure for biodiversity research. Biodivers Data J 2016:e7644. [PMID: 27099553 PMCID: PMC4822057 DOI: 10.3897/bdj.4.e7644] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2015] [Accepted: 03/22/2016] [Indexed: 11/20/2022] Open
Abstract
Background During the last years, more and more online portals were generated and are now available for ecologists to run advanced models with extensive data sets. Some examples are the Biodiversity Virtual e-Laboratory (BioVel) Portal (https://portal.biovel.eu) for ecological niche modelling and the Mobyle SNAP Workbench (https://snap.hpc.ncsu.edu) for evolutionary and population genetics analysis. Such portals have the main goal to facilitate the run of advanced models, through access to large-capacity computers or servers. In this study, we present the Analysis Portal (www.analysisportal.se), which is a part of the Swedish LifeWatch e-infrastructure for biodiversity research that combines a variety of Swedish web services to perform different kinds of dataprocessing. New information For the first time, the Swedish Analysis Portal for integrated analysis of species occurrence data is described in detail. It was launched in 2013 and today, over 60 Million Swedish species observation records can be assessed, visualized and analyzed via the portal. Datasets can be assembled using sophisticated filtering tools, and combined with environmental and climatic data from a wide range of providers. Different validation tools, for example the official Swedish taxon concept database Dyntaxa, ensure high data quality. Results can be downloaded in different formats as maps, tables, diagrams and reports.
Collapse
Affiliation(s)
| | - Martin Käck
- ArtDatabanken, Swedish Species Information Centre, SLU, Uppsala, Sweden
| | - Björn Karlsson
- ArtDatabanken, Swedish Species Information Centre, SLU, Uppsala, Sweden
| | - Oskar Kindvall
- ArtDatabanken, Swedish Species Information Centre, SLU, Uppsala, Sweden
| |
Collapse
|
24
|
Abstract
BACKGROUND To cope with the ever-increasing amount of sequence data generated in the field of genomics, the demand for efficient and fast database searches that drive functional and structural annotation in both large- and small-scale genome projects is on the rise. The tools of the BLAST+ suite are the most widely employed bioinformatic method for these database searches. Recent trends in bioinformatics application development show an increasing number of JavaScript apps that are based on modern frameworks such as Node.js. Until now, there is no way of using database searches with the BLAST+ suite from a Node.js codebase. RESULTS We developed blastjs, a Node.js library that wraps the search tools of the BLAST+ suite and thus allows to easily add significant functionality to any Node.js-based application. CONCLUSION blastjs is a library that allows the incorporation of BLAST+ functionality into bioinformatics applications based on JavaScript and Node.js. The library was designed to be as user-friendly as possible and therefore requires only a minimal amount of code in the client application. The library is freely available under the MIT license at https://github.com/teammaclean/blastjs.
Collapse
Affiliation(s)
- Martin Page
- Bioinformatics Group, The Sainsbury Laboratory, Norwich Research Park, Norwich, NR4 7UH, UK.
| | - Dan MacLean
- Bioinformatics Group, The Sainsbury Laboratory, Norwich Research Park, Norwich, NR4 7UH, UK.
| | - Christian Schudoma
- Bioinformatics Group, The Sainsbury Laboratory, Norwich Research Park, Norwich, NR4 7UH, UK.
- Triticeae Genomics Group, The Genome Analysis Centre, Norwich Research Park, Norwich, NR4 7UH, UK.
| |
Collapse
|
25
|
Merlet B, Paulhe N, Vinson F, Frainay C, Chazalviel M, Poupin N, Gloaguen Y, Giacomoni F, Jourdan F. A Computational Solution to Automatically Map Metabolite Libraries in the Context of Genome Scale Metabolic Networks. Front Mol Biosci 2016; 3:2. [PMID: 26909353 PMCID: PMC4754433 DOI: 10.3389/fmolb.2016.00002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2015] [Accepted: 01/25/2016] [Indexed: 11/13/2022] Open
Abstract
This article describes a generic programmatic method for mapping chemical compound libraries on organism-specific metabolic networks from various databases (KEGG, BioCyc) and flat file formats (SBML and Matlab files). We show how this pipeline was successfully applied to decipher the coverage of chemical libraries set up by two metabolomics facilities MetaboHub (French National infrastructure for metabolomics and fluxomics) and Glasgow Polyomics (GP) on the metabolic networks available in the MetExplore web server. The present generic protocol is designed to formalize and reduce the volume of information transfer between the library and the network database. Matching of metabolites between libraries and metabolic networks is based on InChIs or InChIKeys and therefore requires that these identifiers are specified in both libraries and networks. In addition to providing covering statistics, this pipeline also allows the visualization of mapping results in the context of metabolic networks. In order to achieve this goal, we tackled issues on programmatic interaction between two servers, improvement of metabolite annotation in metabolic networks and automatic loading of a mapping in genome scale metabolic network analysis tool MetExplore. It is important to note that this mapping can also be performed on a single or a selection of organisms of interest and is thus not limited to large facilities.
Collapse
Affiliation(s)
- Benjamin Merlet
- TOXALIM (Research Centre in Food Toxicology), Institut National de la Recherche Agronomique, UMR1331, Université de Toulouse Toulouse, France
| | - Nils Paulhe
- Nutrition Humaine, Plateforme d'Exploration du Métabolisme, Institut National de la Recherche Agronomique, Centre Clermont-Ferrand-Theix, UMR 1019 Saint-Genès-Champanelle, France
| | - Florence Vinson
- TOXALIM (Research Centre in Food Toxicology), Institut National de la Recherche Agronomique, UMR1331, Université de Toulouse Toulouse, France
| | - Clément Frainay
- TOXALIM (Research Centre in Food Toxicology), Institut National de la Recherche Agronomique, UMR1331, Université de Toulouse Toulouse, France
| | - Maxime Chazalviel
- TOXALIM (Research Centre in Food Toxicology), Institut National de la Recherche Agronomique, UMR1331, Université de Toulouse Toulouse, France
| | - Nathalie Poupin
- TOXALIM (Research Centre in Food Toxicology), Institut National de la Recherche Agronomique, UMR1331, Université de Toulouse Toulouse, France
| | - Yoann Gloaguen
- Glasgow Polyomics, College of Medical, Veterinary and Life Sciences, University of Glasgow Glasgow, UK
| | - Franck Giacomoni
- Nutrition Humaine, Plateforme d'Exploration du Métabolisme, Institut National de la Recherche Agronomique, Centre Clermont-Ferrand-Theix, UMR 1019 Saint-Genès-Champanelle, France
| | - Fabien Jourdan
- TOXALIM (Research Centre in Food Toxicology), Institut National de la Recherche Agronomique, UMR1331, Université de Toulouse Toulouse, France
| |
Collapse
|
26
|
Arif MJ, El Emary IMM, Koutsouris DD. A review on the technologies and services used in the self-management of health and independent living of elderly. Technol Health Care 2015; 22:677-87. [PMID: 25134962 DOI: 10.3233/thc-140851] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
As the number of aged people is rapidly growing, the need for health and living care of aged people living alone becomes imperative. The telecare systems are able to provide flexible services for older people suffering from chronic diseases, but are largely user group oriented. However, it is common in elderly to show symptoms of a combination of (chronic) diseases. Moreover, elderly are totally dependent on a third person as they are unable to perform a number of basic functions at home. They also feel cutt off from the social fabric. Old people living in remote places typically use telephone that dials a social alarm control center or mobile social alarm systems and monitoring systems. This study examines the existing solutions related to elderly assistance and proposes an advanced solution based on web technology for the self-management of health and independent living of elderly.
Collapse
Affiliation(s)
- Mohammad Jafar Arif
- Information Science Department, Faculty of Arts & Humanities King Abdulaziz University, Jeddah, Saudi Arabia
| | - Ibrahiem M M El Emary
- Information Science Department, Faculty of Arts & Humanities King Abdulaziz University, Jeddah, Saudi Arabia
| | - Dimitrios-Dionisios Koutsouris
- Biomedical Engineering Laboratory, School of Electrical and Computers Engineering, National Technical University of Athens, Athens, Greece
| |
Collapse
|
27
|
Velloso H, Vialle RA, Ortega JM. BOWS (bioinformatics open web services) to centralize bioinformatics tools in web services. BMC Res Notes 2015; 8:206. [PMID: 26032494 PMCID: PMC4467627 DOI: 10.1186/s13104-015-1190-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2014] [Accepted: 05/20/2015] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Bioinformaticians face a range of difficulties to get locally-installed tools running and producing results; they would greatly benefit from a system that could centralize most of the tools, using an easy interface for input and output. Web services, due to their universal nature and widely known interface, constitute a very good option to achieve this goal. RESULTS Bioinformatics open web services (BOWS) is a system based on generic web services produced to allow programmatic access to applications running on high-performance computing (HPC) clusters. BOWS intermediates the access to registered tools by providing front-end and back-end web services. Programmers can install applications in HPC clusters in any programming language and use the back-end service to check for new jobs and their parameters, and then to send the results to BOWS. Programs running in simple computers consume the BOWS front-end service to submit new processes and read results. BOWS compiles Java clients, which encapsulate the front-end web service requisitions, and automatically creates a web page that disposes the registered applications and clients. CONCLUSIONS Bioinformatics open web services registered applications can be accessed from virtually any programming language through web services, or using standard java clients. The back-end can run in HPC clusters, allowing bioinformaticians to remotely run high-processing demand applications directly from their machines.
Collapse
Affiliation(s)
- Henrique Velloso
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil.
| | - Ricardo A Vialle
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil.
| | - J Miguel Ortega
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil.
| |
Collapse
|
28
|
Mathew C, Güntsch A, Obst M, Vicario S, Haines R, Williams AR, de Jong Y, Goble C. A semi-automated workflow for biodiversity data retrieval, cleaning, and quality control. Biodivers Data J 2014:e4221. [PMID: 25535486 PMCID: PMC4267104 DOI: 10.3897/bdj.2.e4221] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2014] [Accepted: 12/07/2014] [Indexed: 11/16/2022] Open
Abstract
The compilation and cleaning of data needed for analyses and prediction of species distributions is a time consuming process requiring a solid understanding of data formats and service APIs provided by biodiversity informatics infrastructures. We designed and implemented a Taverna-based Data Refinement Workflow which integrates taxonomic data retrieval, data cleaning, and data selection into a consistent, standards-based, and effective system hiding the complexity of underlying service infrastructures. The workflow can be freely used both locally and through a web-portal which does not require additional software installations by users.
Collapse
Affiliation(s)
- Cherian Mathew
- Freie Universität Berlin, Botanic Garden and Botanical Museum Berlin-Dahlem, Berlin, Germany
| | - Anton Güntsch
- Freie Universität Berlin, Botanic Garden and Botanical Museum Berlin-Dahlem, Berlin, Germany
| | - Matthias Obst
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
| | - Saverio Vicario
- Institute of Biomedical Technologies, National Research Council, Bari, Italy
| | - Robert Haines
- School of Computer Science, University of Manchester, Manchester, United Kingdom
| | - Alan R Williams
- School of Computer Science, University of Manchester, Manchester, United Kingdom
| | - Yde de Jong
- University of Eastern Finland, Joensuu, Finland ; University of Amsterdam - Facilty of Science, Amsterdam, Netherlands
| | - Carole Goble
- School of Computer Science, University of Manchester, Manchester, United Kingdom
| |
Collapse
|
29
|
Ieong PU, Sørensen J, Vemu PL, Wong CW, Demir Ö, Williams NP, Wang J, Crawl D, Swift RV, Malmstrom RD, Altintas I, Amaro RE. Progress towards automated Kepler scientific workflows for computer-aided drug discovery and molecular simulations. ACTA ACUST UNITED AC 2014; 29:1745-1755. [PMID: 29399238 DOI: 10.1016/j.procs.2014.05.159] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
We describe the development of automated workflows that support computed-aided drug discovery (CADD) and molecular dynamics (MD) simulations and are included as part of the National Biomedical Computational Resource (NBCR). The main workflow components include: file-management tasks, ligand force field parameterization, receptor-ligand molecular dynamics (MD) simulations, job submission and monitoring on relevant high-performance computing (HPC) resources, receptor structural clustering, virtual screening (VS), and statistical analyses of the VS results. The workflows aim to standardize simulation and analysis and promote best practices within the molecular simulation and CADD communities. Each component is developed as a stand-alone workflow, which allows easy integration into larger frameworks built to suit user needs, while remaining intuitive and easy to extend.
Collapse
Affiliation(s)
- Pek U Ieong
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA.,National Biomedical Computation Resource, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Jesper Sørensen
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Prasantha L Vemu
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Celia W Wong
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Özlem Demir
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Nadya P Williams
- San Diego Supercomputer Center, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA.,National Biomedical Computation Resource, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Jianwu Wang
- San Diego Supercomputer Center, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Daniel Crawl
- San Diego Supercomputer Center, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA.,National Biomedical Computation Resource, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Robert V Swift
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Robert D Malmstrom
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA.,National Biomedical Computation Resource, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Ilkay Altintas
- San Diego Supercomputer Center, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA.,National Biomedical Computation Resource, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| | - Rommie E Amaro
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA.,National Biomedical Computation Resource, University of California San Diego, 9500 Gilman Drive, MC 0340, La Jolla, CA 92093, USA
| |
Collapse
|
30
|
van Nas A, Pan C, Ingram-Drake LA, Ghazalpour A, Drake TA, Sobel EM, Papp JC, Lusis AJ. The systems genetics resource: a web application to mine global data for complex disease traits. Front Genet 2013; 4:84. [PMID: 23730305 PMCID: PMC3657633 DOI: 10.3389/fgene.2013.00084] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2012] [Accepted: 04/25/2013] [Indexed: 11/13/2022] Open
Abstract
The Systems Genetics Resource (SGR) (http://systems.genetics.ucla.edu) is a new open-access web application and database that contains genotypes and clinical and intermediate phenotypes from both human and mouse studies. The mouse data include studies using crosses between specific inbred strains and studies using the Hybrid Mouse Diversity Panel. SGR is designed to assist researchers studying genes and pathways contributing to complex disease traits, including obesity, diabetes, atherosclerosis, heart failure, osteoporosis, and lipoprotein metabolism. Over the next few years, we hope to add data relevant to deafness, addiction, hepatic steatosis, toxin responses, and vascular injury. The intermediate phenotypes include expression array data for a variety of tissues and cultured cells, metabolite levels, and protein levels. Pre-computed tables of genetic loci controlling intermediate and clinical phenotypes, as well as phenotype correlations, are accessed via a user-friendly web interface. The web site includes detailed protocols for all of the studies. Data from published studies are freely available; unpublished studies have restricted access during their embargo period.
Collapse
Affiliation(s)
- Atila van Nas
- Department of Human Genetics, University of California Los Angeles, Los Angeles, CA, USA
| | | | | | | | | | | | | | | |
Collapse
|
31
|
Stoltzfus A, Lapp H, Matasci N, Deus H, Sidlauskas B, Zmasek CM, Vaidya G, Pontelli E, Cranston K, Vos R, Webb CO, Harmon LJ, Pirrung M, O'Meara B, Pennell MW, Mirarab S, Rosenberg MS, Balhoff JP, Bik HM, Heath TA, Midford PE, Brown JW, McTavish EJ, Sukumaran J, Westneat M, Alfaro ME, Steele A, Jordan G. Phylotastic! Making tree-of-life knowledge accessible, reusable and convenient. BMC Bioinformatics 2013; 14:158. [PMID: 23668630 PMCID: PMC3669619 DOI: 10.1186/1471-2105-14-158] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2013] [Accepted: 04/30/2013] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Scientists rarely reuse expert knowledge of phylogeny, in spite of years of effort to assemble a great "Tree of Life" (ToL). A notable exception involves the use of Phylomatic, which provides tools to generate custom phylogenies from a large, pre-computed, expert phylogeny of plant taxa. This suggests great potential for a more generalized system that, starting with a query consisting of a list of any known species, would rectify non-standard names, identify expert phylogenies containing the implicated taxa, prune away unneeded parts, and supply branch lengths and annotations, resulting in a custom phylogeny suited to the user's needs. Such a system could become a sustainable community resource if implemented as a distributed system of loosely coupled parts that interact through clearly defined interfaces. RESULTS With the aim of building such a "phylotastic" system, the NESCent Hackathons, Interoperability, Phylogenies (HIP) working group recruited 2 dozen scientist-programmers to a weeklong programming hackathon in June 2012. During the hackathon (and a three-month follow-up period), 5 teams produced designs, implementations, documentation, presentations, and tests including: (1) a generalized scheme for integrating components; (2) proof-of-concept pruners and controllers; (3) a meta-API for taxonomic name resolution services; (4) a system for storing, finding, and retrieving phylogenies using semantic web technologies for data exchange, storage, and querying; (5) an innovative new service, DateLife.org, which synthesizes pre-computed, time-calibrated phylogenies to assign ages to nodes; and (6) demonstration projects. These outcomes are accessible via a public code repository (GitHub.com), a website (http://www.phylotastic.org), and a server image. CONCLUSIONS Approximately 9 person-months of effort (centered on a software development hackathon) resulted in the design and implementation of proof-of-concept software for 4 core phylotastic components, 3 controllers, and 3 end-user demonstration tools. While these products have substantial limitations, they suggest considerable potential for a distributed system that makes phylogenetic knowledge readily accessible in computable form. Widespread use of phylotastic systems will create an electronic marketplace for sharing phylogenetic knowledge that will spur innovation in other areas of the ToL enterprise, such as annotation of sources and methods and third-party methods of quality assessment.
Collapse
Affiliation(s)
- Arlin Stoltzfus
- Institute for Bioscience and Biotechnology Research (IBBR), Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
| | - Hilmar Lapp
- National Evolutionary Synthesis Center, 2024 W. Main St, Durham, NC, 27705, USA
| | - Naim Matasci
- The iPlant Collaborative and EEB Department, University of Arizona, 1657 E Helen St, Tucson, AZ, 85721, USA
| | - Helena Deus
- Digital Enterprise Research Institute, National University of Ireland, University Road, Galway, Ireland
| | - Brian Sidlauskas
- Department of Fisheries and Wildlife, Oregon State University, 104 Nash Hall, Corvallis, OR, 97331-3803, USA
| | - Christian M Zmasek
- Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Gaurav Vaidya
- Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, 80309-0334, USA
| | - Enrico Pontelli
- Department of Computer Science, New Mexico State University, MSC CS, Box 30001, Las Cruces, NM, 88003, USA
| | - Karen Cranston
- National Evolutionary Synthesis Center, 2024 W. Main St, Durham, NC, 27705, USA
| | - Rutger Vos
- NCB Naturalis, Einsteinweg 2, Leiden, 2333 CC, the Netherlands
| | - Campbell O Webb
- Arnold Arboretum of Harvard University, Boston, MA, 02130, USA
| | - Luke J Harmon
- Institute for Bioinformatics and Evolutionary Studies (IBEST), University of Idaho, PO Box 443051, Moscow, ID, 83844-3051, USA
| | - Megan Pirrung
- University of Colorado Denver Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Brian O'Meara
- Department of Ecology & Evolutionary Biology, 569 Dabney Hall, University of Tennessee, Knoxville, TN, 37996, USA
| | - Matthew W Pennell
- Institute for Bioinformatics and Evolutionary Studies (IBEST), University of Idaho, PO Box 443051, Moscow, ID, 83844-3051, USA
| | - Siavash Mirarab
- Department of Computer Science, University of Texas at Austin, Austin, TX, 78701, USA
| | - Michael S Rosenberg
- Center for Evolutionary Medicine and Informatics, The Biodesign Institute, and School of Life Sciences, Arizona State University, PO Box 874501, Tempe, AZ, 85287-4501, USA
| | - James P Balhoff
- National Evolutionary Synthesis Center, 2024 W. Main St, Durham, NC, 27705, USA
| | - Holly M Bik
- UC Davis Genome Center, One Shields Ave, Davis, CA, 95618, USA
| | - Tracy A Heath
- Department of Integrative Biology, University of California, Berkeley, CA, 94720-3140, USA
| | - Peter E Midford
- National Evolutionary Synthesis Center, 2024 W. Main St, Durham, NC, 27705, USA
| | - Joseph W Brown
- Institute for Bioinformatics and Evolutionary Studies (IBEST), University of Idaho, PO Box 443051, Moscow, ID, 83844-3051, USA
| | | | - Jeet Sukumaran
- Biology Department, Duke University, Biological Sciences Building, 125 Science Drive, Durham, NC, 27708, USA
| | - Mark Westneat
- Biodiversity Synthesis Center, Field Museum of Natural History, 1400 S Lakeshore Dr, Chicago, IL, 60605, USA
| | - Michael E Alfaro
- Department of Ecology and Evolutionary Biology, South University of California Los Angeles, 621 Charles E. Young Dr, Los Angeles, CA, 90095, USA
| | - Aaron Steele
- U.C. Berkeley Museum of Vertebrate Zoology, University of California, 3101 Valley Life Sciences Building, Berkeley, CA, 94720, USA
| | - Greg Jordan
- Paperpile, 34 Houghton Street, Somerville, MA, 02143, USA
| |
Collapse
|
32
|
Abstract
The proliferation of mobile devices such as smartphones and tablet computers has recently been extended to include a growing ecosystem of increasingly sophisticated chemistry software packages, commonly known as apps. The capabilities that these apps can offer to the practicing chemist are approaching those of conventional desktop-based software, but apps tend to be focused on a relatively small range of tasks. To overcome this, chemistry apps must be able to seamlessly transfer data to other apps, and through the network to other devices, as well as to other platforms, such as desktops and servers, using documented file formats and protocols whenever possible. This article describes the development and state of the art with regard to chemistry-aware apps that make use of facile data interchange, and some of the scenarios in which these apps can be inserted into a chemical information workflow to increase productivity. A selection of contemporary apps is used to demonstrate their relevance to pharmaceutical research. Mobile apps represent a novel approach for delivery of cheminformatics tools to chemists and other scientists, and indications suggest that mobile devices represent a disruptive technology for drug discovery, as they have been to many other industries.
Collapse
Affiliation(s)
- Alex M Clark
- Molecular Materials Informatics, 1900 St. Jacques #302, Montreal, Quebec, Canada H3J 2S1
| | - Sean Ekins
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay Varina, NC 27526, USA
| | - Antony J Williams
- Royal Society of Chemistry, 904 Tamaras Circle, Wake Forest, NC-27587, USA
| |
Collapse
|
33
|
Miori V, Russo D, Concordia C. Meeting people's needs in a fully interoperable domotic environment. Sensors (Basel) 2012; 12:6802-24. [PMID: 22969322 PMCID: PMC3435952 DOI: 10.3390/s120606802] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/08/2012] [Revised: 05/11/2012] [Accepted: 05/22/2012] [Indexed: 11/16/2022]
Abstract
The key idea underlying many Ambient Intelligence (AmI) projects and applications is context awareness, which is based mainly on their capacity to identify users and their locations. The actual computing capacity should remain in the background, in the periphery of our awareness, and should only move to the center if and when necessary. Computing thus becomes 'invisible', as it is embedded in the environment and everyday objects. The research project described herein aims to realize an Ambient Intelligence-based environment able to improve users' quality of life by learning their habits and anticipating their needs. This environment is part of an adaptive, context-aware framework designed to make today's incompatible heterogeneous domotic systems fully interoperable, not only for connecting sensors and actuators, but for providing comprehensive connections of devices to users. The solution is a middleware architecture based on open and widely recognized standards capable of abstracting the peculiarities of underlying heterogeneous technologies and enabling them to co-exist and interwork, without however eliminating their differences. At the highest level of this infrastructure, the Ambient Intelligence framework, integrated with the domotic sensors, can enable the system to recognize any unusual or dangerous situations and anticipate health problems or special user needs in a technological living environment, such as a house or a public space.
Collapse
Affiliation(s)
- Vittorio Miori
- Institute of Information Science and Technologies A. Faedo (ISTI), CNR-National Research Council of Italy, Pisa, Italy.
| | | | | |
Collapse
|
34
|
Baker E, Johnson KG, Young JR. The future of the past in the present: biodiversity informatics and geological time. Zookeys 2011:397-405. [PMID: 22207819 PMCID: PMC3234446 DOI: 10.3897/zookeys.150.2350] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2011] [Accepted: 11/23/2011] [Indexed: 11/12/2022] Open
Abstract
The biological and palaeontological communities have approached the problem of informatics separately, creating a divide between communities that is both technological and sociological in nature. In this paper we describe one new advance towards solving this problem - expanding the Scratchpads platform to deal with geological time. In creating this system we have attempted to make our work open to existing communities by providing a webservice of geological time data via the GBIF Vocabularies site. We have also ensured that our system can adapt to changes in the definition of geological time intervals and is capable of querying datasets independently of the format of geological age data used.
Collapse
Affiliation(s)
- Edward Baker
- Department of Entomology, Natural History Museum, London, United Kingdom
| | | | | |
Collapse
|
35
|
Hanani F, Kobayashi T, Jo E, Nakajima S, Oyama H. Public health information and statistics dissemination efforts for Indonesia on the Internet. Online J Public Health Inform 2011; 3:ojphi.v3i2.3602. [PMID: 23569612 PMCID: PMC3615789 DOI: 10.5210/ojphi.v3i2.3602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
OBJECTIVES To elucidate current issues related to health statistics dissemination efforts on the Internet in Indonesia and to propose a new dissemination website as a solution. METHODS A cross-sectional survey was conducted. Sources of statistics were identified using link relationship and Google™ search. Menu used to locate statistics, mode of presentation and means of access to statistics, and available statistics were assessed for each site. Assessment results were used to derive design specification; a prototype system was developed and evaluated with usability test. RESULTS 49 sources were identified on 18 governmental, 8 international and 5 non-government websites. Of 49 menus identified, 33% used non-intuitive titles and lead to inefficient search. 69% of them were on government websites. Of 31 websites, only 39% and 23% used graph/chart and map for presentation. Further, only 32%, 39% and 19% provided query, export and print feature. While >50% sources reported morbidity, risk factor and service provision statistics, <40% sources reported health resource and mortality statistics. Statistics portal website was developed using Joomla!™ content management system. Usability test demonstrated its potential to improve data accessibility. DISCUSSION AND CONCLUSION In this study, government's efforts to disseminate statistics in Indonesia are supported by non-governmental and international organizations and existing their information may not be very useful because it is: a) not widely distributed, b) difficult to locate, and c) not effectively communicated. Actions are needed to ensure information usability, and one of such actions is the development of statistics portal website.
Collapse
Affiliation(s)
- Febiana Hanani
- Department of Clinical Information Engineering, Health Services Sciences, School of Public Health, Graduate School of Medicine, The University of Tokyo, Japan
| | - Takashi Kobayashi
- Department of Clinical Information Engineering, Health Services Sciences, School of Public Health, Graduate School of Medicine, The University of Tokyo, Japan
| | - Eitetsu Jo
- Department of Clinical Information Engineering, Health Services Sciences, School of Public Health, Graduate School of Medicine, The University of Tokyo, Japan
| | - Sawako Nakajima
- Department of Clinical Information Engineering, Health Services Sciences, School of Public Health, Graduate School of Medicine, The University of Tokyo, Japan
| | - Hiroshi Oyama
- Department of Clinical Information Engineering, Health Services Sciences, School of Public Health, Graduate School of Medicine, The University of Tokyo, Japan
| |
Collapse
|
36
|
Graham J, Jarnevich CS, Simpson A, Newman GJ, Stohlgren TJ. Federated or cached searches: Providing expected performance from multiple invasive species databases. Front Earth Sci 2011; 5:111-119. [PMID: 32215222 PMCID: PMC7088668 DOI: 10.1007/s11707-011-0152-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2011] [Accepted: 03/02/2011] [Indexed: 06/10/2023]
Abstract
Invasive species are a universal global problem, but the information to identify them, manage them, and prevent invasions is stored around the globe in a variety of formats. The Global Invasive Species Information Network is a consortium of organizations working toward providing seamless access to these disparate databases via the Internet. A distributed network of databases can be created using the Internet and a standard web service protocol. There are two options to provide this integration. First, federated searches are being proposed to allow users to search "deep" web documents such as databases for invasive species. A second method is to create a cache of data from the databases for searching. We compare these two methods, and show that federated searches will not provide the performance and flexibility required from users and a central cache of the datum are required to improve performance.
Collapse
Affiliation(s)
- Jim Graham
- Natural Resource Ecology Laboratory, Colorado State University, Fort Collins, CO 80523-1062 USA
| | - Catherine S. Jarnevich
- United States Geological Survey Fort Collins Science Center, Fort Collins, CO 80523-1062 USA
| | - Annie Simpson
- United States Geological Survey Headquarters, Reston, VA 11750 USA
| | - Gregory J. Newman
- Natural Resource Ecology Laboratory, Colorado State University, Fort Collins, CO 80523-1062 USA
| | - Thomas J. Stohlgren
- United States Geological Survey Fort Collins Science Center, Fort Collins, CO 80523-1062 USA
| |
Collapse
|
37
|
Savel TG, Bronstein A, Duck W, Rhodes MB, Lee B, Stinn J, Worthen K. Using secure web services to visualize poison center data for nationwide biosurveillance: a case study. Online J Public Health Inform 2010; 2:ojphi. [PMID: 23569581 DOI: 10.5210/ojphi.v2i1.2920] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Objectives Real-time surveillance systems are valuable for timely response to public health emergencies. It has been challenging to leverage existing surveillance systems in state and local communities, and, using a centralized architecture, add new data sources and analytical capacity. Because this centralized model has proven to be difficult to maintain and enhance, the US Centers for Disease Control and Prevention (CDC) has been examining the ability to use a federated model based on secure web services architecture, with data stewardship remaining with the data provider. Methods As a case study for this approach, the American Association of Poison Control Centers and the CDC extended an existing data warehouse via a secure web service, and shared aggregate clinical effects and case counts data by geographic region and time period. To visualize these data, CDC developed a web browser-based interface, Quicksilver, which leveraged the Google Maps API and Flot, a javascript plotting library. Results Two iterations of the NPDS web service were completed in 12 weeks. The visualization client, Quicksilver, was developed in four months. Discussion This implementation of web services combined with a visualization client represents incremental positive progress in transitioning national data sources like BioSense and NPDS to a federated data exchange model. Conclusion Quicksilver effectively demonstrates how the use of secure web services in conjunction with a lightweight, rapidly deployed visualization client can easily integrate isolated data sources for biosurveillance.
Collapse
|
38
|
Douglas J, Usländer T, Schimak G, Esteban JF, Denzer R. An Open Distributed Architecture for Sensor Networks for Risk Management. Sensors (Basel) 2008; 8:1755-1773. [PMID: 27879791 PMCID: PMC3663022 DOI: 10.3390/s8031755] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/10/2007] [Accepted: 03/12/2008] [Indexed: 11/16/2022]
Abstract
Sensors provide some of the basic input data for risk management of natural and man-made hazards. Here the word ‘sensors’ covers everything from remote sensing satellites, providing invaluable images of large regions, through instruments installed on the Earth's surface to instruments situated in deep boreholes and on the sea floor, providing highly-detailed point-based information from single sites. Data from such sensors is used in all stages of risk management, from hazard, vulnerability and risk assessment in the pre-event phase, information to provide on-site help during the crisis phase through to data to aid in recovery following an event. Because data from sensors play such an important part in improving understanding of the causes of risk and consequently in its mitigation, considerable investment has been made in the construction and maintenance of highly-sophisticated sensor networks. In spite of the ubiquitous need for information from sensor networks, the use of such data is hampered in many ways. Firstly, information about the presence and capabilities of sensor networks operating in a region is difficult to obtain due to a lack of easily available and usable meta-information. Secondly, once sensor networks have been identified their data it is often difficult to access due to a lack of interoperability between dissemination and acquisition systems. Thirdly, the transfer and processing of information from sensors is limited, again by incompatibilities between systems. Therefore, the current situation leads to a lack of efficiency and limited use of the available data that has an important role to play in risk mitigation. In view of this situation, the European Commission (EC) is funding a number of Integrated Projects within the Sixth Framework Programme concerned with improving the accessibility of data and services for risk management. Two of these projects: ‘Open Architecture and Spatial Data Infrastructure for Risk Management’ (ORCHESTRA, http://www.eu-orchestra.org/) and ‘Sensors Anywhere’ (SANY, http://sany-ip.eu/) are discussed in this article. These projects have developed an open distributed information technology architecture and have implemented web services for the accessing and using data emanating, for example, from sensor networks. These developments are based on existing data and service standards proposed by international organizations. The projects seek to develop the ideals of the EC directive INSPIRE (http://inspire.jrc.it), which was launched in 2001 and whose implementation began this year (2007), into the risk management domain. Thanks to the open nature of the architecture and services being developed within these projects, they can be implemented by any interested party and can be accessed by all potential users. The architecture is based around a service-oriented approach that makes use of Internet-based applications (web services) whose inputs and outputs conform to standards. The benefit of this philosophy is that it is expected to favor the emergence of an operational market for risk management services in Europe, it eliminates the need to replace or radically alter the hundreds of already operational IT systems in Europe (drastically lowering costs for users), and it allows users and stakeholders to achieve interoperability while using the system most adequate to their needs, budgets, culture etc. (i.e. it has flexibility).
Collapse
Affiliation(s)
- John Douglas
- BRGM - ARN/RIS, 3 avenue C. Guillemin, BP 36009, 45060 ORLEANS Cedex 2, France.
| | - Thomas Usländer
- Information Management, Fraunhofer IITB, Fraunhoferstraße 1, 76131 Karlsruhe, Germany.
| | - Gerald Schimak
- Information Management, Austrian Research Centers GmbH - ARC, A-2444 Seibersdorf.
| | | | - Ralf Denzer
- Environmental Informatics Group, Goebenstraße 40, 66117 Saarbrücken, Germany.
| |
Collapse
|