1
|
Dall'Alba G, Casa PL, Abreu FPD, Notari DL, de Avila E Silva S. A Survey of Biological Data in a Big Data Perspective. BIG DATA 2022; 10:279-297. [PMID: 35394342 DOI: 10.1089/big.2020.0383] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The amount of available data is continuously growing. This phenomenon promotes a new concept, named big data. The highlight technologies related to big data are cloud computing (infrastructure) and Not Only SQL (NoSQL; data storage). In addition, for data analysis, machine learning algorithms such as decision trees, support vector machines, artificial neural networks, and clustering techniques present promising results. In a biological context, big data has many applications due to the large number of biological databases available. Some limitations of biological big data are related to the inherent features of these data, such as high degrees of complexity and heterogeneity, since biological systems provide information from an atomic level to interactions between organisms or their environment. Such characteristics make most bioinformatic-based applications difficult to build, configure, and maintain. Although the rise of big data is relatively recent, it has contributed to a better understanding of the underlying mechanisms of life. The main goal of this article is to provide a concise and reliable survey of the application of big data-related technologies in biology. As such, some fundamental concepts of information technology, including storage resources, analysis, and data sharing, are described along with their relation to biological data.
Collapse
Affiliation(s)
- Gabriel Dall'Alba
- Computational Biology and Bioinformatics Laboratory, Biotechnology Institute, Department of Life Sciences, University of Caxias do Sul, Caxias do Sul, Brazil
- Genome Science and Technology Program, Faculty of Science, The University of British Columbia, Vancouver, Canada
| | - Pedro Lenz Casa
- Computational Biology and Bioinformatics Laboratory, Biotechnology Institute, Department of Life Sciences, University of Caxias do Sul, Caxias do Sul, Brazil
| | - Fernanda Pessi de Abreu
- Computational Biology and Bioinformatics Laboratory, Biotechnology Institute, Department of Life Sciences, University of Caxias do Sul, Caxias do Sul, Brazil
| | - Daniel Luis Notari
- Computational Biology and Bioinformatics Laboratory, Biotechnology Institute, Department of Life Sciences, University of Caxias do Sul, Caxias do Sul, Brazil
| | - Scheila de Avila E Silva
- Computational Biology and Bioinformatics Laboratory, Biotechnology Institute, Department of Life Sciences, University of Caxias do Sul, Caxias do Sul, Brazil
| |
Collapse
|
2
|
Yao D, Zhan X, Zhan X, Kwoh CK, Sun Y. ncRNA2MetS: a manually curated database for non-coding RNAs associated with metabolic syndrome. PeerJ 2019; 7:e7909. [PMID: 31637139 PMCID: PMC6798904 DOI: 10.7717/peerj.7909] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Accepted: 09/17/2019] [Indexed: 12/19/2022] Open
Abstract
Metabolic syndrome is a cluster of the most dangerous heart attack risk factors (diabetes and raised fasting plasma glucose, abdominal obesity, high cholesterol and high blood pressure), and has become a major global threat to human health. A number of studies have demonstrated that hundreds of non-coding RNAs, including miRNAs and lncRNAs, are involved in metabolic syndrome-related diseases such as obesity, type 2 diabetes mellitus, hypertension, etc. However, these research results are distributed in a large number of literature, which is not conducive to analysis and use. There is an urgent need to integrate these relationship data between metabolic syndrome and non-coding RNA into a specialized database. To address this need, we developed a metabolic syndrome-associated non-coding RNA database (ncRNA2MetS) to curate the associations between metabolic syndrome and non-coding RNA. Currently, ncRNA2MetS contains 1,068 associations between five metabolic syndrome traits and 627 non-coding RNAs (543 miRNAs and 84 lncRNAs) in four species. Each record in ncRNA2MetS database represents a pair of disease-miRNA (lncRNA) association consisting of non-coding RNA category, miRNA (lncRNA) name, name of metabolic syndrome trait, expressive patterns of non-coding RNA, method for validation, specie involved, a brief introduction to the association, the article referenced, etc. We also developed a user-friendly website so that users can easily access and download all data. In short, ncRNA2MetS is a complete and high-quality data resource for exploring the role of non-coding RNA in the pathogenesis of metabolic syndrome and seeking new treatment options. The website is freely available at http://www.biomed-bigdata.com:50020/index.html
Collapse
Affiliation(s)
- Dengju Yao
- School of Software and Microelectronics, Harbin University of Science and Technology, Harbin, Heilongjiang, China.,School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore.,College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Xiaojuan Zhan
- College of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin, Heilongjiang, China.,School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, Heilongjiang, China
| | - Xiaorong Zhan
- Department of Endocrinology and Metabolism, the First Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang, China
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Yuezhongyi Sun
- School of Software and Microelectronics, Harbin University of Science and Technology, Harbin, Heilongjiang, China.,School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, Heilongjiang, China
| |
Collapse
|
3
|
Trinh I, Gluscencova OB, Boulianne GL. An in vivo screen for neuronal genes involved in obesity identifies Diacylglycerol kinase as a regulator of insulin secretion. Mol Metab 2018; 19:13-23. [PMID: 30389349 PMCID: PMC6323187 DOI: 10.1016/j.molmet.2018.10.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Revised: 09/26/2018] [Accepted: 10/15/2018] [Indexed: 12/31/2022] Open
Abstract
Objective Obesity is a complex disorder involving many genetic and environmental factors that are required to maintain energy homeostasis. While studies in human populations have led to significant progress in the generation of an obesity gene map and broadened our understanding of the genetic basis of common obesity, there is still a large portion of heritability and etiology that remains unknown. Here, we have used the genetically tractable fruit fly, Drosophila melanogaster, to identify genes/pathways that function in the nervous system to regulate energy balance. Methods We performed an in vivo RNAi screen in Drosophila neurons and assayed for obese or lean phenotypes by measuring changes in levels of stored fats (in the form of triacylglycerides or TAG). Three rounds of screening were performed to verify the reproducibility and specificity of the adiposity phenotypes. Genes that produced >25% increase in TAG (206 in total) underwent a second round of screening to verify their effect on TAG levels by retesting the same RNAi line to validate the phenotype. All remaining hits were screened a third time by testing the TAG levels of additional RNAi lines against the genes of interest to rule out any off-target effects. Results We identified 24 genes including 20 genes that have not been previously associated with energy homeostasis. One identified hit, Diacylglycerol kinase (Dgk), has mammalian homologues that have been implicated in genome-wide association studies for metabolic defects. Downregulation of neuronal Dgk levels increases TAG and carbohydrate levels and these phenotypes can be recapitulated by reducing Dgk levels specifically within the insulin-producing cells that secrete Drosophila insulin-like peptides (dILPs). Conversely, overexpression of kinase-dead Dgk, but not wild-type, decreased circulating dILP2 and dILP5 levels resulting in lower insulin signalling activity. Despite having higher circulating dILP levels, Dgk RNAi flies have decreased pathway activity suggesting that they are insulin-resistant. Conclusion Altogether, we have identified several genes that act within the CNS to regulate energy homeostasis. One of these, Dgk, acts within the insulin-producing cells to regulate the secretion of dILPs and energy homeostasis in Drosophila. RNAi screen in neurons identifies 24 regulators of energy homeostasis. One of the hits, Dgk, affects lipid and carbohydrate homeostasis. Dgk acts within the IPCs to regulate dILP secretion and insulin signalling activity.
Collapse
Affiliation(s)
- Irene Trinh
- Department of Molecular Genetics, University of Toronto, Toronto, M5S 1A8, Canada; Program in Developmental and Stem Cell Biology, Hospital for Sick Children, Peter Gilgan Center for Research and Learning, 686 Bay Street, Toronto, M5G 0A6, Canada.
| | - Oxana B Gluscencova
- Program in Developmental and Stem Cell Biology, Hospital for Sick Children, Peter Gilgan Center for Research and Learning, 686 Bay Street, Toronto, M5G 0A6, Canada.
| | - Gabrielle L Boulianne
- Department of Molecular Genetics, University of Toronto, Toronto, M5S 1A8, Canada; Program in Developmental and Stem Cell Biology, Hospital for Sick Children, Peter Gilgan Center for Research and Learning, 686 Bay Street, Toronto, M5G 0A6, Canada.
| |
Collapse
|
4
|
Chen Q, Zobel J, Verspoor K. Duplicates, redundancies and inconsistencies in the primary nucleotide databases: a descriptive study. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2017; 2017:baw163. [PMID: 28077566 PMCID: PMC5225397 DOI: 10.1093/database/baw163] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Revised: 11/17/2016] [Accepted: 11/21/2016] [Indexed: 01/22/2023]
Abstract
GenBank, the EMBL European Nucleotide Archive and the DNA DataBank of Japan, known collectively as the International Nucleotide Sequence Database Collaboration or INSDC, are the three most significant nucleotide sequence databases. Their records are derived from laboratory work undertaken by different individuals, by different teams, with a range of technologies and assumptions and over a period of decades. As a consequence, they contain a great many duplicates, redundancies and inconsistencies, but neither the prevalence nor the characteristics of various types of duplicates have been rigorously assessed. Existing duplicate detection methods in bioinformatics only address specific duplicate types, with inconsistent assumptions; and the impact of duplicates in bioinformatics databases has not been carefully assessed, making it difficult to judge the value of such methods. Our goal is to assess the scale, kinds and impact of duplicates in bioinformatics databases, through a retrospective analysis of merged groups in INSDC databases. Our outcomes are threefold: (1) We analyse a benchmark dataset consisting of duplicates manually identified in INSDC—a dataset of 67 888 merged groups with 111 823 duplicate pairs across 21 organisms from INSDC databases – in terms of the prevalence, types and impacts of duplicates. (2) We categorize duplicates at both sequence and annotation level, with supporting quantitative statistics, showing that different organisms have different prevalence of distinct kinds of duplicate. (3) We show that the presence of duplicates has practical impact via a simple case study on duplicates, in terms of GC content and melting temperature. We demonstrate that duplicates not only introduce redundancy, but can lead to inconsistent results for certain tasks. Our findings lead to a better understanding of the problem of duplication in biological databases. Database URL: the merged records are available at https://cloudstor.aarnet.edu.au/plus/index.php/s/Xef2fvsebBEAv9w
Collapse
Affiliation(s)
- Qingyu Chen
- Department of Computing and Information Systems, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Justin Zobel
- Department of Computing and Information Systems, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Karin Verspoor
- Department of Computing and Information Systems, The University of Melbourne, Parkville, VIC, 3010, Australia
| |
Collapse
|
5
|
Baumgartner C. The Era of Big Data: From Data-Driven Research to Data-Driven Clinical Care. TRANSLATIONAL BIOINFORMATICS 2016. [DOI: 10.1007/978-94-017-7543-4_1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
6
|
Pazos Obregón F, Papalardo C, Castro S, Guerberoff G, Cantera R. Putative synaptic genes defined from a Drosophila whole body developmental transcriptome by a machine learning approach. BMC Genomics 2015; 16:694. [PMID: 26370122 PMCID: PMC4570697 DOI: 10.1186/s12864-015-1888-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Accepted: 09/01/2015] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Assembly and function of neuronal synapses require the coordinated expression of a yet undetermined set of genes. Although roughly a thousand genes are expected to be important for this function in Drosophila melanogaster, just a few hundreds of them are known so far. RESULTS In this work we trained three learning algorithms to predict a "synaptic function" for genes of Drosophila using data from a whole-body developmental transcriptome published by others. Using statistical and biological criteria to analyze and combine the predictions, we obtained a gene catalogue that is highly enriched in genes of relevance for Drosophila synapse assembly and function but still not recognized as such. CONCLUSIONS The utility of our approach is that it reduces the number of genes to be tested through hypothesis-driven experimentation.
Collapse
Affiliation(s)
- Flavio Pazos Obregón
- Departamento de Biología del Neurodesarrollo, Instituto de Investigaciones Biológicas Clemente Estable, Avenida Italia 3318, PC 11600, Montevideo, Uruguay.
| | - Cecilia Papalardo
- Instituto de Matemática y Estadística "Prof. Ing. Rafael Laguardia", Facultad de Ingeniería, Universidad de la República, Montevideo, Uruguay.
| | - Sebastián Castro
- Instituto de Matemática y Estadística "Prof. Ing. Rafael Laguardia", Facultad de Ingeniería, Universidad de la República, Montevideo, Uruguay.
| | - Gustavo Guerberoff
- Instituto de Matemática y Estadística "Prof. Ing. Rafael Laguardia", Facultad de Ingeniería, Universidad de la República, Montevideo, Uruguay.
| | - Rafael Cantera
- Departamento de Biología del Neurodesarrollo, Instituto de Investigaciones Biológicas Clemente Estable, Avenida Italia 3318, PC 11600, Montevideo, Uruguay.
- Zoology Department, Stockholm University, Stockholm, Sweden.
| |
Collapse
|
7
|
Lyne R, Sullivan J, Butano D, Contrino S, Heimbach J, Hu F, Kalderimis A, Lyne M, Smith RN, Štěpán R, Balakrishnan R, Binkley G, Harris T, Karra K, Moxon SAT, Motenko H, Neuhauser S, Ruzicka L, Cherry M, Richardson J, Stein L, Westerfield M, Worthey E, Micklem G. Cross-organism analysis using InterMine. Genesis 2015; 53:547-60. [PMID: 26097192 PMCID: PMC4545681 DOI: 10.1002/dvg.22869] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2015] [Revised: 06/17/2015] [Accepted: 06/17/2015] [Indexed: 01/01/2023]
Abstract
InterMine is a data integration warehouse and analysis software system developed for large and complex biological data sets. Designed for integrative analysis, it can be accessed through a user-friendly web interface. For bioinformaticians, extensive web services as well as programming interfaces for most common scripting languages support access to all features. The web interface includes a useful identifier look-up system, and both simple and sophisticated search options. Interactive results tables enable exploration, and data can be filtered, summarized, and browsed. A set of graphical analysis tools provide a rich environment for data exploration including statistical enrichment of sets of genes or other entities. InterMine databases have been developed for the major model organisms, budding yeast, nematode worm, fruit fly, zebrafish, mouse, and rat together with a newly developed human database. Here, we describe how this has facilitated interoperation and development of cross-organism analysis tools and reports. InterMine as a data exploration and analysis tool is also described. All the InterMine-based systems described in this article are resources freely available to the scientific community.
Collapse
Affiliation(s)
- Rachel Lyne
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Julie Sullivan
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Daniela Butano
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Sergio Contrino
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Josh Heimbach
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Fengyuan Hu
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Alex Kalderimis
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Mike Lyne
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Richard N. Smith
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Radek Štěpán
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Rama Balakrishnan
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Gail Binkley
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Todd Harris
- Ontario Institute for Cancer Research, Toronto, ON, M5G0A3, Canada
| | - Kalpana Karra
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | | | - Howie Motenko
- The Jackson Laboratory, Bar Harbor, Maine, 04609, USA
| | | | | | - Mike Cherry
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | | | - Lincoln Stein
- Ontario Institute for Cancer Research, Toronto, ON, M5G0A3, Canada
| | - Monte Westerfield
- ZFIN, University of Oregon, Eugene, OR, 97403, USA
- Institute of Neuroscience, University of Oregon, Eugene, OR, 97403, USA
| | - Elizabeth Worthey
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI, 53226, USA
| | - Gos Micklem
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| |
Collapse
|
8
|
Rhee DB, Croken MM, Shieh KR, Sullivan J, Micklem G, Kim K, Golden A. toxoMine: an integrated omics data warehouse for Toxoplasma gondii systems biology research. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav066. [PMID: 26130662 PMCID: PMC4485433 DOI: 10.1093/database/bav066] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/13/2015] [Accepted: 06/09/2015] [Indexed: 01/09/2023]
Abstract
Toxoplasma gondii (T. gondii) is an obligate intracellular parasite that must monitor for changes in the host environment and respond accordingly; however, it is still not fully known which genetic or epigenetic factors are involved in regulating virulence traits of T. gondii. There are on-going efforts to elucidate the mechanisms regulating the stage transition process via the application of high-throughput epigenomics, genomics and proteomics techniques. Given the range of experimental conditions and the typical yield from such high-throughput techniques, a new challenge arises: how to effectively collect, organize and disseminate the generated data for subsequent data analysis. Here, we describe toxoMine, which provides a powerful interface to support sophisticated integrative exploration of high-throughput experimental data and metadata, providing researchers with a more tractable means toward understanding how genetic and/or epigenetic factors play a coordinated role in determining pathogenicity of T. gondii. As a data warehouse, toxoMine allows integration of high-throughput data sets with public T. gondii data. toxoMine is also able to execute complex queries involving multiple data sets with straightforward user interaction. Furthermore, toxoMine allows users to define their own parameters during the search process that gives users near-limitless search and query capabilities. The interoperability feature also allows users to query and examine data available in other InterMine systems, which would effectively augment the search scope beyond what is available to toxoMine. toxoMine complements the major community database ToxoDB by providing a data warehouse that enables more extensive integrative studies for T. gondii. Given all these factors, we believe it will become an indispensable resource to the greater infectious disease research community. Database URL:http://toxomine.org
Collapse
Affiliation(s)
- David B Rhee
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA,
| | - Matthew McKnight Croken
- Department of Microbiology & Immunology, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Kevin R Shieh
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Julie Sullivan
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK and
| | - Gos Micklem
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK and
| | - Kami Kim
- Department of Medicine, Department of Pathology, Department of Microbiology & Immunology, Albert Einstein College of Medicine, Bronx, NY 10461, USA,
| | - Aaron Golden
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Department of Mathematical Sciences, Yeshiva University, New York, NY 10033, USA
| |
Collapse
|
9
|
Thodiyil P. Sleeve gastrectomy is also anti-inflammatory, but why? Surg Obes Relat Dis 2014; 10:1128. [PMID: 25443055 DOI: 10.1016/j.soard.2014.05.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2014] [Accepted: 05/27/2014] [Indexed: 10/25/2022]
Affiliation(s)
- Paul Thodiyil
- New York Methodist Hospital Department of SurgeryBrooklyn, New York.
| |
Collapse
|
10
|
Bean DM, Heimbach J, Ficorella L, Micklem G, Oliver SG, Favrin G. esyN: network building, sharing and publishing. PLoS One 2014; 9:e106035. [PMID: 25181461 PMCID: PMC4152123 DOI: 10.1371/journal.pone.0106035] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 07/27/2014] [Indexed: 01/18/2023] Open
Abstract
The construction and analysis of networks is increasingly widespread in biological research. We have developed esyN ("easy networks") as a free and open source tool to facilitate the exchange of biological network models between researchers. esyN acts as a searchable database of user-created networks from any field. We have developed a simple companion web tool that enables users to view and edit networks using data from publicly available databases. Both normal interaction networks (graphs) and Petri nets can be created. In addition to its basic tools, esyN contains a number of logical templates that can be used to create models more easily. The ability to use previously published models as building blocks makes esyN a powerful tool for the construction of models and network graphs. Users are able to save their own projects online and share them either publicly or with a list of collaborators. The latter can be given the ability to edit the network themselves, allowing online collaboration on network construction. esyN is designed to facilitate unrestricted exchange of this increasingly important type of biological information. Ultimately, the aim of esyN is to bring the advantages of Open Source software development to the construction of biological networks.
Collapse
Affiliation(s)
- Daniel M. Bean
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Joshua Heimbach
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
| | - Lorenzo Ficorella
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- Dipartimento di Biochimica, Universita’ degli studi di Pisa, Pisa, Italy
| | - Gos Micklem
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Stephen G. Oliver
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Giorgio Favrin
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
11
|
Kalderimis A, Lyne R, Butano D, Contrino S, Lyne M, Heimbach J, Hu F, Smith R, Stěpán R, Sullivan J, Micklem G. InterMine: extensive web services for modern biology. Nucleic Acids Res 2014; 42:W468-72. [PMID: 24753429 PMCID: PMC4086141 DOI: 10.1093/nar/gku301] [Citation(s) in RCA: 80] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
InterMine (www.intermine.org) is a biological data warehousing system providing extensive automatically generated and configurable RESTful web services that underpin the web interface and can be re-used in many other applications: to find and filter data; export it in a flexible and structured way; to upload, use, manipulate and analyze lists; to provide services for flexible retrieval of sequence segments, and for other statistical and analysis tools. Here we describe these features and discuss how they can be used separately or in combinations to support integrative and comparative analysis.
Collapse
Affiliation(s)
- Alex Kalderimis
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK and Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Rachel Lyne
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK and Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Daniela Butano
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK and Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Sergio Contrino
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK and Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Mike Lyne
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK and Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Joshua Heimbach
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK and Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Fengyuan Hu
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK and Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Richard Smith
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK and Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Radek Stěpán
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK and Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Julie Sullivan
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK and Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Gos Micklem
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK and Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| |
Collapse
|