1
|
Knaup P, Bendl R, Eisenmann U, Hastenteufel M, Reichenbach A. Managing the Transition from Tradition to Innovation for the Heidelberg/Heilbronn Medical Informatics Master of Science Program. Appl Clin Inform 2025; 16:305-313. [PMID: 39587019 PMCID: PMC12020540 DOI: 10.1055/a-2482-9071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 09/30/2024] [Indexed: 11/27/2024] Open
Abstract
BACKGROUND To keep pace with the developments in the medical informatics field, the curriculum of the Heidelberg/Heilbronn Medical Informatics Master of Science program is continuously updated. In its latest revision we restructured our master's program to allow more flexibility to accommodate updates and include current topics and to enable students' choices. OBJECTIVES This study aimed to present our new concepts for graduate medical informatics education, share our experiences, and provide insights into the perception of these concepts by advanced students and graduates. METHODS Our new curriculum consists of three core components: Areas of concentration that bundle elective courses in an important domain of medical informatics, a large catalog of elective courses, and introductory/alignment courses for students without a bachelor's degree in medical informatics. We conducted an online survey of graduates and students with at least 75 credits to assess their opinion on the program's effectiveness and attractiveness. RESULTS Mandatory courses include clinical medicine, project management, research, and practical training in biomedical informatics. Five areas of concentration bundle elective courses for 30 credits to provide a solid foundation in an important domain in medical informatics. These are bioinformatics, data science, computer-aided diagnosis and therapy systems, information management, and software engineering in medicine. The catalog of electives offers a total of 67 courses. About 75% of the courses are assigned to more than one area of concentration. Our survey demonstrates that the participants highly appreciate the flexibility of the electives and the opportunity to develop an area of expertise. CONCLUSION Offering a high degree of flexibility to our students has motivated them to join our program and resulted in a high level of student satisfaction. By designing the curriculum with areas of concentration and providing an infrastructure that permits courses on emerging topics to be added easily to the curriculum, we were able to meet our students' expectations.
Collapse
Affiliation(s)
- Petra Knaup
- Heidelberg University, Institute of Medical Informatics, Heidelberg, Germany
| | - Rolf Bendl
- Heilbronn University of Applied Sciences, Heilbronn, Germany
| | - Urs Eisenmann
- Heidelberg University, Institute of Medical Informatics, Heidelberg, Germany
| | | | | |
Collapse
|
2
|
Forero DA, Bonilla DA, González-Giraldo Y, Patrinos GP. An overview of key online resources for human genomics: a powerful and open toolbox for in silico research. Brief Funct Genomics 2024; 23:754-764. [PMID: 38993146 DOI: 10.1093/bfgp/elae029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 06/19/2024] [Accepted: 06/25/2024] [Indexed: 07/13/2024] Open
Abstract
Recent advances in high-throughput molecular methods have led to an extraordinary volume of genomics data. Simultaneously, the progress in the computational implementation of novel algorithms has facilitated the creation of hundreds of freely available online tools for their advanced analyses. However, a general overview of the most commonly used tools for the in silico analysis of genomics data is still missing. In the current article, we present an overview of commonly used online resources for genomics research, including over 50 tools. This selection will be helpful for scientists with basic or intermediate skills in the in silico analyses of genomics data, such as researchers and students from wet labs seeking to strengthen their computational competencies. In addition, we discuss current needs and future perspectives within this field.
Collapse
Affiliation(s)
- Diego A Forero
- School of Health and Sport Sciences, Fundación Universitaria del Área Andina, Bogotá, Colombia
| | - Diego A Bonilla
- Research Division, Dynamical Business & Science Society - DBSS International SAS, Bogotá, Colombia
- Hologenomiks Research Group, Department of Genetics, Physical Anthropology and Animal Physiology, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), Leioa, Spain
| | - Yeimy González-Giraldo
- Departamento de Nutrición y Bioquímica, Facultad de Ciencias, Pontificia Universidad Javeriana, Bogotá, Colombia
| | - George P Patrinos
- Laboratory of Pharmacogenomics and Individualized Therapy, Department of Pharmacy, School of Health Science, University of Patras, Patras, Greece
- Clinical Bioinformatics Unit, Department of Pathology, School of Medicine and Health Sciences, Erasmus University Medical Center, Rotterdam, The Netherlands
- Department of Genetics and Genomics, College of Medicine and Health Sciences, United Arab Emirates University, Al-AIn, Abu Dhabi, United Arab Emirates
- Zayed Center for Health Sciences, United Arab Emirates University, Al-AIn, Abu Dhabi, United Arab Emirates
| |
Collapse
|
3
|
Liu H, Wu X, Wang D, Li Q, Zhang X, Xu L. Unveiling the role of miR-137-3p/miR-296-5p/SERPINA3 signaling in colorectal cancer progression: integrative analysis of gene expression profiles and in vitro studies. BMC Med Genomics 2023; 16:327. [PMID: 38087342 PMCID: PMC10714458 DOI: 10.1186/s12920-023-01763-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Accepted: 12/05/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Colorectal cancer (CRC) is a prevalent malignancy worldwide, with increasing incidence and mortality rates. Although treatment options have improved, CRC remains a leading cause of death due to metastasis. Early intervention can significantly improve patient outcomes, making it crucial to understand the molecular mechanisms underlying CRC metastasis. In this study, we performed bioinformatics analysis to identify potential genes associated with CRC metastasis. METHODS We downloaded and integrated gene expression datasets (GSE89393, GSE100243, and GSE144259) from GEO database. Differential expression analysis was conducted, followed by Gene Ontology (GO) functional enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. The hub gene SERPINA3 was selected for further in vitro functional studies. Additionally, the role of miR-137-3p/miR-296-5p/ Serpin family A member 3 (SERPINA3) in CRC cell function was investigated using in vitro assays. RESULTS Analysis of the gene expression datasets revealed differentially expressed genes (DEGs) associated with CRC metastasis. GO analysis showed enrichment in biological processes such as blood coagulation regulation and wound healing. Cellular component analysis highlighted extracellular matrix components and secretory granules. Molecular function analysis identified activities such as serine-type endopeptidase inhibition and lipoprotein receptor binding. KEGG analysis revealed involvement in pathways related to complement and coagulation cascades, cholesterol metabolism, and immune responses. The common DEGs among the datasets were further investigated. We identified SERPINA3 as a hub gene associated with CRC metastasis. SERPINA3 exerted enhanced effects on migration, proliferation and epithelial-mesenchymal transition (EMT) and inhibitory effects on caspase-3/-9 activities in HT29 and SW620 cells. MiR-137-3p overexpression increased activities of caspase-3/-9, decreased migration and proliferation, and also repressed EMT in HT29 cells, which were obviously attenuated by SERPINA3 enforced overexpression. Consistently, SERPINA3 enforced overexpression also largely reversed miR-296-5p mimics-induced increased in activities of caspase-3/-9, decrease in migration, proliferation and EMT in HT29 cells. CONCLUSION Through bioinformatics analysis, we identified potential genes associated with CRC metastasis. The functional studies focusing on SERPINA3/miR-137-3p/miR-296-5p further consolidated its role in regulating CRC progression. Our findings provide insights into novel mechanisms underlying CRC metastasis and might contribute to the development of effective treatment strategies. However, the role of SERPINA3/miR-137-3p/miR-296-5p signaling in CRC still requires further investigation.
Collapse
Affiliation(s)
- Huimin Liu
- Department of General Surgery, The Second People's Hospital of Lianyungang, Lianyungang, China
| | - Xingxing Wu
- Department of Pediatric Surgery, The Second People's Hospital of Lianyungang, Lianyungang, China
| | - Dandan Wang
- Department of General Surgery, The Second People's Hospital of Lianyungang, Lianyungang, China
| | - Quanxi Li
- Department of General Surgery, The Second People's Hospital of Lianyungang, Lianyungang, China
| | - Xin Zhang
- Department of General Surgery, The Second People's Hospital of Lianyungang, Lianyungang, China
| | - Liang Xu
- Department of General Surgery, The Second People's Hospital of Lianyungang, Lianyungang, China.
| |
Collapse
|
4
|
Forero DA. Genomics of psychiatric disorders: Regional challenges and opportunities. BIOMEDICA : REVISTA DEL INSTITUTO NACIONAL DE SALUD 2023; 43:5-7. [PMID: 37167458 PMCID: PMC10462422 DOI: 10.7705/biomedica.6996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Indexed: 05/13/2023]
Affiliation(s)
- Diego A Forero
- Facultad de Ciencias de la Salud y del Deporte, Fundación Universitaria del Área Andina, Bogotá, D.C., Colombia.
| |
Collapse
|
5
|
Sinha P, Spicer A, Delucchi KL, McAuley DF, Calfee CS, Churpek MM. Comparison of machine learning clustering algorithms for detecting heterogeneity of treatment effect in acute respiratory distress syndrome: A secondary analysis of three randomised controlled trials. EBioMedicine 2021; 74:103697. [PMID: 34861492 PMCID: PMC8645454 DOI: 10.1016/j.ebiom.2021.103697] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 10/18/2021] [Accepted: 11/01/2021] [Indexed: 12/30/2022] Open
Abstract
Background Heterogeneity in Acute Respiratory Distress Syndrome (ARDS), as a consequence of its non-specific definition, has led to a multitude of negative randomised controlled trials (RCTs). Investigators have sought to identify heterogeneity of treatment effect (HTE) in RCTs using clustering algorithms. We evaluated the proficiency of several commonly-used machine-learning algorithms to identify clusters where HTE may be detected. Methods Five unsupervised: Latent class analysis (LCA), K-means, partition around medoids, hierarchical, and spectral clustering; and four supervised algorithms: model-based recursive partitioning, Causal Forest (CF), and X-learner with Random Forest (XL-RF) and Bayesian Additive Regression Trees were individually applied to three prior ARDS RCTs. Clinical data and research protein biomarkers were used as partitioning variables, with the latter excluded for secondary analyses. For a clustering schema, HTE was evaluated based on the interaction term of treatment group and cluster with day-90 mortality as the dependent variable. Findings No single algorithm identified clusters with significant HTE in all three trials. LCA, XL-RF, and CF identified HTE most frequently (2/3 RCTs). Important partitioning variables in the unsupervised approaches were consistent across algorithms and RCTs. In supervised models, important partitioning variables varied between algorithms and across RCTs. In algorithms where clusters demonstrated HTE in the same trial, patients frequently interchanged clusters from treatment-benefit to treatment-harm clusters across algorithms. LCA aside, results from all other algorithms were subject to significant alteration in cluster composition and HTE with random seed change. Removing research biomarkers as partitioning variables greatly reduced the chances of detecting HTE across all algorithms. Interpretation Machine-learning algorithms were inconsistent in their abilities to identify clusters with significant HTE. Protein biomarkers were essential in identifying clusters with HTE. Investigations using machine-learning approaches to identify clusters to seek HTE require cautious interpretation. Funding NIGMS R35 GM142992 (PS), NHLBI R35 HL140026 (CSC); NIGMS R01 GM123193, Department of Defense W81XWH-21-1-0009, NIA R21 AG068720, NIDA R01 DA051464 (MMC)
Collapse
Affiliation(s)
- Pratik Sinha
- Division of Clinical and Translational Research, Division of Critical Care, Department of Anesthesia, Washington University School of Medicine, Saint Louis, MO.
| | - Alexandra Spicer
- Department of Medicine, University of Wisconsin- Madison, Madison, Wisconsin
| | - Kevin L Delucchi
- Department of Psychiatry and Behavioral Sciences; University of California, San Francisco; San Francisco, CA
| | - Daniel F McAuley
- Wellcome-Wolfson Institute for Experimental Medicine, Queen's University Belfast; Regional Intensive Care Unit, Royal Victoria Hospital, Belfast. Wellcome-Wolfson Institute for Experimental Medicine, Queen's University Belfast
| | - Carolyn S Calfee
- Department of Medicine, Division of Pulmonary, Critical Care, Allergy and Sleep Medicine; University of California, San Francisco; San Francisco, CA; Department of Anesthesia; University of California, San Francisco; San Francisco, CA
| | - Matthew M Churpek
- Department of Medicine, University of Wisconsin- Madison, Madison, Wisconsin
| |
Collapse
|
6
|
Community development, implementation, and assessment of a NIBLSE bioinformatics sequence similarity learning resource. PLoS One 2021; 16:e0257404. [PMID: 34506617 PMCID: PMC8432852 DOI: 10.1371/journal.pone.0257404] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 08/31/2021] [Indexed: 11/19/2022] Open
Abstract
As powerful computational tools and 'big data' transform the biological sciences, bioinformatics training is becoming necessary to prepare the next generation of life scientists. Furthermore, because the tools and resources employed in bioinformatics are constantly evolving, bioinformatics learning materials must be continuously improved. In addition, these learning materials need to move beyond today's typical step-by-step guides to promote deeper conceptual understanding by students. One of the goals of the Network for Integrating Bioinformatics into Life Sciences Education (NIBSLE) is to create, curate, disseminate, and assess appropriate open-access bioinformatics learning resources. Here we describe the evolution, integration, and assessment of a learning resource that explores essential concepts of biological sequence similarity. Pre/post student assessment data from diverse life science courses show significant learning gains. These results indicate that the learning resource is a beneficial educational product for the integration of bioinformatics across curricula.
Collapse
|
7
|
Chung SS, Ng JCF, Laddach A, Thomas NSB, Fraternali F. Short loop functional commonality identified in leukaemia proteome highlights crucial protein sub-networks. NAR Genom Bioinform 2021; 3:lqab010. [PMID: 33709075 PMCID: PMC7936661 DOI: 10.1093/nargab/lqab010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 12/19/2020] [Accepted: 01/26/2021] [Indexed: 11/13/2022] Open
Abstract
Direct drug targeting of mutated proteins in cancer is not always possible and efficacy can be nullified by compensating protein-protein interactions (PPIs). Here, we establish an in silico pipeline to identify specific PPI sub-networks containing mutated proteins as potential targets, which we apply to mutation data of four different leukaemias. Our method is based on extracting cyclic interactions of a small number of proteins topologically and functionally linked in the Protein-Protein Interaction Network (PPIN), which we call short loop network motifs (SLM). We uncover a new property of PPINs named 'short loop commonality' to measure indirect PPIs occurring via common SLM interactions. This detects 'modules' of PPI networks enriched with annotated biological functions of proteins containing mutation hotspots, exemplified by FLT3 and other receptor tyrosine kinase proteins. We further identify functional dependency or mutual exclusivity of short loop commonality pairs in large-scale cellular CRISPR-Cas9 knockout screening data. Our pipeline provides a new strategy for identifying new therapeutic targets for drug discovery.
Collapse
Affiliation(s)
- Sun Sook Chung
- Department of Haematological Medicine, King's College London, London, SE5 9NU, UK
| | - Joseph C F Ng
- Randall Centre for Cell and Molecular Biophysics, King's College London, London, SE1 1UL, UK
| | - Anna Laddach
- Randall Centre for Cell and Molecular Biophysics, King's College London, London, SE1 1UL, UK
| | - N Shaun B Thomas
- Department of Haematological Medicine, King's College London, London, SE5 9NU, UK
| | - Franca Fraternali
- Randall Centre for Cell and Molecular Biophysics, King's College London, London, SE1 1UL, UK
| |
Collapse
|
8
|
Davies A, Mueller J, Moulton G. Core competencies for clinical informaticians: A systematic review. Int J Med Inform 2020; 141:104237. [PMID: 32771960 DOI: 10.1016/j.ijmedinf.2020.104237] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Revised: 07/20/2020] [Accepted: 07/21/2020] [Indexed: 12/20/2022]
Abstract
BACKGROUND Building on initial work carried out by the Faculty of Clinical Informatics (FCI) in the UK, the creation of a national competency framework for Clinical Informatics is required for the definition of clinical informaticians' professional attributes and skills. We aimed to systematically review the academic literature relating to competencies, skills and existing course curricula in the clinical and health related informatics domains. METHODS Two independent reviewers searched Web of Science, EMBASE, ERIC, PubMed and CINAHL. Publications were included if they reported details of relevant competencies, skills and existing course curricula. We report findings using the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) statement. RESULTS A total of 82 publications were included. The most frequently used method was surveys (30 %) followed by narrative descriptions (28 %). Most of the publications describe curriculum design (23 %) followed by competency definition (18 %) and skills, qualifications & training (18 %). Core skills surrounding data, information systems and information management appear to be cross-cutting across the various informatics disciplines with Bioinformatics and Pharmacy Informatics expressing the most unique competency requirements. CONCLUSION We identified eight key domains that cut across the different sub-disciplines of health informatics, including data, information management, human factors, project management, research skills/knowledge, leadership and management, systems development and evaluation, and health/healthcare. Some informatics disciplines such as Nursing Informatics appear to be further ahead at achieving widespread competency standardisation. Attempts at standardisation for competencies should be tempered with flexibility to allow for local variation and requirements.
Collapse
Affiliation(s)
- Alan Davies
- School of Health Sciences, University of Manchester, Manchester, United Kingdom.
| | - Julia Mueller
- MRC Epidemiology Unit, University of Cambridge, Cambridge, United Kingdom
| | - Georgina Moulton
- School of Health Sciences, University of Manchester, Manchester, United Kingdom; Health Data Research United Kingdom (HDRUK), London, United Kingdom
| |
Collapse
|
9
|
Zhang Y, Qiao S, Lu R, Han N, Liu D, Zhou J. How to balance the bioinformatics data: pseudo-negative sampling. BMC Bioinformatics 2019; 20:695. [PMID: 31874622 PMCID: PMC6929457 DOI: 10.1186/s12859-019-3269-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Imbalanced datasets are commonly encountered in bioinformatics classification problems, that is, the number of negative samples is much larger than that of positive samples. Particularly, the data imbalance phenomena will make us underestimate the performance of the minority class of positive samples. Therefore, how to balance the bioinformatic data becomes a very challenging and difficult problem. RESULTS In this study, we propose a new data sampling approach, called pseudo-negative sampling, which can be effectively applied to handle the case that: negative samples greatly dominate positive samples. Specifically, we design a supervised learning method based on a max-relevance min-redundancy criterion beyond Pearson correlation coefficient (MMPCC), which is used to choose pseudo-negative samples from the negative samples and view them as positive samples. In addition, MMPCC uses an incremental searching technique to select optimal pseudo-negative samples to reduce the computation cost. Consequently, the discovered pseudo-negative samples have strong relevance to positive samples and less redundancy to negative ones. CONCLUSIONS To validate the performance of our method, we conduct experiments base on four UCI datasets and three real bioinformatics datasets. According to the experimental results, we clearly observe the performance of MMPCC is better than other sampling methods in terms of Sensitivity, Specificity, Accuracy and the Mathew's Correlation Coefficient. This reveals that the pseudo-negative samples are particularly helpful to solve the imbalance dataset problem. Moreover, the gain of Sensitivity from the minority samples with pseudo-negative samples grows with the improvement of prediction accuracy on all dataset.
Collapse
Affiliation(s)
- Yongqing Zhang
- School of Computer Science, Chengdu University of Information Technology, Chengdu, 610225, China
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Shaojie Qiao
- School of Software Engineering, Chengdu University of Information Technology, Chengdu, 610225, China.
- Software Automatic Generation and Intelligent Service Key Laboratory of Sichuan Province, Chengdu University of Information Technology, Chengdu, 610225, China.
| | - Rongzhao Lu
- School of Computer Science, Chengdu University of Information Technology, Chengdu, 610225, China
| | - Nan Han
- School of Management, Chengdu University of Information Technology, Chengdu, 610103, China
| | - Dingxiang Liu
- School of Cybersecurity, Chengdu University of Information Technology, Chengdu, 610225, China
| | - Jiliu Zhou
- School of Computer Science, Chengdu University of Information Technology, Chengdu, 610225, China
| |
Collapse
|
10
|
Cruz A, Arrais JP, Machado P. Interactive and coordinated visualization approaches for biological data analysis. Brief Bioinform 2019; 20:1513-1523. [PMID: 29590305 DOI: 10.1093/bib/bby019] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2017] [Revised: 01/24/2018] [Indexed: 12/11/2022] Open
Abstract
The field of computational biology has become largely dependent on data visualization tools to analyze the increasing quantities of data gathered through the use of new and growing technologies. Aside from the volume, which often results in large amounts of noise and complex relationships with no clear structure, the visualization of biological data sets is hindered by their heterogeneity, as data are obtained from different sources and contain a wide variety of attributes, including spatial and temporal information. This requires visualization approaches that are able to not only represent various data structures simultaneously but also provide exploratory methods that allow the identification of meaningful relationships that would not be perceptible through data analysis algorithms alone. In this article, we present a survey of visualization approaches applied to the analysis of biological data. We focus on graph-based visualizations and tools that use coordinated multiple views to represent high-dimensional multivariate data, in particular time series gene expression, protein-protein interaction networks and biological pathways. We then discuss how these methods can be used to help solve the current challenges surrounding the visualization of complex biological data sets.
Collapse
Affiliation(s)
- António Cruz
- Universidade de Coimbra Faculdade de Ciencias e Tecnologia, Departamento de Engenharia Informática
| | - Joel P Arrais
- Universidade de Coimbra Faculdade de Ciencias e Tecnologia, Departamento de Engenharia Informática
| | - Penousal Machado
- Universidade de Coimbra Faculdade de Ciencias e Tecnologia, Departamento de Engenharia Informática
| |
Collapse
|
11
|
Chen Y, Sa Y, Wang G, Pan X, Zhen Y, Cheng X, Zhang K, Fu L, Wang H, Liu B. The protective effects of citrullus colocynthis on inhibiting oxidative damage and autophagy-associated cell death in Parkinson's disease. J Taiwan Inst Chem Eng 2019. [DOI: 10.1016/j.jtice.2019.04.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
12
|
Mangul S, Mosqueiro T, Abdill RJ, Duong D, Mitchell K, Sarwal V, Hill B, Brito J, Littman RJ, Statz B, Lam AKM, Dayama G, Grieneisen L, Martin LS, Flint J, Eskin E, Blekhman R. Challenges and recommendations to improve the installability and archival stability of omics computational tools. PLoS Biol 2019; 17:e3000333. [PMID: 31220077 PMCID: PMC6605654 DOI: 10.1371/journal.pbio.3000333] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Revised: 07/02/2019] [Indexed: 01/07/2023] Open
Abstract
Developing new software tools for analysis of large-scale biological data is a key component of advancing modern biomedical research. Scientific reproduction of published findings requires running computational tools on data generated by such studies, yet little attention is presently allocated to the installability and archival stability of computational software tools. Scientific journals require data and code sharing, but none currently require authors to guarantee the continuing functionality of newly published tools. We have estimated the archival stability of computational biology software tools by performing an empirical analysis of the internet presence for 36,702 omics software resources published from 2005 to 2017. We found that almost 28% of all resources are currently not accessible through uniform resource locators (URLs) published in the paper they first appeared in. Among the 98 software tools selected for our installability test, 51% were deemed "easy to install," and 28% of the tools failed to be installed at all because of problems in the implementation. Moreover, for papers introducing new software, we found that the number of citations significantly increased when authors provided an easy installation process. We propose for incorporation into journal policy several practical solutions for increasing the widespread installability and archival stability of published bioinformatics software.
Collapse
Affiliation(s)
- Serghei Mangul
- Department of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America
- Institute for Quantitative and Computational Biosciences, University of California Los Angeles, Los Angeles, California, United States of America
| | - Thiago Mosqueiro
- Institute for Quantitative and Computational Biosciences, University of California Los Angeles, Los Angeles, California, United States of America
| | - Richard J. Abdill
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Dat Duong
- Department of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America
| | - Keith Mitchell
- Department of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America
| | - Varuni Sarwal
- Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India
| | - Brian Hill
- Department of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America
| | - Jaqueline Brito
- Institute of Mathematics and Computer Science, University of São Paulo, São Paulo, Brazil
| | - Russell Jared Littman
- Department of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America
| | - Benjamin Statz
- Department of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America
| | - Angela Ka-Mei Lam
- Department of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America
| | - Gargi Dayama
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Laura Grieneisen
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Lana S. Martin
- Institute for Quantitative and Computational Biosciences, University of California Los Angeles, Los Angeles, California, United States of America
| | - Jonathan Flint
- Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, California, United States of America
| | - Eleazar Eskin
- Department of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Human Genetics, University of California Los Angeles, Los Angeles, California, United States of America
| | - Ran Blekhman
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, Minnesota, United States of America
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Minnesota, United States of America
| |
Collapse
|
13
|
Zhan YA, Wray CG, Namburi S, Glantz ST, Laubenbacher R, Chuang JH. Fostering bioinformatics education through skill development of professors: Big Genomic Data Skills Training for Professors. PLoS Comput Biol 2019; 15:e1007026. [PMID: 31194735 PMCID: PMC6563947 DOI: 10.1371/journal.pcbi.1007026] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Bioinformatics has become an indispensable part of life science over the past 2 decades. However, bioinformatics education is not well integrated at the undergraduate level, especially in liberal arts colleges and regional universities in the United States. One significant obstacle pointed out by the Network for Integrating Bioinformatics into Life Sciences Education is the lack of faculty in the bioinformatics area. Most current life science professors did not acquire bioinformatics analysis skills during their own training. Consequently, a great number of undergraduate and graduate students do not get the chance to learn bioinformatics or computational biology skills within a structured curriculum during their education. To address this gap, we developed a module-based, week-long short course to train small college and regional university professors with essential bioinformatics skills. The bioinformatics modules were built to be adapted by the professor-trainees afterward and used in their own classes. All the course materials can be accessed at https://github.com/TheJacksonLaboratory/JAXBD2K-ShortCourse.
Collapse
Affiliation(s)
- Yingqian Ada Zhan
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, United States of America
| | - Charles Gregory Wray
- Genomic Education, The Jackson Laboratory, Bar Harbor, Maine, United States of America
| | - Sandeep Namburi
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, United States of America
| | - Spencer T. Glantz
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, United States of America
| | - Reinhard Laubenbacher
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, United States of America
- Center for Quantitative Medicine, UConn Health, Farmington, Connecticut, United States of America
| | - Jeffrey H. Chuang
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, United States of America
- Department of Genetics and Genome Sciences, UConn Health, Farmington, Connecticut, United States of America
| |
Collapse
|
14
|
A Bioinformatics View of Glycan⁻Virus Interactions. Viruses 2019; 11:v11040374. [PMID: 31018588 PMCID: PMC6521074 DOI: 10.3390/v11040374] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Revised: 04/05/2019] [Accepted: 04/15/2019] [Indexed: 02/06/2023] Open
Abstract
Evidence of the mediation of glycan molecules in the interaction between viruses and their hosts is accumulating and is now partially reflected in several online databases. Bioinformatics provides convenient and efficient means of searching, visualizing, comparing, and sometimes predicting, interactions in numerous and diverse molecular biology applications related to the -omics fields. As viromics is gaining momentum, bioinformatics support is increasingly needed. We propose a survey of the current resources for searching, visualizing, comparing, and possibly predicting host–virus interactions that integrate the presence and role of glycans. To the best of our knowledge, we have mapped the specialized and general-purpose databases with the appropriate focus. With an illustration of their potential usage, we also discuss the strong and weak points of the current bioinformatics landscape in the context of understanding viral infection and the immune response to it.
Collapse
|
15
|
Anton Feenstra K, Abeln S, Westerhuis JA, Brancos dos Santos F, Molenaar D, Teusink B, Hoefsloot HCJ, Heringa J. Training for translation between disciplines: a philosophy for life and data sciences curricula. Bioinformatics 2018; 34:i4-i12. [PMID: 29950011 PMCID: PMC6022589 DOI: 10.1093/bioinformatics/bty233] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
Motivation Our society has become data-rich to the extent that research in many areas has become impossible without computational approaches. Educational programmes seem to be lagging behind this development. At the same time, there is a growing need not only for strong data science skills, but foremost for the ability to both translate between tools and methods on the one hand, and application and problems on the other. Results Here we present our experiences with shaping and running a masters' programme in bioinformatics and systems biology in Amsterdam. From this, we have developed a comprehensive philosophy on how translation in training may be achieved in a dynamic and multidisciplinary research area, which is described here. We furthermore describe two requirements that enable translation, which we have found to be crucial: sufficient depth and focus on multidisciplinary topic areas, coupled with a balanced breadth from adjacent disciplines. Finally, we present concrete suggestions on how this may be implemented in practice, which may be relevant for the effectiveness of life science and data science curricula in general, and of particular interest to those who are in the process of setting up such curricula. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- K Anton Feenstra
- Department of Computer Science, IBIVU Centre for Integrative Bioinformatics Vrije Universiteit Amsterdam, HV Amsterdam, Netherlands
- AIMMS Amsterdam Institute for Molecules, Medicines and Systems, Vrije Universiteit Amsterdam, MC Amsterdam, The Netherlands
| | - Sanne Abeln
- Department of Computer Science, IBIVU Centre for Integrative Bioinformatics Vrije Universiteit Amsterdam, HV Amsterdam, Netherlands
- Amsterdam Data Science, GH Amsterdam, The Netherlands
| | - Johan A Westerhuis
- Swammerdam Institute for Life Sciences, Universiteit van Amsterdam, GE Amsterdam, The Netherlands
| | | | - Douwe Molenaar
- AIMMS Amsterdam Institute for Molecules, Medicines and Systems, Vrije Universiteit Amsterdam, MC Amsterdam, The Netherlands
| | - Bas Teusink
- AIMMS Amsterdam Institute for Molecules, Medicines and Systems, Vrije Universiteit Amsterdam, MC Amsterdam, The Netherlands
- Amsterdam Data Science, GH Amsterdam, The Netherlands
| | - Huub C J Hoefsloot
- Swammerdam Institute for Life Sciences, Universiteit van Amsterdam, GE Amsterdam, The Netherlands
| | - Jaap Heringa
- Department of Computer Science, IBIVU Centre for Integrative Bioinformatics Vrije Universiteit Amsterdam, HV Amsterdam, Netherlands
- AIMMS Amsterdam Institute for Molecules, Medicines and Systems, Vrije Universiteit Amsterdam, MC Amsterdam, The Netherlands
- Amsterdam Data Science, GH Amsterdam, The Netherlands
| |
Collapse
|
16
|
Abstract
The digital world is generating data at a staggering and still increasing rate. While these "big data" have unlocked novel opportunities to understand public health, they hold still greater potential for research and practice. This review explores several key issues that have arisen around big data. First, we propose a taxonomy of sources of big data to clarify terminology and identify threads common across some subtypes of big data. Next, we consider common public health research and practice uses for big data, including surveillance, hypothesis-generating research, and causal inference, while exploring the role that machine learning may play in each use. We then consider the ethical implications of the big data revolution with particular emphasis on maintaining appropriate care for privacy in a world in which technology is rapidly changing social norms regarding the need for (and even the meaning of) privacy. Finally, we make suggestions regarding structuring teams and training to succeed in working with big data in research and practice.
Collapse
Affiliation(s)
- Stephen J Mooney
- Harborview Injury Prevention and Research Center, University of Washington, Seattle, Washington 98122, USA;
| | - Vikas Pejaver
- Department of Biomedical Informatics and Medical Education and the eScience Institute, University of Washington, Seattle, Washington 98109, USA;
| |
Collapse
|
17
|
Genomic Research Data Generation, Analysis and Sharing – Challenges in the African Setting. DATA SCIENCE JOURNAL 2017. [DOI: 10.5334/dsj-2017-049] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
|
18
|
Vamathevan J, Birney E. A Review of Recent Advances in Translational Bioinformatics: Bridges from Biology to Medicine. Yearb Med Inform 2017; 26:178-187. [PMID: 29063562 PMCID: PMC6239226 DOI: 10.15265/iy-2017-017] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Indexed: 11/24/2022] Open
Abstract
Objectives: To highlight and provide insights into key developments in translational bioinformatics between 2014 and 2016. Methods: This review describes some of the most influential bioinformatics papers and resources that have been published between 2014 and 2016 as well as the national genome sequencing initiatives that utilize these resources to routinely embed genomic medicine into healthcare. Also discussed are some applications of the secondary use of patient data followed by a comprehensive view of the open challenges and emergent technologies. Results: Although data generation can be performed routinely, analyses and data integration methods still require active research and standardization to improve streamlining of clinical interpretation. The secondary use of patient data has resulted in the development of novel algorithms and has enabled a refined understanding of cellular and phenotypic mechanisms. New data storage and data sharing approaches are required to enable diverse biomedical communities to contribute to genomic discovery. Conclusion: The translation of genomics data into actionable knowledge for use in healthcare is transforming the clinical landscape in an unprecedented way. Exciting and innovative models that bridge the gap between clinical and academic research are set to open up the field of translational bioinformatics for rapid growth in a digital era.
Collapse
Affiliation(s)
- J. Vamathevan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - E. Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| |
Collapse
|
19
|
Canner JE, McEligot AJ, Pérez ME, Qian L, Zhang X. Enhancing Diversity in Biomedical Data Science. Ethn Dis 2017; 27:107-116. [PMID: 28439180 DOI: 10.18865/ed.27.2.107] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The gap in educational attainment separating underrepresented minorities from Whites and Asians remains wide. Such a gap has significant impact on workforce diversity and inclusion among cross-cutting Biomedical Data Science (BDS) research, which presents great opportunities as well as major challenges for addressing health disparities. This article provides a brief description of the newly established National Institutes of Health Big Data to Knowledge (BD2K) diversity initiatives at four universities: California State University, Monterey Bay; Fisk University; University of Puerto Rico, Río Piedras Campus; and California State University, Fullerton. We emphasize three main barriers to BDS careers (ie, preparation, exposure, and access to resources) experienced among those pioneer programs and recommendations for possible solutions (ie, early and proactive mentoring, enriched research experience, and data science curriculum development). The diversity disparities in BDS demonstrate the need for educators, researchers, and funding agencies to support evidence-based practices that will lead to the diversification of the BDS workforce.
Collapse
Affiliation(s)
| | | | | | | | - Xinzhi Zhang
- National Institutes of Health, National Institute on Minority Health and Health Disparities
| |
Collapse
|
20
|
Garmire LX, Gliske S, Nguyen QC, Chen JH, Nemati S, VAN Horn JD, Moore JH, Shreffler C, Dunn M. THE TRAINING OF NEXT GENERATION DATA SCIENTISTS IN BIOMEDICINE. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2016; 22:640-645. [PMID: 27897014 DOI: 10.1142/9789813207813_0059] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
With the booming of new technologies, biomedical science has transformed into digitalized, data intensive science. Massive amount of data need to be analyzed and interpreted, demand a complete pipeline to train next generation data scientists. To meet this need, the transinstitutional Big Data to Knowledge (BD2K) Initiative has been implemented since 2014, complementing other NIH institutional efforts. In this report, we give an overview the BD2K K01 mentored scientist career awards, which have demonstrated early success. We address the specific trainings needed in representative data science areas, in order to make the next generation of data scientists in biomedicine.
Collapse
Affiliation(s)
- Lana X Garmire
- 2Epidemiology Program, University of Hawaii Cancer Center,Honolulu, HI, 96813, USA†Work partially supported by grant NIH Big Data 2 Knowledge Award K01ES025434 (to LXG), K01ES026839 (to SG), K01ES025433 (to QCN), K01ES026837 (to JHC), K01ES025445 (to SN), U24 ES026465 (to JDV), by the National Institute of Environmental Health Sciences through funds provided by the trans-NIH Big Data to Knowledge (BD2K) initiative,*work is also partially supported by P20 COBRE GM103457 awarded by NIH/NIGMS, NICHD R01 HD084633, NLM R01 LM012373, and Hawaii Community Foundation Medical Research Grant 14ADVC-64566.,
| | | | | | | | | | | | | | | | | |
Collapse
|
21
|
The Topology Prediction of Membrane Proteins: A Web-Based Tutorial. Interdiscip Sci 2016; 10:291-296. [PMID: 27718149 DOI: 10.1007/s12539-016-0190-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2016] [Revised: 09/19/2016] [Accepted: 09/22/2016] [Indexed: 01/15/2023]
Abstract
There is a great need for development of educational materials on the transfer of current bioinformatics knowledge to undergraduate students in bioscience departments. In this study, it is aimed to prepare an example in silico laboratory tutorial on the topology prediction of membrane proteins by bioinformatics tools. This laboratory tutorial is prepared for biochemistry lessons at bioscience departments (biology, chemistry, biochemistry, molecular biology and genetics, and faculty of medicine). The tutorial is intended for students who have not taken a bioinformatics course yet or already have taken a course as an introduction to bioinformatics. The tutorial is based on step-by-step explanations with illustrations. It can be applied under supervision of an instructor in the lessons, or it can be used as a self-study guide by students. In the tutorial, membrane-spanning regions and α-helices of membrane proteins were predicted by internet-based bioinformatics tools. According to the results achieved from internet-based bioinformatics tools, the algorithms and parameters used were effective on the accuracy of prediction. The importance of this laboratory tutorial lies on the facts that it provides an introduction to the bioinformatics and that it also demonstrates an in silico laboratory application to the students at natural sciences. The presented example education material is applicable easily at all departments that have internet connection. This study presents an alternative education material to the students in biochemistry laboratories in addition to classical laboratory experiments.
Collapse
|