1
|
Liu M, Zhang F, Lu H, Xue H, Dong X, Li Z, Xu J, Wang W, Wei C. PPanG: a precision pangenome browser enabling nucleotide-level analysis of genomic variations in individual genomes and their graph-based pangenome. BMC Genomics 2024; 25:405. [PMID: 38658835 PMCID: PMC11044437 DOI: 10.1186/s12864-024-10302-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 04/11/2024] [Indexed: 04/26/2024] Open
Abstract
Graph-based pangenome is gaining more popularity than linear pangenome because it stores more comprehensive information of variations. However, traditional linear genome browser has its own advantages, especially the tremendous resources accumulated historically. With the fast-growing number of individual genomes and their annotations available, the demand for a genome browser to visualize genome annotation for many individuals together with a graph-based pangenome is getting higher and higher. Here we report a new pangenome browser PPanG, a precise pangenome browser enabling nucleotide-level comparison of individual genome annotations together with a graph-based pangenome. Nine rice genomes with annotations were provided by default as potential references, and any individual genome can be selected as the reference. Our pangenome browser provides unprecedented insights on genome variations at different levels from base to gene, and reveals how the structures of a gene could differ for individuals. PPanG can be applied to any species with multiple individual genomes available and it is available at https://cgm.sjtu.edu.cn/PPanG .
Collapse
Affiliation(s)
- Mingwei Liu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Fan Zhang
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, 100081, China
- College of Agronomy, Anhui Agricultural University, Hefei, 230036, China
| | - Huimin Lu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Hongzhang Xue
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Xiaorui Dong
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Zhikang Li
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, 100081, China
- College of Agronomy, Anhui Agricultural University, Hefei, 230036, China
| | - Jianlong Xu
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, 100081, China
| | - Wensheng Wang
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, 100081, China.
- College of Agronomy, Anhui Agricultural University, Hefei, 230036, China.
- National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya, 572024, China.
| | - Chaochun Wei
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China.
| |
Collapse
|
2
|
Łapińska N, Pacławski A, Szlęk J, Mendyk A. SerotoninAI: Serotonergic System Focused, Artificial Intelligence-Based Application for Drug Discovery. J Chem Inf Model 2024; 64:2150-2157. [PMID: 38289046 PMCID: PMC11005036 DOI: 10.1021/acs.jcim.3c01517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 01/02/2024] [Accepted: 01/04/2024] [Indexed: 04/09/2024]
Abstract
SerotoninAI is an innovative web application for scientific purposes focused on the serotonergic system. By leveraging SerotoninAI, researchers can assess the affinity (pKi value) of a molecule to all main serotonin receptors and serotonin transporters based on molecule structure introduced as SMILES. Additionally, the application provides essential insights into critical attributes of potential drugs such as blood-brain barrier penetration and human intestinal absorption. The complexity of the serotonergic system demands advanced tools for accurate predictions, which is a fundamental requirement in drug development. SerotoninAI addresses this need by providing an intuitive user interface that generates predictions of pKi values for the main serotonergic targets. The application is freely available on the Internet at https://serotoninai.streamlit.app/, implemented in Streamlit with all major web browsers supported. Currently, to the best of our knowledge, there is no tool that allows users to access affinity predictions for serotonergic targets without registration or financial obligations. SerotoninAI significantly increases the scope of drug development activities worldwide. The source code of the application is available at https://github.com/nczub/SerotoninAI_streamlit.
Collapse
Affiliation(s)
- Natalia Łapińska
- Department
of Pharmaceutical Technology and Biopharmaceutics, Jagiellonian University Medical College, 30-688 Kraków, Poland
- Doctoral
School of Medicinal and Health Sciences, Jagiellonian University Medical College, 30-688 Kraków, Poland
| | - Adam Pacławski
- Department
of Pharmaceutical Technology and Biopharmaceutics, Jagiellonian University Medical College, 30-688 Kraków, Poland
| | - Jakub Szlęk
- Department
of Pharmaceutical Technology and Biopharmaceutics, Jagiellonian University Medical College, 30-688 Kraków, Poland
| | - Aleksander Mendyk
- Department
of Pharmaceutical Technology and Biopharmaceutics, Jagiellonian University Medical College, 30-688 Kraków, Poland
| |
Collapse
|
3
|
Mohanty C, Prasad A, Cheng L, Arkin LM, Shields BE, Drolet B, Kendziorski C. SpatialView: an interactive web application for visualization of multiple samples in spatial transcriptomics experiments. Bioinformatics 2024; 40:btae117. [PMID: 38444087 PMCID: PMC10957517 DOI: 10.1093/bioinformatics/btae117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 12/06/2023] [Accepted: 03/04/2024] [Indexed: 03/07/2024] Open
Abstract
MOTIVATION Spatial transcriptomics (ST) experiments provide spatially localized measurements of genome-wide gene expression allowing for an unprecedented opportunity to investigate cellular heterogeneity and organization within a tissue. Statistical and computational frameworks exist that implement robust methods for pre-processing and analyzing data in ST experiments. However, the lack of an interactive suite of tools for visualizing ST data and results currently limits the full potential of ST experiments. RESULTS To fill the gap, we developed SpatialView, an open-source web browser-based interactive application for visualizing data and results from multiple 10× Genomics Visium ST experiments. We anticipate SpatialView will be useful to a broad array of clinical and basic science investigators utilizing ST to study disease. AVAILABILITY AND IMPLEMENTATION SpatialView is available at https://github.com/kendziorski-lab/SpatialView (and https://doi.org/10.5281/zenodo.10223907); a demo application is available at https://www.biostat.wisc.edu/˜kendzior/spatialviewdemo/.
Collapse
Affiliation(s)
- Chitrasen Mohanty
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53726, United States
| | - Aman Prasad
- Department of Dermatology, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Lingxin Cheng
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53726, United States
| | - Lisa M Arkin
- Department of Dermatology, University of Wisconsin-Madison, Madison, WI 53715, United States
| | - Bridget E Shields
- Department of Dermatology, University of Wisconsin-Madison, Madison, WI 53715, United States
| | - Beth Drolet
- Department of Dermatology, University of Wisconsin-Madison, Madison, WI 53715, United States
| | - Christina Kendziorski
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53726, United States
| |
Collapse
|
4
|
Perkel JM. No installation required: how WebAssembly is changing scientific computing. Nature 2024; 627:455-456. [PMID: 38467881 DOI: 10.1038/d41586-024-00725-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
|
5
|
Ji D, Aboukhalil R, Moshiri N. ViralWasm: a client-side user-friendly web application suite for viral genomics. Bioinformatics 2024; 40:btae018. [PMID: 38200583 PMCID: PMC10809900 DOI: 10.1093/bioinformatics/btae018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 01/09/2024] [Indexed: 01/12/2024] Open
Abstract
MOTIVATION The genomic surveillance of viral pathogens such as SARS-CoV-2 and HIV-1 has been critical to modern epidemiology and public health, but the use of sequence analysis pipelines requires computational expertise, and web-based platforms require sending potentially sensitive raw sequence data to remote servers. RESULTS We introduce ViralWasm, a user-friendly graphical web application suite for viral genomics. All ViralWasm tools utilize WebAssembly to execute the original command line tools client-side directly in the web browser without any user setup, with a cost of just 2-3x slowdown with respect to their command line counterparts. AVAILABILITY AND IMPLEMENTATION The ViralWasm tool suite can be accessed at: https://niema-lab.github.io/ViralWasm.
Collapse
Affiliation(s)
- Daniel Ji
- Department of Computer Science & Engineering, UC San Diego, La Jolla, CA 92093, United States
| | | | - Niema Moshiri
- Department of Computer Science & Engineering, UC San Diego, La Jolla, CA 92093, United States
| |
Collapse
|
6
|
Tang D, Chen M, Huang X, Zhang G, Zeng L, Zhang G, Wu S, Wang Y. SRplot: A free online platform for data visualization and graphing. PLoS One 2023; 18:e0294236. [PMID: 37943830 PMCID: PMC10635526 DOI: 10.1371/journal.pone.0294236] [Citation(s) in RCA: 36] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 10/27/2023] [Indexed: 11/12/2023] Open
Abstract
Graphics are widely used to provide summarization of complex data in scientific publications. Although there are many tools available for drawing graphics, their use is limited by programming skills, costs, and platform specificities. Here, we presented a freely accessible easy-to-use web server named SRplot that integrated more than a hundred of commonly used data visualization and graphing functions together. It can be run easily using all Web browsers and there are no strong requirements on the computing power of users' machines. With a user-friendly graphical interface, users can simply paste the contents of the input file into the text box according to the defined file format. Modification operations can be easily performed, and graphs can be generated in real-time. The resulting graphs can be easily downloaded in bitmap (PNG or TIFF) or vector (PDF or SVG) format in publication quality. The website is updated promptly and continuously. Functions in SRplot have been improved, optimized and updated depend on feedback and suggestions from users. The graphs prepared with SRplot have been featured in more than five hundred peer-reviewed publications. The SRplot web server is now freely available at http://www.bioinformatics.com.cn/SRplot.
Collapse
Affiliation(s)
- Doudou Tang
- Department of Respiratory and Critical Care Medicine, the Second Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Mingjie Chen
- Shanghai NewCore Biotechnology, Minhang District, Shanghai, China
| | - Xinhua Huang
- Shenzhen Ping’an Financial Technology Consulting Co. Ltd, Pudong New District, Shanghai, China
| | - Guicheng Zhang
- Shanghai NewCore Biotechnology, Minhang District, Shanghai, China
| | - Lin Zeng
- Shanghai NewCore Biotechnology, Minhang District, Shanghai, China
| | - Guangsen Zhang
- Department of Hematology, the Second Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Shangjie Wu
- Department of Respiratory and Critical Care Medicine, the Second Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Yewei Wang
- Department of Hematology, the Second Xiangya Hospital, Central South University, Changsha, Hunan, China
| |
Collapse
|
7
|
Ribera-Altimir J, Llorach-Tó G, Sala-Coromina J, Company JB, Galimany E. Fisheries data management systems in the NW Mediterranean: from data collection to web visualization. Database (Oxford) 2023; 2023:baad067. [PMID: 37864836 PMCID: PMC10590195 DOI: 10.1093/database/baad067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 08/01/2023] [Accepted: 09/29/2023] [Indexed: 10/23/2023]
Abstract
The European Union Data Collection Framework (DCF) states that scientific data-driven assessments are essential to achieve sustainable fisheries. To respond to the DCF call, this study introduces the information systems developed and used by Institut Català de Recerca per a la Governança del Mar (ICATMAR), the Catalan Institute of Research for the Governance of the Seas. The information systems include data from a biological monitoring, curation, processing, analysis, publication and web visualization for bottom trawl fisheries. Over the 4 years of collected data (2019-2022), the sampling program developed a dataset of over 1.1 million sampled individuals accounting for 24.6 tons of catch. The sampling data are ingested into a database through a data input website ensuring data management control and quality. The standardized metrics are automatically calculated and the data are published in the web visualizer, combined with fishing landings and Vessel Monitoring System (VMS) records. As the combination of remote sensing data with fisheries monitoring offers new approaches for ecosystem assessment, the collected fisheries data are also visualized in combination with georeferenced seabed habitats from the European Marine Observation and Data Network (EMODnet), climate and sea conditions from Copernicus Monitoring Environment Marine Service (CMEMS) on the web browser. Three public web-based products have been developed in the visualizer: geolocated bottom trawl samplings, biomass distribution per port or season and length-frequency charts per species. These information systems aim to fulfil the gaps in the scientific community, administration and civil society to access high-quality data for fisheries management, following the Findable, Accessible, Interoperable, Reusable (FAIR) principles, enabling scientific knowledge transfer. Database URL https://icatmar.github.io/VISAP/(www.icatmar.cat).
Collapse
Affiliation(s)
- Jordi Ribera-Altimir
- Institut de Ciències del Mar (ICM-CSIC), Passeig Marítim de la Barceloneta 37-49, 08003 Barcelona, Catalonia, Spain
- Institut Català de Recerca per a la Governança del Mar (ICATMAR), Passeig Marítim de la Barceloneta 37-49, 08003 Barcelona, Catalonia, Spain
| | - Gerard Llorach-Tó
- Institut de Ciències del Mar (ICM-CSIC), Passeig Marítim de la Barceloneta 37-49, 08003 Barcelona, Catalonia, Spain
- Institut Català de Recerca per a la Governança del Mar (ICATMAR), Passeig Marítim de la Barceloneta 37-49, 08003 Barcelona, Catalonia, Spain
- Xarxa Marítima de Catalunya (BlueNetCat), Plaça d’Eusebi Güell 6, 08034 Barcelona, Catalonia, Spain
| | - Joan Sala-Coromina
- Institut de Ciències del Mar (ICM-CSIC), Passeig Marítim de la Barceloneta 37-49, 08003 Barcelona, Catalonia, Spain
- Institut Català de Recerca per a la Governança del Mar (ICATMAR), Passeig Marítim de la Barceloneta 37-49, 08003 Barcelona, Catalonia, Spain
| | - Joan B Company
- Institut de Ciències del Mar (ICM-CSIC), Passeig Marítim de la Barceloneta 37-49, 08003 Barcelona, Catalonia, Spain
- Institut Català de Recerca per a la Governança del Mar (ICATMAR), Passeig Marítim de la Barceloneta 37-49, 08003 Barcelona, Catalonia, Spain
| | - Eve Galimany
- Institut de Ciències del Mar (ICM-CSIC), Passeig Marítim de la Barceloneta 37-49, 08003 Barcelona, Catalonia, Spain
- Institut Català de Recerca per a la Governança del Mar (ICATMAR), Passeig Marítim de la Barceloneta 37-49, 08003 Barcelona, Catalonia, Spain
| |
Collapse
|
8
|
Mondal RK, Sen D, Arya A, Samanta SK. Developing anti-microbial peptide database version 1 to provide comprehensive and exhaustive resource of manually curated AMPs. Sci Rep 2023; 13:17843. [PMID: 37857659 PMCID: PMC10587344 DOI: 10.1038/s41598-023-45016-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Accepted: 10/14/2023] [Indexed: 10/21/2023] Open
Abstract
Anti-Microbial Peptide Database version 1 (AMPDB v1) is a meticulously curated resource that aims to address the limitations of existing databases in the field of antimicrobial research. We have utilized the latest technology and put our best efforts into adding all relevant tools to cater to the needs of our users. AMPDB v1 is a derived database, built upon information gathered from the available resources and boasts a significant size of 59,122 entries which are classified into 88 classes. All the information in this resource was curated manually. Sequence alignment and protein feature calculation tools were integrated into the database in the form of web applications, to make them easy to use, quick, and responsive in real-time. We have included multiple types of browsing and searching options to enhance the user experience, from simple text search to a completely customizable advanced search page with intuitive options that let the user combine multiple options together to make a powerful search query. The database is accessible by a web browser at https://bblserver.org.in/ampdb/ .
Collapse
Affiliation(s)
- Rajat Kumar Mondal
- Biochemistry and Bioinformatics Laboratory, Department of Applied Sciences, Indian Institute of Information Technology Allahabad (IIIT-A), Uttar Pradesh, Devghat, Jhalwa, Prayagraj, 211012, India
| | - Debarup Sen
- Persistent Systems Ltd., Pune, Maharashtra, India
| | - Ankish Arya
- Biochemistry and Bioinformatics Laboratory, Department of Applied Sciences, Indian Institute of Information Technology Allahabad (IIIT-A), Uttar Pradesh, Devghat, Jhalwa, Prayagraj, 211012, India
| | - Sintu Kumar Samanta
- Biochemistry and Bioinformatics Laboratory, Department of Applied Sciences, Indian Institute of Information Technology Allahabad (IIIT-A), Uttar Pradesh, Devghat, Jhalwa, Prayagraj, 211012, India.
- Department of Applied Sciences, Indian Institute of Information Technology Allahabad, Allahabad, 211012, India.
| |
Collapse
|
9
|
Turzo SMBA, Seffernick JT, Lyskov S, Lindert S. Predicting ion mobility collision cross sections using projection approximation with ROSIE-PARCS webserver. Brief Bioinform 2023; 24:bbad308. [PMID: 37609950 PMCID: PMC10516336 DOI: 10.1093/bib/bbad308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 07/03/2023] [Accepted: 08/08/2023] [Indexed: 08/24/2023] Open
Abstract
Ion mobility coupled to mass spectrometry informs on the shape and size of protein structures in the form of a collision cross section (CCSIM). Although there are several computational methods for predicting CCSIM based on protein structures, including our previously developed projection approximation using rough circular shapes (PARCS), the process usually requires prior experience with the command-line interface. To overcome this challenge, here we present a web application on the Rosetta Online Server that Includes Everyone (ROSIE) webserver to predict CCSIM from protein structure using projection approximation with PARCS. In this web interface, the user is only required to provide one or more PDB files as input. Results from our case studies suggest that CCSIM predictions (with ROSIE-PARCS) are highly accurate with an average error of 6.12%. Furthermore, the absolute difference between CCSIM and CCSPARCS can help in distinguishing accurate from inaccurate AlphaFold2 protein structure predictions. ROSIE-PARCS is designed with a user-friendly interface, is available publicly and is free to use. The ROSIE-PARCS web interface is supported by all major web browsers and can be accessed via this link (https://rosie.graylab.jhu.edu).
Collapse
Affiliation(s)
- S M Bargeen Alam Turzo
- Department of Chemistry and Biochemistry and Resource for Native Mass Spectrometry Guided Structural Biology, Ohio State University, Columbus, OH 43210, USA
| | - Justin T Seffernick
- Department of Chemistry and Biochemistry and Resource for Native Mass Spectrometry Guided Structural Biology, Ohio State University, Columbus, OH 43210, USA
| | - Sergey Lyskov
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Steffen Lindert
- Department of Chemistry and Biochemistry and Resource for Native Mass Spectrometry Guided Structural Biology, Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
10
|
Diesh C, Stevens GJ, Xie P, De Jesus Martinez T, Hershberg EA, Leung A, Guo E, Dider S, Zhang J, Bridge C, Hogue G, Duncan A, Morgan M, Flores T, Bimber BN, Haw R, Cain S, Buels RM, Stein LD, Holmes IH. JBrowse 2: a modular genome browser with views of synteny and structural variation. Genome Biol 2023; 24:74. [PMID: 37069644 PMCID: PMC10108523 DOI: 10.1186/s13059-023-02914-z] [Citation(s) in RCA: 39] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 03/20/2023] [Indexed: 04/19/2023] Open
Abstract
We present JBrowse 2, a general-purpose genome annotation browser offering enhanced visualization of complex structural variation and evolutionary relationships. It retains core features of JBrowse while adding new views for synteny, dotplots, breakpoints, gene fusions, and whole-genome overviews. It allows users to share sessions, open multiple genomes, and navigate between views. It can be embedded in a web page, used as a standalone application, or run from Jupyter notebooks or R sessions. These improvements are enabled by a ground-up redesign using modern web technology. We describe application functionality, use cases, performance benchmarks, and implementation notes for web administrators and developers.
Collapse
Affiliation(s)
- Colin Diesh
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | - Garrett J Stevens
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | - Peter Xie
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | | | - Elliot A. Hershberg
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | - Angel Leung
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | - Emma Guo
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | - Shihab Dider
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | - Junjun Zhang
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3 Canada
| | - Caroline Bridge
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3 Canada
| | - Gregory Hogue
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3 Canada
| | - Andrew Duncan
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3 Canada
| | - Matthew Morgan
- Center for Applied Systems and Software, 224 Milne Computer Center, 1800 SW Campus Way, Oregon State University, Corvallis, OR 97331 USA
| | - Tia Flores
- Center for Applied Systems and Software, 224 Milne Computer Center, 1800 SW Campus Way, Oregon State University, Corvallis, OR 97331 USA
| | - Benjamin N. Bimber
- Oregon National Primate Research Center, Oregon Health and Science University, Beaverton, OR 97006 USA
| | - Robin Haw
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3 Canada
| | - Scott Cain
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3 Canada
| | - Robert M. Buels
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| | - Lincoln D. Stein
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3 Canada
| | - Ian H. Holmes
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720 USA
| |
Collapse
|
11
|
Goldfarb DS, Modersitzki F, Karafilidis J, Li-McLeod J. Healthcare utilization, quality of life, and work productivity associated with primary hyperoxaluria: a cross-sectional web-based US survey. Urolithiasis 2023; 51:72. [PMID: 37067624 PMCID: PMC10110695 DOI: 10.1007/s00240-023-01436-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 03/25/2023] [Indexed: 04/18/2023]
Abstract
Primary hyperoxaluria (PH) is a family of ultra-rare, autosomal recessive, metabolic disorders associated with frequent kidney stones, chronic kidney disease and kidney failure, and serious complications due to systemic oxalosis, resulting in significant morbidity. We investigated the burden of PH among affected patients and caregivers. This cross-sectional, web-based survey was used to quantify the burden of PH, in terms of healthcare resource utilization, health-related quality of life, and work productivity and activity impairment among adults (≥ 18 years) with PH and caregivers of children (≤ 17 years) with PH in the US. Among the 20 respondents, there were 7 adults with PH and 13 caregivers of children with PH. Adherence to hyperhydration was noted as the most, or one of the most, difficult aspects of PH by 56% of respondents. Most patients (95%) had experienced painful kidney stone events, one-third had visited the emergency room, and 29% were hospitalized for complications due to PH. Of the 24% of patients on dialysis, all found the procedure burdensome. Adult patients' quality of life was negatively affected across several domains. Most respondents (81%) reported that PH had a negative effect on their finances. Employed adult patients and caregivers, and children with PH, had moderate impairment in work productivity, school attendance, and activity. Anxiety about future PH-related sequelae was moderate to high. These findings highlight the need for improvements in PH medical management. A plain language summary is available in the supplementary information.
Collapse
Affiliation(s)
- David S Goldfarb
- New York University Grossman School of Medicine, New York, NY, 10016, USA.
| | - Frank Modersitzki
- New York University Grossman School of Medicine, New York, NY, 10016, USA
| | - John Karafilidis
- Dicerna Pharmaceuticals, Inc., a Novo Nordisk Company, Lexington, MA, USA
| | | |
Collapse
|
12
|
Chen TT, Sun YC, Chu WC, Lien CY. BlueLight: An Open Source DICOM Viewer Using Low-Cost Computation Algorithm Implemented with JavaScript Using Advanced Medical Imaging Visualization. J Digit Imaging 2023; 36:753-763. [PMID: 36538245 PMCID: PMC10039132 DOI: 10.1007/s10278-022-00746-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 11/16/2022] [Accepted: 11/23/2022] [Indexed: 12/24/2022] Open
Abstract
Recently, WebGL has been widely used in numerous web-based medical image viewers to present advanced imaging visualization. However, in the scenario of medical imaging, there are many challenges of computation time and memory consumption that limit the use of advanced image renderings, such as volume rendering and multiplanar reformation/reconstruction, in low-cost mobile devices. In this study, we propose a client-side rendering low-cost computation algorithm for common two- and three-dimensional medical imaging visualization implemented by pure JavaScript. Particularly, we used the functions of cascading style sheet transform and combinate with Digital Imaging and Communications in Medicine (DICOM)-related imaging to replace the application programming interface with high computation to reduce the computation time and save memory consumption while launching medical imaging interpretation on web browsers. The results show the proposed algorithm significantly reduced the consumption of central and graphics processing units on various web browsers. The proposed algorithm was implemented in an open-source web-based DICOM viewer BlueLight; the results show that it has sufficient rendering performance to display 3D medical images with DICOM-compliant annotations and has the ability to connect to image archive via DICOMweb as well.Keywords: WebGL, DICOMweb, Multiplanar reconstruction, Volume rendering, DICOM, JavaScript, Zero-footprint.
Collapse
Affiliation(s)
- Tseng-Tse Chen
- Department of Information Management, National Taipei University of Nursing and Health Sciences, Taipei, Taiwan
- Department of Biomedical Engineering, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Ying-Chou Sun
- School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
- Deptartment of Radiology, Taipei Veterans General Hospital, Taipei, Taiwan
- Department of Medical Imaging and Radiological Technology, Yuanpei University of Medical Technology, Hsinchu, Taiwan
| | - Woei-Chyn Chu
- Department of Biomedical Engineering, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Chung-Yueh Lien
- Department of Information Management, National Taipei University of Nursing and Health Sciences, Taipei, Taiwan.
| |
Collapse
|
13
|
Glen AK, Ma C, Mendoza L, Womack F, Wood EC, Sinha M, Acevedo L, Kvarfordt LG, Peene RC, Liu S, Hoffman AS, Roach JC, Deutsch EW, Ramsey SA, Koslicki D. ARAX: a graph-based modular reasoning tool for translational biomedicine. Bioinformatics 2023; 39:7031241. [PMID: 36752514 PMCID: PMC10027432 DOI: 10.1093/bioinformatics/btad082] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 12/17/2022] [Accepted: 02/07/2023] [Indexed: 04/12/2023] Open
Abstract
MOTIVATION With the rapidly growing volume of knowledge and data in biomedical databases, improved methods for knowledge-graph-based computational reasoning are needed in order to answer translational questions. Previous efforts to solve such challenging computational reasoning problems have contributed tools and approaches, but progress has been hindered by the lack of an expressive analysis workflow language for translational reasoning and by the lack of a reasoning engine-supporting that language-that federates semantically integrated knowledge-bases. RESULTS We introduce ARAX, a new reasoning system for translational biomedicine that provides a web browser user interface and an application programming interface (API). ARAX enables users to encode translational biomedical questions and to integrate knowledge across sources to answer the user's query and facilitate exploration of results. For ARAX, we developed new approaches to query planning, knowledge-gathering, reasoning and result ranking and dynamically integrate knowledge providers for answering biomedical questions. To illustrate ARAX's application and utility in specific disease contexts, we present several use-case examples. AVAILABILITY AND IMPLEMENTATION The source code and technical documentation for building the ARAX server-side software and its built-in knowledge database are freely available online (https://github.com/RTXteam/RTX). We provide a hosted ARAX service with a web browser interface at arax.rtx.ai and a web API endpoint at arax.rtx.ai/api/arax/v1.3/ui/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Luis Mendoza
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Finn Womack
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, PA 16802, USA
| | - E C Wood
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Meghamala Sinha
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Liliana Acevedo
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Lindsey G Kvarfordt
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Ross C Peene
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Shaopeng Liu
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, PA 16802, USA
| | - Andrew S Hoffman
- Interdisciplinary Hub for Digitalization and Society, Radboud University, Nijmegen 6500GL, The Netherlands
| | - Jared C Roach
- Institute for Systems Biology, Seattle, WA 98109, USA
| | | | | | | |
Collapse
|
14
|
Ohta T, Shiwa Y. Hybrid Genome Assembly of Short and Long Reads in Galaxy. Methods Mol Biol 2023; 2632:15-30. [PMID: 36781718 DOI: 10.1007/978-1-0716-2996-3_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
Galaxy is a web browser-based data analysis platform that is widely used in biology. Public Galaxy instances allow the analysis of data and interpretation of results without requiring software installation. NanoGalaxy is a public Galaxy instance with tools and workflows for nanopore data analysis. This chapter describes the steps involved in performing genome assembly using short and long reads in NanoGalaxy.
Collapse
Affiliation(s)
- Tazro Ohta
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Mishima, Shizuoka, Japan
| | - Yuh Shiwa
- Laboratory of Bioinformatics, Department of Molecular Microbiology, Faculty of Life Sciences, Tokyo University of Agriculture, Setagaya, Tokyo, Japan.
| |
Collapse
|
15
|
De Jesus Martinez T, Hershberg EA, Guo E, Stevens GJ, Diesh C, Xie P, Bridge C, Cain S, Haw R, Buels RM, Stein LD, Holmes IH. JBrowse Jupyter: a Python interface to JBrowse 2. Bioinformatics 2023; 39:btad032. [PMID: 36648320 PMCID: PMC9887080 DOI: 10.1093/bioinformatics/btad032] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 12/10/2022] [Accepted: 01/16/2023] [Indexed: 01/18/2023] Open
Abstract
MOTIVATION JBrowse Jupyter is a package that aims to close the gap between Python programming and genomic visualization. Web-based genome browsers are routinely used for publishing and inspecting genome annotations. Historically they have been deployed at the end of bioinformatics pipelines, typically decoupled from the analysis itself. However, emerging technologies such as Jupyter notebooks enable a more rapid iterative cycle of development, analysis and visualization. RESULTS We have developed a package that provides a Python interface to JBrowse 2's suite of embeddable components, including the primary Linear Genome View. The package enables users to quickly set up, launch and customize JBrowse views from Jupyter notebooks. In addition, users can share their data via Google's Colab notebooks, providing reproducible interactive views. AVAILABILITY AND IMPLEMENTATION JBrowse Jupyter is released under the Apache License and is available for download on PyPI. Source code and demos are available on GitHub at https://github.com/GMOD/jbrowse-jupyter.
Collapse
Affiliation(s)
| | - Elliot A Hershberg
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720, USA
| | - Emma Guo
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720, USA
| | - Garrett J Stevens
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720, USA
| | - Colin Diesh
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720, USA
| | - Peter Xie
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720, USA
| | - Caroline Bridge
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3, Canada
| | - Scott Cain
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3, Canada
| | - Robin Haw
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3, Canada
| | - Robert M Buels
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720, USA
| | - Lincoln D Stein
- Adaptive Oncology, Ontario Institute for Cancer Research, MaRS Centre, 661 University Avenue, Suite 510, Toronto, ON M5G 0A3, Canada
| | - Ian H Holmes
- Department of Bioengineering, Stanley Hall, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
16
|
Franz M, Lopes CT, Fong D, Kucera M, Cheung M, Siper MC, Huck G, Dong Y, Sumer O, Bader GD. Cytoscape.js 2023 update: a graph theory library for visualization and analysis. Bioinformatics 2023; 39:6988031. [PMID: 36645249 PMCID: PMC9889963 DOI: 10.1093/bioinformatics/btad031] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 01/13/2023] [Indexed: 01/17/2023] Open
Abstract
SUMMARY Cytoscape.js is an open-source JavaScript-based graph library. Its most common use case is as a visualization software component, so it can be used to render interactive graphs in a web browser. It also can be used in a headless manner, useful for graph operations on a server, such as Node.js. This update describes new features and enhancements introduced over many new versions from 2015 to 2022. AVAILABILITY AND IMPLEMENTATION Cytoscape.js is implemented in JavaScript. Documentation, downloads and source code are available at http://js.cytoscape.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Max Franz
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | | | - Dylan Fong
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Mike Kucera
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Manfred Cheung
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Metin Can Siper
- Department of Molecular and Medical Genetics, School of Medicine, Oregon Health & Science University, Portland, OR, USA
| | - Gerardo Huck
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Yue Dong
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Onur Sumer
- Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | | |
Collapse
|
17
|
Wang Q, Chen Z, Wang Y, Qu H. A Survey on ML4VIS: Applying Machine Learning Advances to Data Visualization. IEEE Trans Vis Comput Graph 2022; 28:5134-5153. [PMID: 34437063 DOI: 10.1109/tvcg.2021.3106142] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Inspired by the great success of machine learning (ML), researchers have applied ML techniques to visualizations to achieve a better design, development, and evaluation of visualizations. This branch of studies, known as ML4VIS, is gaining increasing research attention in recent years. To successfully adapt ML techniques for visualizations, a structured understanding of the integration of ML4VIS is needed. In this article, we systematically survey 88 ML4VIS studies, aiming to answer two motivating questions: "what visualization processes can be assisted by ML?" and "how ML techniques can be used to solve visualization problems? "This survey reveals seven main processes where the employment of ML techniques can benefit visualizations: Data Processing4VIS, Data-VIS Mapping, Insight Communication, Style Imitation, VIS Interaction, VIS Reading, and User Profiling. The seven processes are related to existing visualization theoretical models in an ML4VIS pipeline, aiming to illuminate the role of ML-assisted visualization in general visualizations. Meanwhile, the seven processes are mapped into main learning tasks in ML to align the capabilities of ML with the needs in visualization. Current practices and future opportunities of ML4VIS are discussed in the context of the ML4VIS pipeline and the ML-VIS mapping. While more studies are still needed in the area of ML4VIS, we hope this article can provide a stepping-stone for future exploration. A web-based interactive browser of this survey is available at https://ml4vis.github.io.
Collapse
|
18
|
Nair S, Barrett A, Li D, Raney BJ, Lee BT, Kerpedjiev P, Ramalingam V, Pampari A, Lekschas F, Wang T, Haeussler M, Kundaje A. The dynseq browser track shows context-specific features at nucleotide resolution. Nat Genet 2022; 54:1581-1583. [PMID: 36241719 PMCID: PMC10015500 DOI: 10.1038/s41588-022-01194-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
High-throughput experimental platforms have revolutionized the ability to profile biochemical and functional properties of biological sequences such as DNA, RNA and proteins. By collating several data modalities with customizable tracks rendered using intuitive visualizations, genome browsers enable an interactive and interpretable exploration of diverse types of genome profiling experiments and derived annotations. However, existing genome browser tracks are not well suited for intuitive visualization of high-resolution DNA sequence features such as transcription factor motifs. Typically, motif instances in regulatory DNA sequences are visualized as BED-based annotation tracks, which highlight the genomic coordinates of the motif instances but do not expose their specific sequences. Instead, a genome sequence track needs to be cross-referenced with the BED track to identify sequences of motif hits. Even so, quantitative information about the motif instances such as affinity or conservation as well as differences in base resolution from the consensus motif are not immediately apparent. This makes interpretation slow and challenging. This problem is compounded when analyzing several cellular states and/or molecular readouts (such as ATAC-seq and ChIP–seq) simultaneously, as coordinates of enriched regions (peaks) and the set of active transcription factor motifs vary across cell states.
Collapse
Affiliation(s)
- Surag Nair
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | | | - Daofeng Li
- Department of Genetics, Washington University in St. Louis School of Medicine, St. Louis, MO, USA
- Edison Family Center for Genome Sciences and Systems Biology, Washington University in St. Louis School of Medicine, St. Louis, MO, USA
| | - Brian J Raney
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Brian T Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | - Anusri Pampari
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | | | - Ting Wang
- Department of Genetics, Washington University in St. Louis School of Medicine, St. Louis, MO, USA
- Edison Family Center for Genome Sciences and Systems Biology, Washington University in St. Louis School of Medicine, St. Louis, MO, USA
| | | | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA, USA.
- Department of Genetics, Stanford University, Stanford, CA, USA.
| |
Collapse
|
19
|
Wang A, Durrant JD. Open-Source Browser-Based Tools for Structure-Based Computer-Aided Drug Discovery. Molecules 2022; 27:molecules27144623. [PMID: 35889494 PMCID: PMC9319651 DOI: 10.3390/molecules27144623] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 07/17/2022] [Accepted: 07/18/2022] [Indexed: 01/27/2023] Open
Abstract
We here outline the importance of open-source, accessible tools for computer-aided drug discovery (CADD). We begin with a discussion of drug discovery in general to provide context for a subsequent discussion of structure-based CADD applied to small-molecule ligand discovery. Next, we identify usability challenges common to many open-source CADD tools. To address these challenges, we propose a browser-based approach to CADD tool deployment in which CADD calculations run in modern web browsers on users’ local computers. The browser app approach eliminates the need for user-initiated download and installation, ensures broad operating system compatibility, enables easy updates, and provides a user-friendly graphical user interface. Unlike server apps—which run calculations “in the cloud” rather than on users’ local computers—browser apps do not require users to upload proprietary information to a third-party (remote) server. They also eliminate the need for the difficult-to-maintain computer infrastructure required to run user-initiated calculations remotely. We conclude by describing some CADD browser apps developed in our lab, which illustrate the utility of this approach. Aside from introducing readers to these specific tools, we are hopeful that this review highlights the need for additional browser-compatible, user-friendly CADD software.
Collapse
|
20
|
Auer F, Mayer S, Kramer F. Data-dependent visualization of biological networks in the web-browser with NDExEdit. PLoS Comput Biol 2022; 18:e1010205. [PMID: 35675360 PMCID: PMC9212158 DOI: 10.1371/journal.pcbi.1010205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Revised: 06/21/2022] [Accepted: 05/15/2022] [Indexed: 12/02/2022] Open
Abstract
Networks are a common methodology used to capture increasingly complex associations between biological entities. They serve as a resource of biological knowledge for bioinformatics analyses, and also comprise the subsequent results. However, the interpretation of biological networks is challenging and requires suitable visualizations dependent on the contained information. The most prominent software in the field for the visualization of biological networks is Cytoscape, a desktop modeling environment also including many features for analysis. A further challenge when working with networks is their distribution. Within a typical collaborative workflow, even slight changes of the network data force one to repeat the visualization step as well. Also, just minor adjustments to the visual representation not only need the networks to be transferred back and forth. Collaboration on the same resources requires specific infrastructure to avoid redundancies, or worse, the corruption of the data. A well-established solution is provided by the NDEx platform where users can upload a network, share it with selected colleagues or make it publicly available. NDExEdit is a web-based application where simple changes can be made to biological networks within the browser, and which does not require installation. With our tool, plain networks can be enhanced easily for further usage in presentations and publications. Since the network data is only stored locally within the web browser, users can edit their private networks without concerns of unintentional publication. The web tool is designed to conform to the Cytoscape Exchange (CX) format as a data model, which is used for the data transmission by both tools, Cytoscape and NDEx. Therefore the modified network can be directly exported to the NDEx platform or saved as a compatible CX file, additionally to standard image formats like PNG and JPEG. Relations in biological research are often visualized as networks. For instance, if two proteins interact with each other during a certain process, the corresponding network would show two nodes connected by one edge. But the fact that the interaction between the two exists, may not be enough. With established software solutions like Cytoscape we can add all the information we have about our nodes and their interaction to our data foundation. Furthermore, we can change the visual appearance of our nodes and their interaction based on this information. For example, if our network contains 20 nodes, that all interact with each other, but the strength of these interactions each range between 0 and 1, we can illustrate that by making the edges wider for strong interactions and slimmer for weak interactions. Thus, our visualization is enriched with valuable information. As of now these data-dependent modifications can only be made with a desktop client. We introduce NDExEdit, a web-based solution for visualization changes to networks that conform to the CX data format. It allows us to import networks directly from the NDEx platform and apply changes to the visualization—including all types of mappings, one of which was briefly described above.
Collapse
Affiliation(s)
- Florian Auer
- Department of IT-Infrastructure for Translational Medical Research, Faculty of Applied Computer Science, University of Augsburg, Augsburg, Germany
- * E-mail:
| | - Simone Mayer
- Department of IT-Infrastructure for Translational Medical Research, Faculty of Applied Computer Science, University of Augsburg, Augsburg, Germany
| | - Frank Kramer
- Department of IT-Infrastructure for Translational Medical Research, Faculty of Applied Computer Science, University of Augsburg, Augsburg, Germany
| |
Collapse
|
21
|
Yu D, Yang X, Tang B, Pan YH, Yang J, Duan G, Zhu J, Hao ZQ, Mu H, Dai L, Hu W, Zhang M, Cui Y, Jin T, Li CP, Ma L, Su X, Zhang G, Zhao W, Li H. Coronavirus GenBrowser for monitoring the transmission and evolution of SARS-CoV-2. Brief Bioinform 2022; 23:6511196. [PMID: 35043153 PMCID: PMC8921643 DOI: 10.1093/bib/bbab583] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 11/26/2021] [Accepted: 12/20/2021] [Indexed: 12/31/2022] Open
Abstract
Genomic epidemiology is important to study the COVID-19 pandemic, and more than two million severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomic sequences were deposited into public databases. However, the exponential increase of sequences invokes unprecedented bioinformatic challenges. Here, we present the Coronavirus GenBrowser (CGB) based on a highly efficient analysis framework and a node-picking rendering strategy. In total, 1,002,739 high-quality genomic sequences with the transmission-related metadata were analyzed and visualized. The size of the core data file is only 12.20 MB, highly efficient for clean data sharing. Quick visualization modules and rich interactive operations are provided to explore the annotated SARS-CoV-2 evolutionary tree. CGB binary nomenclature is proposed to name each internal lineage. The pre-analyzed data can be filtered out according to the user-defined criteria to explore the transmission of SARS-CoV-2. Different evolutionary analyses can also be easily performed, such as the detection of accelerated evolution and ongoing positive selection. Moreover, the 75 genomic spots conserved in SARS-CoV-2 but non-conserved in other coronaviruses were identified, which may indicate the functional elements specifically important for SARS-CoV-2. The CGB was written in Java and JavaScript. It not only enables users who have no programming skills to analyze millions of genomic sequences, but also offers a panoramic vision of the transmission and evolution of SARS-CoV-2.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Junwei Zhu
- National Genomics Data Center, Beijing Institute of Genomics (China National Center for Bioinformation), Chinese Academy of Sciences, Beijing 100101, China
| | - Zi-Qian Hao
- National Genomics Data Center, Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai 200031, China
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100101, China
| | - Hailong Mu
- National Genomics Data Center, Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai 200031, China
| | - Long Dai
- National Genomics Data Center, Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai 200031, China
- Shanghai Shenyou Biotechnology Co. LTD, Shanghai 201315, China
| | - Wangjie Hu
- National Genomics Data Center, Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai 200031, China
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100101, China
| | - Mochen Zhang
- National Genomics Data Center, Beijing Institute of Genomics (China National Center for Bioinformation), Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100101, China
| | - Ying Cui
- National Genomics Data Center, Beijing Institute of Genomics (China National Center for Bioinformation), Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100101, China
| | - Tong Jin
- National Genomics Data Center, Beijing Institute of Genomics (China National Center for Bioinformation), Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100101, China
| | - Cui-Ping Li
- National Genomics Data Center, Beijing Institute of Genomics (China National Center for Bioinformation), Chinese Academy of Sciences, Beijing 100101, China
| | - Lina Ma
- National Genomics Data Center, Beijing Institute of Genomics (China National Center for Bioinformation), Chinese Academy of Sciences, Beijing 100101, China
| | | | - Xiao Su
- Institut Pasteur of Shanghai, Chinese Academy of Sciences, Shanghai 200031, China
| | - Guoqing Zhang
- Corresponding authors: Guoqing Zhang, E-mail: ; Wenming Zhao, E-mail: ; Haipeng Li, E-mail: ; Tel: +86-21-54920460
| | - Wenming Zhao
- Corresponding authors: Guoqing Zhang, E-mail: ; Wenming Zhao, E-mail: ; Haipeng Li, E-mail: ; Tel: +86-21-54920460
| | - Haipeng Li
- Corresponding authors: Guoqing Zhang, E-mail: ; Wenming Zhao, E-mail: ; Haipeng Li, E-mail: ; Tel: +86-21-54920460
| |
Collapse
|
22
|
Lee BT, Barber GP, Benet-Pagès A, Casper J, Clawson H, Diekhans M, Fischer C, Gonzalez JN, Hinrichs A, Lee C, Muthuraman P, Nassar L, Nguy B, Pereira T, Perez G, Raney B, Rosenbloom K, Schmelter D, Speir M, Wick B, Zweig A, Haussler D, Kuhn R, Haeussler M, Kent W. The UCSC Genome Browser database: 2022 update. Nucleic Acids Res 2022; 50:D1115-D1122. [PMID: 34718705 PMCID: PMC8728131 DOI: 10.1093/nar/gkab959] [Citation(s) in RCA: 125] [Impact Index Per Article: 62.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 09/30/2021] [Accepted: 10/04/2021] [Indexed: 11/25/2022] Open
Abstract
The UCSC Genome Browser, https://genome.ucsc.edu, is a graphical viewer for exploring genome annotations. The website provides integrated tools for visualizing, comparing, analyzing, and sharing both publicly available and user-generated genomic datasets. Data highlights this year include a collection of easily accessible public hub assemblies on new organisms, now featuring BLAT alignment and PCR capabilities, and new and updated clinical tracks (gnomAD, DECIPHER, CADD, REVEL). We introduced a new Track Sets feature and enhanced variant displays to aid in the interpretation of clinical data. We also added a tool to rapidly place new SARS-CoV-2 genomes in a global phylogenetic tree enabling researchers to view the context of emerging mutations in our SARS-CoV-2 Genome Browser. Other new software focuses on usability features, including more informative mouseover displays and new fonts.
Collapse
Affiliation(s)
- Brian T Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Galt P Barber
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Anna Benet-Pagès
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Medical Genetics Center (Medizinisch Genetisches Zentrum), Munich 80335, Germany
| | - Jonathan Casper
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Clay Fischer
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Christopher M Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Pranav Muthuraman
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Luis R Nassar
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Beagan Nguy
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Tiana Pereira
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Gerardo Perez
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian J Raney
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Kate R Rosenbloom
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Daniel Schmelter
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matthew L Speir
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brittney D Wick
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Ann S Zweig
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - David Haussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Robert M Kuhn
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Maximilian Haeussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - W James Kent
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
23
|
De Silva NH, Bhai J, Chakiachvili M, Contreras-Moreira B, Cummins C, Frankish A, Gall A, Genez T, Howe K, Hunt S, Martin F, Moore B, Ogeh D, Parker A, Parton A, Ruffier M, Sakthivel MP, Sheppard D, Tate J, Thormann A, Thybert D, Trevanion S, Winterbottom A, Zerbino D, Finn R, Flicek P, Yates A. The Ensembl COVID-19 resource: ongoing integration of public SARS-CoV-2 data. Nucleic Acids Res 2022; 50:D765-D770. [PMID: 34634797 PMCID: PMC8524594 DOI: 10.1093/nar/gkab889] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 09/09/2021] [Accepted: 09/20/2021] [Indexed: 11/14/2022] Open
Abstract
The COVID-19 pandemic has seen unprecedented use of SARS-CoV-2 genome sequencing for epidemiological tracking and identification of emerging variants. Understanding the potential impact of these variants on the infectivity of the virus and the efficacy of emerging therapeutics and vaccines has become a cornerstone of the fight against the disease. To support the maximal use of genomic information for SARS-CoV-2 research, we launched the Ensembl COVID-19 browser; the first virus to be encompassed within the Ensembl platform. This resource incorporates a new Ensembl gene set, multiple variant sets, and annotation from several relevant resources aligned to the reference SARS-CoV-2 assembly. Since the first release in May 2020, the content has been regularly updated using our new rapid release workflow, and tools such as the Ensembl Variant Effect Predictor have been integrated. The Ensembl COVID-19 browser is freely available at https://covid-19.ensembl.org.
Collapse
Affiliation(s)
- Nishadi H De Silva
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jyothish Bhai
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Marc Chakiachvili
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Bruno Contreras-Moreira
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carla Cummins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Astrid Gall
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thiago Genez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kevin L Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Benjamin Moore
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Denye Ogeh
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anne Parker
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Andrew Parton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Magali Ruffier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Manoj Pandian Sakthivel
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dan Sheppard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - John Tate
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anja Thormann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - David Thybert
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephen J Trevanion
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Andrea Winterbottom
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel R Zerbino
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Andrew D Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
24
|
Johnson O, Fronterre C, Diggle PJ, Amoah B, Giorgi E. MBGapp: A Shiny application for teaching model-based geostatistics to population health scientists. PLoS One 2022; 16:e0262145. [PMID: 34972193 PMCID: PMC8719748 DOI: 10.1371/journal.pone.0262145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 12/16/2021] [Indexed: 11/19/2022] Open
Abstract
User-friendly interfaces have been increasingly used to facilitate the learning of advanced statistical methodology, especially for students with only minimal statistical training. In this paper, we illustrate the use of MBGapp for teaching geostatistical analysis to population health scientists. Using a case-study on Loa loa infections, we show how MBGapp can be used to teach the different stages of a geostatistical analysis in a more interactive fashion. For wider accessibility and usability, MBGapp is available as an R package and as a Shiny web-application that can be freely accessed on any web browser. In addition to MBGapp, we also present an auxiliary Shiny app, called VariagramApp, that can be used to aid the teaching of Gaussian processes in one and two dimensions using simulations.
Collapse
Affiliation(s)
- Olatunji Johnson
- CHICAS, Lancaster Medical School, Lancaster University, Lancaster, United Kingdom
- Department of Mathematics, University of Manchester, Manchester, United Kingdom
- * E-mail:
| | - Claudio Fronterre
- CHICAS, Lancaster Medical School, Lancaster University, Lancaster, United Kingdom
| | - Peter J. Diggle
- CHICAS, Lancaster Medical School, Lancaster University, Lancaster, United Kingdom
| | - Benjamin Amoah
- CHICAS, Lancaster Medical School, Lancaster University, Lancaster, United Kingdom
| | - Emanuele Giorgi
- CHICAS, Lancaster Medical School, Lancaster University, Lancaster, United Kingdom
| |
Collapse
|
25
|
Sandnes FE. CANDIDATE: A tool for generating anonymous participant-linking IDs in multi-session studies. PLoS One 2021; 16:e0260569. [PMID: 34910758 PMCID: PMC8673636 DOI: 10.1371/journal.pone.0260569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2021] [Accepted: 11/12/2021] [Indexed: 11/29/2022] Open
Abstract
Background To ensure the privacy of participants is an ethical and legal obligation for researchers. Yet, achieving anonymity can be technically difficult. When observing participants over time one needs mechanisms to link the data from the different sessions. Also, it is often necessary to expand the sample of participants during a project. Objectives To help researchers simplify the administration of such studies the CANDIDATE tool is proposed. This tool allows simple, unique, and anonymous participant IDs to be generated on the fly. Method Simulations were used to validate the uniqueness of the IDs as well as their anonymity. Results The tool can successfully generate IDs with a low collision rate while maintaining high anonymity. A practical compromise between integrity and anonymity was achieved when the ID space is about ten times the number of participants. Implications The tool holds potential for making it easier to collect more comprehensive empirical evidence over time that in turn will provide a more solid basis for drawing reliable conclusions based on research data. An open-source implementation of the tool that runs locally in a web-browser is made available.
Collapse
Affiliation(s)
- Frode Eika Sandnes
- Department of Computer Science, Faculty of Technology, Art and Design, Oslo Metropolitan University, Oslo, Norway
- School of Economics and Information Technology, Kristiania University College, Oslo, Norway
- * E-mail:
| |
Collapse
|
26
|
Sapra D, Kaur H, Dhall A, Raghava GPS. ProCanBio: A Database of Manually Curated Biomarkers for Prostate Cancer. J Comput Biol 2021; 28:1248-1257. [PMID: 34898255 DOI: 10.1089/cmb.2021.0348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Prostate cancer (PCa) is the second lethal malignancy in men worldwide. In the past, numerous research groups investigated the omics profiles of patients and scrutinized biomarkers for the diagnosis and prognosis of PCa. However, information related to the biomarkers is widely scattered across numerous resources in complex textual format, which poses hindrance to understand the tumorigenesis of this malignancy and scrutinization of robust signature. To create a comprehensive resource, we collected all the relevant literature on PCa biomarkers from the PubMed. We scrutinize the extensive information about each biomarker from a total of 412 unique research articles. Each entry of the database incorporates PubMed ID, biomarker name, biomarker type, biomolecule, source, subjects, validation status, and performance measures such as sensitivity, specificity, and hazard ratio (HR). In this study, we present ProCanBio, a manually curated database that maintains detailed data on 2053 entries of potential PCa biomarkers obtained from 412 publications in user-friendly tabular format. Among them are 766 protein-based, 507 RNA-based, 157 genomic mutations, 260 miRNA-based, and 122 metabolites-based biomarkers. To explore the information in the resource, a web-based interactive platform was developed with searching and browsing facilities. To the best of the authors' knowledge, there is no resource that can consolidate the information contained in all the published literature. Besides this, ProCanBio is freely available and is compatible with most web browsers and devices. Eventually, we anticipate this resource will be highly useful for the research community involved in the area of prostate malignancy.
Collapse
Affiliation(s)
- Dikscha Sapra
- Department of Computational Biology, Indraprastha Institute of Information Technology Delhi, New Delhi, India
| | - Harpreet Kaur
- Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology Delhi, New Delhi, India
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology Delhi, New Delhi, India
| |
Collapse
|
27
|
Cao Q, Irizarry YB, Yazhuk S, Tran T, Gadkari M, Franco LM. GCgx: transcriptome-wide exploration of the response to glucocorticoids. J Mol Endocrinol 2021; 68:B1-B4. [PMID: 34787097 PMCID: PMC8691098 DOI: 10.1530/jme-21-0107] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Accepted: 11/12/2021] [Indexed: 11/08/2022]
Abstract
Glucocorticoids are the cornerstone of immunosuppressive and anti-inflammatory therapy in humans, yet the mechanisms of glucocorticoid immunoregulation and toxicity remain unclear. The response to glucocorticoids is highly cell type-dependent, so translating results from different experimental systems into a better understanding of glucocorticoid effects in humans would benefit from rapid access to high-quality data on the response to glucocorticoids by different cell types. We introduce GCgx, a web application that allows investigators to quickly visualize changes in transcript abundance in response to glucocorticoids in a variety of cells and species. The tool is designed to grow by the addition of datasets based on input from the user community. GCgx is implemented in R and HTML and packaged as a Docker image. The tool and its source code are publicly available.
Collapse
Affiliation(s)
- Qilin Cao
- Functional Immunogenomics Unit, Systemic Autoimmunity Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, MD 20892
| | - Yamil Boo Irizarry
- Bioinformatics and Computational Biosciences Branch, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Rockville, MD 20852
| | - Svetlana Yazhuk
- Operations and Engineering Branch, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Rockville, MD 20852
| | - Thai Tran
- Functional Immunogenomics Unit, Systemic Autoimmunity Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, MD 20892
| | - Manasi Gadkari
- Functional Immunogenomics Unit, Systemic Autoimmunity Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, MD 20892
| | - Luis Miguel Franco
- Functional Immunogenomics Unit, Systemic Autoimmunity Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, MD 20892
- Corresponding Author: Luis M. Franco, MD. . Address: 9000 Rockville Pike, Bldg 10, Rm 13C101A, Bethesda, MD 20892. U.S.A. Phone: 301-827-2461, Fax: 301-480-6372
| |
Collapse
|
28
|
Pereira C, Mazein A, Farinha CM, Gray MA, Kunzelmann K, Ostaszewski M, Balaur I, Amaral MD, Falcao AO. CyFi-MAP: an interactive pathway-based resource for cystic fibrosis. Sci Rep 2021; 11:22223. [PMID: 34782688 PMCID: PMC8592983 DOI: 10.1038/s41598-021-01618-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 10/27/2021] [Indexed: 12/11/2022] Open
Abstract
Cystic fibrosis (CF) is a life-threatening autosomal recessive disease caused by more than 2100 mutations in the CF transmembrane conductance regulator (CFTR) gene, generating variability in disease severity among individuals with CF sharing the same CFTR genotype. Systems biology can assist in the collection and visualization of CF data to extract additional biological significance and find novel therapeutic targets. Here, we present the CyFi-MAP-a disease map repository of CFTR molecular mechanisms and pathways involved in CF. Specifically, we represented the wild-type (wt-CFTR) and the F508del associated processes (F508del-CFTR) in separate submaps, with pathways related to protein biosynthesis, endoplasmic reticulum retention, export, activation/inactivation of channel function, and recycling/degradation after endocytosis. CyFi-MAP is an open-access resource with specific, curated and continuously updated information on CFTR-related pathways available online at https://cysticfibrosismap.github.io/ . This tool was developed as a reference CF pathway data repository to be continuously updated and used worldwide in CF research.
Collapse
Affiliation(s)
- Catarina Pereira
- Faculty of Sciences, BioISI-Biosystems Integrative Sciences Institute, University of Lisboa, Campo Grande, 1749-016, Lisbon, Portugal
- LASIGE, Faculty of Sciences, University of Lisboa, Campo Grande, 1749-016, Lisbon, Portugal
| | - Alexander Mazein
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367, Belvaux, Luxembourg
- CIRI UMR5308, CNRS-ENS-UCBL-INSERM, European Institute for Systems Biology and Medicine, Université de Lyon, 50 Avenue Tony Garnier, 69007, Lyon, France
| | - Carlos M Farinha
- Faculty of Sciences, BioISI-Biosystems Integrative Sciences Institute, University of Lisboa, Campo Grande, 1749-016, Lisbon, Portugal
| | - Michael A Gray
- Biosciences Institute, University Medical School, Newcastle University, Framlington Place, Newcastle upon Tyne, NE2 4HH, UK
| | | | - Marek Ostaszewski
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367, Belvaux, Luxembourg
| | - Irina Balaur
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367, Belvaux, Luxembourg
- CIRI UMR5308, CNRS-ENS-UCBL-INSERM, European Institute for Systems Biology and Medicine, Université de Lyon, 50 Avenue Tony Garnier, 69007, Lyon, France
| | - Margarida D Amaral
- Faculty of Sciences, BioISI-Biosystems Integrative Sciences Institute, University of Lisboa, Campo Grande, 1749-016, Lisbon, Portugal
| | - Andre O Falcao
- Faculty of Sciences, BioISI-Biosystems Integrative Sciences Institute, University of Lisboa, Campo Grande, 1749-016, Lisbon, Portugal.
- LASIGE, Faculty of Sciences, University of Lisboa, Campo Grande, 1749-016, Lisbon, Portugal.
| |
Collapse
|
29
|
Rønneberg L, Cremaschi A, Hanes R, Enserink JM, Zucknick M. bayesynergy: flexible Bayesian modelling of synergistic interaction effects in in vitro drug combination experiments. Brief Bioinform 2021; 22:bbab251. [PMID: 34308471 PMCID: PMC8575029 DOI: 10.1093/bib/bbab251] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 05/26/2021] [Accepted: 06/14/2021] [Indexed: 11/24/2022] Open
Abstract
The effect of cancer therapies is often tested pre-clinically via in vitro experiments, where the post-treatment viability of the cancer cell population is measured through assays estimating the number of viable cells. In this way, large libraries of compounds can be tested, comparing the efficacy of each treatment. Drug interaction studies focus on the quantification of the additional effect encountered when two drugs are combined, as opposed to using the treatments separately. In the bayesynergy R package, we implement a probabilistic approach for the description of the drug combination experiment, where the observed dose response curve is modelled as a sum of the expected response under a zero-interaction model and an additional interaction effect (synergistic or antagonistic). Although the model formulation makes use of the Bliss independence assumption, we note that the posterior estimates of the dose-response surface can also be used to extract synergy scores based on other reference models, which we illustrate for the Highest Single Agent model. The interaction is modelled in a flexible manner, using a Gaussian process formulation. Since the proposed approach is based on a statistical model, it allows the natural inclusion of replicates, handles missing data and uneven concentration grids, and provides uncertainty quantification around the results. The model is implemented in the open-source Stan programming language providing a computationally efficient sampler, a fast approximation of the posterior through variational inference, and features parallel processing for working with large drug combination screens.
Collapse
Affiliation(s)
- Leiv Rønneberg
- Oslo Centre for Biostatistics and Epidemiology (OCBE), University of Oslo, Norway
| | - Andrea Cremaschi
- Singapore Institute for Clinical Sciences (SICS), A*STAR, Singapore
| | - Robert Hanes
- Department of Molecular Cell Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Montebello, Oslo 0379, Norway
- Centre for Cancer Cell Reprogramming, Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Jorrit M Enserink
- Department of Molecular Cell Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Montebello, Oslo 0379, Norway
- Centre for Cancer Cell Reprogramming, Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
- Department of Biosciences, Faculty of Mathematics and Natural Sciences, University of Oslo, PO Box 1066 Blindern, Oslo 0316, Norway
| | - Manuela Zucknick
- Oslo Centre for Biostatistics and Epidemiology (OCBE), University of Oslo, Norway
| |
Collapse
|
30
|
Mosca E, Bersanelli M, Matteuzzi T, Di Nanni N, Castellani G, Milanesi L, Remondini D. Characterization and comparison of gene-centered human interactomes. Brief Bioinform 2021; 22:bbab153. [PMID: 34010955 PMCID: PMC8574298 DOI: 10.1093/bib/bbab153] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 03/22/2021] [Accepted: 04/01/2021] [Indexed: 01/04/2023] Open
Abstract
The complex web of macromolecular interactions occurring within cells-the interactome-is the backbone of an increasing number of studies, but a clear consensus on the exact structure of this network is still lacking. Different genome-scale maps of human interactome have been obtained through several experimental techniques and functional analyses. Moreover, these maps can be enriched through literature-mining approaches, and different combinations of various 'source' databases have been used in the literature. It is therefore unclear to which extent the various interactomes yield similar results when used in the context of interactome-based approaches in network biology. We compared a comprehensive list of human interactomes on the basis of topology, protein complexes, molecular pathways, pathway cross-talk and disease gene prediction. In a general context of relevant heterogeneity, our study provides a series of qualitative and quantitative parameters that describe the state of the art of human interactomes and guidelines for selecting interactomes in future applications.
Collapse
Affiliation(s)
- Ettore Mosca
- Institute of Biomedical Technologies, National Research Council, Segrate (Milan), 20090, Italy
| | - Matteo Bersanelli
- Humanitas University, Department of Biomedical Sciences, Pieve Emanuele (Milan), 20090, Italy
| | - Tommaso Matteuzzi
- Department of Physics and Astronomy, University of Bologna, Bologna, 40127, Italy
| | - Noemi Di Nanni
- Institute of Biomedical Technologies, National Research Council, Segrate (Milan), 20090, Italy
| | - Gastone Castellani
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, 40127, Italy
| | - Luciano Milanesi
- Institute of Biomedical Technologies, National Research Council, Segrate (Milan), 20090, Italy
| | - Daniel Remondini
- Department of Physics and Astronomy, University of Bologna, Bologna, 40127, Italy
| |
Collapse
|
31
|
Tsukiyama S, Hasan MM, Fujii S, Kurata H. LSTM-PHV: prediction of human-virus protein-protein interactions by LSTM with word2vec. Brief Bioinform 2021; 22:bbab228. [PMID: 34160596 PMCID: PMC8574953 DOI: 10.1093/bib/bbab228] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 04/27/2021] [Accepted: 05/25/2021] [Indexed: 12/30/2022] Open
Abstract
Viral infection involves a large number of protein-protein interactions (PPIs) between human and virus. The PPIs range from the initial binding of viral coat proteins to host membrane receptors to the hijacking of host transcription machinery. However, few interspecies PPIs have been identified, because experimental methods including mass spectrometry are time-consuming and expensive, and molecular dynamic simulation is limited only to the proteins whose 3D structures are solved. Sequence-based machine learning methods are expected to overcome these problems. We have first developed the LSTM model with word2vec to predict PPIs between human and virus, named LSTM-PHV, by using amino acid sequences alone. The LSTM-PHV effectively learnt the training data with a highly imbalanced ratio of positive to negative samples and achieved AUCs of 0.976 and 0.973 and accuracies of 0.984 and 0.985 on the training and independent datasets, respectively. In predicting PPIs between human and unknown or new virus, the LSTM-PHV learned greatly outperformed the existing state-of-the-art PPI predictors. Interestingly, learning of only sequence contexts as words is sufficient for PPI prediction. Use of uniform manifold approximation and projection demonstrated that the LSTM-PHV clearly distinguished the positive PPI samples from the negative ones. We presented the LSTM-PHV online web server and support data that are freely available at http://kurata35.bio.kyutech.ac.jp/LSTM-PHV.
Collapse
Affiliation(s)
- Sho Tsukiyama
- Department of Interdisciplinary Informatics in the Kyushu Institute of Technology, Japan
| | | | - Satoshi Fujii
- Department of Bioscience and Bioinformatics in the Kyushu Institute of Technology, Japan
| | - Hiroyuki Kurata
- Department of Bioscience and Bioinformatics in the Kyushu Institute of Technology, Japan
| |
Collapse
|
32
|
de Medeiros Oliveira M, Bonadio I, Lie de Melo A, Mendes Souza G, Durham AM. TSSFinder-fast and accurate ab initio prediction of the core promoter in eukaryotic genomes. Brief Bioinform 2021; 22:bbab198. [PMID: 34050351 PMCID: PMC8574697 DOI: 10.1093/bib/bbab198] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Revised: 02/14/2021] [Accepted: 02/23/2021] [Indexed: 12/02/2022] Open
Abstract
Promoter annotation is an important task in the analysis of a genome. One of the main challenges for this task is locating the border between the promoter region and the transcribing region of the gene, the transcription start site (TSS). The TSS is the reference point to delimit the DNA sequence responsible for the assembly of the transcribing complex. As the same gene can have more than one TSS, so to delimit the promoter region, it is important to locate the closest TSS to the site of the beginning of the translation. This paper presents TSSFinder, a new software for the prediction of the TSS signal of eukaryotic genes that is significantly more accurate than other available software. We currently are the only application to offer pre-trained models for six different eukaryotic organisms: Arabidopsis thaliana, Drosophila melanogaster, Gallus gallus, Homo sapiens, Oryza sativa and Saccharomyces cerevisiae. Additionally, our software can be easily customized for specific organisms using only 125 DNA sequences with a validated TSS signal and corresponding genomic locations as a training set. TSSFinder is a valuable new tool for the annotation of genomes. TSSFinder source code and docker container can be downloaded from http://tssfinder.github.io. Alternatively, TSSFinder is also available as a web service at http://sucest-fun.org/wsapp/tssfinder/.
Collapse
Affiliation(s)
| | - Igor Bonadio
- Data Science, Elo7 Research Lab, São Paulo, Brazil
| | | | | | | |
Collapse
|
33
|
Kreis J, Nedić B, Mazur J, Urban M, Schelhorn SE, Grombacher T, Geist F, Brors B, Zühlsdorf M, Staub E. RosettaSX: Reliable gene expression signature scoring of cancer models and patients. Neoplasia 2021; 23:1069-1077. [PMID: 34583245 PMCID: PMC8479477 DOI: 10.1016/j.neo.2021.08.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 08/28/2021] [Accepted: 08/30/2021] [Indexed: 11/29/2022]
Abstract
Gene expression signatures have proven their potential to characterize important cancer phenomena like oncogenic signaling pathway activities, cellular origins of tumors, or immune cell infiltration into tumor tissues. Large collections of expression signatures provide the basis for their application to data sets, but the applicability of each signature in a new experimental context must be reassessed. We apply a methodology that utilizes the previously developed concept of coherent expression of genes in signatures to identify translatable signatures before scoring their activity in single tumors. We present a web interface (www.rosettasx.com) that applies our methodology to expression data from the Cancer Cell Line Encyclopaedia and The Cancer Genome Atlas. Configurable heat maps visualize per-cancer signature scores for 293 hand-curated literature-derived gene sets representing a wide range of cancer-relevant transcriptional modules and phenomena. The platform allows users to complement heatmaps of signature scores with molecular information on SNVs, CNVs, gene expression, gene dependency, and protein abundance or to analyze own signatures. Clustered heatmaps and further plots to drill-down results support users in studying oncological processes in cancer subtypes, thereby providing a rich resource to explore how mechanisms of cancer interact with each other as demonstrated by exemplary analyses of 2 cancer types.
Collapse
Affiliation(s)
- Julian Kreis
- Department of Translational Medicine, Oncology Bioinformatics, Merck KGaA, Darmstadt, Germany; Faculty of Bioscience, University of Heidelberg, Heidelberg, Germany
| | - Boro Nedić
- Department of Translational Medicine, Oncology Bioinformatics, Merck KGaA, Darmstadt, Germany
| | - Johanna Mazur
- Department of Translational Medicine, Oncology Bioinformatics, Merck KGaA, Darmstadt, Germany
| | - Miriam Urban
- Department of Translational Medicine, Oncology Bioinformatics, Merck KGaA, Darmstadt, Germany
| | - Sven-Eric Schelhorn
- Department of Translational Medicine, Oncology Bioinformatics, Merck KGaA, Darmstadt, Germany
| | - Thomas Grombacher
- Department of Translational Medicine, Oncology Bioinformatics, Merck KGaA, Darmstadt, Germany
| | - Felix Geist
- Therapeutic Innovation Platform Oncology & Immuno-Oncology, Merck KGaA, Darmstadt, Germany
| | - Benedikt Brors
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany; German Cancer Consortium (DKTK), Core Center, Heidelberg, Germany
| | - Michael Zühlsdorf
- Therapeutic Innovation Platform Oncology & Immuno-Oncology, Merck KGaA, Darmstadt, Germany
| | - Eike Staub
- Department of Translational Medicine, Oncology Bioinformatics, Merck KGaA, Darmstadt, Germany.
| |
Collapse
|
34
|
Ross K, Varani AM, Snesrud E, Huang H, Alvarenga DO, Zhang J, Wu C, McGann P, Chandler M. TnCentral: a Prokaryotic Transposable Element Database and Web Portal for Transposon Analysis. mBio 2021; 12:e0206021. [PMID: 34517763 PMCID: PMC8546635 DOI: 10.1128/mbio.02060-21] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Accepted: 08/11/2021] [Indexed: 01/03/2023] Open
Abstract
We describe here the structure and organization of TnCentral (https://tncentral.proteininformationresource.org/ [or the mirror link at https://tncentral.ncc.unesp.br/]), a web resource for prokaryotic transposable elements (TE). TnCentral currently contains ∼400 carefully annotated TE, including transposons from the Tn3, Tn7, Tn402, and Tn554 families; compound transposons; integrons; and associated insertion sequences (IS). These TE carry passenger genes, including genes conferring resistance to over 25 classes of antibiotics and nine types of heavy metal, as well as genes responsible for pathogenesis in plants, toxin/antitoxin gene pairs, transcription factors, and genes involved in metabolism. Each TE has its own entry page, providing details about its transposition genes, passenger genes, and other sequence features required for transposition, as well as a graphical map of all features. TnCentral content can be browsed and queried through text- and sequence-based searches with a graphic output. We describe three use cases, which illustrate how the search interface, results tables, and entry pages can be used to explore and compare TE. TnCentral also includes downloadable software to facilitate user-driven identification, with manual annotation, of certain types of TE in genomic sequences. Through the TnCentral homepage, users can also access TnPedia, which provides comprehensive reviews of the major TE families, including an extensive general section and specialized sections with descriptions of insertion sequence and transposon families. TnCentral and TnPedia are intuitive resources that can be used by clinicians and scientists to assess TE diversity in clinical, veterinary, and environmental samples. IMPORTANCE The ability of bacteria to undergo rapid evolution and adapt to changing environmental circumstances drives the public health crisis of multiple antibiotic resistance, as well as outbreaks of disease in economically important agricultural crops and animal husbandry. Prokaryotic transposable elements (TE) play a critical role in this. Many carry "passenger genes" (not required for the transposition process) conferring resistance to antibiotics or heavy metals or causing disease in plants and animals. Passenger genes are spread by normal TE transposition activities and by insertion into plasmids, which then spread via conjugation within and across bacterial populations. Thus, an understanding of TE composition and transposition mechanisms is key to developing strategies to combat bacterial pathogenesis. Toward this end, we have developed TnCentral, a bioinformatics resource dedicated to describing and exploring the structural and functional features of prokaryotic TE whose use is intuitive and accessible to users with or without bioinformatics expertise.
Collapse
Affiliation(s)
- Karen Ross
- Protein Information Resource, Department of Biochemistry and Molecular and Cellular Biology, Georgetown University Medical Center, Washington, DC, USA
| | - Alessandro M. Varani
- School of Agricultural and Veterinary Sciences, Universidade Estadual Paulista, Jaboticabal, Sao Paulo, Brazil
| | - Erik Snesrud
- Multidrug-Resistant Organism Repository and Surveillance Network, Walter Reed Army Institute of Research, Silver Spring, Maryland, USA
| | - Hongzhan Huang
- Protein Information Resource, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, Delaware, USA
| | - Danillo Oliveira Alvarenga
- School of Agricultural and Veterinary Sciences, Universidade Estadual Paulista, Jaboticabal, Sao Paulo, Brazil
| | - Jian Zhang
- Protein Information Resource, Department of Biochemistry and Molecular and Cellular Biology, Georgetown University Medical Center, Washington, DC, USA
| | - Cathy Wu
- Protein Information Resource, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, Delaware, USA
| | - Patrick McGann
- Multidrug-Resistant Organism Repository and Surveillance Network, Walter Reed Army Institute of Research, Silver Spring, Maryland, USA
| | - Mick Chandler
- Department of Biochemistry and Molecular and Cellular Biology, Georgetown University Medical Center, Washington, DC, USA
| |
Collapse
|
35
|
Yu S, Drton M, Promislow DEL, Shojaie A. CorDiffViz: an R package for visualizing multi-omics differential correlation networks. BMC Bioinformatics 2021; 22:486. [PMID: 34627139 PMCID: PMC8501646 DOI: 10.1186/s12859-021-04383-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 09/20/2021] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND Differential correlation networks are increasingly used to delineate changes in interactions among biomolecules. They characterize differences between omics networks under two different conditions, and can be used to delineate mechanisms of disease initiation and progression. RESULTS We present a new R package, CorDiffViz, that facilitates the estimation and visualization of differential correlation networks using multiple correlation measures and inference methods. The software is implemented in R, HTML and Javascript, and is available at https://github.com/sqyu/CorDiffViz . Visualization has been tested for the Chrome and Firefox web browsers. A demo is available at https://diffcornet.github.io/CorDiffViz/demo.html . CONCLUSIONS Our software offers considerable flexibility by allowing the user to interact with the visualization and choose from different estimation methods and visualizations. It also allows the user to easily toggle between correlation networks for samples under one condition and differential correlations between samples under two conditions. Moreover, the software facilitates integrative analysis of cross-correlation networks between two omics data sets.
Collapse
Affiliation(s)
- Shiqing Yu
- Department of Statistics, University of Washington, NE Stevens Way, Seattle, WA, 98195, USA.
| | - Mathias Drton
- Department of Mathematics, Technical University of Munich, Boltzmannstraße, 85748, Garching bei München, Germany
| | - Daniel E L Promislow
- Departments of Pathology and Biology, University of Washington, NE Pacific St, Seattle, WA, 98195, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, NE Pacific St, Seattle, WA, 98195, USA
| |
Collapse
|
36
|
Gröschel MI, Owens M, Freschi L, Vargas R, Marin MG, Phelan J, Iqbal Z, Dixit A, Farhat MR. GenTB: A user-friendly genome-based predictor for tuberculosis resistance powered by machine learning. Genome Med 2021; 13:138. [PMID: 34461978 PMCID: PMC8407037 DOI: 10.1186/s13073-021-00953-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Accepted: 08/12/2021] [Indexed: 01/31/2023] Open
Abstract
BACKGROUND Multidrug-resistant Mycobacterium tuberculosis (Mtb) is a significant global public health threat. Genotypic resistance prediction from Mtb DNA sequences offers an alternative to laboratory-based drug-susceptibility testing. User-friendly and accurate resistance prediction tools are needed to enable public health and clinical practitioners to rapidly diagnose resistance and inform treatment regimens. RESULTS We present Translational Genomics platform for Tuberculosis (GenTB), a free and open web-based application to predict antibiotic resistance from next-generation sequence data. The user can choose between two potential predictors, a Random Forest (RF) classifier and a Wide and Deep Neural Network (WDNN) to predict phenotypic resistance to 13 and 10 anti-tuberculosis drugs, respectively. We benchmark GenTB's predictive performance along with leading TB resistance prediction tools (Mykrobe and TB-Profiler) using a ground truth dataset of 20,408 isolates with laboratory-based drug susceptibility data. All four tools reliably predicted resistance to first-line tuberculosis drugs but had varying performance for second-line drugs. The mean sensitivities for GenTB-RF and GenTB-WDNN across the nine shared drugs were 77.6% (95% CI 76.6-78.5%) and 75.4% (95% CI 74.5-76.4%), respectively, and marginally higher than the sensitivities of TB-Profiler at 74.4% (95% CI 73.4-75.3%) and Mykrobe at 71.9% (95% CI 70.9-72.9%). The higher sensitivities were at an expense of ≤ 1.5% lower specificity: Mykrobe 97.6% (95% CI 97.5-97.7%), TB-Profiler 96.9% (95% CI 96.7 to 97.0%), GenTB-WDNN 96.2% (95% CI 96.0 to 96.4%), and GenTB-RF 96.1% (95% CI 96.0 to 96.3%). Averaged across the four tools, genotypic resistance sensitivity was 11% and 9% lower for isoniazid and rifampicin respectively, on isolates sequenced at low depth (< 10× across 95% of the genome) emphasizing the need to quality control input sequence data before prediction. We discuss differences between tools in reporting results to the user including variants underlying the resistance calls and any novel or indeterminate variants CONCLUSIONS: GenTB is an easy-to-use online tool to rapidly and accurately predict resistance to anti-tuberculosis drugs. GenTB can be accessed online at https://gentb.hms.harvard.edu , and the source code is available at https://github.com/farhat-lab/gentb-site .
Collapse
Affiliation(s)
- Matthias I Gröschel
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Martin Owens
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Luca Freschi
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Roger Vargas
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Maximilian G Marin
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Jody Phelan
- Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK
| | - Zamin Iqbal
- European Bioinformatics Institute, Hinxton, Cambridge, CB10 ISD, UK
| | - Avika Dixit
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Infectious Diseases, Boston Children's Hospital, Boston, MA, USA
| | - Maha R Farhat
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
- Division of Pulmonary and Critical Care Medicine, Massachusetts General Hospital, Boston, MA, USA.
| |
Collapse
|
37
|
Poirion OB, Jing Z, Chaudhary K, Huang S, Garmire LX. DeepProg: an ensemble of deep-learning and machine-learning models for prognosis prediction using multi-omics data. Genome Med 2021; 13:112. [PMID: 34261540 PMCID: PMC8281595 DOI: 10.1186/s13073-021-00930-x] [Citation(s) in RCA: 61] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 06/25/2021] [Indexed: 12/17/2022] Open
Abstract
Multi-omics data are good resources for prognosis and survival prediction; however, these are difficult to integrate computationally. We introduce DeepProg, a novel ensemble framework of deep-learning and machine-learning approaches that robustly predicts patient survival subtypes using multi-omics data. It identifies two optimal survival subtypes in most cancers and yields significantly better risk-stratification than other multi-omics integration methods. DeepProg is highly predictive, exemplified by two liver cancer (C-index 0.73-0.80) and five breast cancer datasets (C-index 0.68-0.73). Pan-cancer analysis associates common genomic signatures in poor survival subtypes with extracellular matrix modeling, immune deregulation, and mitosis processes. DeepProg is freely available at https://github.com/lanagarmire/DeepProg.
Collapse
Affiliation(s)
- Olivier B Poirion
- Current address: Computational Sciences, The Jackson Laboratory, 10 Discovery Drive Farmington, Farmington, Connecticut, 06032, USA
- University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - Zheng Jing
- Current address: Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48105, USA
| | - Kumardeep Chaudhary
- University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
- Current address: Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY, 10029, USA
| | - Sijia Huang
- University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
- Current address: Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Lana X Garmire
- University of Hawaii Cancer Center, Honolulu, HI, 96813, USA.
- Current address: Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48105, USA.
| |
Collapse
|
38
|
Xiao M, Liu G, Xie J, Dai Z, Wei Z, Ren Z, Yu J, Zhang L. 2019nCoVAS: Developing the Web Service for Epidemic Transmission Prediction, Genome Analysis, and Psychological Stress Assessment for 2019-nCoV. IEEE/ACM Trans Comput Biol Bioinform 2021; 18:1250-1261. [PMID: 33406042 PMCID: PMC8769043 DOI: 10.1109/tcbb.2021.3049617] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 10/02/2020] [Accepted: 01/03/2021] [Indexed: 06/12/2023]
Abstract
Since the COVID-19 epidemic is still expanding around the world and poses a serious threat to human life and health, it is necessary for us to carry out epidemic transmission prediction, whole genome sequence analysis, and public psychological stress assessment for 2019-nCoV. However, transmission prediction models are insufficiently accurate and genome sequence characteristics are not clear, and it is difficult to dynamically assess the public psychological stress state under the 2019-nCoV epidemic. Therefore, this study develops a 2019nCoVAS web service (http://www.combio-lezhang.online/2019ncov/home.html) that not only offers online epidemic transmission prediction and lineage-associated underrepresented permutation (LAUP) analysis services to investigate the spreading trends and genome sequence characteristics, but also provides psychological stress assessments based on such an emotional dictionary that we built for 2019-nCoV. Finally, we discuss the shortcomings and further study of the 2019nCoVAS web service.
Collapse
Affiliation(s)
- Ming Xiao
- College of Computer ScienceSichuan UniversityChengdu610065PR China
| | - Guangdi Liu
- College of Computer and Information ScienceSouthwest UniversityChong-Qing400715PR China
| | - Jianghang Xie
- College of Computer ScienceSichuan UniversityChengdu610065PR China
| | - Zichun Dai
- College of Computer ScienceSichuan UniversityChengdu610065PR China
| | - Zihao Wei
- College of Computer ScienceSichuan UniversityChengdu610065PR China
| | - Ziyao Ren
- College of Computer ScienceSichuan UniversityChengdu610065PR China
| | - Jun Yu
- CAS Key Laboratory of Genome Sciences and InformationBeijing Institute of Genomics, Chinese Academy of SciencesBeijing100101PR China
| | - Le Zhang
- College of Computer ScienceSichuan UniversityChengdu610065PR China
| |
Collapse
|
39
|
Abstract
Lead optimization, a critical step in early stage drug discovery, involves making chemical modifications to a small-molecule ligand to improve properties such as binding affinity. We recently developed DeepFrag, a deep-learning model capable of recommending such modifications. Though a powerful hypothesis-generating tool, DeepFrag is currently implemented in Python and so requires a certain degree of computational expertise. To encourage broader adoption, we have created the DeepFrag browser app, which provides a user-friendly graphical user interface that runs the DeepFrag model in users' web browsers. The browser app does not require users to upload their molecular structures to a third-party server, nor does it require the separate installation of any third-party software. We are hopeful that the app will be a useful tool for both researchers and students. It can be accessed free of charge, without registration, at http://durrantlab.com/deepfrag. The source code is also available at http://git.durrantlab.com/jdurrant/deepfrag-app, released under the terms of the open-source Apache License, Version 2.0.
Collapse
Affiliation(s)
- Harrison Green
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Jacob D. Durrant
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| |
Collapse
|
40
|
Abstract
Cancer is driven by distinctive sorts of changes and basic variations in genes. Recognizing cancer driver genes is basic for accurate oncological analysis. Numerous methodologies to distinguish and identify drivers presently exist, but efficient tools to combine and optimize them on huge datasets are few. Most strategies for prioritizing transformations depend basically on frequency-based criteria. Strategies are required to dependably prioritize organically dynamic driver changes over inert passengers in high-throughput sequencing cancer information sets. This study proposes a model namely PCDG-Pred which works as a utility capable of distinguishing cancer driver and passenger attributes of genes based on sequencing data. Keeping in view the significance of the cancer driver genes an efficient method is proposed to identify the cancer driver genes. Further, various validation techniques are applied at different levels to establish the effectiveness of the model and to obtain metrics like accuracy, Mathew's correlation coefficient, sensitivity, and specificity. The results of the study strongly indicate that the proposed strategy provides a fundamental functional advantage over other existing strategies for cancer driver genes identification. Subsequently, careful experiments exhibit that the accuracy metrics obtained for self-consistency, independent set, and cross-validation tests are 91.08%., 87.26%, and 92.48% respectively.
Collapse
Affiliation(s)
- Sharaf J Malebary
- Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, P.O. Box 344, Rabigh, 21911, Saudi Arabia
| | - Yaser Daanial Khan
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Pakistan.
| |
Collapse
|
41
|
Chandramohan R, Kakkar N, Roy A, Parsons DW. reconCNV: interactive visualization of copy number data from high-throughput sequencing. Bioinformatics 2021; 37:1164-1167. [PMID: 32821910 DOI: 10.1093/bioinformatics/btaa746] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Revised: 07/21/2020] [Accepted: 08/17/2020] [Indexed: 12/30/2022] Open
Abstract
SUMMARY Copy number variation (CNV) is an important category of unbalanced structural rearrangement. While methods for detecting CNV in high-throughput targeted sequencing have become increasingly sophisticated, dedicated tools for interactive and dynamic visualization of CNV from these data are still lacking. We describe reconCNV, a tool that produces an interactive and annotated web-based dashboard for viewing and summarizing CNVs detected in next-generation sequencing (NGS) data. reconCNV is designed to work with delimited result files from most NGS CNV callers with minor adjustments to the configuration file. The reconCNV output is an HTML file that is viewable on any modern web browser, requires no backend server, and can be readily appended to existing analysis pipelines. In addition to a standard CNV track for visualizing relative fold change and absolute copy number, the tool includes an auxiliary variant allele fraction track for visualizing underlying allelic imbalance and loss of heterozygosity. A feature to mask assay-specific technical artifacts and a direct HTML link out to the UCSC Genome Browser are also included to augment the reviewer experience. By providing a light-weight plugin for interactive visualization to existing NGS CNV pipelines, reconCNV can facilitate efficient NGS CNV visualization and interpretation in both research and clinical settings. AVAILABILITY AND IMPLEMENTATION The source code and documentation including a tutorial can be accessed at https://github.com/rghu/reconCNV as well as a Docker image at https://hub.docker.com/repository/docker/raghuc1990/reconcnv. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Nipun Kakkar
- Department of Pediatrics, Texas Children's Cancer Center, Houston, TX 77030, USA
| | - Angshumoy Roy
- Department of Pediatrics, Texas Children's Cancer Center, Houston, TX 77030, USA
- Department of Pathology and Immunology, Baylor College of Medicine, Houston, TX 77030, USA
- Department of Pathology, Texas Children's Hospital, Houston, TX 77030, USA
- Dan L. Duncan Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - D Williams Parsons
- Department of Molecular and Human Genetics, Houston, TX 77030, USA
- Department of Pediatrics, Texas Children's Cancer Center, Houston, TX 77030, USA
- Department of Pathology and Immunology, Baylor College of Medicine, Houston, TX 77030, USA
- Dan L. Duncan Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA
| |
Collapse
|
42
|
Yang J, Zhao S, Wang J, Sheng Q, Liu Q, Shyr Y. Immu-Mela: An open resource for exploring immunotherapy-related multidimensional genomic profiles in melanoma. J Genet Genomics 2021; 48:361-368. [PMID: 34127402 PMCID: PMC8349898 DOI: 10.1016/j.jgg.2021.03.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 03/15/2021] [Accepted: 03/17/2021] [Indexed: 11/22/2022]
Abstract
There are increasing studies aimed to reveal genomic hallmarks predictive of immune checkpoint blockade (ICB) treatment response, which generated a large number of data and provided an unprecedented opportunity to identify response-related features and evaluate their robustness across cohorts. However, those valuable data sets are not easily accessible to the research community. To take full advantage of existing large-scale immuno-genomic profiles, we developed Immu-Mela (http://bioinfo.vanderbilt.edu/database/Immu-Mela/), a multidimensional immuno-genomic portal that provides interactive exploration of associations between ICB responsiveness and multi-omics features in melanoma, including genetic, transcriptomics, immune cells, and single-cell populations. Immu-Mela also enables integrative analysis of any two genomic features. We demonstrated the value of Immu-Mela by identifying known and novel genomic features associated with ICB response. In addition, Immu-Mela allows users to upload their data sets (unrestricted to any cancer types) and co-analyze with existing data to identify and validate signatures of interest. Immu-Mela reduces barriers between researchers and complex genomic data, facilitating discoveries in cancer immunotherapy.
Collapse
Affiliation(s)
- Jing Yang
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville TN 37203, USA; Department of Biostatistics, Vanderbilt University Medical Center, Nashville TN 37203, USA
| | - Shilin Zhao
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville TN 37203, USA; Department of Biostatistics, Vanderbilt University Medical Center, Nashville TN 37203, USA
| | - Jing Wang
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville TN 37203, USA; Department of Biostatistics, Vanderbilt University Medical Center, Nashville TN 37203, USA
| | - Quanhu Sheng
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville TN 37203, USA; Department of Biostatistics, Vanderbilt University Medical Center, Nashville TN 37203, USA
| | - Qi Liu
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville TN 37203, USA; Department of Biostatistics, Vanderbilt University Medical Center, Nashville TN 37203, USA.
| | - Yu Shyr
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville TN 37203, USA; Department of Biostatistics, Vanderbilt University Medical Center, Nashville TN 37203, USA.
| |
Collapse
|
43
|
Kösters LM, Wiechers S, Lyko P, Müller KF, Wicke S. WARPP-web application for the research of parasitic plants. Plant Physiol 2021; 185:1374-1380. [PMID: 33793906 PMCID: PMC8133606 DOI: 10.1093/plphys/kiaa105] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 11/23/2020] [Indexed: 05/18/2023]
Abstract
The lifestyle of parasitic plants is associated with peculiar morphological, genetic, and physiological adaptations that existing online plant-specific resources fail to adequately represent. Here, we introduce the Web Application for the Research of Parasitic Plants (WARPP) as an online resource dedicated to advancing research and development of parasitic plant biology. WARPP is a framework to facilitate international efforts by providing a central hub of curated evolutionary, ecological, and genetic data. The first version of WARPP provides a community hub for researchers to test this web application, for which curated data revolving around the economically important Broomrape family (Orobanchaceae) is readily accessible. The initial set of WARPP online tools includes a genome browser that centralizes genomic information for sequenced parasitic plant genomes, an orthogroup summary detailing the presence and absence of orthologous genes in parasites compared with nonparasitic plants, and an ancestral trait explorer showing the evolution of life-history preferences along phylogenies. WARPP represents a project under active development and relies on the scientific community to populate the web app's database and further the development of new analysis tools. The first version of WARPP can be securely accessed at https://parasiticplants.app. The source code is licensed under GNU GPLv2 and is available at https://github.com/wickeLab/WARPP.
Collapse
Affiliation(s)
- Lara M Kösters
- Plant Evolutionary Biology, Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
- Plant Systematics and Biodiversity, Institute for Biology, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Sarah Wiechers
- Evolution and Biodiversity of Plants, Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Peter Lyko
- Plant Evolutionary Biology, Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
- Plant Systematics and Biodiversity, Institute for Biology, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Kai F Müller
- Evolution and Biodiversity of Plants, Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Susann Wicke
- Plant Evolutionary Biology, Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
- Plant Systematics and Biodiversity, Institute for Biology, Humboldt-Universität zu Berlin, Berlin, Germany
| |
Collapse
|
44
|
Dutta D, VandeHaar P, Fritsche LG, Zöllner S, Boehnke M, Scott LJ, Lee S. A powerful subset-based method identifies gene set associations and improves interpretation in UK Biobank. Am J Hum Genet 2021; 108:669-681. [PMID: 33730541 DOI: 10.1016/j.ajhg.2021.02.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Accepted: 02/19/2021] [Indexed: 02/06/2023] Open
Abstract
Tests of association between a phenotype and a set of genes in a biological pathway can provide insights into the genetic architecture of complex phenotypes beyond those obtained from single-variant or single-gene association analysis. However, most existing gene set tests have limited power to detect gene set-phenotype association when a small fraction of the genes are associated with the phenotype and cannot identify the potentially "active" genes that might drive a gene set-based association. To address these issues, we have developed Gene set analysis Association Using Sparse Signals (GAUSS), a method for gene set association analysis that requires only GWAS summary statistics. For each significantly associated gene set, GAUSS identifies the subset of genes that have the maximal evidence of association and can best account for the gene set association. Using pre-computed correlation structure among test statistics from a reference panel, our p value calculation is substantially faster than other permutation- or simulation-based approaches. In simulations with varying proportions of causal genes, we find that GAUSS effectively controls type 1 error rate and has greater power than several existing methods, particularly when a small proportion of genes account for the gene set signal. Using GAUSS, we analyzed UK Biobank GWAS summary statistics for 10,679 gene sets and 1,403 binary phenotypes. We found that GAUSS is scalable and identified 13,466 phenotype and gene set association pairs. Within these gene sets, we identify an average of 17.2 (max = 405) genes that underlie these gene set associations.
Collapse
Affiliation(s)
- Diptavo Dutta
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Peter VandeHaar
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Lars G Fritsche
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Sebastian Zöllner
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Michael Boehnke
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Laura J Scott
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Seunggeun Lee
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA; Graduate School of Data Science, Seoul National University, Seoul 08826, Republic of Korea.
| |
Collapse
|
45
|
Abstract
riboCIRC is a translatome data-oriented circRNA database specifically designed for hosting, exploring, analyzing, and visualizing translatable circRNAs from multi-species. The database provides a comprehensive repository of computationally predicted ribosome-associated circRNAs; a manually curated collection of experimentally verified translated circRNAs; an evaluation of cross-species conservation of translatable circRNAs; a systematic de novo annotation of putative circRNA-encoded peptides, including sequence, structure, and function; and a genome browser to visualize the context-specific occupant footprints of circRNAs. It represents a valuable resource for the circRNA research community and is publicly available at http://www.ribocirc.com .
Collapse
Affiliation(s)
- Huihui Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Mingzhe Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yan Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Ludong Yang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China.
| | - Hongwei Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
46
|
Brian L, Warren B, McAtee P, Rodrigues J, Nieuwenhuizen N, Pasha A, David KM, Richardson A, Provart NJ, Allan AC, Varkonyi-Gasic E, Schaffer RJ. A gene expression atlas for kiwifruit (Actinidia chinensis) and network analysis of transcription factors. BMC Plant Biol 2021; 21:121. [PMID: 33639842 PMCID: PMC7913447 DOI: 10.1186/s12870-021-02894-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 02/18/2021] [Indexed: 05/02/2023]
Abstract
BACKGROUND Transcriptomic studies combined with a well annotated genome have laid the foundations for new understanding of molecular processes. Tools which visualise gene expression patterns have further added to these resources. The manual annotation of the Actinidia chinensis (kiwifruit) genome has resulted in a high quality set of 33,044 genes. Here we investigate gene expression patterns in diverse tissues, visualised in an Electronic Fluorescent Pictograph (eFP) browser, to study the relationship of transcription factor (TF) expression using network analysis. RESULTS Sixty-one samples covering diverse tissues at different developmental time points were selected for RNA-seq analysis and an eFP browser was generated to visualise this dataset. 2839 TFs representing 57 different classes were identified and named. Network analysis of the TF expression patterns separated TFs into 14 different modules. Two modules consisting of 237 TFs were correlated with floral bud and flower development, a further two modules containing 160 TFs were associated with fruit development and maturation. A single module of 480 TFs was associated with ethylene-induced fruit ripening. Three "hub" genes correlated with flower and fruit development consisted of a HAF-like gene central to gynoecium development, an ERF and a DOF gene. Maturing and ripening hub genes included a KNOX gene that was associated with seed maturation, and a GRAS-like TF. CONCLUSIONS This study provides an insight into the complexity of the transcriptional control of flower and fruit development, as well as providing a new resource to the plant community. The Actinidia eFP browser is provided in an accessible format that allows researchers to download and work internally.
Collapse
Affiliation(s)
- Lara Brian
- The New Zealand Institute for Plant and Food Research Ltd (Plant & Food Research), Private Bag 92169, Auckland, 1146, New Zealand
| | - Ben Warren
- The New Zealand Institute for Plant and Food Research Ltd (Plant & Food Research), Private Bag 92169, Auckland, 1146, New Zealand
- School of Biological Science, The University of Auckland, Private Bag 92019, Auckland, 1146, New Zealand
| | - Peter McAtee
- The New Zealand Institute for Plant and Food Research Ltd (Plant & Food Research), Private Bag 92169, Auckland, 1146, New Zealand
| | - Jessica Rodrigues
- The New Zealand Institute for Plant and Food Research Ltd (Plant & Food Research), Private Bag 92169, Auckland, 1146, New Zealand
| | - Niels Nieuwenhuizen
- The New Zealand Institute for Plant and Food Research Ltd (Plant & Food Research), Private Bag 92169, Auckland, 1146, New Zealand
| | - Asher Pasha
- Department of Cell & Systems Biology / Centre for the Analysis of Genome Evolution and Function, University of Toronto, 25 Willcocks St, Toronto, ON, M5S 3B2, Canada
| | - Karine M David
- School of Biological Science, The University of Auckland, Private Bag 92019, Auckland, 1146, New Zealand
| | - Annette Richardson
- The New Zealand Institute for Plant and Food Research Ltd (Plant & Food Research), 121 Keri Downs Road, Kerikeri, 0294, New Zealand
| | - Nicholas J Provart
- Department of Cell & Systems Biology / Centre for the Analysis of Genome Evolution and Function, University of Toronto, 25 Willcocks St, Toronto, ON, M5S 3B2, Canada
| | - Andrew C Allan
- The New Zealand Institute for Plant and Food Research Ltd (Plant & Food Research), Private Bag 92169, Auckland, 1146, New Zealand
- School of Biological Science, The University of Auckland, Private Bag 92019, Auckland, 1146, New Zealand
| | - Erika Varkonyi-Gasic
- The New Zealand Institute for Plant and Food Research Ltd (Plant & Food Research), Private Bag 92169, Auckland, 1146, New Zealand
| | - Robert J Schaffer
- School of Biological Science, The University of Auckland, Private Bag 92019, Auckland, 1146, New Zealand.
- The New Zealand Institute for Plant and Food Research Ltd (Plant & Food Research), 55 Old Mill Road, Motueka, 7198, New Zealand.
| |
Collapse
|
47
|
Zhai J, Song J, Zhang T, Xie S, Ma C. deepEA: a containerized web server for interactive analysis of epitranscriptome sequencing data. Plant Physiol 2021; 185:29-33. [PMID: 33631802 PMCID: PMC8133649 DOI: 10.1093/plphys/kiaa008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 10/16/2020] [Indexed: 06/12/2023]
Abstract
The containerized web server deepEA allows interactive, reproducible, and collaborative analysis of epitranscriptome sequencing data.
Collapse
Affiliation(s)
- Jingjing Zhai
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Shaanxi, Yangling 712100, China
- Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Shaanxi, Yangling 712100, China
| | - Jie Song
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Shaanxi, Yangling 712100, China
- Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Shaanxi, Yangling 712100, China
| | - Ting Zhang
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Shaanxi, Yangling 712100, China
- Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Shaanxi, Yangling 712100, China
| | - Shang Xie
- Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Shaanxi, Yangling 712100, China
| | - Chuang Ma
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Shaanxi, Yangling 712100, China
- Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Shaanxi, Yangling 712100, China
| |
Collapse
|
48
|
Wojahn JMA, Galla SJ, Melton AE, Buerki S. G2PMineR: A Genome to Phenome Literature Review Approach. Genes (Basel) 2021; 12:genes12020293. [PMID: 33672535 PMCID: PMC7923769 DOI: 10.3390/genes12020293] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 02/16/2021] [Accepted: 02/18/2021] [Indexed: 11/21/2022] Open
Abstract
There is a gap in the conceptual framework linking genes to phenotypes (G2P) for non-model organisms, as most non-model organisms do not yet have genomic resources readily available. To address this, researchers often perform literature reviews to understand G2P linkages by curating a list of likely gene candidates, hinging upon other studies already conducted in closely related systems. Sifting through hundreds to thousands of articles is a cumbersome task that slows down the scientific process and may introduce bias into a study. To fill this gap, we created G2PMineR, a free and open source literature mining tool developed specifically for G2P research. This R package uses automation to make the G2P review process efficient and unbiased, while also generating hypothesized associations between genes and phenotypes within a taxonomical framework. We applied the package to a literature review for drought-tolerance in plants. The analysis provides biologically meaningful results within the known framework of drought tolerance in plants. Overall, the package is useful for conducting literature reviews for genome to phenome projects, and also has broad appeal to scientists investigating a wide range of study systems as it can conduct analyses under the auspices of three different kingdoms (Plantae, Animalia, and Fungi).
Collapse
|
49
|
Simon MA, O’Brian CA, Tom L, Wafford QE, Mack S, Mendez SR, Nava M, Dahdouh R, Paul-Brutus R, Carpenter KH, Kern B, Holmes KL. Development of a web tool to increase research literacy in underserved populations through public library partnerships. PLoS One 2021; 16:e0246098. [PMID: 33534794 PMCID: PMC7857632 DOI: 10.1371/journal.pone.0246098] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Accepted: 01/13/2021] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVE Inadequate diversity in clinical trials is widely recognized as a significant contributing factor to health disparities experienced by racial/ethnic minorities and other diverse populations in the US. To address this in a scalable way, we sought to develop a web tool that could help enhance underserved minority participation in clinical research. METHODS We used our research literacy support flashcard tool as the initial prototype for human-centered design and usability testing of the web tool Health for All in public library settings. After forming partnerships with leadership from Chicago Public Libraries (CPL), local medical libraries, and the Chicago Department of Public Health, we conducted seven iterative design sessions with focus groups of library patrons and library staff from six CPL branches serving underserved communities followed by two rounds of usability testing and website modification. RESULTS Based on the qualitative research findings from Design Sessions 1-7, we enacted the design decision of a website that was a hybrid of fact-filled and vignette (personal stories) paper prototypes divided into 4 modules (trust, diversity, healthy volunteers, pros/cons), each with their own outcome metrics. The website was thus constructed, and navigation issues identified in two rounds of usability testing by library patrons were addressed through further website modification, followed by the launch of a beta version of a hybridized single-scrolling and guided module prototype to allow further development with website analytics. CONCLUSIONS We report the development of Health for All, a website designed to enhance racial/ethnic minority participation in clinical trials by imparting research literacy, mitigating distrust engendered by longstanding racism and discrimination, and providing connections to clinical trials recruiting participants.
Collapse
Affiliation(s)
- Melissa A. Simon
- Center for Health Equity Transformation and Department of Obstetrics & Gynecology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Catherine A. O’Brian
- Center for Health Equity Transformation and Department of Obstetrics & Gynecology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Laura Tom
- Center for Health Equity Transformation and Department of Obstetrics & Gynecology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Q. Eileen Wafford
- Galter Health Sciences Library & Learning Center, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Shenita Mack
- Chicago Public Library, Chicago, Illinois, United States of America
| | - Samuel R. Mendez
- Center for Health Equity Transformation and Department of Obstetrics & Gynecology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Magdalena Nava
- Center for Health Equity Transformation and Department of Obstetrics & Gynecology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Rabih Dahdouh
- Center for Health Equity Transformation and Department of Obstetrics & Gynecology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Rachelle Paul-Brutus
- Chicago Department of Public Health, Chicago, Illinois, United States of America
| | - Kathryn H. Carpenter
- University Library, University of Illinois-Chicago, Chicago, Illinois, United States of America
| | - Barbara Kern
- The John Crerar Library, University of Chicago, Chicago, Illinois, United States of America
| | - Kristi L. Holmes
- Galter Health Sciences Library & Learning Center, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
| |
Collapse
|
50
|
Geissler AS, Anthon C, Alkan F, González-Tortuero E, Poulsen LD, Kallehauge TB, Breüner A, Seemann SE, Vinther J, Gorodkin J. BSGatlas: a unified Bacillus subtilis genome and transcriptome annotation atlas with enhanced information access. Microb Genom 2021; 7:000524. [PMID: 33539279 PMCID: PMC8208703 DOI: 10.1099/mgen.0.000524] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 01/11/2021] [Indexed: 12/26/2022] Open
Abstract
A large part of our current understanding of gene regulation in Gram-positive bacteria is based on Bacillus subtilis, as it is one of the most well studied bacterial model systems. The rapid growth in data concerning its molecular and genomic biology is distributed across multiple annotation resources. Consequently, the interpretation of data from further B. subtilis experiments becomes increasingly challenging in both low- and large-scale analyses. Additionally, B. subtilis annotation of structured RNA and non-coding RNA (ncRNA), as well as the operon structure, is still lagging behind the annotation of the coding sequences. To address these challenges, we created the B. subtilis genome atlas, BSGatlas, which integrates and unifies multiple existing annotation resources. Compared to any of the individual resources, the BSGatlas contains twice as many ncRNAs, while improving the positional annotation for 70 % of the ncRNAs. Furthermore, we combined known transcription start and termination sites with lists of known co-transcribed gene sets to create a comprehensive transcript map. The combination with transcription start/termination site annotations resulted in 717 new sets of co-transcribed genes and 5335 untranslated regions (UTRs). In comparison to existing resources, the number of 5' and 3' UTRs increased nearly fivefold, and the number of internal UTRs doubled. The transcript map is organized in 2266 operons, which provides transcriptional annotation for 92 % of all genes in the genome compared to the at most 82 % by previous resources. We predicted an off-target-aware genome-wide library of CRISPR-Cas9 guide RNAs, which we also linked to polycistronic operons. We provide the BSGatlas in multiple forms: as a website (https://rth.dk/resources/bsgatlas/), an annotation hub for display in the UCSC genome browser, supplementary tables and standardized GFF3 format, which can be used in large scale -omics studies. By complementing existing resources, the BSGatlas supports analyses of the B. subtilis genome and its molecular biology with respect to not only non-coding genes but also genome-wide transcriptional relationships of all genes.
Collapse
Affiliation(s)
- Adrian Sven Geissler
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, 1871 Frederiksberg, Denmark
| | - Christian Anthon
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, 1871 Frederiksberg, Denmark
| | - Ferhat Alkan
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, 1871 Frederiksberg, Denmark
- Division of Oncogenomics, Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Enrique González-Tortuero
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, 1871 Frederiksberg, Denmark
- Present address: School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - Line Dahl Poulsen
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, 1165 Copenhagen, Denmark
| | | | | | - Stefan Ernst Seemann
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, 1871 Frederiksberg, Denmark
| | - Jeppe Vinther
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, 1165 Copenhagen, Denmark
| | - Jan Gorodkin
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, 1871 Frederiksberg, Denmark
| |
Collapse
|