1
|
Meisburger SP, Ando N. Scaling and merging macromolecular diffuse scattering with mdx2. Acta Crystallogr D Struct Biol 2024; 80:S2059798324002705. [PMID: 38606664 DOI: 10.1107/s2059798324002705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 03/25/2024] [Indexed: 04/13/2024] Open
Abstract
Diffuse scattering is a promising method to gain additional insight into protein dynamics from macromolecular crystallography experiments. Bragg intensities yield the average electron density, while the diffuse scattering can be processed to obtain a three-dimensional reciprocal-space map that is further analyzed to determine correlated motion. To make diffuse scattering techniques more accessible, software for data processing called mdx2 has been created that is both convenient to use and simple to extend and modify. mdx2 is written in Python, and it interfaces with DIALS to implement self-contained data-reduction workflows. Data are stored in NeXus format for software interchange and convenient visualization. mdx2 can be run on the command line or imported as a package, for instance to encapsulate a complete workflow in a Jupyter notebook for reproducible computing and education. Here, mdx2 version 1.0 is described, a new release incorporating state-of-the-art techniques for data reduction. The implementation of a complete multi-crystal scaling and merging workflow is described, and the methods are tested using a high-redundancy data set from cubic insulin. It is shown that redundancy can be leveraged during scaling to correct systematic errors and obtain accurate and reproducible measurements of weak diffuse signals.
Collapse
Affiliation(s)
- Steve P Meisburger
- Cornell High Energy Synchrotron Source, Cornell University, Ithaca, NY 14850, USA
| | - Nozomi Ando
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14850, USA
| |
Collapse
|
2
|
van Houdt PJ, Ragunathan S, Berks M, Ahmed Z, Kershaw LE, Gurney-Champion OJ, Tadimalla S, Arvidsson J, Sun Y, Kallehauge J, Dickie B, Lévy S, Bell L, Sourbron S, Thrippleton MJ. Contrast-agent-based perfusion MRI code repository and testing framework: ISMRM Open Science Initiative for Perfusion Imaging (OSIPI). Magn Reson Med 2024; 91:1774-1786. [PMID: 37667526 DOI: 10.1002/mrm.29826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 06/30/2023] [Accepted: 07/25/2023] [Indexed: 09/06/2023]
Abstract
PURPOSE Software has a substantial impact on quantitative perfusion MRI values. The lack of generally accepted implementations, code sharing and transparent testing reduces reproducibility, hindering the use of perfusion MRI in clinical trials. To address these issues, the ISMRM Open Science Initiative for Perfusion Imaging (OSIPI) aimed to establish a community-led, centralized repository for sharing open-source code for processing contrast-based perfusion imaging, incorporating an open-source testing framework. METHODS A repository was established on the OSIPI GitHub website. Python was chosen as the target software language. Calls for code contributions were made to OSIPI members, the ISMRM Perfusion Study Group, and publicly via OSIPI websites. An automated unit-testing framework was implemented to evaluate the output of code contributions, including visual representation of the results. RESULTS The repository hosts 86 implementations of perfusion processing steps contributed by 12 individuals or teams. These cover all core aspects of DCE- and DSC-MRI processing, including multiple implementations of the same functionality. Tests were developed for 52 implementations, covering five analysis steps. For T1 mapping, signal-to-concentration conversion and population AIF functions, different implementations resulted in near-identical output values. For the five pharmacokinetic models tested (Tofts, extended Tofts-Kety, Patlak, two-compartment exchange, and two-compartment uptake), differences in output parameters were observed between contributions. CONCLUSIONS The OSIPI DCE-DSC code repository represents a novel community-led model for code sharing and testing. The repository facilitates the re-use of existing code and the benchmarking of new code, promoting enhanced reproducibility in quantitative perfusion imaging.
Collapse
Affiliation(s)
- Petra J van Houdt
- Department of Radiation Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | | | - Michael Berks
- Quantitative Biomedical Imaging Laboratory, Division of Cancer Sciences, The University of Manchester, Manchester, UK
| | - Zaki Ahmed
- Corewell Health William Beaumont University Hospital, Diagnostic Radiology, Royal Oak, USA
| | - Lucy E Kershaw
- Edinburgh Imaging and Centre for Cardiovascular Science, University of Edinburgh, Edinburgh, UK
| | - Oliver J Gurney-Champion
- Department of Radiology and Nuclear Medicine, Amsterdam UMC location University of Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
| | - Sirisha Tadimalla
- Institute of Medical Physics, The University of Sydney, Sydney, Australia
| | - Jonathan Arvidsson
- Department of Medical Radiation Sciences, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Medical Physics and Biomedical Engineering, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Yu Sun
- Institute of Medical Physics, The University of Sydney, Sydney, Australia
| | - Jesper Kallehauge
- Aarhus University Hospital, Danish Centre for Particle Therapy, Aarhus, Denmark
- Aarhus University, Department of Clinical Medicine, Aarhus, Denmark
| | - Ben Dickie
- Division of Informatics, Imaging, and Data Science, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK
- Geoffrey Jefferson Brain Research Centre, Manchester Academic Health Science Centre, Northern Care Alliance NHS Group, The University of Manchester, Manchester, UK
| | - Simon Lévy
- MR Research Collaborations, Siemens Healthcare Pty Ltd, Melbourne, Australia
| | - Laura Bell
- Genentech, Inc, Clinical Imaging Group, South San Francisco, USA
| | - Steven Sourbron
- University of Sheffield, Department of Infection, Immunity and Cardiovascular Disease, Sheffield, UK
| | - Michael J Thrippleton
- University of Edinburgh, Edinburgh Imaging and Centre for Clinical Brain Sciences, Edinburgh, UK
| |
Collapse
|
3
|
Metz A, Stegmann DP, Panepucci EH, Buehlmann S, Huang CY, McAuley KE, Wang M, Wojdyla JA, Sharpe ME, Smith KML. HEIDI: an experiment-management platform enabling high-throughput fragment and compound screening. Acta Crystallogr D Struct Biol 2024; 80:S2059798324002833. [PMID: 38606665 DOI: 10.1107/s2059798324002833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 03/29/2024] [Indexed: 04/13/2024] Open
Abstract
The Swiss Light Source facilitates fragment-based drug-discovery campaigns for academic and industrial users through the Fast Fragment and Compound Screening (FFCS) software suite. This framework is further enriched by the option to utilize the Smart Digital User (SDU) software for automated data collection across the PXI, PXII and PXIII beamlines. In this work, the newly developed HEIDI webpage (https://heidi.psi.ch) is introduced: a platform crafted using state-of-the-art software architecture and web technologies for sample management of rotational data experiments. The HEIDI webpage features a data-review tab for enhanced result visualization and provides programmatic access through a representational state transfer application programming interface (REST API). The migration of the local FFCS MongoDB instance to the cloud is highlighted and detailed. This transition ensures secure, encrypted and consistently accessible data through a robust and reliable REST API tailored for the FFCS software suite. Collectively, these advancements not only significantly elevate the user experience, but also pave the way for future expansions and improvements in the capabilities of the system.
Collapse
Affiliation(s)
- A Metz
- Swiss Light Source, Paul Scherrer Institute, 5232 Villigen PSI, Switzerland
| | - D P Stegmann
- Swiss Light Source, Paul Scherrer Institute, 5232 Villigen PSI, Switzerland
| | - E H Panepucci
- Swiss Light Source, Paul Scherrer Institute, 5232 Villigen PSI, Switzerland
| | - S Buehlmann
- Swiss Light Source, Paul Scherrer Institute, 5232 Villigen PSI, Switzerland
| | - C Y Huang
- Swiss Light Source, Paul Scherrer Institute, 5232 Villigen PSI, Switzerland
| | - K E McAuley
- Swiss Light Source, Paul Scherrer Institute, 5232 Villigen PSI, Switzerland
| | - M Wang
- Swiss Light Source, Paul Scherrer Institute, 5232 Villigen PSI, Switzerland
| | - J A Wojdyla
- Swiss Light Source, Paul Scherrer Institute, 5232 Villigen PSI, Switzerland
| | - M E Sharpe
- Swiss Light Source, Paul Scherrer Institute, 5232 Villigen PSI, Switzerland
| | - K M L Smith
- Swiss Light Source, Paul Scherrer Institute, 5232 Villigen PSI, Switzerland
| |
Collapse
|
4
|
Filatov DA. ProSeq4: A user-friendly multiplatform program for preparation and analysis of large-scale DNA polymorphism datasets. Mol Ecol Resour 2024:e13962. [PMID: 38646687 DOI: 10.1111/1755-0998.13962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 03/28/2024] [Accepted: 04/08/2024] [Indexed: 04/23/2024]
Abstract
Preparation of DNA polymorphism datasets for analysis is an important step in evolutionary genetic and molecular ecology studies. Ever-growing dataset sizes make this step time consuming, but few convenient software tools are available to facilitate processing of large-scale datasets including thousands of sequence alignments. Here I report "processor of sequences v4" (proSeq4)-a user-friendly multiplatform software for preparation and evolutionary genetic analyses of genome- or transcriptome-scale sequence polymorphism datasets. The program has an easy-to-use graphic user interface and is designed to process and analyse many thousands of datasets. It supports over two dozen file formats, includes a flexible sequence editor and various tools for data visualization, quality control and most commonly used evolutionary genetic analyses, such as NJ-phylogeny reconstruction, DNA polymorphism analyses and coalescent simulations. Command line tools (e.g. vcf2fasta) are also provided for easier integration into bioinformatic pipelines. Apart of molecular ecology and evolution research, proSeq4 may be useful for teaching, e.g. for visual illustration of different shapes of phylogenies generated with coalescent simulations in different scenarios. ProSeq4 source code and binaries for Windows, MacOS and Ubuntu are available from https://sourceforge.net/projects/proseq/.
Collapse
|
5
|
White L, Basurra S, Alsewari AA, Saeed F, Addanki SM. Temporal meta-optimiser based sensitivity analysis (TMSA) for agent-based models and applications in children's services. Sci Rep 2024; 14:9105. [PMID: 38643325 PMCID: PMC11032329 DOI: 10.1038/s41598-024-59743-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 04/15/2024] [Indexed: 04/22/2024] Open
Abstract
With current and predicted economic pressures within English Children's Services in the UK, there is a growing discourse around the development of methods of analysis using existing data to make more effective interventions and policy decisions. Agent-Based modelling shows promise in aiding in this, with limitations that require novel methods to overcome. This can include challenges in managing model complexity, transparency, and validation; which may deter analysts from implementing such Agent-Based simulations. Children's Services specifically can gain from the expansion of modelling techniques available to them. Sensitivity analysis is a common step when analysing models that currently has methods with limitations regarding Agent-Based Models. This paper outlines an improved method of conducting Sensitivity Analysis to enable better utilisation of Agent-Based models (ABMs) within Children's Services. By using machine learning based regression in conjunction with the Nomadic Peoples Optimiser (NPO) a method of conducting sensitivity analysis tailored for ABMs is achieved. This paper demonstrates the effectiveness of the approach by drawing comparisons with common existing methods of sensitivity analysis, followed by a demonstration of an improved ABM design in the target use case.
Collapse
Affiliation(s)
- Luke White
- College of Computing and Digital Technology, Birmingham City University, Birmingham, B4 7XG, UK.
| | - Shadi Basurra
- College of Computing and Digital Technology, Birmingham City University, Birmingham, B4 7XG, UK
| | - Abdulrahman A Alsewari
- College of Computing and Digital Technology, Birmingham City University, Birmingham, B4 7XG, UK
| | - Faisal Saeed
- College of Computing and Digital Technology, Birmingham City University, Birmingham, B4 7XG, UK
| | | |
Collapse
|
6
|
Jung J, Tan C, Sugita Y. GENESIS CGDYN: large-scale coarse-grained MD simulation with dynamic load balancing for heterogeneous biomolecular systems. Nat Commun 2024; 15:3370. [PMID: 38643169 PMCID: PMC11032353 DOI: 10.1038/s41467-024-47654-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 04/08/2024] [Indexed: 04/22/2024] Open
Abstract
Residue-level coarse-grained (CG) molecular dynamics (MD) simulation is widely used to investigate slow biological processes that involve multiple proteins, nucleic acids, and their complexes. Biomolecules in a large simulation system are distributed non-uniformly, limiting computational efficiency with conventional methods. Here, we develop a hierarchical domain decomposition scheme with dynamic load balancing for heterogeneous biomolecular systems to keep computational efficiency even after drastic changes in particle distribution. These schemes are applied to the dynamics of intrinsically disordered protein (IDP) droplets. During the fusion of two droplets, we find that the changes in droplet shape correlate with the mixing of IDP chains. Additionally, we simulate large systems with multiple IDP droplets, achieving simulation sizes comparable to those observed in microscopy. In our MD simulations, we directly observe Ostwald ripening, a phenomenon where small droplets dissolve and their molecules redeposit into larger droplets. These methods have been implemented in CGDYN of the GENESIS software, offering a tool for investigating mesoscopic biological processes using the residue-level CG models.
Collapse
Affiliation(s)
- Jaewoon Jung
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo, 650-0047, Japan
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama, 351-0198, Japan
| | - Cheng Tan
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo, 650-0047, Japan
| | - Yuji Sugita
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo, 650-0047, Japan.
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama, 351-0198, Japan.
- Laboratory for Biomolecular Function Simulation, RIKEN Center for Biosystems Dynamics Research, Kobe, Hyogo, 650-0047, Japan.
| |
Collapse
|
7
|
Wündisch E, Hufnagl P, Brunecker P, Meier Zu Ummeln S, Träger S, Kopp M, Prasser F, Weber J. Development of a Trusted Third Party at a Large University Hospital: Design and Implementation Study. JMIR Med Inform 2024; 12:e53075. [PMID: 38632712 DOI: 10.2196/53075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 02/15/2024] [Accepted: 02/17/2024] [Indexed: 04/19/2024] Open
Abstract
Background Pseudonymization has become a best practice to securely manage the identities of patients and study participants in medical research projects and data sharing initiatives. This method offers the advantage of not requiring the direct identification of data to support various research processes while still allowing for advanced processing activities, such as data linkage. Often, pseudonymization and related functionalities are bundled in specific technical and organization units known as trusted third parties (TTPs). However, pseudonymization can significantly increase the complexity of data management and research workflows, necessitating adequate tool support. Common tasks of TTPs include supporting the secure registration and pseudonymization of patient and sample identities as well as managing consent. Objective Despite the challenges involved, little has been published about successful architectures and functional tools for implementing TTPs in large university hospitals. The aim of this paper is to fill this research gap by describing the software architecture and tool set developed and deployed as part of a TTP established at Charité - Universitätsmedizin Berlin. Methods The infrastructure for the TTP was designed to provide a modular structure while keeping maintenance requirements low. Basic functionalities were realized with the free MOSAIC tools. However, supporting common study processes requires implementing workflows that span different basic services, such as patient registration, followed by pseudonym generation and concluded by consent collection. To achieve this, an integration layer was developed to provide a unified Representational state transfer (REST) application programming interface (API) as a basis for more complex workflows. Based on this API, a unified graphical user interface was also implemented, providing an integrated view of information objects and workflows supported by the TTP. The API was implemented using Java and Spring Boot, while the graphical user interface was implemented in PHP and Laravel. Both services use a shared Keycloak instance as a unified management system for roles and rights. Results By the end of 2022, the TTP has already supported more than 10 research projects since its launch in December 2019. Within these projects, more than 3000 identities were stored, more than 30,000 pseudonyms were generated, and more than 1500 consent forms were submitted. In total, more than 150 people regularly work with the software platform. By implementing the integration layer and the unified user interface, together with comprehensive roles and rights management, the effort for operating the TTP could be significantly reduced, as personnel of the supported research projects can use many functionalities independently. Conclusions With the architecture and components described, we created a user-friendly and compliant environment for supporting research projects. We believe that the insights into the design and implementation of our TTP can help other institutions to efficiently and effectively set up corresponding structures.
Collapse
Affiliation(s)
- Eric Wündisch
- Core Unit THS, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Peter Hufnagl
- Digital Pathology, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Peter Brunecker
- Core Unit Research IT, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Sophie Meier Zu Ummeln
- Core Unit THS, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Sarah Träger
- Core Unit THS, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Marcus Kopp
- Core Unit THS, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Fabian Prasser
- Medical Informatics Group, Center of Health Data Science, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Joachim Weber
- Core Unit THS, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
- Center for Stroke Research Berlin, Charité - Universitätsmedizin Berlin, Berlin, Germany
- German Centre for Cardiovascular Research (DZHK), Berlin, Germany
| |
Collapse
|
8
|
Ferris Z, Ribeiro E, Nagata T, van Woesik R. ReScape: transforming coral-reefscape images for quantitative analysis. Sci Rep 2024; 14:8915. [PMID: 38632306 PMCID: PMC11024090 DOI: 10.1038/s41598-024-59123-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 04/08/2024] [Indexed: 04/19/2024] Open
Abstract
Ever since the first image of a coral reef was captured in 1885, people worldwide have been accumulating images of coral reefscapes that document the historic conditions of reefs. However, these innumerable reefscape images suffer from perspective distortion, which reduces the apparent size of distant taxa, rendering the images unusable for quantitative analysis of reef conditions. Here we solve this century-long distortion problem by developing a novel computer-vision algorithm, ReScape, which removes the perspective distortion from reefscape images by transforming them into top-down views, making them usable for quantitative analysis of reef conditions. In doing so, we demonstrate the first-ever ecological application and extension of inverse-perspective mapping-a foundational technique used in the autonomous-driving industry. The ReScape algorithm is composed of seven functions that (1) calibrate the camera lens, (2) remove the inherent lens-induced image distortions, (3) detect the scene's horizon line, (4) remove the camera-roll angle, (5) detect the transformable reef area, (6) detect the scene's perspective geometry, and (7) apply brute-force inverse-perspective mapping. The performance of the ReScape algorithm was evaluated by transforming the perspective of 125 reefscape images. Eighty-five percent of the images had no processing errors and of those, 95% were successfully transformed into top-down views. ReScape was validated by demonstrating that same-length transects, placed increasingly further from the camera, became the same length after transformation. The mission of the ReScape algorithm is to (i) unlock historical information about coral-reef conditions from previously unquantified periods and localities, (ii) enable citizen scientists and recreational photographers to contribute reefscape images to the scientific process, and (iii) provide a new survey technique that can rigorously assess relatively large areas of coral reefs, and other marine and even terrestrial ecosystems, worldwide. To facilitate this mission, we compiled the ReScape algorithm into a free, user-friendly App that does not require any coding experience. Equipped with the ReScape App, scientists can improve the management and prediction of the future of coral reefs by uncovering historical information from reefscape-image archives and by using reefscape images as a new, rapid survey method, opening a new era of coral-reef monitoring.
Collapse
Affiliation(s)
- Z Ferris
- Institute for Global Ecology, Florida Institute of Technology, Melbourne, FL, 32901, USA
| | - E Ribeiro
- Department of Computer Science, Florida Institute of Technology, Melbourne, FL, 32901, USA
| | - T Nagata
- Incorporated Foundation Okinawa Environment Science Center, Urasoe, Okinawa, 901-2111, Japan
| | - R van Woesik
- Institute for Global Ecology, Florida Institute of Technology, Melbourne, FL, 32901, USA.
| |
Collapse
|
9
|
Alfonsi T, Bernasconi A, Chiara M, Ceri S. Data-driven recombination detection in viral genomes. Nat Commun 2024; 15:3313. [PMID: 38632281 PMCID: PMC11024102 DOI: 10.1038/s41467-024-47464-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 03/25/2024] [Indexed: 04/19/2024] Open
Abstract
Recombination is a key molecular mechanism for the evolution and adaptation of viruses. The first recombinant SARS-CoV-2 genomes were recognized in 2021; as of today, more than ninety SARS-CoV-2 lineages are designated as recombinant. In the wake of the COVID-19 pandemic, several methods for detecting recombination in SARS-CoV-2 have been proposed; however, none could faithfully confirm manual analyses by experts in the field. We hereby present RecombinHunt, an original data-driven method for the identification of recombinant genomes, capable of recognizing recombinant SARS-CoV-2 genomes (or lineages) with one or two breakpoints with high accuracy and within reduced turn-around times. ReconbinHunt shows high specificity and sensitivity, compares favorably with other state-of-the-art methods, and faithfully confirms manual analyses by experts. RecombinHunt identifies recombinant viral genomes from the recent monkeypox epidemic in high concordance with manually curated analyses by experts, suggesting that our approach is robust and can be applied to any epidemic/pandemic virus.
Collapse
Affiliation(s)
- Tommaso Alfonsi
- Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Via Ponzio 34/5, 20133, Milan, Italy
| | - Anna Bernasconi
- Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Via Ponzio 34/5, 20133, Milan, Italy.
| | - Matteo Chiara
- Department of Biosciences, Università degli Studi di Milano, Via Celoria 26, 20133, Milan, Italy
| | - Stefano Ceri
- Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Via Ponzio 34/5, 20133, Milan, Italy
| |
Collapse
|
10
|
Callahan TJ, Tripodi IJ, Stefanski AL, Cappelletti L, Taneja SB, Wyrwa JM, Casiraghi E, Matentzoglu NA, Reese J, Silverstein JC, Hoyt CT, Boyce RD, Malec SA, Unni DR, Joachimiak MP, Robinson PN, Mungall CJ, Cavalleri E, Fontana T, Valentini G, Mesiti M, Gillenwater LA, Santangelo B, Vasilevsky NA, Hoehndorf R, Bennett TD, Ryan PB, Hripcsak G, Kahn MG, Bada M, Baumgartner WA, Hunter LE. An open source knowledge graph ecosystem for the life sciences. Sci Data 2024; 11:363. [PMID: 38605048 PMCID: PMC11009265 DOI: 10.1038/s41597-024-03171-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 03/21/2024] [Indexed: 04/13/2024] Open
Abstract
Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.
Collapse
Affiliation(s)
- Tiffany J Callahan
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA.
| | - Ignacio J Tripodi
- Computer Science Department, Interdisciplinary Quantitative Biology, University of Colorado Boulder, Boulder, CO, 80301, USA
| | - Adrianne L Stefanski
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Luca Cappelletti
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Sanya B Taneja
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Jordan M Wyrwa
- Department of Physical Medicine and Rehabilitation, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Elena Casiraghi
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | | | - Justin Reese
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Jonathan C Silverstein
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15206, USA
| | - Charles Tapley Hoyt
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, 02115, USA
| | - Richard D Boyce
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15206, USA
| | - Scott A Malec
- Division of Translational Informatics, University of New Mexico School of Medicine, Albuquerque, NM, 87131, USA
| | - Deepak R Unni
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Marcin P Joachimiak
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Peter N Robinson
- Berlin Institute of Health at Charité-Universitatsmedizin, 10117, Berlin, Germany
| | - Christopher J Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Emanuele Cavalleri
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Tommaso Fontana
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Giorgio Valentini
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
- ELLIS, European Laboratory for Learning and Intelligent Systems, Milan Unit, Italy
| | - Marco Mesiti
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Lucas A Gillenwater
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Brook Santangelo
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Nicole A Vasilevsky
- Data Collaboration Center, Critical Path Institute, 1840 E River Rd. Suite 100, Tucson, AZ, 85718, USA
| | - Robert Hoehndorf
- Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Tellen D Bennett
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
- Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Patrick B Ryan
- Janssen Research and Development, Raritan, NJ, 08869, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Michael G Kahn
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Michael Bada
- Division of General Internal Medicine, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - William A Baumgartner
- Division of General Internal Medicine, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| | - Lawrence E Hunter
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| |
Collapse
|
11
|
Bentley-Abbot C, Heslop R, Pirillo C, Chandrasegaran P, McConnell G, Roberts E, Hutchinson E, MacLeod A. An easy to use tool for the analysis of subcellular mRNA transcript colocalisation in smFISH data. Sci Rep 2024; 14:8348. [PMID: 38594373 PMCID: PMC11004122 DOI: 10.1038/s41598-024-58641-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 04/01/2024] [Indexed: 04/11/2024] Open
Abstract
Single molecule fluorescence in situ hybridisation (smFISH) has become a valuable tool to investigate the mRNA expression of single cells. However, it requires a considerable amount of programming expertise to use currently available open-source analytical software packages to extract and analyse quantitative data about transcript expression. Here, we present FISHtoFigure, a new software tool developed specifically for the analysis of mRNA abundance and co-expression in QuPath-quantified, multi-labelled smFISH data. FISHtoFigure facilitates the automated spatial analysis of transcripts of interest, allowing users to analyse populations of cells positive for specific combinations of mRNA targets without the need for computational image analysis expertise. As a proof of concept and to demonstrate the capabilities of this new research tool, we have validated FISHtoFigure in multiple biological systems. We used FISHtoFigure to identify an upregulation in the expression of Cd4 by T-cells in the spleens of mice infected with influenza A virus, before analysing more complex data showing crosstalk between microglia and regulatory B-cells in the brains of mice infected with Trypanosoma brucei brucei. These analyses demonstrate the ease of analysing cell expression profiles using FISHtoFigure and the value of this new tool in the field of smFISH data analysis.
Collapse
Affiliation(s)
- Calum Bentley-Abbot
- Wellcome Centre for Integrative Parasitology (WCIP), University of Glasgow, Glasgow, UK.
- School of Biodiversity, One Health, Veterinary Medicine (SBOHVM), College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, UK.
- MRC-University of Glasgow Centre for Virus Research, University of Glasgow, Glasgow, UK.
| | - Rhiannon Heslop
- Wellcome Centre for Integrative Parasitology (WCIP), University of Glasgow, Glasgow, UK
- School of Biodiversity, One Health, Veterinary Medicine (SBOHVM), College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, UK
| | | | - Praveena Chandrasegaran
- Wellcome Centre for Integrative Parasitology (WCIP), University of Glasgow, Glasgow, UK
- School of Biodiversity, One Health, Veterinary Medicine (SBOHVM), College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, UK
| | - Gail McConnell
- Department of Physics, University of Strathclyde, Glasgow, UK
| | - Ed Roberts
- Beatson Institute for Cancer Research, Glasgow, UK
| | - Edward Hutchinson
- MRC-University of Glasgow Centre for Virus Research, University of Glasgow, Glasgow, UK
| | - Annette MacLeod
- Wellcome Centre for Integrative Parasitology (WCIP), University of Glasgow, Glasgow, UK
- School of Biodiversity, One Health, Veterinary Medicine (SBOHVM), College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, UK
| |
Collapse
|
12
|
Wang K, Yang Y, Wu F, Song B, Wang X, Wang T. Author Correction: Comparative analysis of dimension reduction methods for cytometry by time-of-flight data. Nat Commun 2024; 15:3006. [PMID: 38589345 PMCID: PMC11001602 DOI: 10.1038/s41467-024-47234-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/10/2024] Open
Affiliation(s)
- Kaiwen Wang
- Department of Statistical Science, Southern Methodist University, Dallas, TX, 75275, USA
| | - Yuqiu Yang
- Department of Statistical Science, Southern Methodist University, Dallas, TX, 75275, USA
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Fangjiang Wu
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Bing Song
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Xinlei Wang
- Department of Statistical Science, Southern Methodist University, Dallas, TX, 75275, USA.
- Department of Mathematics, University of Texas at Arlington, Arlington, TX, 76019, USA.
- Center for Data Science Research and Education, College of Science, University of Texas at Arlington, Arlington, 76019, USA.
| | - Tao Wang
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA.
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA.
| |
Collapse
|
13
|
Wright E. Accurately clustering biological sequences in linear time by relatedness sorting. Nat Commun 2024; 15:3047. [PMID: 38589369 PMCID: PMC11001989 DOI: 10.1038/s41467-024-47371-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 03/28/2024] [Indexed: 04/10/2024] Open
Abstract
Clustering biological sequences into similar groups is an increasingly important task as the number of available sequences continues to grow exponentially. Search-based approaches to clustering scale super-linearly with the number of input sequences, making it impractical to cluster very large sets of sequences. Approaches to clustering sequences in linear time currently lack the accuracy of super-linear approaches. Here, I set out to develop and characterize a strategy for clustering with linear time complexity that retains the accuracy of less scalable approaches. The resulting algorithm, named Clusterize, sorts sequences by relatedness to linearize the clustering problem. Clusterize produces clusters with accuracy rivaling popular programs (CD-HIT, MMseqs2, and UCLUST) but exhibits linear asymptotic scalability. Clusterize generates higher accuracy and oftentimes much larger clusters than Linclust, a fast linear time clustering algorithm. I demonstrate the utility of Clusterize by accurately solving different clustering problems involving millions of nucleotide or protein sequences.
Collapse
Affiliation(s)
- Erik Wright
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA.
- Center for Evolutionary Biology and Medicine, Pittsburgh, PA, USA.
| |
Collapse
|
14
|
Andreica T, Musuroi A, Anistoroaei A, Jichici C, Groza B. Blockchain integration for in-vehicle CAN bus intrusion detection systems with ISO/SAE 21434 compliant reporting. Sci Rep 2024; 14:8169. [PMID: 38589628 PMCID: PMC11002018 DOI: 10.1038/s41598-024-58694-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 03/29/2024] [Indexed: 04/10/2024] Open
Abstract
The development of Intrusion Detection Systems (IDS) for in-vehicle buses has gained a lot of momentum in recent years as the number of reported vulnerabilities and the degree of interconnectivity for modern vehicles are on the rise. Since intrusion detection is resource consuming, it can be performed on computationally capable Android head units that are now present inside vehicles. Moreover, these units are connected to the internet, which enables the use of more complex algorithms that run in cloud environments. In this work we develop one such approach: an IDS that consists of a locally installed copy, running on head units, and a centralized instance of it that runs in the cloud and monitors traffic for groups of similar vehicles. Additionally, the centralized instance is part of a cloud service for intrusion detection which is continuously updated with the most recent types of attacks. The classification results of the cloud-based service are further analyzed by an incident response team which confirms the presence of known attacks, analyzes new types of attacks and assesses their impact. The output of this activity is stored on the Blockchain as ISO/SAE 21434 compliant reports, ensuring the transparency and traceability of the reported incidents.
Collapse
Affiliation(s)
- Tudor Andreica
- Faculty of Automation and Computers, Politehnica University of Timisoara, 300223, Timisoara, Romania
| | - Adrian Musuroi
- Faculty of Automation and Computers, Politehnica University of Timisoara, 300223, Timisoara, Romania
| | - Alfred Anistoroaei
- Faculty of Automation and Computers, Politehnica University of Timisoara, 300223, Timisoara, Romania
| | - Camil Jichici
- Faculty of Automation and Computers, Politehnica University of Timisoara, 300223, Timisoara, Romania
| | - Bogdan Groza
- Faculty of Automation and Computers, Politehnica University of Timisoara, 300223, Timisoara, Romania.
| |
Collapse
|
15
|
O'Leary K, Zheng D. Metacell-based differential expression analysis identifies cell type specific temporal gene response programs in COVID-19 patient PBMCs. NPJ Syst Biol Appl 2024; 10:36. [PMID: 38580667 PMCID: PMC10997786 DOI: 10.1038/s41540-024-00364-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 03/27/2024] [Indexed: 04/07/2024] Open
Abstract
By profiling gene expression in individual cells, single-cell RNA-sequencing (scRNA-seq) can resolve cellular heterogeneity and cell-type gene expression dynamics. Its application to time-series samples can identify temporal gene programs active in different cell types, for example, immune cells' responses to viral infection. However, current scRNA-seq analysis has limitations. One is the low number of genes detected per cell. The second is insufficient replicates (often 1-2) due to high experimental cost. The third lies in the data analysis-treating individual cells as independent measurements leads to inflated statistics. To address these, we explore a new computational framework, specifically whether "metacells" constructed to maintain cellular heterogeneity within individual cell types (or clusters) can be used as "replicates" for increasing statistical rigor. Toward this, we applied SEACells to a time-series scRNA-seq dataset from peripheral blood mononuclear cells (PBMCs) after SARS-CoV-2 infection to construct metacells, and used them in maSigPro for quadratic regression to find significantly differentially expressed genes (DEGs) over time, followed by clustering expression velocity trends. We showed that such metacells retained greater expression variances and produced more biologically meaningful DEGs compared to either metacells generated randomly or from simple pseudobulk methods. More specifically, this approach correctly identified the known ISG15 interferon response program in almost all PBMC cell types and many DEGs enriched in the previously defined SARS-CoV-2 infection response pathway. It also uncovered additional and more cell type-specific temporal gene expression programs. Overall, our results demonstrate that the metacell-pseudoreplicate strategy could potentially overcome the limitation of 1-2 replicates.
Collapse
Affiliation(s)
- Kevin O'Leary
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Deyou Zheng
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, USA.
- Department of Neurology, Albert Einstein College of Medicine, Bronx, NY, USA.
- Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA.
| |
Collapse
|
16
|
Sternberg PW, Van Auken K, Wang Q, Wright A, Yook K, Zarowiecki M, Arnaboldi V, Becerra A, Brown S, Cain S, Chan J, Chen WJ, Cho J, Davis P, Diamantakis S, Dyer S, Grigoriadis D, Grove CA, Harris T, Howe K, Kishore R, Lee R, Longden I, Luypaert M, Muller HM, Nuin P, Quinton-Tulloch M, Raciti D, Schedl T, Schindelman G, Stein L. WormBase 2024: status and transitioning to Alliance infrastructure. Genetics 2024:iyae050. [PMID: 38573366 DOI: 10.1093/genetics/iyae050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 03/19/2024] [Accepted: 03/20/2024] [Indexed: 04/05/2024] Open
Abstract
WormBase has been the major repository and knowledgebase of information about the genome and genetics of C. elegans and other nematodes of experimental interest for over two decades. We have three goals: to keep current with the fast-paced C. elegans research, to provide better integration with other resources, and to be sustainable. Here we discuss the current state of WormBase as well as progress and plans for moving core WormBase infrastructure to the Alliance of Genome Resources (the Alliance). As an Alliance member, WormBase will continue to interact with the C. elegans community, develop new features as needed, and curate key information from the literature and large-scale projects.
Collapse
Affiliation(s)
- Paul W Sternberg
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Kimberly Van Auken
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Qinghua Wang
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Adam Wright
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Karen Yook
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Magdalena Zarowiecki
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Valerio Arnaboldi
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Andrés Becerra
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephanie Brown
- School of Infection and Immunity, University of Glasgow, G12 8TA, UK
| | - Scott Cain
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Juancarlos Chan
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Wen J Chen
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Jaehyoung Cho
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paul Davis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stavros Diamantakis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sarah Dyer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Christian A Grove
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Todd Harris
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Kevin Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ranjana Kishore
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Raymond Lee
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Ian Longden
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Manuel Luypaert
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Hans-Michael Muller
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paulo Nuin
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Mark Quinton-Tulloch
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniela Raciti
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Tim Schedl
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Gary Schindelman
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Lincoln Stein
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| |
Collapse
|
17
|
dos Santos LRA, de Oliveira AM, dos Santos LMAC, Aguilar GJ, Costa WDL, Donato DDCB, Bollela VR. Collaborative Development of an Electronic Portfolio to Support the Assessment and Development of Medical Undergraduates. JMIR Med Educ 2024; 10:e56568. [PMID: 38596841 PMCID: PMC11007380 DOI: 10.2196/56568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 02/27/2024] [Accepted: 03/04/2024] [Indexed: 04/11/2024]
Abstract
This study outlines the development of an electronic portfolio (e-portfolio) designed to capture and record the overall academic performance of medical undergraduate students throughout their educational journey. Additionally, it facilitates the capture of narratives on lived experiences and sharing of reflections, fostering collaboration between students and their mentors.
Collapse
Affiliation(s)
| | - Alan Maicon de Oliveira
- School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, Brazil
| | | | - Guilherme José Aguilar
- Faculty of Philosophy, Sciences and Letters at Ribeirão Preto, University of São Paulo, Ribeirão Preto, Brazil
| | | | | | - Valdes Roberto Bollela
- Ribeirão Preto Medical School, University of São Paulo, Ribeirão Preto, Brazil
- Clinical Hospital of the Ribeirão Preto Medical School, University of São Paulo, Ribeirão Preto, Brazil
| |
Collapse
|
18
|
Nurmi J, Paju A, Brumley BB, Insoll T, Ovaska AK, Soloveva V, Vaaranen-Valkonen N, Aaltonen M, Arroyo D. Investigating child sexual abuse material availability, searches, and users on the anonymous Tor network for a public health intervention strategy. Sci Rep 2024; 14:7849. [PMID: 38570603 PMCID: PMC10991312 DOI: 10.1038/s41598-024-58346-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 03/27/2024] [Indexed: 04/05/2024] Open
Abstract
Tor is widely used for staying anonymous online and accessing onion websites; unfortunately, Tor is popular for distributing and viewing illicit child sexual abuse material (CSAM). From 2018 to 2023, we analyse 176,683 onion domains and find that one-fifth share CSAM. We find that CSAM is easily available using 21 out of the 26 most-used Tor search engines. We analyse 110,133,715 search sessions from the Ahmia.fi search engine and discover that 11.1% seek CSAM. When searching CSAM by age, 40.5% search for 11-year-olds and younger; 11.0% for 12-year-olds; 8.2% for 13-year-olds; 11.6% for 14-year-olds; 10.9% for 15-year-olds; and 12.7% for 16-year-olds. We demonstrate accurate filtering for search engines, introduce intervention, show a questionnaire for CSAM users, and analyse 11,470 responses. 65.3% of CSAM users first saw the material when they were children themselves, and half of the respondents first saw the material accidentally, demonstrating the availability of CSAM. 48.1% want to stop using CSAM. Some seek help through Tor, and self-help websites are popular. Our survey finds commonalities between CSAM use and addiction. Help-seeking correlates with increasing viewing duration and frequency, depression, anxiety, self-harming thoughts, guilt, and shame. Yet, 73.9% of help seekers have not been able to receive it.
Collapse
Affiliation(s)
- Juha Nurmi
- Tampere University, FI-33720, Tampere, Finland.
| | - Arttu Paju
- Tampere University, FI-33720, Tampere, Finland
| | | | - Tegan Insoll
- Suojellaan Lapsia, Protect Children ry., FI-00580, Helsinki, Finland
| | - Anna K Ovaska
- Suojellaan Lapsia, Protect Children ry., FI-00580, Helsinki, Finland
| | - Valeriia Soloveva
- Suojellaan Lapsia, Protect Children ry., FI-00580, Helsinki, Finland
| | | | - Mikko Aaltonen
- University of Eastern Finland, FI-80101, Joensuu, Finland
| | - David Arroyo
- Consejo Superior de Investigaciones Científicas, 28014, Madrid, Spain
| |
Collapse
|
19
|
Lancaster AK, Single RM, Mack SJ, Sochat V, Mariani MP, Webster GD. PyPop: a mature open-source software pipeline for population genomics. Front Immunol 2024; 15:1378512. [PMID: 38629078 PMCID: PMC11019567 DOI: 10.3389/fimmu.2024.1378512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 03/08/2024] [Indexed: 04/19/2024] Open
Abstract
Python for Population Genomics (PyPop) is a software package that processes genotype and allele data and performs large-scale population genetic analyses on highly polymorphic multi-locus genotype data. In particular, PyPop tests data conformity to Hardy-Weinberg equilibrium expectations, performs Ewens-Watterson tests for selection, estimates haplotype frequencies, measures linkage disequilibrium, and tests significance. Standardized means of performing these tests is key for contemporary studies of evolutionary biology and population genetics, and these tests are central to genetic studies of disease association as well. Here, we present PyPop 1.0.0, a new major release of the package, which implements new features using the more robust infrastructure of GitHub, and is distributed via the industry-standard Python Package Index. New features include implementation of the asymmetric linkage disequilibrium measures and, of particular interest to the immunogenetics research communities, support for modern nomenclature, including colon-delimited allele names, and improvements to meta-analysis features for aggregating outputs for multiple populations. Code available at: https://zenodo.org/records/10080668 and https://github.com/alexlancaster/pypop.
Collapse
Affiliation(s)
- Alexander K. Lancaster
- Amber Biology LLC, Cambridge, MA, United States
- Ronin Institute, Montclair, NJ, United States
- Institute for Globally Distributed Open Research and Education (IGDORE), Cambridge, MA, United States
| | - Richard M. Single
- Department of Mathematics and Statistics, University of Vermont, Burlington, VT, United States
| | - Steven J. Mack
- Department of Pediatrics, University of California, San Francisco, Oakland, CA, United States
| | - Vanessa Sochat
- Livermore Computing, Lawrence Livermore National Laboratory, Livermore, CA, United States
| | - Michael P. Mariani
- Department of Mathematics and Statistics, University of Vermont, Burlington, VT, United States
- Mariani Systems LLC, Hanover, NH, United States
| | - Gordon D. Webster
- Amber Biology LLC, Cambridge, MA, United States
- Ronin Institute, Montclair, NJ, United States
| |
Collapse
|
20
|
Mahout M, Carlson RP, Simon L, Peres S. Logic programming-based Minimal Cut Sets reveal consortium-level therapeutic targets for chronic wound infections. NPJ Syst Biol Appl 2024; 10:34. [PMID: 38565568 PMCID: PMC10987626 DOI: 10.1038/s41540-024-00360-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 03/13/2024] [Indexed: 04/04/2024] Open
Abstract
Minimal Cut Sets (MCSs) identify sets of reactions which, when removed from a metabolic network, disable certain cellular functions. The traditional search for MCSs within genome-scale metabolic models (GSMMs) targets cellular growth, identifies reaction sets resulting in a lethal phenotype if disrupted, and retrieves a list of corresponding gene, mRNA, or enzyme targets. Using the dual link between MCSs and Elementary Flux Modes (EFMs), our logic programming-based tool aspefm was able to compute MCSs of any size from GSMMs in acceptable run times. The tool demonstrated better performance when computing large-sized MCSs than the mixed-integer linear programming methods. We applied the new MCSs methodology to a medically-relevant consortium model of two cross-feeding bacteria, Staphylococcus aureus and Pseudomonas aeruginosa. aspefm constraints were used to bias the computation of MCSs toward exchanged metabolites that could complement lethal phenotypes in individual species. We found that interspecies metabolite exchanges could play an essential role in rescuing single-species growth, for instance inosine could complement lethal reaction knock-outs in the purine synthesis, glycolysis, and pentose phosphate pathways of both bacteria. Finally, MCSs were used to derive a list of promising enzyme targets for consortium-level therapeutic applications that cannot be circumvented via interspecies metabolite exchange.
Collapse
Affiliation(s)
- Maxime Mahout
- Université Paris-Saclay, CNRS, Laboratoire Interdisciplinaire des Sciences du Numérique, 91405, Orsay, France
| | - Ross P Carlson
- Department of Chemical and Biological Engineering, Center for Biofilm Engineering, Microbiology and Immunology, Montana State University, Bozeman, MT, 59717, USA
| | - Laurent Simon
- Bordeaux-INP, Université Bordeaux, LaBRI, 33405, Talence Cedex, France
| | - Sabine Peres
- UMR CNRS 5558, Laboratoire de Biométrie et de Biologie Évolutive, Université Claude Bernard Lyon 1, 69100, Villeurbanne, France.
- INRIA Lyon Centre, 69100, Villeurbanne, France.
| |
Collapse
|
21
|
Baril T, Galbraith J, Hayward A. Earl Grey: A Fully Automated User-Friendly Transposable Element Annotation and Analysis Pipeline. Mol Biol Evol 2024; 41:msae068. [PMID: 38577785 PMCID: PMC11003543 DOI: 10.1093/molbev/msae068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 02/20/2024] [Accepted: 03/22/2024] [Indexed: 04/06/2024] Open
Abstract
Transposable elements (TEs) are major components of eukaryotic genomes and are implicated in a range of evolutionary processes. Yet, TE annotation and characterization remain challenging, particularly for nonspecialists, since existing pipelines are typically complicated to install, run, and extract data from. Current methods of automated TE annotation are also subject to issues that reduce overall quality, particularly (i) fragmented and overlapping TE annotations, leading to erroneous estimates of TE count and coverage, and (ii) repeat models represented by short sections of total TE length, with poor capture of 5' and 3' ends. To address these issues, we present Earl Grey, a fully automated TE annotation pipeline designed for user-friendly curation and annotation of TEs in eukaryotic genome assemblies. Using nine simulated genomes and an annotation of Drosophila melanogaster, we show that Earl Grey outperforms current widely used TE annotation methodologies in ameliorating the issues mentioned above while scoring highly in benchmarking for TE annotation and classification and being robust across genomic contexts. Earl Grey provides a comprehensive and fully automated TE annotation toolkit that provides researchers with paper-ready summary figures and outputs in standard formats compatible with other bioinformatics tools. Earl Grey has a modular format, with great scope for the inclusion of additional modules focused on further quality control and tailored analyses in future releases.
Collapse
Affiliation(s)
- Tobias Baril
- Centre for Ecology and Conservation, University of Exeter, Penryn Campus, Cornwall TR10 9FE, UK
- Laboratory of Evolutionary Genetics, Institute of Biology, University of Neuchâtel, 2000 Neuchâtel, Switzerland
| | - James Galbraith
- Centre for Ecology and Conservation, University of Exeter, Penryn Campus, Cornwall TR10 9FE, UK
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3FL, UK
| | - Alex Hayward
- Centre for Ecology and Conservation, University of Exeter, Penryn Campus, Cornwall TR10 9FE, UK
| |
Collapse
|
22
|
Wagner C, Urquiza-Garcia U, Zurbriggen MD, Beyer HM. GMOCU: Digital Documentation, Management, and Biological Risk Assessment of Genetic Parts. Adv Biol (Weinh) 2024; 8:e2300529. [PMID: 38263723 DOI: 10.1002/adbi.202300529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 01/02/2024] [Indexed: 01/25/2024]
Abstract
The continuous evolution of molecular biology and gene synthesis methods paired with an ever-increasing potential of synthetic biology approaches and genome engineering toolkits enables the rapid design of genetic bioparts and genetically modified organisms. Although various software solutions assist with specific design tasks and challenges, lab internal documentation and ensuring compliance with governmental regulations on biosafety assessment of the generated organisms remain the responsibility of individual academic researchers. This results in inconsistent and redundant documentation regimes and a significant time and labor burden. GMOCU (GMO documentation) is a standardized semi-automatic user-oriented software approach -written in Python and freely available- that unifies lab internal data documentation on genetic parts and genetically modified organisms (GMOs). It automatizes biological risk evaluations and maintains a shared up-to-date inventory of bioparts for team-wide data navigation and sharing. GMOCU further enables data export into customizable formats suitable for scientific publications, official biosafety documents, and the research community.
Collapse
Affiliation(s)
- Christoph Wagner
- Institute of Synthetic Biology, Heinrich-Heine-University Düsseldorf, Universitätsstrasse 1, D-40225, Düsseldorf, Germany
| | - Uriel Urquiza-Garcia
- Institute of Synthetic Biology, Heinrich-Heine-University Düsseldorf, Universitätsstrasse 1, D-40225, Düsseldorf, Germany
- CEPLAS-Cluster of Excellence on Plant Sciences, Heinrich-Heine-University Düsseldorf, Universitätsstrasse 1, D-40225, Düsseldorf, Germany
| | - Matias D Zurbriggen
- Institute of Synthetic Biology, Heinrich-Heine-University Düsseldorf, Universitätsstrasse 1, D-40225, Düsseldorf, Germany
- CEPLAS-Cluster of Excellence on Plant Sciences, Heinrich-Heine-University Düsseldorf, Universitätsstrasse 1, D-40225, Düsseldorf, Germany
| | - Hannes M Beyer
- Institute of Synthetic Biology, Heinrich-Heine-University Düsseldorf, Universitätsstrasse 1, D-40225, Düsseldorf, Germany
| |
Collapse
|
23
|
Liu S, Golozar A, Buesgens N, McLeggon JA, Black A, Nagy P. A framework for understanding an open scientific community using automated harvesting of public artifacts. JAMIA Open 2024; 7:ooae017. [PMID: 38425704 PMCID: PMC10903973 DOI: 10.1093/jamiaopen/ooae017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 09/14/2023] [Accepted: 02/15/2024] [Indexed: 03/02/2024] Open
Abstract
Background The Observational Health Data Sciences and Informatics (OHDSI) community has emerged as a leader in observational research on real-world clinical data for promoting evidence for healthcare and decision-making. The community has seen rapid growth in publications, citations, and the number of authors. Components of its successful uptake have been attributed to an open science and collaborative culture for research and development. Investigating the adoption of OHDSI as a field of study provides an opportunity to understand how communities embrace new ideas, onboard new members, and enhance their impact. Objective To track, study, and evaluate an open scientific community's growth and impact. Method We present a modern architecture leveraging open application programming interfaces to capture publicly available data (PubMed, YouTube, and EHDEN) on open science activities (publication, teaching, and engagement). Results Three interactive dashboard were implemented for each publicly available artifact (PubMed, YouTube, and EHDEN). Each dashboard provides longitudinal summary analysis and has a searchable table, which differs in the available features related to each public artifact. Conclusion We discuss the insights enabled by our approach to monitor the growth and impact of the OHDSI community by capturing artifacts of learning, teaching, and creation. We share the implications for different users based on their functional needs. As other scientific networks adopt open-source frameworks, our framework serves as a model for tracking the growth of their community, driving the perception of their development, engaging their members, and attaining higher impact.
Collapse
Affiliation(s)
- Star Liu
- Biomedical Informatics and Data Science, Johns Hopkins University School of Medicine, Baltimore, MD 21205, United States
| | - Asieh Golozar
- OHDSI Center at the Roux Institute, Northeastern University, Boston, MA 04101, United States
- Odysseus Data Services, Cambridge, MA 02142, United States
| | - Nathan Buesgens
- Biomedical Informatics and Data Science, Johns Hopkins University School of Medicine, Baltimore, MD 21205, United States
| | - Jody-Ann McLeggon
- Biomedical Informatics, Columbia University, New York, NY 10032, United States
| | - Adam Black
- Odysseus Data Services, Cambridge, MA 02142, United States
| | - Paul Nagy
- Biomedical Informatics and Data Science, Johns Hopkins University School of Medicine, Baltimore, MD 21205, United States
| |
Collapse
|
24
|
Mitra S, Malik R, Wong W, Rahman A, Hartemink AJ, Pritykin Y, Dey KK, Leslie CS. Single-cell multi-ome regression models identify functional and disease-associated enhancers and enable chromatin potential analysis. Nat Genet 2024; 56:627-636. [PMID: 38514783 PMCID: PMC11018525 DOI: 10.1038/s41588-024-01689-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 02/14/2024] [Indexed: 03/23/2024]
Abstract
We present a gene-level regulatory model, single-cell ATAC + RNA linking (SCARlink), which predicts single-cell gene expression and links enhancers to target genes using multi-ome (scRNA-seq and scATAC-seq co-assay) sequencing data. The approach uses regularized Poisson regression on tile-level accessibility data to jointly model all regulatory effects at a gene locus, avoiding the limitations of pairwise gene-peak correlations and dependence on peak calling. SCARlink outperformed existing gene scoring methods for imputing gene expression from chromatin accessibility across high-coverage multi-ome datasets while giving comparable to improved performance on low-coverage datasets. Shapley value analysis on trained models identified cell-type-specific gene enhancers that are validated by promoter capture Hi-C and are 11× to 15× and 5× to 12× enriched in fine-mapped eQTLs and fine-mapped genome-wide association study (GWAS) variants, respectively. We further show that SCARlink-predicted and observed gene expression vectors provide a robust way to compute a chromatin potential vector field to enable developmental trajectory analysis.
Collapse
Affiliation(s)
- Sneha Mitra
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York City, NY, USA
| | | | - Wilfred Wong
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York City, NY, USA
- Tri-Institutional Training Program in Computational Biology and Medicine, New York City, NY, USA
| | - Afsana Rahman
- Hunter College, City University of New York, New York City, NY, USA
| | - Alexander J Hartemink
- Department of Computer Science, Duke University, Durham, NC, USA
- Program in Computational Biology and Bioinformatics, Duke University, Durham, NC, USA
- Center for Genomic and Computational Biology, Duke University, Durham, NC, USA
| | - Yuri Pritykin
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York City, NY, USA
- Department of Computer Science, Princeton University, Princeton, NJ, USA
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Kushal K Dey
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York City, NY, USA.
| | - Christina S Leslie
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York City, NY, USA.
| |
Collapse
|
25
|
Dinnage R, Sarre SD, Duncan RP, Dickman CR, Edwards SV, Greenville AC, Wardle GM, Gruber B. slimr: An R package for tailor-made integrations of data in population genomic simulations over space and time. Mol Ecol Resour 2024; 24:e13916. [PMID: 38124500 DOI: 10.1111/1755-0998.13916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 11/20/2023] [Accepted: 11/30/2023] [Indexed: 12/23/2023]
Abstract
Software for realistically simulating complex population genomic processes is revolutionizing our understanding of evolutionary processes, and providing novel opportunities for integrating empirical data with simulations. However, the integration between standalone simulation software and R is currently not well developed. Here, we present slimr, an R package designed to create a seamless link between standalone software SLiM >3.0, one of the most powerful population genomic simulation frameworks, and the R development environment, with its powerful data manipulation and analysis tools. We show how slimr facilitates smooth integration between genetic data, ecological data and simulation in a single environment. The package enables pipelines that begin with data reading, cleaning and manipulation, proceed to constructing empirically based parameters and initial conditions for simulations, then to running numerical simulations and finally to retrieving simulation results in a format suitable for comparisons with empirical data - aided by advanced analysis and visualization tools provided by R. We demonstrate the use of slimr with an example from our own work on the landscape population genomics of desert mammals, highlighting the advantage of having a single integrated tool for both data analysis and simulation. slimr makes the powerful simulation ability of SLiM directly accessible to R users, allowing integrated simulation projects that incorporate empirical data without the need to switch between software environments. This should provide more opportunities for evolutionary biologists and ecologists to use realistic simulations to better understand the interplay between ecological and evolutionary processes.
Collapse
Affiliation(s)
- Russell Dinnage
- Institute of Environment, Department of Biological Sciences, Florida International University, Miami, Florida, USA
- Centre for Conservation Ecology and Genomics, Institute for Applied Ecology, University of Canberra, Canberra, Australian Capital Territory, Australia
| | - Stephen D Sarre
- Centre for Conservation Ecology and Genomics, Institute for Applied Ecology, University of Canberra, Canberra, Australian Capital Territory, Australia
| | - Richard P Duncan
- Centre for Conservation Ecology and Genomics, Institute for Applied Ecology, University of Canberra, Canberra, Australian Capital Territory, Australia
| | - Christopher R Dickman
- Desert Ecology Research Group, School of Life and Environmental Sciences, University of Sydney, Camperdown, New South Wales, Australia
| | - Scott V Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts, USA
| | - Aaron C Greenville
- Desert Ecology Research Group, School of Life and Environmental Sciences, University of Sydney, Camperdown, New South Wales, Australia
| | - Glenda M Wardle
- Desert Ecology Research Group, School of Life and Environmental Sciences, University of Sydney, Camperdown, New South Wales, Australia
| | - Bernd Gruber
- Centre for Conservation Ecology and Genomics, Institute for Applied Ecology, University of Canberra, Canberra, Australian Capital Territory, Australia
| |
Collapse
|
26
|
Gomez-Zepeda D, Michna T, Ziesmann T, Distler U, Tenzer S. HowDirty: An R package to evaluate molecular contaminants in LC-MS experiments. Proteomics 2024; 24:e2300134. [PMID: 37679057 DOI: 10.1002/pmic.202300134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 08/22/2023] [Accepted: 08/24/2023] [Indexed: 09/09/2023]
Abstract
Contaminants derived from consumables, reagents, and sample handling often negatively affect LC-MS data acquisition. In proteomics experiments, they can markedly reduce identification performance, reproducibility, and quantitative robustness. Here, we introduce a data analysis workflow combining MS1 feature extraction in Skyline with HowDirty, an R-markdown-based tool, that automatically generates an interactive report on the molecular contaminant level in LC-MS data sets. To facilitate the interpretation of the results, the HTML report is self-contained and self-explanatory, including plots that can be easily interpreted. The R package HowDirty is available from https://github.com/DavidGZ1/HowDirty. To demonstrate a showcase scenario for the application of HowDirty, we assessed the impact of ultrafiltration units from different providers on sample purity after filter-assisted sample preparation (FASP) digestion. This allowed us to select the filter units with the lowest contamination risk. Notably, the filter units with the lowest contaminant levels showed higher reproducibility regarding the number of peptides and proteins identified. Overall, HowDirty enables the efficient evaluation of sample quality covering a wide range of common contaminant groups that typically impair LC-MS analyses, facilitating corrective or preventive actions to minimize instrument downtime.
Collapse
Affiliation(s)
- David Gomez-Zepeda
- Helmholtz-Institute for Translational Oncology Mainz (HI-TRON), Mainz, Rheinland-Pfalz, Germany
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Rheinland-Pfalz, Germany
| | - Thomas Michna
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Rheinland-Pfalz, Germany
| | - Tanja Ziesmann
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Rheinland-Pfalz, Germany
| | - Ute Distler
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Rheinland-Pfalz, Germany
- Research Center for Immunotherapy (FZI), University Medical Center of the Johannes-Gutenberg University, Mainz, Rheinland-Pfalz, Germany
| | - Stefan Tenzer
- Helmholtz-Institute for Translational Oncology Mainz (HI-TRON), Mainz, Rheinland-Pfalz, Germany
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Rheinland-Pfalz, Germany
- Research Center for Immunotherapy (FZI), University Medical Center of the Johannes-Gutenberg University, Mainz, Rheinland-Pfalz, Germany
| |
Collapse
|
27
|
Nekorchuk DM, Bharadwaja A, Simonson S, Ortega E, França CMB, Dinh E, Reik R, Burkholder R, Wimberly MC. The Arbovirus Mapping and Prediction (ArboMAP) system for West Nile virus forecasting. JAMIA Open 2024; 7:ooad110. [PMID: 38186743 PMCID: PMC10766066 DOI: 10.1093/jamiaopen/ooad110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 12/04/2023] [Accepted: 12/20/2023] [Indexed: 01/09/2024] Open
Abstract
Objectives West Nile virus (WNV) is the most common mosquito-borne disease in the United States. Predicting the location and timing of outbreaks would allow targeting of disease prevention and mosquito control activities. Our objective was to develop software (ArboMAP) for routine WNV forecasting using public health surveillance data and meteorological observations. Materials and Methods ArboMAP was implemented using an R markdown script for data processing, modeling, and report generation. A Google Earth Engine application was developed to summarize and download weather data. Generalized additive models were used to make county-level predictions of WNV cases. Results ArboMAP minimized the number of manual steps required to make weekly forecasts, generated information that was useful for decision-makers, and has been tested and implemented in multiple public health institutions. Discussion and Conclusion Routine prediction of mosquito-borne disease risk is feasible and can be implemented by public health departments using ArboMAP.
Collapse
Affiliation(s)
- Dawn M Nekorchuk
- Department of Geography and Environmental Sustainability, University of Oklahoma, Norman, OK 73019, United States
| | - Anita Bharadwaja
- South Dakota Department of Health, Pierre, SD 57501, United States
| | - Sean Simonson
- Louisiana Department of Health, New Orleans, LA 70112, United States
| | - Emma Ortega
- Louisiana Department of Health, New Orleans, LA 70112, United States
| | - Caio M B França
- Department of Biology, Southern Nazarene University, Bethany, OK 73008, United States
- Quetzal Education and Research Center, Southern Nazarene University, San Gerardo de Dota, 11911, Costa Rica
| | - Emily Dinh
- Michigan Department of Health and Human Services, Lansing, MI 48909, United States
| | - Rebecca Reik
- Michigan Department of Health and Human Services, Lansing, MI 48909, United States
| | - Rachel Burkholder
- Michigan Department of Health and Human Services, Lansing, MI 48909, United States
| | - Michael C Wimberly
- Department of Geography and Environmental Sustainability, University of Oklahoma, Norman, OK 73019, United States
| |
Collapse
|
28
|
Weberpals J, Raman SR, Shaw PA, Lee H, Hammill BG, Toh S, Connolly JG, Dandreo KJ, Tian F, Liu W, Li J, Hernández-Muñoz JJ, Glynn RJ, Desai RJ. smdi: an R package to perform structural missing data investigations on partially observed confounders in real-world evidence studies. JAMIA Open 2024; 7:ooae008. [PMID: 38304248 PMCID: PMC10833461 DOI: 10.1093/jamiaopen/ooae008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/09/2024] [Accepted: 01/16/2024] [Indexed: 02/03/2024] Open
Abstract
Objectives Partially observed confounder data pose a major challenge in statistical analyses aimed to inform causal inference using electronic health records (EHRs). While analytic approaches such as imputation are available, assumptions on underlying missingness patterns and mechanisms must be verified. We aimed to develop a toolkit to streamline missing data diagnostics to guide choice of analytic approaches based on meeting necessary assumptions. Materials and methods We developed the smdi (structural missing data investigations) R package based on results of a previous simulation study which considered structural assumptions of common missing data mechanisms in EHR. Results smdi enables users to run principled missing data investigations on partially observed confounders and implement functions to visualize, describe, and infer potential missingness patterns and mechanisms based on observed data. Conclusions The smdi R package is freely available on CRAN and can provide valuable insights into underlying missingness patterns and mechanisms and thereby help improve the robustness of real-world evidence studies.
Collapse
Affiliation(s)
- Janick Weberpals
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02120, United States
| | - Sudha R Raman
- Department of Population Health Sciences, Duke University School of Medicine, Durham, NC 27701, United States
| | - Pamela A Shaw
- Biostatistics Division, Kaiser Permanente Washington Health Research Institute, Seattle, WA 98101, United States
| | - Hana Lee
- Office of Biostatistics, Center for Drug Evaluation and Research, United States Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Bradley G Hammill
- Department of Population Health Sciences, Duke University School of Medicine, Durham, NC 27701, United States
| | - Sengwee Toh
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA 02215, United States
| | - John G Connolly
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA 02215, United States
| | - Kimberly J Dandreo
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA 02215, United States
| | - Fang Tian
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, United States Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Wei Liu
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, United States Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Jie Li
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, United States Food and Drug Administration, Silver Spring, MD 20993, United States
| | - José J Hernández-Muñoz
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, United States Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Robert J Glynn
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02120, United States
| | - Rishi J Desai
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02120, United States
| |
Collapse
|
29
|
King VL, Siegel G, Priesmeyer HR, Siegel LH, Potter JS. Development and Evaluation of a Digital App for Patient Self-Management of Opioid Use Disorder: Usability, Acceptability, and Utility Study. JMIR Form Res 2024; 8:e48068. [PMID: 38557501 PMCID: PMC11019416 DOI: 10.2196/48068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 12/07/2023] [Accepted: 01/11/2024] [Indexed: 04/04/2024] Open
Abstract
BACKGROUND Self-management of opioid use disorder (OUD) is an important component of treatment. Many patients receiving opioid agonist treatment in methadone maintenance treatment settings benefit from counseling treatments to help them improve their recovery skills but have insufficient access to these treatments between clinic appointments. In addition, many addiction medicine clinicians treating patients with OUD in a general medical clinic setting do not have consistent access to counseling referrals for their patients. This can lead to decreases in both treatment retention and overall progress in the patient's recovery from substance misuse. Digital apps may help to bridge this gap by coaching, supporting, and reinforcing behavioral change that is initiated and directed by their psychosocial and medical providers. OBJECTIVE This study aimed to conduct an acceptability, usability, and utility pilot study of the KIOS app to address these clinical needs. METHODS We developed a unique, patient-centered computational software system (KIOS; Biomedical Development Corporation) to assist in managing OUD in an outpatient, methadone maintenance clinic setting. KIOS tracks interacting self-reported symptoms (craving, depressed mood, anxiety, irritability, pain, agitation or restlessness, difficulty sleeping, absenteeism, difficulty with usual activities, and conflicts with others) to determine changes in both the trajectory and severity of symptom patterns over time. KIOS then applies a proprietary algorithm to assess the individual's patterns of symptom interaction in accordance with models previously established by OUD experts. After this analysis, KIOS provides specific behavioral advice addressing the individual's changing trajectory of symptoms to help the person self-manage their symptoms. The KIOS software also provides analytics on the self-reported data that can be used by patients, clinicians, and researchers to track outcomes. RESULTS In a 4-week acceptability, usability (mean System Usability Scale-Modified score 89.5, SD 9.2, maximum of 10.0), and utility (mean KIOS utility questionnaire score 6.32, SD 0.25, maximum of 7.0) pilot study of 15 methadone-maintained participants with OUD, user experience, usability, and software-generated advice received high and positive assessment scores. The KIOS clinical variables closely correlated with craving self-report measures. Therefore, managing these variables with advice generated by the KIOS software could have an impact on craving and ultimately substance use. CONCLUSIONS KIOS tracks key clinical variables and generates advice specifically relevant to the patient's current and changing clinical state. Patients in this pilot study assigned high positive values to the KIOS user experience, ease of use, and the appropriateness, relevance, and usefulness of the specific behavioral guidance they received to match their evolving experiences. KIOS may therefore be useful to augment in-person treatment of opioid agonist patients and help fill treatment gaps that currently exist in the continuum of care. A National Institute on Drug Abuse-funded randomized controlled trial of KIOS to augment in-person treatment of patients with OUD is currently being conducted.
Collapse
Affiliation(s)
- Van Lewis King
- Department of Psychiatry and Behavioral Sciences, University of Texas Health Science Center San Antonio, San Antonio, TX, United States
| | - Gregg Siegel
- Biomedical Development Corporation, San Antonio, TX, United States
| | | | - Leslie H Siegel
- Biomedical Development Corporation, San Antonio, TX, United States
| | - Jennifer S Potter
- Department of Psychiatry and Behavioral Sciences, University of Texas Health Science Center San Antonio, San Antonio, TX, United States
| |
Collapse
|
30
|
Pan L, Lu J, Tang X. Spatial-temporal graph neural ODE networks for skeleton-based action recognition. Sci Rep 2024; 14:7629. [PMID: 38561396 PMCID: PMC10984971 DOI: 10.1038/s41598-024-58190-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 03/26/2024] [Indexed: 04/04/2024] Open
Abstract
In the field of skeleton-based action recognition, accurately recognizing human actions is crucial for applications such as virtual reality and motion analysis. However, this task faces challenges such intraindividual action differences and long-term temporal dependencies. To address these challenges, we propose an innovative model called spatial-temporal graph neural ordinary differential equations (STG-NODE). First, in the data preprocessing stage, the dynamic time warping (DTW) algorithm is used to normalize and calculate 3D skeleton data to facilitate the derivation of customized adjacency matrices for improving the influence of intraindividual action differences. Secondly, a custom ordinary differential equation (ODE) integrator is applied based on the initial conditions of the temporal features, producing a solution function that simulates the dynamic evolution trend of the events of interest. Finally, the outstanding ODE solver is used to numerically solve the time features based on the solution function to increase the influence of long-term dependencies on the recognition accuracy of the model and provide it with a more powerful temporal modeling ability. Through extensive experiments conducted on the NTU RGB+D 60 and Kinetics Skeleton 400 benchmark datasets, we demonstrate the superior performance of STG-NODE in the action recognition domain. The success of the STG-NODE model also provides new ideas and methods for the future development of the action recognition field.
Collapse
Affiliation(s)
- Longji Pan
- Guizhou University, State Key Laboratory of Public Big Data, Guiyang, 550025, China
| | - Jianguang Lu
- Guizhou University, State Key Laboratory of Public Big Data, Guiyang, 550025, China.
| | - Xianghong Tang
- Guizhou University, State Key Laboratory of Public Big Data, Guiyang, 550025, China
| |
Collapse
|
31
|
Frommelt F, Fossati A, Uliana F, Wendt F, Xue P, Heusel M, Wollscheid B, Aebersold R, Ciuffa R, Gstaiger M. DIP-MS: ultra-deep interaction proteomics for the deconvolution of protein complexes. Nat Methods 2024; 21:635-647. [PMID: 38532014 PMCID: PMC11009110 DOI: 10.1038/s41592-024-02211-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 02/14/2024] [Indexed: 03/28/2024]
Abstract
Most proteins are organized in macromolecular assemblies, which represent key functional units regulating and catalyzing most cellular processes. Affinity purification of the protein of interest combined with liquid chromatography coupled to tandem mass spectrometry (AP-MS) represents the method of choice to identify interacting proteins. The composition of complex isoforms concurrently present in the AP sample can, however, not be resolved from a single AP-MS experiment but requires computational inference from multiple time- and resource-intensive reciprocal AP-MS experiments. Here we introduce deep interactome profiling by mass spectrometry (DIP-MS), which combines AP with blue-native-PAGE separation, data-independent acquisition with mass spectrometry and deep-learning-based signal processing to resolve complex isoforms sharing the same bait protein in a single experiment. We applied DIP-MS to probe the organization of the human prefoldin family of complexes, resolving distinct prefoldin holo- and subcomplex variants, complex-complex interactions and complex isoforms with new subunits that were experimentally validated. Our results demonstrate that DIP-MS can reveal proteome modularity at unprecedented depth and resolution.
Collapse
Affiliation(s)
- Fabian Frommelt
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.
| | - Andrea Fossati
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Quantitative Biosciences Institute (QBI), University of California San Francisco, San Francisco, CA, USA
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA
- J. David Gladstone Institutes, San Francisco, CA, USA
| | - Federico Uliana
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Department of Biology, Institute of Biochemistry, ETH Zurich, Zurich, Switzerland
| | - Fabian Wendt
- Department of Health Sciences and Technology (D-HEST), Institute of Translational Medicine (ITM), ETH Zurich, Zurich, Switzerland
| | - Peng Xue
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Guangzhou National Laboratory, Guang Zhou, China
| | - Moritz Heusel
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Bernd Wollscheid
- Department of Health Sciences and Technology (D-HEST), Institute of Translational Medicine (ITM), ETH Zurich, Zurich, Switzerland
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Rodolfo Ciuffa
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Matthias Gstaiger
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.
| |
Collapse
|
32
|
Liu W, Wang Z, You R, Xie C, Wei H, Xiong Y, Yang J, Zhu S. PLMSearch: Protein language model powers accurate and fast sequence search for remote homology. Nat Commun 2024; 15:2775. [PMID: 38555371 PMCID: PMC10981738 DOI: 10.1038/s41467-024-46808-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Accepted: 03/08/2024] [Indexed: 04/02/2024] Open
Abstract
Homologous protein search is one of the most commonly used methods for protein annotation and analysis. Compared to structure search, detecting distant evolutionary relationships from sequences alone remains challenging. Here we propose PLMSearch (Protein Language Model), a homologous protein search method with only sequences as input. PLMSearch uses deep representations from a pre-trained protein language model and trains the similarity prediction model with a large number of real structure similarity. This enables PLMSearch to capture the remote homology information concealed behind the sequences. Extensive experimental results show that PLMSearch can search millions of query-target protein pairs in seconds like MMseqs2 while increasing the sensitivity by more than threefold, and is comparable to state-of-the-art structure search methods. In particular, unlike traditional sequence search methods, PLMSearch can recall most remote homology pairs with dissimilar sequences but similar structures. PLMSearch is freely available at https://dmiip.sjtu.edu.cn/PLMSearch .
Collapse
Affiliation(s)
- Wei Liu
- Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, 200433, Shanghai, China
| | - Ziye Wang
- Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, 200433, Shanghai, China
| | - Ronghui You
- Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, 200433, Shanghai, China
| | - Chenghan Xie
- School of Mathematical Sciences, Fudan University, 200433, Shanghai, China
| | - Hong Wei
- School of Mathematical Sciences, Nankai University, 300071, Tianjin, China
| | - Yi Xiong
- Department of Bioinformatics and Biostatistics, Shanghai Jiao Tong University, 200240, Shanghai, China
| | - Jianyi Yang
- Ministry of Education Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Science, Shandong University, 266237, Qingdao, China.
| | - Shanfeng Zhu
- Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, 200433, Shanghai, China.
- Shanghai Qi Zhi Institute, Shanghai, China.
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China.
- Shanghai Key Lab of Intelligent Information Processing and Shanghai Institute of Artificial Intelligence Algorithm, Fudan University, Shanghai, China.
- Zhangjiang Fudan International Innovation Center, Shanghai, China.
| |
Collapse
|
33
|
Liu Z, Lin J, Xie K, Sha J, Chen X, Lei W, Huang L, Yan Y. [Development and Application of Deep Learning-Based Model for Quality Control of Children Pelvic X-Ray Images]. Zhongguo Yi Liao Qi Xie Za Zhi 2024; 48:144-149. [PMID: 38605612 DOI: 10.12455/j.issn.1671-7104.240010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 04/13/2024]
Abstract
Objective A deep learning-based method for evaluating the quality of pediatric pelvic X-ray images is proposed to construct a diagnostic model and verify its clinical feasibility. Methods Three thousand two hundred and forty-seven children with anteroposteric pelvic radiographs are retrospectively collected and randomly divided into training datasets, validation datasets and test datasets. Artificial intelligence model is conducted to evaluate the reliability of quality control model. Results The diagnostic accuracy, area under ROC curve, sensitivity and specificity of the model are 99.4%, 0.993, 98.6% and 100.0%, respectively. The 95% consistency limit of the pelvic tilt index of the model is -0.052-0.072. The 95% consistency threshold of pelvic rotation index is -0.088-0.055. Conclusion This is the first attempt to apply AI algorithm to the quality assessment of children's pelvic radiographs, and has significantly improved the diagnosis and treatment status of DDH in children.
Collapse
Affiliation(s)
- Zhichen Liu
- Department of Orthopaedics, Xijing Hospital, Air Force Medical University, Xi'an, 710032
| | - Jincong Lin
- Department of Orthopaedics, Xijing Hospital, Air Force Medical University, Xi'an, 710032
| | - Kunjie Xie
- Department of Orthopaedics, Xijing Hospital, Air Force Medical University, Xi'an, 710032
| | - Jia Sha
- Department of Orthopaedics, Xijing Hospital, Air Force Medical University, Xi'an, 710032
| | - Xu Chen
- Department of Orthopaedics, Xijing Hospital, Air Force Medical University, Xi'an, 710032
| | - Wei Lei
- Department of Orthopaedics, Xijing Hospital, Air Force Medical University, Xi'an, 710032
| | - Luyu Huang
- Department of Orthopaedics, Xijing Hospital, Air Force Medical University, Xi'an, 710032
| | - Yabo Yan
- Department of Orthopaedics, Xijing Hospital, Air Force Medical University, Xi'an, 710032
| |
Collapse
|
34
|
Aleksander SA, Anagnostopoulos AV, Antonazzo G, Arnaboldi V, Attrill H, Becerra A, Bello SM, Blodgett O, Bradford YM, Bult CJ, Cain S, Calvi BR, Carbon S, Chan J, Chen WJ, Cherry JM, Cho J, Crosby MA, De Pons JL, D’Eustachio P, Diamantakis S, Dolan ME, dos Santos G, Dyer S, Ebert D, Engel SR, Fashena D, Fisher M, Foley S, Gibson AC, Gollapally VR, Gramates LS, Grove CA, Hale P, Harris T, Hayman GT, Hu Y, James-Zorn C, Karimi K, Karra K, Kishore R, Kwitek AE, Laulederkind SJF, Lee R, Longden I, Luypaert M, Markarian N, Marygold SJ, Matthews B, McAndrews MS, Millburn G, Miyasato S, Motenko H, Moxon S, Muller HM, Mungall CJ, Muruganujan A, Mushayahama T, Nash RS, Nuin P, Paddock H, Pells T, Perrimon N, Pich C, Quinton-Tulloch M, Raciti D, Ramachandran S, Richardson JE, Gelbart SR, Ruzicka L, Schindelman G, Shaw DR, Sherlock G, Shrivatsav A, Singer A, Smith CM, Smith CL, Smith JR, Stein L, Sternberg PW, Tabone CJ, Thomas PD, Thorat K, Thota J, Tomczuk M, Trovisco V, Tutaj MA, Urbano JM, Van Auken K, Van Slyke CE, Vize PD, Wang Q, Weng S, Westerfield M, Wilming LG, Wong ED, Wright A, Yook K, Zhou P, Zorn A, Zytkovicz M. Updates to the Alliance of Genome Resources Central Infrastructure. Genetics 2024:iyae049. [PMID: 38552170 DOI: 10.1093/genetics/iyae049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 02/28/2024] [Accepted: 02/29/2024] [Indexed: 04/09/2024] Open
Abstract
The Alliance of Genome Resources (Alliance) is an extensible coalition of knowledgebases focused on the genetics and genomics of intensively-studied model organisms. The Alliance is organized as individual knowledge centers with strong connections to their research communities and a centralized software infrastructure, discussed here. Model organisms currently represented in the Alliance are budding yeast, C. elegans, Drosophila, zebrafish, frog, laboratory mouse, laboratory rat, and the Gene Ontology Consortium. The project is in a rapid development phase to harmonize knowledge, store it, analyze it, and present it to the community through a web portal, direct downloads, and Application Programming Interfaces (APIs). Here we focus on developments over the last two years. Specifically, we added and enhanced tools for browsing the genome (JBrowse), downloading sequences, mining complex data (AllianceMine), visualizing pathways, full-text searching of the literature (Textpresso), and sequence similarity searching (SequenceServer). We enhanced existing interactive data tables and added an interactive table of paralogs to complement our representation of orthology. To support individual model organism communities, we implemented species-specific "landing pages" and will add disease-specific portals soon; in addition, we support a common community forum implemented in Discourse software. We describe our progress towards a central persistent database to support curation, the data modeling that underpins harmonization, and progress towards a state-of-the art literature curation system with integrated Artificial Intelligence and Machine Learning (AI/ML).
Collapse
Affiliation(s)
| | | | | | - Giulia Antonazzo
- Department of Physiology, Development and Neuroscience, University of Cambridge , Downing Street, Cambridge CB2 3DY , UK
| | - Valerio Arnaboldi
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Helen Attrill
- Department of Physiology, Development and Neuroscience, University of Cambridge , Downing Street, Cambridge CB2 3DY , UK
| | - Andrés Becerra
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus , Hinxton, Cambridge CB10 1SD , UK
| | - Susan M Bello
- The Jackson Laboratory for Mammalian Genomics , Bar Harbor, ME, 04609 , USA
| | - Olin Blodgett
- The Jackson Laboratory for Mammalian Genomics , Bar Harbor, ME, 04609 , USA
| | | | - Carol J Bult
- The Jackson Laboratory for Mammalian Genomics , Bar Harbor, ME, 04609 , USA
| | - Scott Cain
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
| | - Brian R Calvi
- Department of Biology, Indiana University , Bloomington, IN 47408 , USA
| | - Seth Carbon
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory , Berkeley, CA
| | - Juancarlos Chan
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Wen J Chen
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - J Michael Cherry
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - Jaehyoung Cho
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Madeline A Crosby
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Jeffrey L De Pons
- Medical College of Wisconsin - Rat Genome Database, Departments of Physiology and Biomedical Engineering, Medical College of Wisconsin , Milwaukee, WI 53226 , USA
| | | | - Stavros Diamantakis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus , Hinxton, Cambridge CB10 1SD , UK
| | - Mary E Dolan
- The Jackson Laboratory for Mammalian Genomics , Bar Harbor, ME, 04609 , USA
| | - Gilberto dos Santos
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Sarah Dyer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus , Hinxton, Cambridge CB10 1SD , UK
| | - Dustin Ebert
- Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033 , USA
| | - Stacia R Engel
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - David Fashena
- Institute of Neuroscience, University of Oregon , Eugene, OR 97403
| | - Malcolm Fisher
- Division of Developmental Biology, Cincinnati Children’s Hospital Medical Center , 3333 Burnet Ave, Cincinnati, OH 45229 , USA
| | - Saoirse Foley
- Department of Biological Sciences, Carnegie Mellon University , 5000 Forbes Ave, Pittsburgh, PA 15203
| | - Adam C Gibson
- Medical College of Wisconsin - Rat Genome Database, Departments of Physiology and Biomedical Engineering, Medical College of Wisconsin , Milwaukee, WI 53226 , USA
| | - Varun R Gollapally
- Medical College of Wisconsin - Rat Genome Database, Departments of Physiology and Biomedical Engineering, Medical College of Wisconsin , Milwaukee, WI 53226 , USA
| | - L Sian Gramates
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Christian A Grove
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Paul Hale
- The Jackson Laboratory for Mammalian Genomics , Bar Harbor, ME, 04609 , USA
| | - Todd Harris
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
| | - G Thomas Hayman
- Medical College of Wisconsin - Rat Genome Database, Departments of Physiology and Biomedical Engineering, Medical College of Wisconsin , Milwaukee, WI 53226 , USA
| | - Yanhui Hu
- Department of Genetics, Howard Hughes Medical Institute, Harvard Medical School , 77 Avenue Louis Pasteur, Boston, MA 02115 , USA
| | - Christina James-Zorn
- Division of Developmental Biology, Cincinnati Children’s Hospital Medical Center , 3333 Burnet Ave, Cincinnati, OH 45229 , USA
| | - Kamran Karimi
- Department of Biological Sciences, University of Calgary , 507 Campus Dr NW, Calgary, AB T2N 4V8 , Canada
| | - Kalpana Karra
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - Ranjana Kishore
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Anne E Kwitek
- Medical College of Wisconsin - Rat Genome Database, Departments of Physiology and Biomedical Engineering, Medical College of Wisconsin , Milwaukee, WI 53226 , USA
| | - Stanley J F Laulederkind
- Medical College of Wisconsin - Rat Genome Database, Departments of Physiology and Biomedical Engineering, Medical College of Wisconsin , Milwaukee, WI 53226 , USA
| | - Raymond Lee
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Ian Longden
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Manuel Luypaert
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus , Hinxton, Cambridge CB10 1SD , UK
| | - Nicholas Markarian
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Steven J Marygold
- Department of Physiology, Development and Neuroscience, University of Cambridge , Downing Street, Cambridge CB2 3DY , UK
| | - Beverley Matthews
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Monica S McAndrews
- The Jackson Laboratory for Mammalian Genomics , Bar Harbor, ME, 04609 , USA
| | - Gillian Millburn
- Department of Physiology, Development and Neuroscience, University of Cambridge , Downing Street, Cambridge CB2 3DY , UK
| | - Stuart Miyasato
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - Howie Motenko
- The Jackson Laboratory for Mammalian Genomics , Bar Harbor, ME, 04609 , USA
| | - Sierra Moxon
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory , Berkeley, CA
| | - Hans-Michael Muller
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Christopher J Mungall
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory , Berkeley, CA
| | - Anushya Muruganujan
- Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033 , USA
| | - Tremayne Mushayahama
- Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033 , USA
| | - Robert S Nash
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - Paulo Nuin
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
| | - Holly Paddock
- Institute of Neuroscience, University of Oregon , Eugene, OR 97403
| | - Troy Pells
- Department of Biological Sciences, University of Calgary , 507 Campus Dr NW, Calgary, AB T2N 4V8 , Canada
| | - Norbert Perrimon
- Department of Genetics, Howard Hughes Medical Institute, Harvard Medical School , 77 Avenue Louis Pasteur, Boston, MA 02115 , USA
| | - Christian Pich
- Institute of Neuroscience, University of Oregon , Eugene, OR 97403
| | - Mark Quinton-Tulloch
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus , Hinxton, Cambridge CB10 1SD , UK
| | - Daniela Raciti
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | | | | | - Susan Russo Gelbart
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Leyla Ruzicka
- Institute of Neuroscience, University of Oregon , Eugene, OR 97403
| | - Gary Schindelman
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - David R Shaw
- The Jackson Laboratory for Mammalian Genomics , Bar Harbor, ME, 04609 , USA
| | - Gavin Sherlock
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - Ajay Shrivatsav
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - Amy Singer
- Institute of Neuroscience, University of Oregon , Eugene, OR 97403
| | - Constance M Smith
- The Jackson Laboratory for Mammalian Genomics , Bar Harbor, ME, 04609 , USA
| | - Cynthia L Smith
- The Jackson Laboratory for Mammalian Genomics , Bar Harbor, ME, 04609 , USA
| | - Jennifer R Smith
- Medical College of Wisconsin - Rat Genome Database, Departments of Physiology and Biomedical Engineering, Medical College of Wisconsin , Milwaukee, WI 53226 , USA
| | - Lincoln Stein
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
| | - Paul W Sternberg
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Christopher J Tabone
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Paul D Thomas
- Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033 , USA
| | - Ketaki Thorat
- Medical College of Wisconsin - Rat Genome Database, Departments of Physiology and Biomedical Engineering, Medical College of Wisconsin , Milwaukee, WI 53226 , USA
| | - Jyothi Thota
- Medical College of Wisconsin - Rat Genome Database, Departments of Physiology and Biomedical Engineering, Medical College of Wisconsin , Milwaukee, WI 53226 , USA
| | - Monika Tomczuk
- The Jackson Laboratory for Mammalian Genomics , Bar Harbor, ME, 04609 , USA
| | - Vitor Trovisco
- Department of Physiology, Development and Neuroscience, University of Cambridge , Downing Street, Cambridge CB2 3DY , UK
| | - Marek A Tutaj
- Medical College of Wisconsin - Rat Genome Database, Departments of Physiology and Biomedical Engineering, Medical College of Wisconsin , Milwaukee, WI 53226 , USA
| | - Jose-Maria Urbano
- Department of Physiology, Development and Neuroscience, University of Cambridge , Downing Street, Cambridge CB2 3DY , UK
| | - Kimberly Van Auken
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Ceri E Van Slyke
- Institute of Neuroscience, University of Oregon , Eugene, OR 97403
| | - Peter D Vize
- Department of Biological Sciences, University of Calgary , 507 Campus Dr NW, Calgary, AB T2N 4V8 , Canada
| | - Qinghua Wang
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Shuai Weng
- Department of Genetics, Stanford University , Stanford, CA 94305
| | | | - Laurens G Wilming
- The Jackson Laboratory for Mammalian Genomics , Bar Harbor, ME, 04609 , USA
| | - Edith D Wong
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - Adam Wright
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
| | - Karen Yook
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Pinglei Zhou
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Aaron Zorn
- Division of Developmental Biology, Cincinnati Children’s Hospital Medical Center , 3333 Burnet Ave, Cincinnati, OH 45229 , USA
| | - Mark Zytkovicz
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| |
Collapse
|
35
|
Ono R, Katsumata A, Fujikawa Y, Takahira E, Yamamoto T, Kanamura N. Sex differences and age-related changes in the mandibular alveolar bone mineral density using a computer-aided measurement system for intraoral radiography. Sci Rep 2024; 14:7386. [PMID: 38548856 PMCID: PMC10979020 DOI: 10.1038/s41598-024-57805-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 03/21/2024] [Indexed: 04/01/2024] Open
Abstract
This study aimed to conduct a cross-sectional data analysis of the alveolar bone mineral density (al-BMD) in 225 patients of various ages and different sexes. The al-BMD value in the mandibular incisor region was calculated using a computer-aided measurement system (DentalSCOPE) for intraoral radiography. All participants with intact teeth (101 males and 124 females; age range, 25-89 years) were divided into three age-segregated groups (25-49, 50-74, and > 75 years). Statistical differences were evaluated using the Mann-Whitney U or Kruskal-Wallis test. Males exhibited significantly greater al-BMD than females (p < 0.001). The highest means were observed in the 25-49 age group, regardless of sex (1007.90 mg/cm2 in males, 910.90 mg/cm2 in females). A 9.8% decrease in al-BMD was observed with the increase in age in males (25-49 to 50-74 years; p = 0.004); however, no further changes were seen thereafter. In females, a decreasing trend was seen throughout the lifespan, with values reaching up to 76.0% of the initial peak value (p < 0.001). Similar to other skeletal sites, the alveolar bone exhibits sex differences and undergoes a reduction in BMD via the normal aging process.
Collapse
Affiliation(s)
- Ryutaro Ono
- Department of Dental Medicine, Graduate School of Medicine, Kyoto Prefectural University of Medicine, 465 Kajii-cho, Kawaramachi-Hirokoji, Kamigyo-ku, Kyoto, 602-8566, Japan.
- Department of Oral and Maxillofacial Surgery, North Medical Center, Kyoto Prefectural University of Medicine, Kyoto, Japan.
| | - Akitoshi Katsumata
- Department of Oral Radiology, Asahi University School of Dentistry, Gifu, Japan
| | - Yumi Fujikawa
- Department of Dental Medicine, Graduate School of Medicine, Kyoto Prefectural University of Medicine, 465 Kajii-cho, Kawaramachi-Hirokoji, Kamigyo-ku, Kyoto, 602-8566, Japan
- Department of Oral and Maxillofacial Surgery, North Medical Center, Kyoto Prefectural University of Medicine, Kyoto, Japan
| | - Emi Takahira
- Department of Dental Medicine, Graduate School of Medicine, Kyoto Prefectural University of Medicine, 465 Kajii-cho, Kawaramachi-Hirokoji, Kamigyo-ku, Kyoto, 602-8566, Japan
- Department of Oral and Maxillofacial Surgery, North Medical Center, Kyoto Prefectural University of Medicine, Kyoto, Japan
| | - Toshiro Yamamoto
- Department of Dental Medicine, Graduate School of Medicine, Kyoto Prefectural University of Medicine, 465 Kajii-cho, Kawaramachi-Hirokoji, Kamigyo-ku, Kyoto, 602-8566, Japan
| | - Narisato Kanamura
- Department of Dental Medicine, Graduate School of Medicine, Kyoto Prefectural University of Medicine, 465 Kajii-cho, Kawaramachi-Hirokoji, Kamigyo-ku, Kyoto, 602-8566, Japan
| |
Collapse
|
36
|
Wang C, Østergaard L, Hasselholt S, Sporring J. A semi-automatic method for extracting mitochondrial cristae characteristics from 3D focused ion beam scanning electron microscopy data. Commun Biol 2024; 7:377. [PMID: 38548849 PMCID: PMC10978844 DOI: 10.1038/s42003-024-06045-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 03/11/2024] [Indexed: 04/01/2024] Open
Abstract
Mitochondria are the main suppliers of energy for cells and their bioenergetic function is regulated by mitochondrial dynamics: the constant changes in mitochondria size, shape, and cristae morphology to secure cell homeostasis. Although changes in mitochondrial function are implicated in a wide range of diseases, our understanding is challenged by a lack of reliable ways to extract spatial features from the cristae, the detailed visualization of which requires electron microscopy (EM). Here, we present a semi-automatic method for the segmentation, 3D reconstruction, and shape analysis of mitochondria, cristae, and intracristal spaces based on 2D EM images of the murine hippocampus. We show that our method provides a more accurate characterization of mitochondrial ultrastructure in 3D than common 2D approaches and propose an operational index of mitochondria's internal organization. With an improved consistency of 3D shape analysis and a decrease in the workload needed for large-scale analysis, we speculate that this tool will help increase our understanding of mitochondrial dynamics in health and disease.
Collapse
Affiliation(s)
- Chenhao Wang
- Department of Computer Science, University of Copenhagen, Copenhagen, Denmark.
- Center for Quantification of Imaging Data from MAX IV, Copenhagen, Denmark.
| | - Leif Østergaard
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Center of Functionally Integrative Neuroscience, Aarhus, Denmark
| | - Stine Hasselholt
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Center of Functionally Integrative Neuroscience, Aarhus, Denmark
| | - Jon Sporring
- Department of Computer Science, University of Copenhagen, Copenhagen, Denmark.
- Center for Quantification of Imaging Data from MAX IV, Copenhagen, Denmark.
| |
Collapse
|
37
|
Matson Z, Cooley G, Parameswaran N, Simon A, Bankamp B, Coughlin MM. shinyMBA: a novel R shiny application for quality control of the multiplex bead assay for serosurveillance studies. Sci Rep 2024; 14:7442. [PMID: 38548772 PMCID: PMC10978933 DOI: 10.1038/s41598-024-57652-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 03/20/2024] [Indexed: 04/01/2024] Open
Abstract
The multiplex bead assay (MBA) based on Luminex xMAP technology can be used as a tool to measure seroprevalence as part of population immunity evaluations to multiple antigens in large-scale serosurveys. However, multiplexing several antigens presents challenges for quality control (QC) assessments of the data because multiple parameters must be evaluated for each antigen. MBA QC parameters include monitoring bead counts and median fluorescence intensity (MFI) for each antigen in plate wells, and performance of assay controls included on each plate. Analyzing these large datasets to identify plates failing QC standards presents challenges for many laboratories. We developed a novel R Shiny application, shinyMBA, to expedite the MBA QC processes and reduce the risk of user error. The app allows users to rapidly merge multi-plate assay outputs to evaluate bead count, MFI, and performance of assay controls using statistical process control charts for all antigen targets simultaneously. The utility of the shinyMBA application and its various outputs are demonstrated using data from 32 synthetic xPONENT files with 3 multiplex antigens and two population serosurveillance studies that evaluated 1200 and 3871 samples, respectively, for 20 multiplexed antigens. The shinyMBA open-source code is available for download and modification at https://github.com/CDCgov/shinyMBA . Incorporation of shinyMBA into Luminex serosurveillance workflows can vastly improve the speed and accuracy of QC processes.
Collapse
Affiliation(s)
- Zachary Matson
- Viral Vaccine Preventable Diseases Branch, Division of Viral Diseases, National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA.
| | - Gretchen Cooley
- Division of Parasitic Diseases and Malaria, Center for Global Health, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Nishanth Parameswaran
- Division of Parasitic Diseases and Malaria, Center for Global Health, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Ashley Simon
- Division of Parasitic Diseases and Malaria, Center for Global Health, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Bettina Bankamp
- Viral Vaccine Preventable Diseases Branch, Division of Viral Diseases, National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Melissa M Coughlin
- Laboratory Branch, Coronavirus and Other Respiratory Viruses Division, National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA
| |
Collapse
|
38
|
Vanea C, Džigurski J, Rukins V, Dodi O, Siigur S, Salumäe L, Meir K, Parks WT, Hochner-Celnikier D, Fraser A, Hochner H, Laisk T, Ernst LM, Lindgren CM, Nellåker C. Mapping cell-to-tissue graphs across human placenta histology whole slide images using deep learning with HAPPY. Nat Commun 2024; 15:2710. [PMID: 38548713 PMCID: PMC10978962 DOI: 10.1038/s41467-024-46986-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 03/15/2024] [Indexed: 04/01/2024] Open
Abstract
Accurate placenta pathology assessment is essential for managing maternal and newborn health, but the placenta's heterogeneity and temporal variability pose challenges for histology analysis. To address this issue, we developed the 'Histology Analysis Pipeline.PY' (HAPPY), a deep learning hierarchical method for quantifying the variability of cells and micro-anatomical tissue structures across placenta histology whole slide images. HAPPY differs from patch-based features or segmentation approaches by following an interpretable biological hierarchy, representing cells and cellular communities within tissues at a single-cell resolution across whole slide images. We present a set of quantitative metrics from healthy term placentas as a baseline for future assessments of placenta health and we show how these metrics deviate in placentas with clinically significant placental infarction. HAPPY's cell and tissue predictions closely replicate those from independent clinical experts and placental biology literature.
Collapse
Affiliation(s)
- Claudia Vanea
- Nuffield Department of Women's & Reproductive Health, University of Oxford, Oxford, UK.
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK.
| | | | | | - Omri Dodi
- Faculty of Medicine, Hadassah Hebrew University Medical Center, Jerusalem, Israel
| | - Siim Siigur
- Department of Pathology, Tartu University Hospital, Tartu, Estonia
| | - Liis Salumäe
- Department of Pathology, Tartu University Hospital, Tartu, Estonia
| | - Karen Meir
- Department of Pathology, Hadassah Hebrew University Medical Center, Jerusalem, Israel
| | - W Tony Parks
- Department of Laboratory Medicine & Pathobiology, University of Toronto, Toronto, Canada
| | | | - Abigail Fraser
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
- MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, UK
| | - Hagit Hochner
- Braun School of Public Health, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Triin Laisk
- Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Linda M Ernst
- Department of Pathology and Laboratory Medicine, NorthShore University HealthSystem, Chicago, USA
- Department of Pathology, University of Chicago Pritzker School of Medicine, Chicago, USA
| | - Cecilia M Lindgren
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
- Centre for Human Genetics, Nuffield Department, University of Oxford, Oxford, UK
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Nuffield Department of Population Health Health, University of Oxford, Oxford, UK
| | - Christoffer Nellåker
- Nuffield Department of Women's & Reproductive Health, University of Oxford, Oxford, UK.
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK.
| |
Collapse
|
39
|
Liu YH, Luo C, Golding SG, Ioffe JB, Zhou XM. Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data. Nat Commun 2024; 15:2447. [PMID: 38503752 PMCID: PMC10951360 DOI: 10.1038/s41467-024-46614-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 03/04/2024] [Indexed: 03/21/2024] Open
Abstract
Long-read sequencing offers long contiguous DNA fragments, facilitating diploid genome assembly and structural variant (SV) detection. Efficient and robust algorithms for SV identification are crucial with increasing data availability. Alignment-based methods, favored for their computational efficiency and lower coverage requirements, are prominent. Alternative approaches, relying solely on available reads for de novo genome assembly and employing assembly-based tools for SV detection via comparison to a reference genome, demand significantly more computational resources. However, the lack of comprehensive benchmarking constrains our comprehension and hampers further algorithm development. Here we systematically compare 14 read alignment-based SV calling methods (including 4 deep learning-based methods and 1 hybrid method), and 4 assembly-based SV calling methods, alongside 4 upstream aligners and 7 assemblers. Assembly-based tools excel in detecting large SVs, especially insertions, and exhibit robustness to evaluation parameter changes and coverage fluctuations. Conversely, alignment-based tools demonstrate superior genotyping accuracy at low sequencing coverage (5-10×) and excel in detecting complex SVs, like translocations, inversions, and duplications. Our evaluation provides performance insights, highlighting the absence of a universally superior tool. We furnish guidelines across 31 criteria combinations, aiding users in selecting the most suitable tools for diverse scenarios and offering directions for further method development.
Collapse
Affiliation(s)
- Yichen Henry Liu
- Department of Computer Science, Vanderbilt University, 37235, Nashville, TN, USA
| | - Can Luo
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, TN, USA
| | - Staunton G Golding
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, TN, USA
| | - Jacob B Ioffe
- Department of Computer Science, Vanderbilt University, 37235, Nashville, TN, USA
| | - Xin Maizie Zhou
- Department of Computer Science, Vanderbilt University, 37235, Nashville, TN, USA.
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, TN, USA.
- Data Science Institute, Vanderbilt University, 37235, Nashville, TN, USA.
| |
Collapse
|
40
|
Cáceres Rivera DI, Rojas LMJ, Rojas LZ, Gomez DC, Castro Ruiz DA, López Romero LA. Using Principles of Digital Development for a Smartphone App to Support Data Collection in Patients With Acute Myocardial Infarction and Physical Activity Intolerance: Case Study. JMIR Form Res 2024; 8:e33868. [PMID: 38498019 PMCID: PMC10985596 DOI: 10.2196/33868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 10/18/2023] [Accepted: 01/02/2024] [Indexed: 03/19/2024] Open
Abstract
BACKGROUND Advances in health have highlighted the need to implement technologies as a fundamental part of the diagnosis, treatment, and recovery of patients at risk of or with health alterations. For this purpose, digital platforms have demonstrated their applicability in the identification of care needs. Nursing is a fundamental component in the care of patients with cardiovascular disorders and plays a crucial role in diagnosing human responses to these health conditions. Consequently, the validation of nursing diagnoses through ongoing research processes has become a necessity that can significantly impact both patients and health care professionals. OBJECTIVE We aimed to describe the process of developing a mobile app to validate the nursing diagnosis "intolerance to physical activity" in patients with acute myocardial infarction. METHODS We describe the development and pilot-testing of a mobile system to support data collection for validating the nursing diagnosis of activity intolerance. This was a descriptive study conducted with 11 adults (aged ≥18 years) who attended a health institution for highly complex needs with a suspected diagnosis of coronary syndrome between August and September 2019 in Floridablanca, Colombia. An app for the clinical validation of activity intolerance (North American Nursing Diagnosis Association [NANDA] code 00092) in patients with acute coronary syndrome was developed in two steps: (1) operationalization of the nursing diagnosis and (2) the app development process, which included an evaluation of the initial requirements, development and digitization of the forms, and a pilot test. The agreement level between the 2 evaluating nurses was evaluated with the κ index. RESULTS We developed a form that included sociodemographic data, hospital admission data, medical history, current pharmacological treatment, and thrombolysis in myocardial infarction risk score (TIMI-RS) and GRACE (Global Registry of Acute Coronary Events) scores. To identify the defining characteristics, we included official guidelines, physiological measurements, and scales such as the Piper fatigue scale and Borg scale. Participants in the pilot test (n=11) had an average age of 63.2 (SD 4.0) years and were 82% (9/11) men; 18% (2/11) had incomplete primary schooling. The agreement between the evaluators was approximately 80% for most of the defining characteristics. The most prevalent characteristics were exercise discomfort (10/11, 91%), weakness (7/11, 64%), dyspnea (3/11, 27%), abnormal heart rate in response to exercise (2/10, 20%), electrocardiogram abnormalities (1/10, 9%), and abnormal blood pressure in response to activity (1/10, 10%). CONCLUSIONS We developed a mobile app for validating the diagnosis of "activity intolerance." Its use will guarantee not only optimal data collection, minimizing errors to perform validation, but will also allow the identification of individual care needs.
Collapse
Affiliation(s)
| | | | - Lyda Z Rojas
- Centro de Investigaciones, Fundación Cardiovascular de Colombia, Floridablanca, Colombia
| | - Diana Canon Gomez
- Centro de Investigaciones, Fundación Cardiovascular de Colombia, Floridablanca, Colombia
| | | | - Luis Alberto López Romero
- Departamento de Pediatría, de Obstetricia y Ginecología y de Medicina Preventiva y Salud Pública, Universidad Autónoma de Barcelona, Barcelona, Spain
| |
Collapse
|
41
|
Giri SJ, Ibtehaz N, Kihara D. GO2Sum: generating human-readable functional summary of proteins from GO terms. NPJ Syst Biol Appl 2024; 10:29. [PMID: 38491038 PMCID: PMC10943200 DOI: 10.1038/s41540-024-00358-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 03/05/2024] [Indexed: 03/18/2024] Open
Abstract
Understanding the biological functions of proteins is of fundamental importance in modern biology. To represent a function of proteins, Gene Ontology (GO), a controlled vocabulary, is frequently used, because it is easy to handle by computer programs avoiding open-ended text interpretation. Particularly, the majority of current protein function prediction methods rely on GO terms. However, the extensive list of GO terms that describe a protein function can pose challenges for biologists when it comes to interpretation. In response to this issue, we developed GO2Sum (Gene Ontology terms Summarizer), a model that takes a set of GO terms as input and generates a human-readable summary using the T5 large language model. GO2Sum was developed by fine-tuning T5 on GO term assignments and free-text function descriptions for UniProt entries, enabling it to recreate function descriptions by concatenating GO term descriptions. Our results demonstrated that GO2Sum significantly outperforms the original T5 model that was trained on the entire web corpus in generating Function, Subunit Structure, and Pathway paragraphs for UniProt entries.
Collapse
Affiliation(s)
| | - Nabil Ibtehaz
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
42
|
Calendo G, Kusic D, Madzo J, Gharani N, Scheinfeldt L. ursaPGx: a new R package to annotate pharmacogenetic star alleles using phased whole-genome sequencing data. Front Bioinform 2024; 4:1351620. [PMID: 38533129 PMCID: PMC10963438 DOI: 10.3389/fbinf.2024.1351620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 02/28/2024] [Indexed: 03/28/2024] Open
Abstract
Long-read sequencing technologies offer new opportunities to generate high-confidence phased whole-genome sequencing data for robust pharmacogenetic annotation. Here, we describe a new user-friendly R package, ursaPGx, designed to accept multi-sample phased whole-genome sequencing data VCF input files and output star allele annotations for pharmacogenes annotated in PharmVar.
Collapse
Affiliation(s)
- Gennaro Calendo
- Coriell Institute for Medical Research, Camden, NJ, United States
| | - Dara Kusic
- Coriell Institute for Medical Research, Camden, NJ, United States
| | - Jozef Madzo
- Coriell Institute for Medical Research, Camden, NJ, United States
- Cooper Medical School of Rowan University, Camden, NJ, United States
| | - Neda Gharani
- Coriell Institute for Medical Research, Camden, NJ, United States
- Gharani Consulting Limited, London, United Kingdom
| | - Laura Scheinfeldt
- Coriell Institute for Medical Research, Camden, NJ, United States
- Cooper Medical School of Rowan University, Camden, NJ, United States
| |
Collapse
|
43
|
Qiu Z, Yuan L, Lian CA, Lin B, Chen J, Mu R, Qiao X, Zhang L, Xu Z, Fan L, Zhang Y, Wang S, Li J, Cao H, Li B, Chen B, Song C, Liu Y, Shi L, Tian Y, Ni J, Zhang T, Zhou J, Zhuang WQ, Yu K. BASALT refines binning from metagenomic data and increases resolution of genome-resolved metagenomic analysis. Nat Commun 2024; 15:2179. [PMID: 38467684 PMCID: PMC10928208 DOI: 10.1038/s41467-024-46539-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 03/01/2024] [Indexed: 03/13/2024] Open
Abstract
Metagenomic binning is an essential technique for genome-resolved characterization of uncultured microorganisms in various ecosystems but hampered by the low efficiency of binning tools in adequately recovering metagenome-assembled genomes (MAGs). Here, we introduce BASALT (Binning Across a Series of Assemblies Toolkit) for binning and refinement of short- and long-read sequencing data. BASALT employs multiple binners with multiple thresholds to produce initial bins, then utilizes neural networks to identify core sequences to remove redundant bins and refine non-redundant bins. Using the same assemblies generated from Critical Assessment of Metagenome Interpretation (CAMI) datasets, BASALT produces up to twice as many MAGs as VAMB, DASTool, or metaWRAP. Processing assemblies from a lake sediment dataset, BASALT produces ~30% more MAGs than metaWRAP, including 21 unique class-level prokaryotic lineages. Functional annotations reveal that BASALT can retrieve 47.6% more non-redundant opening-reading frames than metaWRAP. These results highlight the robust handling of metagenomic sequencing data of BASALT.
Collapse
Affiliation(s)
- Zhiguang Qiu
- Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, China
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, China
| | - Li Yuan
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, China
- School of Electronic and Computer Engineering, Peking University, Shenzhen, China
- Peng Cheng Laboratory, Shenzhen, China
| | - Chun-Ang Lian
- Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, China
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, China
| | - Bin Lin
- School of Electronic and Computer Engineering, Peking University, Shenzhen, China
| | - Jie Chen
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, China
- School of Electronic and Computer Engineering, Peking University, Shenzhen, China
- Peng Cheng Laboratory, Shenzhen, China
| | - Rong Mu
- Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, China
| | - Xuejiao Qiao
- Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, China
| | - Liyu Zhang
- Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, China
| | - Zheng Xu
- Southern University of Sciences and Technology Yantian Hospital, Shenzhen, China
- Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Lu Fan
- Department of Ocean Science and Engineering, Southern University of Science and Technology (SUSTech), Shenzhen, China
| | - Yunzeng Zhang
- Joint International Research Laboratory of Agriculture and Agri-Product Safety, the Ministry of Education of China, Yangzhou University, Yangzhou, China
| | - Shanquan Wang
- Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Sun Yat-Sen University, Guangzhou, China
| | - Junyi Li
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong, China
| | - Huiluo Cao
- Department of Microbiology, University of Hong Kong, Hong Kong, China
| | - Bing Li
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
| | - Baowei Chen
- Guangdong Provincial Key Laboratory of Marine Resources and Coastal Engineering, School of Marine Sciences, Sun Yat-sen University, Zhuhai, China
| | - Chi Song
- Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu, China
- Wuhan Benagen Technology Co., Ltd, Wuhan, China
| | - Yongxin Liu
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Lili Shi
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, China
- State Key Laboratory of Chemical Oncogenomics, School of Chemical Biology and Biotechnology, Peking University Shenzhen Graduate School, Shenzhen, China
| | - Yonghong Tian
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, China
- School of Electronic and Computer Engineering, Peking University, Shenzhen, China
- Peng Cheng Laboratory, Shenzhen, China
| | - Jinren Ni
- Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, China
- College of Environmental Sciences and Engineering, Key Laboratory of Water and Sediment Sciences, Ministry of Education, Peking University, Beijing, China
| | - Tong Zhang
- Department of Civil Engineering, University of Hong Kong, Hong Kong, China
| | - Jizhong Zhou
- Institute for Environmental Genomics, University of Oklahoma, Norman, OK, USA
| | - Wei-Qin Zhuang
- Department of Civil and Environmental Engineering, Faculty of Engineering, University of Auckland, Auckland, New Zealand
| | - Ke Yu
- Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, China.
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, China.
| |
Collapse
|
44
|
Hsiao Y, Zhang H, Li GX, Deng Y, Yu F, Kahrood HV, Steele JR, Schittenhelm RB, Nesvizhskii AI. Analysis and visualization of quantitative proteomics data using FragPipe-Analyst. bioRxiv 2024:2024.03.05.583643. [PMID: 38496650 PMCID: PMC10942459 DOI: 10.1101/2024.03.05.583643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
The FragPipe computational proteomics platform is gaining widespread popularity among the proteomics research community because of its fast processing speed and user-friendly graphical interface. Although FragPipe produces well-formatted output tables that are ready for analysis, there is still a need for an easy-to-use and user-friendly downstream statistical analysis and visualization tool. FragPipe-Analyst addresses this need by providing an R shiny web server to assist FragPipe users in conducting downstream analyses of the resulting quantitative proteomics data. It supports major quantification workflows including label-free quantification, tandem mass tags, and data-independent acquisition. FragPipe-Analyst offers a range of useful functionalities, such as various missing value imputation options, data quality control, unsupervised clustering, differential expression (DE) analysis using Limma, and gene ontology and pathway enrichment analysis using Enrichr. To support advanced analysis and customized visualizations, we also developed FragPipeAnalystR, an R package encompassing all FragPipe-Analyst functionalities that is extended to support site-specific analysis of post-translational modifications (PTMs). FragPipe-Analyst and FragPipeAnalystR are both open-source and freely available.
Collapse
Affiliation(s)
- Yi Hsiao
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Haijian Zhang
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Ginny Xiaohe Li
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yamei Deng
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Hossein Valipour Kahrood
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
- Monash Genomics & Bioinformatics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Joel R. Steele
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Ralf B. Schittenhelm
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Alexey I. Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
45
|
Montepietra D, Tesei G, Martins JM, Kunze MBA, Best RB, Lindorff-Larsen K. FRETpredict: a Python package for FRET efficiency predictions using rotamer libraries. Commun Biol 2024; 7:298. [PMID: 38461354 PMCID: PMC10925062 DOI: 10.1038/s42003-024-05910-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 02/12/2024] [Indexed: 03/11/2024] Open
Abstract
Förster resonance energy transfer (FRET) is a widely-used and versatile technique for the structural characterization of biomolecules. Here, we introduce FRETpredict, an easy-to-use Python software to predict FRET efficiencies from ensembles of protein conformations. FRETpredict uses a rotamer library approach to describe the FRET probes covalently bound to the protein. The software efficiently and flexibly operates on large conformational ensembles such as those generated by molecular dynamics simulations to facilitate the validation or refinement of molecular models and the interpretation of experimental data. We provide access to rotamer libraries for many commonly used dyes and linkers and describe a general methodology to generate new rotamer libraries for FRET probes. We demonstrate the performance and accuracy of the software for different types of systems: a rigid peptide (polyproline 11), an intrinsically disordered protein (ACTR), and three folded proteins (HiSiaP, SBD2, and MalE). FRETpredict is open source (GPLv3) and is available at github.com/KULL-Centre/FRETpredict and as a Python PyPI package at pypi.org/project/FRETpredict .
Collapse
Affiliation(s)
- Daniele Montepietra
- Department of Chemical, Life and Environmental Sustainability Sciences, University of Parma, Parma, 43125, Italy
- Istituto Nanoscienze - CNR-NANO, Center S3, via G. Campi 213/A, 41125, Modena, Italy
| | - Giulio Tesei
- Structural Biology and NMR Laboratory & the Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, DK-2200, Denmark
| | - João M Martins
- Structural Biology and NMR Laboratory & the Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, DK-2200, Denmark
| | - Micha B A Kunze
- Structural Biology and NMR Laboratory & the Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, DK-2200, Denmark
| | - Robert B Best
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, 20892-0520, USA.
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory & the Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, DK-2200, Denmark.
| |
Collapse
|
46
|
Stribling D, Xia Y, Amer MK, Graim KS, Mulligan CJ, Renne R. The model student: GPT-4 performance on graduate biomedical science exams. Sci Rep 2024; 14:5670. [PMID: 38453979 PMCID: PMC10920673 DOI: 10.1038/s41598-024-55568-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 02/25/2024] [Indexed: 03/09/2024] Open
Abstract
The GPT-4 large language model (LLM) and ChatGPT chatbot have emerged as accessible and capable tools for generating English-language text in a variety of formats. GPT-4 has previously performed well when applied to questions from multiple standardized examinations. However, further evaluation of trustworthiness and accuracy of GPT-4 responses across various knowledge domains is essential before its use as a reference resource. Here, we assess GPT-4 performance on nine graduate-level examinations in the biomedical sciences (seven blinded), finding that GPT-4 scores exceed the student average in seven of nine cases and exceed all student scores for four exams. GPT-4 performed very well on fill-in-the-blank, short-answer, and essay questions, and correctly answered several questions on figures sourced from published manuscripts. Conversely, GPT-4 performed poorly on questions with figures containing simulated data and those requiring a hand-drawn answer. Two GPT-4 answer-sets were flagged as plagiarism based on answer similarity and some model responses included detailed hallucinations. In addition to assessing GPT-4 performance, we discuss patterns and limitations in GPT-4 capabilities with the goal of informing design of future academic examinations in the chatbot era.
Collapse
Affiliation(s)
- Daniel Stribling
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, 32610, USA.
- UF Genetics Institute, University of Florida, Gainesville, FL, 32610, USA.
- UF Health Cancer Center, University of Florida, Gainesville, FL, 32610, USA.
| | - Yuxing Xia
- Department of Neuroscience, Center for Translational Research in Neurodegenerative Disease, College of Medicine, University of Florida, Gainesville, FL, 32610, USA
- Department of Neurology, UCLA, Los Angeles, CA, 90095, USA
| | - Maha K Amer
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, 32610, USA
| | - Kiley S Graim
- Department of Computer and Information Science and Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, FL, 32610, USA
| | - Connie J Mulligan
- UF Genetics Institute, University of Florida, Gainesville, FL, 32610, USA
- Department of Anthropology, University of Florida, Gainesville, FL, 32610, USA
| | - Rolf Renne
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, 32610, USA.
- UF Genetics Institute, University of Florida, Gainesville, FL, 32610, USA.
- UF Health Cancer Center, University of Florida, Gainesville, FL, 32610, USA.
| |
Collapse
|
47
|
Hughes E, Kenwright AM. SimpleNMR: An interactive graph network approach to aid constitutional isomer verification using standard 1D and 2D NMR experiments. Magn Reson Chem 2024. [PMID: 38445574 DOI: 10.1002/mrc.5441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 02/16/2024] [Accepted: 02/17/2024] [Indexed: 03/07/2024]
Abstract
Despite progress in computer automated solutions, constitutional isomer verification by NMR using one- and two-dimensional data sets is still, in the main, a manual, user-intensive activity that is challenging for a number of reasons. These include the problem of simultaneously keeping track of the information from a number of separate NMR experiments and the difficulty of another researcher subsequently verifying the assignments made without having to independently repeat the whole analysis. This paper describes a graphical interactive approach that overcomes some of these problems. By using concepts used to visualise graph networks, we have been able to represent the NMR data in a manner that highlights directly the link between the different NMR experiments and the molecule of interest. Furthermore, by making the graph networks interactive, a user can easily validate and correct the assignment and understand the decisions made in arriving at the solution. We have developed a usable proof-of-concept computer program, 'simpleNMR', written in Python to illustrate the ideas and approach.
Collapse
Affiliation(s)
- Eric Hughes
- Department of Chemistry, University of Durham, Durham, UK
| | | |
Collapse
|
48
|
Wang Z, Xia Y, Mills L, Nikolakopoulos AN, Maeser N, Dehm SM, Sheltzer JM, Sun R. Evolving copy number gains promote tumor expansion and bolster mutational diversification. Nat Commun 2024; 15:2025. [PMID: 38448455 PMCID: PMC10918155 DOI: 10.1038/s41467-024-46414-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 02/20/2024] [Indexed: 03/08/2024] Open
Abstract
The timing and fitness effect of somatic copy number alterations (SCNA) in cancer evolution remains poorly understood. Here we present a framework to determine the timing of a clonal SCNA that encompasses multiple gains. This involves calculating the proportion of time from its last gain to the onset of population expansion (lead time) as well as the proportion of time prior to its first gain (initiation time). Our method capitalizes on the observation that a genomic segment, while in a specific copy number (CN) state, accumulates point mutations proportionally to its CN. Analyzing 184 whole genome sequenced samples from 75 patients across five tumor types, we commonly observe late gains following early initiating events, occurring just before the clonal expansion relevant to the sampling. These include gains acquired after genome doubling in more than 60% of cases. Notably, mathematical modeling suggests that late clonal gains may contain final-expansion drivers. Lastly, SCNAs bolster mutational diversification between subpopulations, exacerbating the circle of proliferation and increasing heterogeneity.
Collapse
Affiliation(s)
- Zicheng Wang
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA
- School of Data Science, The Chinese University of Hong Kong (CUHK-Shenzhen), Shenzhen, China
| | - Yunong Xia
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA
| | - Lauren Mills
- Department of Pediatrics, University of Minnesota, Minneapolis, MN, USA
| | - Athanasios N Nikolakopoulos
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA
| | - Nicole Maeser
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA
| | - Scott M Dehm
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA
- Department of Urology, University of Minnesota, Minneapolis, MN, USA
| | | | - Ruping Sun
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA.
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA.
| |
Collapse
|
49
|
Rodriguez DV, Lawrence K, Gonzalez J, Brandfield-Harvey B, Xu L, Tasneem S, Levine DL, Mann D. Leveraging Generative AI Tools to Support the Development of Digital Solutions in Health Care Research: Case Study. JMIR Hum Factors 2024; 11:e52885. [PMID: 38446539 PMCID: PMC10955400 DOI: 10.2196/52885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 11/27/2023] [Accepted: 12/15/2023] [Indexed: 03/07/2024] Open
Abstract
BACKGROUND Generative artificial intelligence has the potential to revolutionize health technology product development by improving coding quality, efficiency, documentation, quality assessment and review, and troubleshooting. OBJECTIVE This paper explores the application of a commercially available generative artificial intelligence tool (ChatGPT) to the development of a digital health behavior change intervention designed to support patient engagement in a commercial digital diabetes prevention program. METHODS We examined the capacity, advantages, and limitations of ChatGPT to support digital product idea conceptualization, intervention content development, and the software engineering process, including software requirement generation, software design, and code production. In total, 11 evaluators, each with at least 10 years of experience in fields of study ranging from medicine and implementation science to computer science, participated in the output review process (ChatGPT vs human-generated output). All had familiarity or prior exposure to the original personalized automatic messaging system intervention. The evaluators rated the ChatGPT-produced outputs in terms of understandability, usability, novelty, relevance, completeness, and efficiency. RESULTS Most metrics received positive scores. We identified that ChatGPT can (1) support developers to achieve high-quality products faster and (2) facilitate nontechnical communication and system understanding between technical and nontechnical team members around the development goal of rapid and easy-to-build computational solutions for medical technologies. CONCLUSIONS ChatGPT can serve as a usable facilitator for researchers engaging in the software development life cycle, from product conceptualization to feature identification and user story development to code generation. TRIAL REGISTRATION ClinicalTrials.gov NCT04049500; https://clinicaltrials.gov/ct2/show/NCT04049500.
Collapse
Affiliation(s)
- Danissa V Rodriguez
- Department of Population Health, New York University Grossman School of Medicine, New York, NY, United States
| | - Katharine Lawrence
- Department of Population Health, New York University Grossman School of Medicine, New York, NY, United States
- Medical Center Information Technology, Department of Health Informatics, New York University Langone Health, New York, NY, United States
| | - Javier Gonzalez
- Medical Center Information Technology, Department of Health Informatics, New York University Langone Health, New York, NY, United States
| | - Beatrix Brandfield-Harvey
- Department of Population Health, New York University Grossman School of Medicine, New York, NY, United States
| | - Lynn Xu
- Department of Population Health, New York University Grossman School of Medicine, New York, NY, United States
| | - Sumaiya Tasneem
- Department of Population Health, New York University Grossman School of Medicine, New York, NY, United States
| | - Defne L Levine
- Department of Population Health, New York University Grossman School of Medicine, New York, NY, United States
| | - Devin Mann
- Department of Population Health, New York University Grossman School of Medicine, New York, NY, United States
- Medical Center Information Technology, Department of Health Informatics, New York University Langone Health, New York, NY, United States
| |
Collapse
|
50
|
Boehm D, Strantz C, Christoph J, Busch H, Ganslandt T, Unberath P. Data Visualization Support for Tumor Boards and Clinical Oncology: Protocol for a Scoping Review. JMIR Res Protoc 2024; 13:e53627. [PMID: 38441925 PMCID: PMC10951826 DOI: 10.2196/53627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 12/17/2023] [Accepted: 12/19/2023] [Indexed: 03/07/2024] Open
Abstract
BACKGROUND Complex and expanding data sets in clinical oncology applications require flexible and interactive visualization of patient data to provide the maximum amount of information to physicians and other medical practitioners. Interdisciplinary tumor conferences in particular profit from customized tools to integrate, link, and visualize relevant data from all professions involved. OBJECTIVE The scoping review proposed in this protocol aims to identify and present currently available data visualization tools for tumor boards and related areas. The objective of the review will be to provide not only an overview of digital tools currently used in tumor board settings, but also the data included, the respective visualization solutions, and their integration into hospital processes. METHODS The planned scoping review process is based on the Arksey and O'Malley scoping study framework. The following electronic databases will be searched for articles published in English: PubMed, Web of Knowledge, and SCOPUS. Eligible articles will first undergo a deduplication step, followed by the screening of titles and abstracts. Second, a full-text screening will be used to reach the final decision about article selection. At least 2 reviewers will independently screen titles, abstracts, and full-text reports. Conflicting inclusion decisions will be resolved by a third reviewer. The remaining literature will be analyzed using a data extraction template proposed in this protocol. The template includes a variety of meta information as well as specific questions aiming to answer the research question: "What are the key features of data visualization solutions used in molecular and organ tumor boards, and how are these elements integrated and used within the clinical setting?" The findings will be compiled, charted, and presented as specified in the scoping study framework. Data for included tools may be supplemented with additional manual literature searches. The entire review process will be documented in alignment with the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) flowchart. RESULTS The results of this scoping review will be reported per the expanded PRISMA-ScR guidelines. A preliminary search using PubMed, Web of Knowledge, and Scopus resulted in 1320 articles after deduplication that will be included in the further review process. We expect the results to be published during the second quarter of 2024. CONCLUSIONS Visualization is a key process in leveraging a data set's potentially available information and enabling its use in an interdisciplinary setting. The scoping review described in this protocol aims to present the status quo of visualization solutions for tumor board and clinical oncology applications and their integration into hospital processes. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) DERR1-10.2196/53627.
Collapse
Affiliation(s)
- Dominik Boehm
- Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
- Bavarian Cancer Research Center (Bayerisches Zentrum für Krebsforschung), Erlangen, Germany
| | - Cosima Strantz
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Jan Christoph
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
- Junior Research Group (Bio-)medical Data Science, Faculty of Medicine, Martin-Luther-University Halle-Wittenberg, Halle, Germany
| | - Hauke Busch
- Group for Medical Systems Biology, Lübeck Institute of Experimental Dermatology, University of Lübeck, Lübeck, Germany
| | - Thomas Ganslandt
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Philipp Unberath
- Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
- SRH Fürth University of Applied Sciences, Fürth, Germany
| |
Collapse
|