1
|
Vollmar M, Tirunagari S, Harrus D, Armstrong D, Gáborová R, Gupta D, Afonso MQL, Evans G, Velankar S. Dataset from a human-in-the-loop approach to identify functionally important protein residues from literature. Sci Data 2024; 11:1032. [PMID: 39333508 PMCID: PMC11436914 DOI: 10.1038/s41597-024-03841-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 08/29/2024] [Indexed: 09/29/2024] Open
Abstract
We present a novel system that leverages curators in the loop to develop a dataset and model for detecting structure features and functional annotations at residue-level from standard publication text. Our approach involves the integration of data from multiple resources, including PDBe, EuropePMC, PubMedCentral, and PubMed, combined with annotation guidelines from UniProt, and LitSuggest and HuggingFace models as tools in the annotation process. A team of seven annotators manually curated ten articles for named entities, which we utilized to train a starting PubmedBert model from HuggingFace. Using a human-in-the-loop annotation system, we iteratively developed the best model with commendable performance metrics of 0.90 for precision, 0.92 for recall, and 0.91 for F1-measure. Our proposed system showcases a successful synergy of machine learning techniques and human expertise in curating a dataset for residue-level functional annotations and protein structure features. The results demonstrate the potential for broader applications in protein research, bridging the gap between advanced machine learning models and the indispensable insights of domain experts.
Collapse
Affiliation(s)
- Melanie Vollmar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Santosh Tirunagari
- Literature Services, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Deborah Harrus
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - David Armstrong
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Romana Gáborová
- CEITEC - Central European Institute of Technology, Masaryk University, Kamenice 5, 62500, Brno, Czech Republic
| | - Deepti Gupta
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Marcelo Querino Lima Afonso
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Genevieve Evans
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
2
|
Leite ELL, Sheila de Queiroz Souza A, Riceli Vasconcelos Ribeiro P, de Cássia Alves Pereira R, Florêncio Martins N, Kueirislene Amâncio Ferreira M, Silva Alencar de Menezes JE, Silva Dos Santos H, Deusdênia Loiola Pessoa O, Marques Canuto K. Molecular Docking and GC/MS-Based Approach for Identification of Anxiolytic Alkaloids from Griffinia (Amaryllidaceae) Species in a Zebrafish Model. Chem Biodivers 2024; 21:e202302122. [PMID: 38354224 DOI: 10.1002/cbdv.202302122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Revised: 02/12/2024] [Accepted: 02/13/2024] [Indexed: 02/16/2024]
Abstract
Griffinia gardneriana Ravenna, Griffinia liboniana Morren and Griffinia nocturna Ravenna (Amarillydaceae) are bulbous plants found in tropical regions of Brazil. Our work aimed to determine the alkaloid profiles of Griffinia spp. and evaluate their anxiolytic potential through in vivo and in silico assays. The plants grown in greenhouses were dried and their ground bulbs were subjected to liquid-liquid partitions, resulting in alkaloid fractions that were analyzed by gas chromatography coupled to mass spectrometry (GC-MS). Anxiolytic activity was evaluated in zebrafish (Danio rerio) through intraperitoneal injection at doses of 40, 100 and 200 mg/kg in light-dark box test. GC-MS analyses revealed 23 alkaloids belonging to different skeleton types: lycorine, homolychorine, galanthamine, crinine, haemanthamine, montanine and narcisclasine. The chemical profiles were relatively similar, presenting 8 alkaloids common to the three species. The major component for G. gardneriana and G. liboniana was lycorine, while G. nocturna consisted mainly of anhydrolycorine. All three alkaloid fractions demonstrated anxiolytic effect. Furthermore, pre-treatment with diazepam and pizotifen drugs was able to reverse the anxiolytic action, indicating involving the GABAergic and serotonergic receptors. Molecular docking showed that the compounds vittatine, lycorine and 11,12-dehydro-2-methoxyassoanine had high affinity with both receptors, suggesting them to be responsible for the anxiolytic effect.
Collapse
Affiliation(s)
- Elder Luis Lima Leite
- Embrapa Agroindústria Tropical, Fortaleza, CE, Brazil
- Departamento de Química Orgânica e Inorgânica, Universidade Federal do Ceará, Fortaleza Ceará Brazil
| | | | | | | | | | - Maria Kueirislene Amâncio Ferreira
- Programa de Pós-graduação em Ciências Naturais, Universidade Estadual do Ceará, Fortaleza, CE, Brazil
- Centro de Ciências Exatas e Tecnologia, Universidade Estadual do Vale do Acaraú, Sobral, CE, Brazil
| | | | - Hélcio Silva Dos Santos
- Programa de Pós-graduação em Ciências Naturais, Universidade Estadual do Ceará, Fortaleza, CE, Brazil
- Centro de Ciências Exatas e Tecnologia, Universidade Estadual do Vale do Acaraú, Sobral, CE, Brazil
| | | | - Kirley Marques Canuto
- Embrapa Agroindústria Tropical, Fortaleza, CE, Brazil
- Programa de Pós-graduação em Ciências Naturais, Universidade Estadual do Ceará, Fortaleza, CE, Brazil
| |
Collapse
|