1
|
Zhang L, Starr J, Ford B, Reznicek A, Zhou Y, Léveillé-Bourret É, Lacroix-Carignan É, Cayouette J, Smith TW, Sutherland D, Catling P, Saarela JM, Cui H, Macklin J. Helping authors produce FAIR taxonomic data: evaluation of an author-driven phenotype data production prototype. Database (Oxford) 2025; 2025:baae097. [PMID: 39879563 PMCID: PMC11928229 DOI: 10.1093/database/baae097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 08/01/2024] [Accepted: 08/28/2024] [Indexed: 01/31/2025]
Abstract
It is well-known that the use of vocabulary in phenotype treatments is often inconsistent. An earlier survey of biologists who create or use phenotypic characters revealed that this lack of standardization leads to ambiguities, frustrating both the consumers and producers of phenotypic data. Such ambiguities are challenging for biologists, and more so for Artificial Intelligence, to resolve. That survey also indicated a strong interest in a new authoring workflow supported by ontologies to ensure published phenotype data are FAIR (Findable, Accessible, Interoperable, and Reusable) and suitable for large-scale computational analyses. In this article, we introduce a prototype software system designed for authors to produce computational phenotype data. This platform includes a web-based, ontology-enhanced editor for taxonomic characters (Character Recorder), an Ontology Backend holding standardized vocabulary (the Cared Ontology), and a mobile application for resolving ontological conflicts (Conflict Resolver). We present two formal user evaluations of Character Recorder, the main interface authors would interact with to produce FAIR data. The evaluations were conducted with undergraduate biology students and Carex experts. We evaluated Character Recorder against Microsoft Excel on their effectiveness, efficiency, and the cognitive demands of the users in producing computable taxon-by-character matrices. The evaluations showed that Character Recorder is quickly learnable for both student and professional participants, with its cognitive demand comparable to Excel's. Participants agreed that the quality of the data Character Recorder yielded was superior. Students praised Character Recorder's educational value, while Carex experts were keen to recommend it and help evolve it from a prototype into a comprehensive tool. Feature improvements recommended by expert participants have been implemented after the evaluation.
Collapse
Affiliation(s)
- Limin Zhang
- School of Information, University of Arizona, 1103 E. 2nd Street, Tucson, AZ 85719, USA
- School of Fine Arts, Huaiyin Normal University, 71 Jiaotong Road, Huaian, Jiangsu 223001, China
| | - Julian Starr
- Department of Biology, University of Ottawa, 30 Marie Curie, Ottawa, ON K1N 6N5, Canada
| | - Bruce Ford
- Department of Biological Sciences, University of Manitoba, 50 Sifton Road, Winnipeg, MB R3T 2N2, Canada
| | - Anton Reznicek
- University Herbarium, University of Michigan, 3600 Varsity Drive, Ann Arbor, MI 48108, US
| | - Yuxuan Zhou
- School of Information, University of Arizona, 1103 E. 2nd Street, Tucson, AZ 85719, USA
| | - Étienne Léveillé-Bourret
- Department of Biological Sciences, Université de Montréal, 1375 Avenue Thérèse-Lavoie-Roux, Montréal, QC H3A 2B3, Canada
| | - Étienne Lacroix-Carignan
- Department of Biological Sciences, Université de Montréal, 1375 Avenue Thérèse-Lavoie-Roux, Montréal, QC H3A 2B3, Canada
| | - Jacques Cayouette
- Research and Development Centre, Agriculture and Agri-Food Canada, 960 Carling Avenue, Ottawa, ON CA K1A 0C6, Canada
| | - Tyler W Smith
- Research and Development Centre, Agriculture and Agri-Food Canada, 960 Carling Avenue, Ottawa, ON CA K1A 0C6, Canada
| | - Donald Sutherland
- Natural Heritage Information Centre, Ontario Ministry of Natural Resources, P.O. Box 7000, Peterborough, Ontario K9J 8M5, Canada
| | - Paul Catling
- Research and Development Centre, Agriculture and Agri-Food Canada, 960 Carling Avenue, Ottawa, ON CA K1A 0C6, Canada
| | - Jeffery M Saarela
- Research and Collections, Canadian Museum of Nature, 240 McLeod St, Ottawa, ON K1P 6P4, Canada
| | - Hong Cui
- School of Information, University of Arizona, 1103 E. 2nd Street, Tucson, AZ 85719, USA
| | - James Macklin
- Research and Development Centre, Agriculture and Agri-Food Canada, 960 Carling Avenue, Ottawa, ON CA K1A 0C6, Canada
| |
Collapse
|
2
|
Cui H, Ford B, Starr J, Reznicek A, Zhang L, Macklin JA. Authors’ attitude toward adopting a new workflow to improve the computability of phenotype publications. Database (Oxford) 2022; 2022:6519872. [PMID: 35106535 PMCID: PMC9278328 DOI: 10.1093/database/baac001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 11/24/2021] [Accepted: 01/10/2022] [Indexed: 11/13/2022]
Abstract
Critical to answering large-scale questions in biology is the integration of knowledge from different disciplines into a coherent, computable whole. Controlled vocabularies such as ontologies represent a clear path toward this goal. Using survey questionnaires, we examined the attitudes of biologists toward adopting controlled vocabularies in phenotype publications. Our questions cover current experience and overall attitude with controlled vocabularies, the awareness of the issues around ambiguity and inconsistency in phenotype descriptions and post-publication professional data curation, the preferred solutions and the effort and desired rewards for adopting a new authoring workflow. Results suggest that although the existence of controlled vocabularies is widespread, their use is not common. A majority of respondents (74%) are frustrated with ambiguity in phenotypic descriptions, and there is a strong agreement (mean agreement score 4.21 out of 5) that author curation would better reflect the original meaning of phenotype data. Moreover, the vast majority (85%) of researchers would try a new authoring workflow if resultant data were more consistent and less ambiguous. Even more respondents (93%) suggested that they would try and possibly adopt a new authoring workflow if it required 5% additional effort as compared to normal, but higher rates resulted in a steep decline in likely adoption rates. Among the four different types of rewards, two types of citations were the most desired incentives for authors to produce computable data. Overall, our results suggest the adoption of a new authoring workflow would be accelerated by a user-friendly and efficient software-authoring tool, an increased awareness of the challenges text ambiguity creates for external curators and an elevated appreciation of the benefits of controlled vocabularies.
Collapse
Affiliation(s)
- Hong Cui
- School of Information, University of Arizona , 1103 E. Second Street, Tucson, AZ 85705, USA
| | - Bruce Ford
- Department of Biological Sciences, University of Manitoba , 50 Sifton Road, Winnipeg, MB R3T 2N2, Canada
| | - Julian Starr
- Department of Biology, University of Ottawa , 30 Marie Curie Road, Ottawa, ON K1N 6N5, Canada
| | - Anton Reznicek
- SLA Herbarium, University of Michigan , 3600 Varsity Drive #1046, Ann Arbor, MI 48019, USA
| | - Limin Zhang
- School of Information, University of Arizona , 1103 E. Second Street, Tucson, AZ 85705, USA
| | - James A Macklin
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada , 960 Carling Avenue, Ottawa, ON K1A 0C6, Canada
| |
Collapse
|
3
|
Zhang L, Yang X, Cota Z, Cui H, Ford B, Chen HL, Macklin JA, Reznicek A, Starr J. Which methods are the most effective in enabling novice users to participate in ontology creation? A usability study. Database (Oxford) 2021; 2021:baab035. [PMID: 34156445 PMCID: PMC8218699 DOI: 10.1093/database/baab035] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 04/02/2021] [Accepted: 05/22/2021] [Indexed: 11/14/2022]
Abstract
Producing findable, accessible, interoperable and reusable (FAIR) data cannot be accomplished solely by data curators in all disciplines. In biology, we have shown that phenotypic data curation is not only costly, but it is burdened with inter-curator variation. We intend to propose a software platform that would enable all data producers, including authors of scientific publications, to produce ontologized data at the time of publication. Working toward this goal, we need to identify ontology construction methods that are preferred by end users. Here, we employ two usability studies to evaluate effectiveness, efficiency and user satisfaction with a set of four methods that allow an end user to add terms and their relations to an ontology. Thirty-three participants took part in a controlled experiment where they evaluated the four methods (Quick Form, Wizard, WebProtégé and Wikidata) after watching demonstration videos and completing a hands-on task. Another think-aloud study was conducted with three professional botanists. The efficiency effectiveness and user confidence in the methods are clearly revealed through statistical and content analyses of participants' comments. Quick Form, Wizard and WebProtégé offer distinct strengths that would benefit our author-driven FAIR data generation system. Features preferred by the participants will guide the design of future iterations.
Collapse
Affiliation(s)
- Limin Zhang
- School of Information, University of Arizona, 1103 E. Second Street, Tucson, AZ 85705, USA
| | - Xingyi Yang
- School of Information, University of Arizona, 1103 E. Second Street, Tucson, AZ 85705, USA
| | - Zuleima Cota
- School of Information, University of Arizona, 1103 E. Second Street, Tucson, AZ 85705, USA
| | - Hong Cui
- School of Information, University of Arizona, 1103 E. Second Street, Tucson, AZ 85705, USA
| | - Bruce Ford
- Department of Biological Sciences, University of Manitoba, 50 Sifton Road, Winnipeg, MB R3T 2N2, Canada
| | - Hsin-liang Chen
- Curtis Laws Wilson Library, Missouri University of Science and Technology, 400 W. 14th Street, Rolla, MO 65409, USA
| | - James A Macklin
- Information protected by Canada government, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada
| | - Anton Reznicek
- SLA Herbarium, University of Michigan, 3600 Varsity Drive, Ann Arbor, MI 48019, USA
| | - Julian Starr
- Department of Biology, University of Ottawa, 30 Marie Curie Road, Ottawa, ON K1N 6N5, Canada
| |
Collapse
|