1
|
Cui H, Ford B, Starr J, Reznicek A, Zhang L, Macklin JA. Authors’ attitude toward adopting a new workflow to improve the computability of phenotype publications. Database (Oxford) 2022; 2022:6519872. [PMID: 35106535 PMCID: PMC9278328 DOI: 10.1093/database/baac001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 11/24/2021] [Accepted: 01/10/2022] [Indexed: 11/13/2022]
Abstract
Critical to answering large-scale questions in biology is the integration of knowledge from different disciplines into a coherent, computable whole. Controlled vocabularies such as ontologies represent a clear path toward this goal. Using survey questionnaires, we examined the attitudes of biologists toward adopting controlled vocabularies in phenotype publications. Our questions cover current experience and overall attitude with controlled vocabularies, the awareness of the issues around ambiguity and inconsistency in phenotype descriptions and post-publication professional data curation, the preferred solutions and the effort and desired rewards for adopting a new authoring workflow. Results suggest that although the existence of controlled vocabularies is widespread, their use is not common. A majority of respondents (74%) are frustrated with ambiguity in phenotypic descriptions, and there is a strong agreement (mean agreement score 4.21 out of 5) that author curation would better reflect the original meaning of phenotype data. Moreover, the vast majority (85%) of researchers would try a new authoring workflow if resultant data were more consistent and less ambiguous. Even more respondents (93%) suggested that they would try and possibly adopt a new authoring workflow if it required 5% additional effort as compared to normal, but higher rates resulted in a steep decline in likely adoption rates. Among the four different types of rewards, two types of citations were the most desired incentives for authors to produce computable data. Overall, our results suggest the adoption of a new authoring workflow would be accelerated by a user-friendly and efficient software-authoring tool, an increased awareness of the challenges text ambiguity creates for external curators and an elevated appreciation of the benefits of controlled vocabularies.
Collapse
Affiliation(s)
- Hong Cui
- School of Information, University of Arizona , 1103 E. Second Street, Tucson, AZ 85705, USA
| | - Bruce Ford
- Department of Biological Sciences, University of Manitoba , 50 Sifton Road, Winnipeg, MB R3T 2N2, Canada
| | - Julian Starr
- Department of Biology, University of Ottawa , 30 Marie Curie Road, Ottawa, ON K1N 6N5, Canada
| | - Anton Reznicek
- SLA Herbarium, University of Michigan , 3600 Varsity Drive #1046, Ann Arbor, MI 48019, USA
| | - Limin Zhang
- School of Information, University of Arizona , 1103 E. Second Street, Tucson, AZ 85705, USA
| | - James A Macklin
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada , 960 Carling Avenue, Ottawa, ON K1A 0C6, Canada
| |
Collapse
|
2
|
Zhang L, Yang X, Cota Z, Cui H, Ford B, Chen HL, Macklin JA, Reznicek A, Starr J. Which methods are the most effective in enabling novice users to participate in ontology creation? A usability study. Database (Oxford) 2021; 2021:baab035. [PMID: 34156445 PMCID: PMC8218699 DOI: 10.1093/database/baab035] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 04/02/2021] [Accepted: 05/22/2021] [Indexed: 11/14/2022]
Abstract
Producing findable, accessible, interoperable and reusable (FAIR) data cannot be accomplished solely by data curators in all disciplines. In biology, we have shown that phenotypic data curation is not only costly, but it is burdened with inter-curator variation. We intend to propose a software platform that would enable all data producers, including authors of scientific publications, to produce ontologized data at the time of publication. Working toward this goal, we need to identify ontology construction methods that are preferred by end users. Here, we employ two usability studies to evaluate effectiveness, efficiency and user satisfaction with a set of four methods that allow an end user to add terms and their relations to an ontology. Thirty-three participants took part in a controlled experiment where they evaluated the four methods (Quick Form, Wizard, WebProtégé and Wikidata) after watching demonstration videos and completing a hands-on task. Another think-aloud study was conducted with three professional botanists. The efficiency effectiveness and user confidence in the methods are clearly revealed through statistical and content analyses of participants' comments. Quick Form, Wizard and WebProtégé offer distinct strengths that would benefit our author-driven FAIR data generation system. Features preferred by the participants will guide the design of future iterations.
Collapse
Affiliation(s)
- Limin Zhang
- School of Information, University of Arizona, 1103 E. Second Street, Tucson, AZ 85705, USA
| | - Xingyi Yang
- School of Information, University of Arizona, 1103 E. Second Street, Tucson, AZ 85705, USA
| | - Zuleima Cota
- School of Information, University of Arizona, 1103 E. Second Street, Tucson, AZ 85705, USA
| | - Hong Cui
- School of Information, University of Arizona, 1103 E. Second Street, Tucson, AZ 85705, USA
| | - Bruce Ford
- Department of Biological Sciences, University of Manitoba, 50 Sifton Road, Winnipeg, MB R3T 2N2, Canada
| | - Hsin-liang Chen
- Curtis Laws Wilson Library, Missouri University of Science and Technology, 400 W. 14th Street, Rolla, MO 65409, USA
| | - James A Macklin
- Information protected by Canada government, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada
| | - Anton Reznicek
- SLA Herbarium, University of Michigan, 3600 Varsity Drive, Ann Arbor, MI 48019, USA
| | - Julian Starr
- Department of Biology, University of Ottawa, 30 Marie Curie Road, Ottawa, ON K1N 6N5, Canada
| |
Collapse
|
3
|
Cui H, Zhang L, Ford B, Cheng HL, Macklin JA, Reznicek A, Starr J. Measurement Recorder: developing a useful tool for making species descriptions that produces computable phenotypes. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2020:5995854. [PMID: 33216896 PMCID: PMC7678789 DOI: 10.1093/database/baaa079] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 08/24/2020] [Accepted: 08/27/2020] [Indexed: 12/31/2022]
Abstract
To use published phenotype information in computational analyses, there have been efforts to convert descriptions of phenotype characters from human languages to ontologized statements. This postpublication curation process is not only slow and costly, it is also burdened with significant intercurator variation (including curator-author variation), due to different interpretations of a character by various individuals. This problem is inherent in any human-based intellectual activity. To address this problem, making scientific publications semantically clear (i.e. computable) by the authors at the time of publication is a critical step if we are to avoid postpublication curation. To help authors efficiently produce species phenotypes while producing computable data, we are experimenting with an author-driven ontology development approach and developing and evaluating a series of ontology-aware software modules that would create publishable species descriptions that are readily useable in scientific computations. The first software module prototype called Measurement Recorder has been developed to assist authors in defining continuous measurements and reported in this paper. Two usability studies of the software were conducted with 22 undergraduate students majoring in information science and 32 in biology. Results suggest that participants can use Measurement Recorder without training and they find it easy to use after limited practice. Participants also appreciate the semantic enhancement features. Measurement Recorder's character reuse features facilitate character convergence among participants by 48% and have the potential to further reduce user errors in defining characters. A set of software design issues have also been identified and then corrected. Measurement Recorder enables authors to record measurements in a semantically clear manner and enriches phenotype ontology along the way. Future work includes representing the semantic data as Resource Description Framework (RDF) knowledge graphs and characterizing the division of work between authors as domain knowledge providers and ontology engineers as knowledge formalizers in this new author-driven ontology development approach.
Collapse
Affiliation(s)
- Hong Cui
- School of Information, University of Arizona, Tucson, AZ 85705, USA
| | - Limin Zhang
- School of Information, University of Arizona, Tucson, AZ 85705, USA
| | - Bruce Ford
- Department of Biological sciences, University of Manitoba, Winnipeg, MB R3T 2N2, Canada
| | - Hsin-Liang Cheng
- Curtis Laws Wilson Library, Missouri University of Science and Technology, Rolla, MO 65409, USA
| | - James A Macklin
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada
| | - Anton Reznicek
- LSA Herbarium, University of Michigan, Ann Arbor, MI 48019, USA
| | - Julian Starr
- Department of Biology, University of Ottawa, Ottawa, ON K1N 6N5, Canada
| |
Collapse
|