1
|
Sternberg PW, Van Auken K, Wang Q, Wright A, Yook K, Zarowiecki M, Arnaboldi V, Becerra A, Brown S, Cain S, Chan J, Chen WJ, Cho J, Davis P, Diamantakis S, Dyer S, Grigoriadis D, Grove CA, Harris T, Howe K, Kishore R, Lee R, Longden I, Luypaert M, Müller HM, Nuin P, Quinton-Tulloch M, Raciti D, Schedl T, Schindelman G, Stein L. WormBase 2024: status and transitioning to Alliance infrastructure. Genetics 2024; 227:iyae050. [PMID: 38573366 PMCID: PMC11075546 DOI: 10.1093/genetics/iyae050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 03/19/2024] [Accepted: 03/20/2024] [Indexed: 04/05/2024] Open
Abstract
WormBase has been the major repository and knowledgebase of information about the genome and genetics of Caenorhabditis elegans and other nematodes of experimental interest for over 2 decades. We have 3 goals: to keep current with the fast-paced C. elegans research, to provide better integration with other resources, and to be sustainable. Here, we discuss the current state of WormBase as well as progress and plans for moving core WormBase infrastructure to the Alliance of Genome Resources (the Alliance). As an Alliance member, WormBase will continue to interact with the C. elegans community, develop new features as needed, and curate key information from the literature and large-scale projects.
Collapse
Affiliation(s)
- Paul W Sternberg
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Kimberly Van Auken
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Qinghua Wang
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Adam Wright
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Karen Yook
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Magdalena Zarowiecki
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Valerio Arnaboldi
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Andrés Becerra
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Stephanie Brown
- School of Infection and Immunity, University of Glasgow, Glasgow G12 8TA, UK
| | - Scott Cain
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Juancarlos Chan
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Wen J Chen
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Jaehyoung Cho
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paul Davis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Stavros Diamantakis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Sarah Dyer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | | | - Christian A Grove
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Todd Harris
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Kevin Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Ranjana Kishore
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Raymond Lee
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Ian Longden
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Manuel Luypaert
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Hans-Michael Müller
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paulo Nuin
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Mark Quinton-Tulloch
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Daniela Raciti
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Tim Schedl
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Gary Schindelman
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Lincoln Stein
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| |
Collapse
|
2
|
Aleksander SA, Anagnostopoulos AV, Antonazzo G, Arnaboldi V, Attrill H, Becerra A, Bello SM, Blodgett O, Bradford YM, Bult CJ, Cain S, Calvi BR, Carbon S, Chan J, Chen WJ, Cherry JM, Cho J, Crosby MA, De Pons JL, D’Eustachio P, Diamantakis S, Dolan ME, dos Santos G, Dyer S, Ebert D, Engel SR, Fashena D, Fisher M, Foley S, Gibson AC, Gollapally VR, Gramates LS, Grove CA, Hale P, Harris T, Hayman GT, Hu Y, James-Zorn C, Karimi K, Karra K, Kishore R, Kwitek AE, Laulederkind SJF, Lee R, Longden I, Luypaert M, Markarian N, Marygold SJ, Matthews B, McAndrews MS, Millburn G, Miyasato S, Motenko H, Moxon S, Muller HM, Mungall CJ, Muruganujan A, Mushayahama T, Nash RS, Nuin P, Paddock H, Pells T, Perrimon N, Pich C, Quinton-Tulloch M, Raciti D, Ramachandran S, Richardson JE, Gelbart SR, Ruzicka L, Schindelman G, Shaw DR, Sherlock G, Shrivatsav A, Singer A, Smith CM, Smith CL, Smith JR, Stein L, Sternberg PW, Tabone CJ, Thomas PD, Thorat K, Thota J, Tomczuk M, Trovisco V, Tutaj MA, Urbano JM, Van Auken K, Van Slyke CE, Vize PD, Wang Q, Weng S, Westerfield M, Wilming LG, Wong ED, Wright A, Yook K, Zhou P, Zorn A, Zytkovicz M. Updates to the Alliance of Genome Resources central infrastructure. Genetics 2024; 227:iyae049. [PMID: 38552170 DOI: 10.1093/genetics/iyae049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 02/28/2024] [Accepted: 02/29/2024] [Indexed: 04/09/2024] Open
Abstract
The Alliance of Genome Resources (Alliance) is an extensible coalition of knowledgebases focused on the genetics and genomics of intensively studied model organisms. The Alliance is organized as individual knowledge centers with strong connections to their research communities and a centralized software infrastructure, discussed here. Model organisms currently represented in the Alliance are budding yeast, Caenorhabditis elegans, Drosophila, zebrafish, frog, laboratory mouse, laboratory rat, and the Gene Ontology Consortium. The project is in a rapid development phase to harmonize knowledge, store it, analyze it, and present it to the community through a web portal, direct downloads, and application programming interfaces (APIs). Here, we focus on developments over the last 2 years. Specifically, we added and enhanced tools for browsing the genome (JBrowse), downloading sequences, mining complex data (AllianceMine), visualizing pathways, full-text searching of the literature (Textpresso), and sequence similarity searching (SequenceServer). We enhanced existing interactive data tables and added an interactive table of paralogs to complement our representation of orthology. To support individual model organism communities, we implemented species-specific "landing pages" and will add disease-specific portals soon; in addition, we support a common community forum implemented in Discourse software. We describe our progress toward a central persistent database to support curation, the data modeling that underpins harmonization, and progress toward a state-of-the-art literature curation system with integrated artificial intelligence and machine learning (AI/ML).
Collapse
Affiliation(s)
| | | | | | - Giulia Antonazzo
- Department of Physiology, Development and Neuroscience , University of Cambridge, Downing Street, Cambridge CB2 3DY , UK
| | - Valerio Arnaboldi
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Helen Attrill
- Department of Physiology, Development and Neuroscience , University of Cambridge, Downing Street, Cambridge CB2 3DY , UK
| | - Andrés Becerra
- European Molecular Biology Laboratory, European Bioinformatics Institute , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , UK
| | - Susan M Bello
- The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
| | - Olin Blodgett
- The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
| | | | - Carol J Bult
- The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
| | - Scott Cain
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
| | - Brian R Calvi
- Department of Biology, Indiana University , Bloomington, IN 47408 , USA
| | - Seth Carbon
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory , Berkeley, CA
| | - Juancarlos Chan
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Wen J Chen
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - J Michael Cherry
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - Jaehyoung Cho
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Madeline A Crosby
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Jeffrey L De Pons
- Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
| | | | - Stavros Diamantakis
- European Molecular Biology Laboratory, European Bioinformatics Institute , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , UK
| | - Mary E Dolan
- The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
| | - Gilberto dos Santos
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Sarah Dyer
- European Molecular Biology Laboratory, European Bioinformatics Institute , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , UK
| | - Dustin Ebert
- Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033 , USA
| | - Stacia R Engel
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - David Fashena
- Institute of Neuroscience, University of Oregon , Eugene, OR 97403
| | - Malcolm Fisher
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center , 3333 Burnet Ave, Cincinnati, OH 45229 , USA
| | - Saoirse Foley
- Department of Biological Sciences, Carnegie Mellon University , 5000 Forbes Ave, Pittsburgh, PA 15203
| | - Adam C Gibson
- Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
| | - Varun R Gollapally
- Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
| | - L Sian Gramates
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Christian A Grove
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Paul Hale
- The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
| | - Todd Harris
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
| | - G Thomas Hayman
- Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
| | - Yanhui Hu
- Department of Genetics, Howard Hughes Medical Institute , Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115 , USA
| | - Christina James-Zorn
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center , 3333 Burnet Ave, Cincinnati, OH 45229 , USA
| | - Kamran Karimi
- Department of Biological Sciences, University of Calgary , 507 Campus Dr NW, Calgary, AB T2N 4V8 , Canada
| | - Kalpana Karra
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - Ranjana Kishore
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Anne E Kwitek
- Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
| | - Stanley J F Laulederkind
- Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
| | - Raymond Lee
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Ian Longden
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Manuel Luypaert
- European Molecular Biology Laboratory, European Bioinformatics Institute , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , UK
| | - Nicholas Markarian
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Steven J Marygold
- Department of Physiology, Development and Neuroscience , University of Cambridge, Downing Street, Cambridge CB2 3DY , UK
| | - Beverley Matthews
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Monica S McAndrews
- The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
| | - Gillian Millburn
- Department of Physiology, Development and Neuroscience , University of Cambridge, Downing Street, Cambridge CB2 3DY , UK
| | - Stuart Miyasato
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - Howie Motenko
- The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
| | - Sierra Moxon
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory , Berkeley, CA
| | - Hans-Michael Muller
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Christopher J Mungall
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory , Berkeley, CA
| | - Anushya Muruganujan
- Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033 , USA
| | - Tremayne Mushayahama
- Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033 , USA
| | - Robert S Nash
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - Paulo Nuin
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
| | - Holly Paddock
- Institute of Neuroscience, University of Oregon , Eugene, OR 97403
| | - Troy Pells
- Department of Biological Sciences, University of Calgary , 507 Campus Dr NW, Calgary, AB T2N 4V8 , Canada
| | - Norbert Perrimon
- Department of Genetics, Howard Hughes Medical Institute , Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115 , USA
| | - Christian Pich
- Institute of Neuroscience, University of Oregon , Eugene, OR 97403
| | - Mark Quinton-Tulloch
- European Molecular Biology Laboratory, European Bioinformatics Institute , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , UK
| | - Daniela Raciti
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | | | | | - Susan Russo Gelbart
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Leyla Ruzicka
- Institute of Neuroscience, University of Oregon , Eugene, OR 97403
| | - Gary Schindelman
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - David R Shaw
- The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
| | - Gavin Sherlock
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - Ajay Shrivatsav
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - Amy Singer
- Institute of Neuroscience, University of Oregon , Eugene, OR 97403
| | - Constance M Smith
- The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
| | - Cynthia L Smith
- The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
| | - Jennifer R Smith
- Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
| | - Lincoln Stein
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
| | - Paul W Sternberg
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Christopher J Tabone
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Paul D Thomas
- Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033 , USA
| | - Ketaki Thorat
- Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
| | - Jyothi Thota
- Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
| | - Monika Tomczuk
- The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
| | - Vitor Trovisco
- Department of Physiology, Development and Neuroscience , University of Cambridge, Downing Street, Cambridge CB2 3DY , UK
| | - Marek A Tutaj
- Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
| | - Jose-Maria Urbano
- Department of Physiology, Development and Neuroscience , University of Cambridge, Downing Street, Cambridge CB2 3DY , UK
| | - Kimberly Van Auken
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Ceri E Van Slyke
- Institute of Neuroscience, University of Oregon , Eugene, OR 97403
| | - Peter D Vize
- Department of Biological Sciences, University of Calgary , 507 Campus Dr NW, Calgary, AB T2N 4V8 , Canada
| | - Qinghua Wang
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Shuai Weng
- Department of Genetics, Stanford University , Stanford, CA 94305
| | | | - Laurens G Wilming
- The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
| | - Edith D Wong
- Department of Genetics, Stanford University , Stanford, CA 94305
| | - Adam Wright
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
| | - Karen Yook
- Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
| | - Pinglei Zhou
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| | - Aaron Zorn
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center , 3333 Burnet Ave, Cincinnati, OH 45229 , USA
| | - Mark Zytkovicz
- The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
| |
Collapse
|
3
|
Aleksander SA, Anagnostopoulos AV, Antonazzo G, Arnaboldi V, Attrill H, Becerra A, Bello SM, Blodgett O, Bradford YM, Bult CJ, Cain S, Calvi BR, Carbon S, Chan J, Chen WJ, Michael Cherry J, Cho J, Crosby MA, De Pons JL, D’Eustachio P, Diamantakis S, Dolan ME, Santos GD, Dyer S, Ebert D, Engel SR, Fashena D, Fisher M, Foley S, Gibson AC, Gollapally VR, Sian Gramates L, Grove CA, Hale P, Harris T, Thomas Hayman G, Hu Y, James-Zorn C, Karimi K, Karra K, Kishore R, Kwitek AE, Laulederkind SJF, Lee R, Longden I, Luypaert M, Markarian N, Marygold SJ, Matthews B, McAndrews MS, Millburn G, Miyasato S, Motenko H, Moxon S, Muller HM, Mungall CJ, Muruganujan A, Mushayahama T, Nash RS, Nuin P, Paddock H, Pells T, Perrimon N, Pich C, Quinton-Tulloch M, Raciti D, Ramachandran S, Richardson JE, Gelbart SR, Ruzicka L, Schindelman G, Shaw DR, Sherlock G, Shrivatsav A, Singer A, Smith CM, Smith CL, Smith JR, Stein L, Sternberg PW, Tabone CJ, Thomas PD, Thorat K, Thota J, Tomczuk M, Trovisco V, Tutaj MA, Urbano JM, Auken KV, Van Slyke CE, Vize PD, Wang Q, Weng S, Westerfield M, Wilming LG, Wong ED, Wright A, Yook K, Zhou P, Zorn A, Zytkovicz M. Updates to the Alliance of Genome Resources Central Infrastructure Alliance of Genome Resources Consortium. bioRxiv 2023:2023.11.20.567935. [PMID: 38045425 PMCID: PMC10690154 DOI: 10.1101/2023.11.20.567935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
The Alliance of Genome Resources (Alliance) is an extensible coalition of knowledgebases focused on the genetics and genomics of intensively-studied model organisms. The Alliance is organized as individual knowledge centers with strong connections to their research communities and a centralized software infrastructure, discussed here. Model organisms currently represented in the Alliance are budding yeast, C. elegans, Drosophila, zebrafish, frog, laboratory mouse, laboratory rat, and the Gene Ontology Consortium. The project is in a rapid development phase to harmonize knowledge, store it, analyze it, and present it to the community through a web portal, direct downloads, and APIs. Here we focus on developments over the last two years. Specifically, we added and enhanced tools for browsing the genome (JBrowse), downloading sequences, mining complex data (AllianceMine), visualizing pathways, full-text searching of the literature (Textpresso), and sequence similarity searching (SequenceServer). We enhanced existing interactive data tables and added an interactive table of paralogs to complement our representation of orthology. To support individual model organism communities, we implemented species-specific "landing pages" and will add disease-specific portals soon; in addition, we support a common community forum implemented in Discourse. We describe our progress towards a central persistent database to support curation, the data modeling that underpins harmonization, and progress towards a state-of-the art literature curation system with integrated Artificial Intelligence and Machine Learning (AI/ML).
Collapse
|
4
|
Agapite J, Albou LP, Aleksander SA, Alexander M, Anagnostopoulos AV, Antonazzo G, Argasinska J, Arnaboldi V, Attrill H, Becerra A, Bello SM, Blake JA, Blodgett O, Bradford YM, Bult CJ, Cain S, Calvi BR, Carbon S, Chan J, Chen WJ, Michael Cherry J, Cho J, Christie KR, Crosby MA, Davis P, da Veiga Beltrame E, De Pons JL, D’Eustachio P, Diamantakis S, Dolan ME, dos Santos G, Douglass E, Dunn B, Eagle A, Ebert D, Engel SR, Fashena D, Foley S, Frazer K, Gao S, Gibson AC, Gondwe F, Goodman J, Sian Gramates L, Grove CA, Hale P, Harris T, Thomas Hayman G, Hill DP, Howe DG, Howe KL, Hu Y, Jha S, Kadin JA, Kaufman TC, Kalita P, Karra K, Kishore R, Kwitek AE, Laulederkind SJF, Lee R, Longden I, Luypaert M, MacPherson KA, Martin R, Marygold SJ, Matthews B, McAndrews MS, Millburn G, Miyasato S, Motenko H, Moxon S, Muller HM, Mungall CJ, Muruganujan A, Mushayahama T, Nalabolu HS, Nash RS, Ng P, Nuin P, Paddock H, Paulini M, Perrimon N, Pich C, Quinton-Tulloch M, Raciti D, Ramachandran S, Richardson JE, Gelbart SR, Ruzicka L, Schaper K, Schindelman G, Shimoyama M, Simison M, Shaw DR, Shrivatsav A, Singer A, Skrzypek M, Smith CM, Smith CL, Smith JR, Stein L, Sternberg PW, Tabone CJ, Thomas PD, Thorat K, Thota J, Toro S, Tomczuk M, Trovisco V, Tutaj MA, Tutaj M, Urbano JM, Van Auken K, Van Slyke CE, Wang Q, Wang SJ, Weng S, Westerfield M, Williams G, Wilming LG, Wong ED, Wright A, Yook K, Zarowiecki M, Zhou P, Zytkovicz M. Harmonizing model organism data in the Alliance of Genome Resources. Genetics 2022; 220:iyac022. [PMID: 35380658 PMCID: PMC8982023 DOI: 10.1093/genetics/iyac022] [Citation(s) in RCA: 46] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Accepted: 01/26/2022] [Indexed: 02/06/2023] Open
Abstract
The Alliance of Genome Resources (the Alliance) is a combined effort of 7 knowledgebase projects: Saccharomyces Genome Database, WormBase, FlyBase, Mouse Genome Database, the Zebrafish Information Network, Rat Genome Database, and the Gene Ontology Resource. The Alliance seeks to provide several benefits: better service to the various communities served by these projects; a harmonized view of data for all biomedical researchers, bioinformaticians, clinicians, and students; and a more sustainable infrastructure. The Alliance has harmonized cross-organism data to provide useful comparative views of gene function, gene expression, and human disease relevance. The basis of the comparative views is shared calls of orthology relationships and the use of common ontologies. The key types of data are alleles and variants, gene function based on gene ontology annotations, phenotypes, association to human disease, gene expression, protein-protein and genetic interactions, and participation in pathways. The information is presented on uniform gene pages that allow facile summarization of information about each gene in each of the 7 organisms covered (budding yeast, roundworm Caenorhabditis elegans, fruit fly, house mouse, zebrafish, brown rat, and human). The harmonized knowledge is freely available on the alliancegenome.org portal, as downloadable files, and by APIs. We expect other existing and emerging knowledge bases to join in the effort to provide the union of useful data and features that each knowledge base currently provides.
Collapse
|
5
|
Davis P, Zarowiecki M, Arnaboldi V, Becerra A, Cain S, Chan J, Chen WJ, Cho J, da Veiga Beltrame E, Diamantakis S, Gao S, Grigoriadis D, Grove CA, Harris TW, Kishore R, Le T, Lee RYN, Luypaert M, Müller HM, Nakamura C, Nuin P, Paulini M, Quinton-Tulloch M, Raciti D, Rodgers FH, Russell M, Schindelman G, Singh A, Stickland T, Van Auken K, Wang Q, Williams G, Wright AJ, Yook K, Berriman M, Howe KL, Schedl T, Stein L, Sternberg PW. WormBase in 2022-data, processes, and tools for analyzing Caenorhabditis elegans. Genetics 2022; 220:6521733. [PMID: 35134929 PMCID: PMC8982018 DOI: 10.1093/genetics/iyac003] [Citation(s) in RCA: 106] [Impact Index Per Article: 53.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 12/17/2021] [Indexed: 02/06/2023] Open
Abstract
WormBase (www.wormbase.org) is the central repository for the genetics and genomics of the nematode Caenorhabditis elegans. We provide the research community with data and tools to facilitate the use of C. elegans and related nematodes as model organisms for studying human health, development, and many aspects of fundamental biology. Throughout our 22-year history, we have continued to evolve to reflect progress and innovation in the science and technologies involved in the study of C. elegans. We strive to incorporate new data types and richer data sets, and to provide integrated displays and services that avail the knowledge generated by the published nematode genetics literature. Here, we provide a broad overview of the current state of WormBase in terms of data type, curation workflows, analysis, and tools, including exciting new advances for analysis of single-cell data, text mining and visualization, and the new community collaboration forum. Concurrently, we continue the integration and harmonization of infrastructure, processes, and tools with the Alliance of Genome Resources, of which WormBase is a founding member.
Collapse
Affiliation(s)
- Paul Davis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Magdalena Zarowiecki
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Valerio Arnaboldi
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Andrés Becerra
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Scott Cain
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Juancarlos Chan
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Wen J Chen
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Jaehyoung Cho
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Eduardo da Veiga Beltrame
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Stavros Diamantakis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sibyl Gao
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Dionysis Grigoriadis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Christian A Grove
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Todd W Harris
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Ranjana Kishore
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Tuan Le
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Raymond Y N Lee
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Manuel Luypaert
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Hans-Michael Müller
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Cecilia Nakamura
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paulo Nuin
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Michael Paulini
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mark Quinton-Tulloch
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniela Raciti
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Faye H Rodgers
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Matthew Russell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Gary Schindelman
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Archana Singh
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Tim Stickland
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Kimberly Van Auken
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Qinghua Wang
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Gary Williams
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Adam J Wright
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Karen Yook
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Matt Berriman
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Kevin L Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tim Schedl
- Department of Genetics, Washington University School of Medicine, St Louis, MO 63110, USA
| | - Lincoln Stein
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Paul W Sternberg
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| |
Collapse
|
6
|
Dahlberg CL, Grove CA, Hulsey-Vincent H, Ismail S. Student Annotations of Published Data as a Collaboration between an Online Laboratory Course and the C. elegans Database, WormBase. J Microbiol Biol Educ 2021; 22:jmbe-22-21. [PMID: 33884078 PMCID: PMC8012049 DOI: 10.1128/jmbe.v22i1.2331] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 11/08/2020] [Indexed: 06/12/2023]
Abstract
Course-based undergraduate research experiences (CUREs) provide the same benefits as individual, mentored faculty research while expanding the availability of research opportunities. One important aspect of CUREs is students' engagement in collaboration. The shift to online learning during the COVID-19 pandemic created an immediate need for meaningful, collaborative experiences in CUREs. We developed a partnership with the Caenorhabditis elegans (C. elegans) database, WormBase, in which students submitted annotations of published manuscripts to the website. Due to the stress on students during this time of crisis, qualitative data were collected in lieu of quantitative pre- and postanalyses. Most students reported on cognitive processes that represent mid-level Bloom's categories. By partnering with WormBase, students gained insight into the scientific community and contributed as community members. We describe possible modifications for future courses, potential expansion of the WormBase collaboration, and future directions for quantitative analysis.
Collapse
Affiliation(s)
| | - Christian A. Grove
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125
| | | | - Samiya Ismail
- Biology Department, Western Washington University, Bellingham, WA 98225
| |
Collapse
|
7
|
Harris TW, Arnaboldi V, Cain S, Chan J, Chen WJ, Cho J, Davis P, Gao S, Grove CA, Kishore R, Lee RYN, Muller HM, Nakamura C, Nuin P, Paulini M, Raciti D, Rodgers FH, Russell M, Schindelman G, Auken KV, Wang Q, Williams G, Wright AJ, Yook K, Howe KL, Schedl T, Stein L, Sternberg PW. WormBase: a modern Model Organism Information Resource. Nucleic Acids Res 2020; 48:D762-D767. [PMID: 31642470 PMCID: PMC7145598 DOI: 10.1093/nar/gkz920] [Citation(s) in RCA: 120] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 10/02/2019] [Accepted: 10/07/2019] [Indexed: 01/16/2023] Open
Abstract
WormBase (https://wormbase.org/) is a mature Model Organism Information Resource supporting researchers using the nematode Caenorhabditis elegans as a model system for studies across a broad range of basic biological processes. Toward this mission, WormBase efforts are arranged in three primary facets: curation, user interface and architecture. In this update, we describe progress in each of these three areas. In particular, we discuss the status of literature curation and recently added data, detail new features of the web interface and options for users wishing to conduct data mining workflows, and discuss our efforts to build a robust and scalable architecture by leveraging commercial cloud offerings. We conclude with a description of WormBase's role as a founding member of the nascent Alliance of Genome Resources.
Collapse
Affiliation(s)
- Todd W Harris
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Valerio Arnaboldi
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Scott Cain
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Juancarlos Chan
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Wen J Chen
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Jaehyoung Cho
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paul Davis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sibyl Gao
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Christian A Grove
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Ranjana Kishore
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Raymond Y N Lee
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Hans-Michael Muller
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Cecilia Nakamura
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paulo Nuin
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Michael Paulini
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniela Raciti
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Faye H Rodgers
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Matthew Russell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Gary Schindelman
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Kimberly V Auken
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Qinghua Wang
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Gary Williams
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Adam J Wright
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Karen Yook
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Kevin L Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tim Schedl
- Department of Genetics, Washington University School of Medicine, St Louis, MO 63110, USA
| | - Lincoln Stein
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Paul W Sternberg
- Division of Biology and Biological Engineering 156–29, California Institute of Technology, Pasadena, CA 91125, USA
| |
Collapse
|
8
|
De Masi F, Grove CA, Vedenko A, Alibés A, Gisselbrecht SS, Serrano L, Bulyk ML, Walhout AJM. Using a structural and logics systems approach to infer bHLH-DNA binding specificity determinants. Nucleic Acids Res 2011; 39:4553-63. [PMID: 21335608 PMCID: PMC3113581 DOI: 10.1093/nar/gkr070] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Numerous efforts are underway to determine gene regulatory networks that describe physical relationships between transcription factors (TFs) and their target DNA sequences. Members of paralogous TF families typically recognize similar DNA sequences. Knowledge of the molecular determinants of protein–DNA recognition by paralogous TFs is of central importance for understanding how small differences in DNA specificities can dictate target gene selection. Previously, we determined the in vitro DNA binding specificities of 19 Caenorhabditis elegans basic helix-loop-helix (bHLH) dimers using protein binding microarrays. These TFs bind E-box (CANNTG) and E-box-like sequences. Here, we combine these data with logics, bHLH–DNA co-crystal structures and computational modeling to infer which bHLH monomer can interact with which CAN E-box half-site and we identify a critical residue in the protein that dictates this specificity. Validation experiments using mutant bHLH proteins provide support for our inferences. Our study provides insights into the mechanisms of DNA recognition by bHLH dimers as well as a blueprint for system-level studies of the DNA binding determinants of other TF families in different model organisms and humans.
Collapse
Affiliation(s)
- Federico De Masi
- Department of Medicine, Division of Genetics, Brigham & Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | | | | | | | | | | | | | | |
Collapse
|
9
|
Grove CA, De Masi F, Barrasa MI, Newburger DE, Alkema MJ, Bulyk ML, Walhout AJM. A multiparameter network reveals extensive divergence between C. elegans bHLH transcription factors. Cell 2009; 138:314-27. [PMID: 19632181 DOI: 10.1016/j.cell.2009.04.058] [Citation(s) in RCA: 196] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2008] [Revised: 03/06/2009] [Accepted: 04/23/2009] [Indexed: 01/20/2023]
Abstract
Differences in expression, protein interactions, and DNA binding of paralogous transcription factors ("TF parameters") are thought to be important determinants of regulatory and biological specificity. However, both the extent of TF divergence and the relative contribution of individual TF parameters remain undetermined. We comprehensively identify dimerization partners, spatiotemporal expression patterns, and DNA-binding specificities for the C. elegans bHLH family of TFs, and model these data into an integrated network. This network displays both specificity and promiscuity, as some bHLH proteins, DNA sequences, and tissues are highly connected, whereas others are not. By comparing all bHLH TFs, we find extensive divergence and that all three parameters contribute equally to bHLH divergence. Our approach provides a framework for examining divergence for other protein families in C. elegans and in other complex multicellular organisms, including humans. Cross-species comparisons of integrated networks may provide further insights into molecular features underlying protein family evolution. For a video summary of this article, see the PaperFlick file available with the online Supplemental Data.
Collapse
Affiliation(s)
- Christian A Grove
- Program in Gene Function and Expression and Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | | | | | | | | | | | | |
Collapse
|
10
|
Abstract
Now that numerous high-quality complete genome sequences are available, many efforts are focusing on the "second genomic code", namely the code that determines how the precise temporal and spatial expression of each gene in the genome is achieved. In this regard, the elucidation of transcription regulatory networks that describe combined transcriptional circuits for an organism of interest has become valuable to our understanding of gene expression at a systems level. Such networks describe physical and regulatory interactions between transcription factors (TFs) and the target genes they regulate under different developmental, physiological, or pathological conditions. The mapping of high-quality transcription regulatory networks depends not only on the accuracy of the experimental or computational method chosen, but also relies on the quality of TF predictions. Moreover, the total repertoire of TFs is not only determined by the protein-coding capacity of the genome, but also by different protein properties, including dimerization, co-factor interactions and post-translational modifications. Here, we discuss the factors that influence TF functionality and, hence, the functionality of the networks in which they operate.
Collapse
Affiliation(s)
- Christian A Grove
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | | |
Collapse
|
11
|
Vermeirssen V, Deplancke B, Barrasa MI, Reece-Hoyes JS, Arda HE, Grove CA, Martinez NJ, Sequerra R, Doucette-Stamm L, Brent MR, Walhout AJM. Matrix and Steiner-triple-system smart pooling assays for high-performance transcription regulatory network mapping. Nat Methods 2007; 4:659-64. [PMID: 17589517 DOI: 10.1038/nmeth1063] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2007] [Accepted: 05/23/2007] [Indexed: 11/09/2022]
Abstract
Yeast one-hybrid (Y1H) assays provide a gene-centered method for the identification of interactions between gene promoters and regulatory transcription factors (TFs). To date, Y1H assays have involved library screens that are relatively expensive and laborious. We present two Y1H strategies that allow immediate prey identification: matrix assays that use an array of 755 individual Caenorhabditis elegans TFs, and smart-pool assays that use TF multiplexing. Both strategies simplify the Y1H pipeline and reduce the cost of protein-DNA interaction identification. We used a Steiner triple system (STS) to create smart pools of 4-25 TFs. Notably, we uniplexed a small number of highly connected TFs to allow efficient assay deconvolution. Both strategies outperform library screens in terms of coverage, confidence and throughput. These versatile strategies can be adapted both to TFs in other systems and, likely, to other biomolecules and assays as well.
Collapse
Affiliation(s)
- Vanessa Vermeirssen
- Program in Gene Function and Expression and Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Deplancke B, Mukhopadhyay A, Ao W, Elewa AM, Grove CA, Martinez NJ, Sequerra R, Doucette-Stamm L, Reece-Hoyes JS, Hope IA, Tissenbaum HA, Mango SE, Walhout AJM. A gene-centered C. elegans protein-DNA interaction network. Cell 2006; 125:1193-205. [PMID: 16777607 DOI: 10.1016/j.cell.2006.04.038] [Citation(s) in RCA: 192] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2005] [Revised: 02/27/2006] [Accepted: 04/12/2006] [Indexed: 11/16/2022]
Abstract
Transcription regulatory networks consist of physical and functional interactions between transcription factors (TFs) and their target genes. The systematic mapping of TF-target gene interactions has been pioneered in unicellular systems, using "TF-centered" methods (e.g., chromatin immunoprecipitation). However, metazoan systems are less amenable to such methods. Here, we used "gene-centered" high-throughput yeast one-hybrid (Y1H) assays to identify 283 interactions between 72 C. elegans digestive tract gene promoters and 117 proteins. The resulting protein-DNA interaction (PDI) network is highly connected and enriched for TFs that are expressed in the digestive tract. We provide functional annotations for approximately 10% of all worm TFs, many of which were previously uncharacterized, and find ten novel putative TFs, illustrating the power of a gene-centered approach. We provide additional in vivo evidence for multiple PDIs and illustrate how the PDI network provides insights into metazoan differential gene expression at a systems level.
Collapse
Affiliation(s)
- Bart Deplancke
- Program in Gene Function and Expression and Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, 01605, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Reece-Hoyes JS, Deplancke B, Shingles J, Grove CA, Hope IA, Walhout AJM. A compendium of Caenorhabditis elegans regulatory transcription factors: a resource for mapping transcription regulatory networks. Genome Biol 2005; 6:R110. [PMID: 16420670 PMCID: PMC1414109 DOI: 10.1186/gb-2005-6-13-r110] [Citation(s) in RCA: 143] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2005] [Revised: 11/07/2005] [Accepted: 11/28/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transcription regulatory networks are composed of interactions between transcription factors and their target genes. Whereas unicellular networks have been studied extensively, metazoan transcription regulatory networks remain largely unexplored. Caenorhabditis elegans provides a powerful model to study such metazoan networks because its genome is completely sequenced and many functional genomic tools are available. While C. elegans gene predictions have undergone continuous refinement, this is not true for the annotation of functional transcription factors. The comprehensive identification of transcription factors is essential for the systematic mapping of transcription regulatory networks because it enables the creation of physical transcription factor resources that can be used in assays to map interactions between transcription factors and their target genes. RESULTS By computational searches and extensive manual curation, we have identified a compendium of 934 transcription factor genes (referred to as wTF2.0). We find that manual curation drastically reduces the number of both false positive and false negative transcription factor predictions. We discuss how transcription factor splice variants and dimer formation may affect the total number of functional transcription factors. In contrast to mouse transcription factor genes, we find that C. elegans transcription factor genes do not undergo significantly more splicing than other genes. This difference may contribute to differences in organism complexity. We identify candidate redundant worm transcription factor genes and orthologous worm and human transcription factor pairs. Finally, we discuss how wTF2.0 can be used together with physical transcription factor clone resources to facilitate the systematic mapping of C. elegans transcription regulatory networks. CONCLUSION wTF2.0 provides a starting point to decipher the transcription regulatory networks that control metazoan development and function.
Collapse
Affiliation(s)
- John S Reece-Hoyes
- Institute of Integrative and Comparative Biology, Faculty of Biological Sciences, School of Biology, University of Leeds, Woodhouse Lane, Leeds LS2 9JT, UK
| | - Bart Deplancke
- Program in Gene Function and Expression and Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, 364 Plantation Street, Lazare Research Building, Room 605, MA 01605, USA
| | - Jane Shingles
- Institute of Integrative and Comparative Biology, Faculty of Biological Sciences, School of Biology, University of Leeds, Woodhouse Lane, Leeds LS2 9JT, UK
| | - Christian A Grove
- Program in Gene Function and Expression and Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, 364 Plantation Street, Lazare Research Building, Room 605, MA 01605, USA
| | - Ian A Hope
- Institute of Integrative and Comparative Biology, Faculty of Biological Sciences, School of Biology, University of Leeds, Woodhouse Lane, Leeds LS2 9JT, UK
| | - Albertha JM Walhout
- Program in Gene Function and Expression and Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, 364 Plantation Street, Lazare Research Building, Room 605, MA 01605, USA
| |
Collapse
|
14
|
Abstract
Insufficient quantitation limits using ion-trap gas-chromatography mass-spectrometry (GC-MS) prevented the assay of some samples during a preliminary screening of preclinical rat plasma samples (50 microliter) containing novel, polar therapeutic agents. Few options were available for improving the lower limit of quantitation. The limited amount of sample available precluded the extraction additional plasma. Lipid-liquid extraction recoveries were greater than 90% throughout the range of the standard curve (500-2000 ng ml-1). Chromatography was optimized and multiple, equivalent sites for analyte fragmentation were precluded, using MS-MS to improve assay sensitivity. Quantitation limits were decreased 10-fold however, by using a larger syringe to increase the injection volume from 5 to 50 microliter, in combination with a universal programmable injector. These large injection volumes required changes in the injector events program and in column plumbing. Additionally, evaluation of injection liner packing material demonstrated a 2-fold improvement in sensitivity, using carbofrit, relative to silanized glass wool. Converting to inert ion-trap electrodes did not appear to affect the detection limit, perhaps due to over-riding peak broadening during gas chromatography. The changes described produced a 20-fold improvement in the lower limit quantitation.
Collapse
Affiliation(s)
- J R Kagel
- Department of Pharmacokinetics and Drugs Metabolism, Parke-Davis Pharmaceutical Research Division of Warner-Lambert, Ann Arbor, MI 48105, USA
| | | | | |
Collapse
|
15
|
Abstract
Changes in renal function of twenty-two cats treated for hyperthyroidism using radioiodine were evaluated. Serum thyroxine (T4), serum creatinine, blood urea nitrogen (BUN) and urine specific gravity were measured before treatment and 6 and 30 days after treatment. Twenty-two cats had pretreatment and 21 cats had 6 day posttreatment measurement of glomerular filtration rate (GFR) using nuclear medicine imaging techniques. There were significant declines in serum T4 at 6 days following treatment, but the changes in GFR, serum creatinine and BUN were not significant. At 30 days following treatment, there were significant increases in BUN and serum creatinine and further significant declines in serum T4. Nine cats were in renal failure prior to treatment and 13 cats were in renal failure 30 days following treatment. Renal failure was defined as BUN greater than 30 mg/dl and/or serum creatinine greater than 1.8 mg/dl with concurrent urine specific gravity less than 1.035. These 13 cats included eight of 9 cats in renal failure prior to treatment and 5 cats not previously in renal failure. Follow up information beyond 30 days following treatment on 9 of these 13 cats indicated that all remained in renal failure. Based on receiver operating curve analysis of pretreatment glomerular filtration rate (GFR) in predicting posttreatment renal failure, a value of 2.25 ml/kg/min as a point of maximum sensitivity (100%) and specificity (78%) was derived. Fifteen of 22 cats had pretreatment GFR measurements of less than 2.25 ml/kg/min. These 15 cats included all 9 cats in renal failure and 5 cats with normal renal clinicopathologic values prior to treatment. At 30 days following treatment, 13 of these 15 cats were in renal failure. The 2 cats not in renal failure had persistently increased serum T4 values. Seven of 22 cats had pretreatment GFR measurements greater than 2.25 ml/kg/min. None of these 7 cats was in renal failure at 30 days following treatment, all cats having normal BUN, serum creatinine, and urine specific gravity values. It was concluded that significant declines in renal function occur after treatment of hyperthyroidism and this decline is clinically important in cats with renal disease. Pretreatment measurement of GFR is valuable in detecting subclinical renal disease and in predicting which cats may have clinically important declines in renal function following treatment.
Collapse
Affiliation(s)
- W H Adams
- Department of Small Animal Clinical Sciences, University of Tennessee, College of Veterinary Medicine, Knoxville 37901-1071, USA
| | | | | | | | | |
Collapse
|
16
|
Grove CA, Judd G, Horn R. Examination of firing pin impressions by scanning electron microscopy. J Forensic Sci 1972; 17:645-58. [PMID: 4680765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
17
|
Abstract
Ion-thinned longitudinal sections of a third molar were observed by bright- and dark-field electron microscopy. In bright field, the hydroxyapatite crystallites in enamel appeared as long rods, but, when observed in dark field, the crystallites were rectangular, with a mean length of 321 A.
Collapse
|
18
|
Cater DB, Adair HM, Grove CA. Effects of vasomotor drugs and "mediators" of the inflammatory reaction upon the oxygen tension of tumours and tumour blood-flow. Br J Cancer 1966; 20:504-16. [PMID: 4288476 PMCID: PMC2007997 DOI: 10.1038/bjc.1966.62] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
|