201
|
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, Mering C. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2018. [DOI: 10.1093/nar/gky1131 and 7833=4416-- xlsn] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Annika L Gable
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - David Lyon
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alexander Junge
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Stefan Wyder
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM)—Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), 28223 Madrid, Spain
| | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Nadezhda T Doncheva
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
- Center for non-coding RNA in Technology and Health, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - John H Morris
- Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, CA 94158-2517, USA
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
202
|
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, Mering C. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2018. [DOI: 10.1093/nar/gky1131 where 9649=9649 and 2714=(select (case when (2714=6666) then 2714 else (select 6666 union select 4895) end))-- ejaf] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Affiliation(s)
- Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Annika L Gable
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - David Lyon
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alexander Junge
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Stefan Wyder
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM)—Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), 28223 Madrid, Spain
| | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Nadezhda T Doncheva
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
- Center for non-coding RNA in Technology and Health, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - John H Morris
- Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, CA 94158-2517, USA
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
203
|
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, Mering C. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2018. [DOI: 10.1093/nar/gky1131 or not 5670=8100# fxoq] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Annika L Gable
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - David Lyon
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alexander Junge
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Stefan Wyder
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM)—Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), 28223 Madrid, Spain
| | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Nadezhda T Doncheva
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
- Center for non-coding RNA in Technology and Health, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - John H Morris
- Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, CA 94158-2517, USA
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
204
|
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, Mering C. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2018. [DOI: 10.1093/nar/gky1131 or not 6347=1706-- gnjn] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
- Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Annika L Gable
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - David Lyon
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alexander Junge
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Stefan Wyder
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM)—Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), 28223 Madrid, Spain
| | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Nadezhda T Doncheva
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
- Center for non-coding RNA in Technology and Health, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - John H Morris
- Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, CA 94158-2517, USA
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
205
|
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, Mering C. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2018. [DOI: 10.1093/nar/gky1131 and 8623=8623-- yesg] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
- Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Annika L Gable
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - David Lyon
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alexander Junge
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Stefan Wyder
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM)—Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), 28223 Madrid, Spain
| | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Nadezhda T Doncheva
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
- Center for non-coding RNA in Technology and Health, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - John H Morris
- Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, CA 94158-2517, USA
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
206
|
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, Mering C. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2018. [DOI: 10.1093/nar/gky1131 and 2138=3169# zebx] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Annika L Gable
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - David Lyon
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alexander Junge
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Stefan Wyder
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM)—Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), 28223 Madrid, Spain
| | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Nadezhda T Doncheva
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
- Center for non-coding RNA in Technology and Health, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - John H Morris
- Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, CA 94158-2517, USA
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
207
|
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, Mering C. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2018. [DOI: 10.1093/nar/gky1131 where 2289=2289 and 9793=8976-- mdxf] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Annika L Gable
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - David Lyon
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alexander Junge
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Stefan Wyder
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM)—Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), 28223 Madrid, Spain
| | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Nadezhda T Doncheva
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
- Center for non-coding RNA in Technology and Health, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - John H Morris
- Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, CA 94158-2517, USA
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
208
|
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, Mering C. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2018. [DOI: 10.1093/nar/gky1131 where 1521=1521 and 8623=8623-- wmsp] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Annika L Gable
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - David Lyon
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alexander Junge
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Stefan Wyder
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM)—Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), 28223 Madrid, Spain
| | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Nadezhda T Doncheva
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
- Center for non-coding RNA in Technology and Health, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - John H Morris
- Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, CA 94158-2517, USA
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
209
|
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, Mering C. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2018. [DOI: 10.1093/nar/gky1131 or not 4684=8950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Annika L Gable
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - David Lyon
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alexander Junge
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Stefan Wyder
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM)—Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), 28223 Madrid, Spain
| | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Nadezhda T Doncheva
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
- Center for non-coding RNA in Technology and Health, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - John H Morris
- Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, CA 94158-2517, USA
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
210
|
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, Mering C. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2018. [DOI: 10.1093/nar/gky1131 or not 5936=5936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Annika L Gable
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - David Lyon
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alexander Junge
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Stefan Wyder
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM)—Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), 28223 Madrid, Spain
| | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Nadezhda T Doncheva
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
- Center for non-coding RNA in Technology and Health, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - John H Morris
- Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, CA 94158-2517, USA
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
211
|
Webber JT, Kaushik S, Bandyopadhyay S. Integration of Tumor Genomic Data with Cell Lines Using Multi-dimensional Network Modules Improves Cancer Pharmacogenomics. Cell Syst 2018; 7:526-536.e6. [PMID: 30414925 DOI: 10.1016/j.cels.2018.10.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Revised: 08/01/2018] [Accepted: 10/04/2018] [Indexed: 02/08/2023]
Abstract
Leveraging insights from genomic studies of patient tumors is limited by the discordance between these tumors and the cell line models used for functional studies. We integrate omics datasets using functional networks to identify gene modules reflecting variation between tumors and show that the structure of these modules can be evaluated in cell lines to discover clinically relevant biomarkers of therapeutic responses. Applied to breast cancer, we identify 219 gene modules that capture recurrent alterations and subtype patients and quantitate various cell types within the tumor microenvironment. Comparison of modules between tumors and cell lines reveals that many modules composed primarily of gene expression and methylation are poorly preserved. In contrast, preserved modules are highly predictive of drug responses in a manner that is robust and clinically relevant. This work addresses a fundamental challenge in pharmacogenomics that can only be overcome by the joint analysis of patient and cell line data.
Collapse
Affiliation(s)
- James T Webber
- Department of Bioengineering and Therapeutic Sciences, Institute for Computational Health Sciences, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, USA
| | - Swati Kaushik
- Department of Bioengineering and Therapeutic Sciences, Institute for Computational Health Sciences, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, USA
| | - Sourav Bandyopadhyay
- Department of Bioengineering and Therapeutic Sciences, Institute for Computational Health Sciences, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
212
|
Lan W, Wang J, Li M, Liu J, Wu FX, Pan Y. Predicting MicroRNA-Disease Associations Based on Improved MicroRNA and Disease Similarities. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1774-1782. [PMID: 27392365 DOI: 10.1109/tcbb.2016.2586190] [Citation(s) in RCA: 77] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
MicroRNAs (miRNAs) are a type of non-coding RNAs with about ∼22nt nucleotides. Increasing evidences have shown that miRNAs play critical roles in many human diseases. The identification of human disease-related miRNAs is helpful to explore the underlying pathogenesis of diseases. More and more experimental validated associations between miRNAs and diseases have been reported in the recent studies, which provide useful information for new miRNA-disease association discovery. In this study, we propose a computational framework, KBMF-MDI, to predict the associations between miRNAs and diseases based on their similarities. The sequence and function information of miRNAs are used to measure similarity among miRNAs while the semantic and function information of disease are used to measure similarity among diseases, respectively. In addition, the kernelized Bayesian matrix factorization method is employed to infer potential miRNA-disease associations by integrating these data sources. We applied this method to 6,084 known miRNA-disease associations and utilized 5-fold cross validation to evaluate the performance. The experimental results demonstrate that our method can effectively predict unknown miRNA-disease associations.
Collapse
|
213
|
Liu J, Li M, Luo XJ, Su B. Systems-level analysis of risk genes reveals the modular nature of schizophrenia. Schizophr Res 2018; 201:261-269. [PMID: 29789256 DOI: 10.1016/j.schres.2018.05.015] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Revised: 05/10/2018] [Accepted: 05/12/2018] [Indexed: 12/31/2022]
Abstract
Schizophrenia (SCZ) is a complex mental disorder with high heritability. Genetic studies (especially recent genome-wide association studies) have identified many risk genes for schizophrenia. However, the physical interactions among the proteins encoded by schizophrenia risk genes remain elusive and it is not known whether the identified risk genes converge on common molecular networks or pathways. Here we systematically investigated the network characteristics of schizophrenia risk genes using the high-confidence protein-protein interactions (PPI) from the human interactome. We found that schizophrenia risk genes encode a densely interconnected PPI network (P = 4.15 × 10-31). Compared with the background genes, the schizophrenia risk genes in the interactome have significantly higher degree (P = 5.39 × 10-11), closeness centrality (P = 7.56 × 10-11), betweeness centrality (P = 1.29 × 10-11), clustering coefficient (P = 2.22 × 10-2), and shorter average shortest path length (P = 7.56 × 10-11). Based on the densely interconnected PPI network, we identified 48 hub genes and 4 modules formed by highly interconnected schizophrenia genes. We showed that the proteins encoded by schizophrenia hub genes have significantly more direct physical interactions. Gene ontology (GO) analysis revealed that cell adhesion, cell cycle, immune system response, and GABR-receptor complex categories were enriched in the modules formed by highly interconnected schizophrenia risk genes. Our study reveals that schizophrenia risk genes encode a densely interconnected molecular network and demonstrates the modular nature of schizophrenia.
Collapse
Affiliation(s)
- Jiewei Liu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China; Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Ming Li
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Kunming, Yunnan, China
| | - Xiong-Jian Luo
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Kunming, Yunnan, China; Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China.
| | - Bing Su
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China; Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China.
| |
Collapse
|
214
|
Kim J, Shin M, Kim J, Park C, Lee S, Woo J, Kim H, Seo D, Yu S, Park S. CASS: A distributed network clustering algorithm based on structure similarity for large-scale network. PLoS One 2018; 13:e0203670. [PMID: 30303961 PMCID: PMC6179193 DOI: 10.1371/journal.pone.0203670] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Accepted: 08/24/2018] [Indexed: 12/21/2022] Open
Abstract
As the size of networks increases, it is becoming important to analyze large-scale network data. A network clustering algorithm is useful for analysis of network data. Conventional network clustering algorithms in a single machine environment rather than a parallel machine environment are actively being researched. However, these algorithms cannot analyze large-scale network data because of memory size issues. As a solution, we propose a network clustering algorithm for large-scale network data analysis using Apache Spark by changing the paradigm of the conventional clustering algorithm to improve its efficiency in the Apache Spark environment. We also apply optimization approaches such as Bloom filter and shuffle selection to reduce memory usage and execution time. By evaluating our proposed algorithm based on an average normalized cut, we confirmed that the algorithm can analyze diverse large-scale network datasets such as biological, co-authorship, internet topology and social networks. Experimental results show that the proposed algorithm can develop more accurate clusters than comparative algorithms with less memory usage. Furthermore, we confirm the proposed optimization approaches and the scalability of the proposed algorithm. In addition, we validate that clusters found from the proposed algorithm can represent biologically meaningful functions.
Collapse
Affiliation(s)
- Jungrim Kim
- Department of Computer Science, Yonsei University, Seoul, South Korea
| | - Mincheol Shin
- Department of Computer Science, Yonsei University, Seoul, South Korea
| | - Jeongwoo Kim
- Department of Computer Science, Yonsei University, Seoul, South Korea
| | - Chihyun Park
- Department of Computer Science, Yonsei University, Seoul, South Korea
| | - Sujin Lee
- Department of Computer Science, Yonsei University, Seoul, South Korea
| | - Jaemin Woo
- Department of Computer Science, Yonsei University, Seoul, South Korea
| | - Hyerim Kim
- Department of Computer Science, Yonsei University, Seoul, South Korea
| | - Dongmin Seo
- Korea Institute of Science and Technology Information, Daejeon, South Korea
| | - Seokjong Yu
- Korea Institute of Science and Technology Information, Daejeon, South Korea
| | - Sanghyun Park
- Department of Computer Science, Yonsei University, Seoul, South Korea
- * E-mail:
| |
Collapse
|
215
|
Skinnider MA, Stacey RG, Foster LJ. Genomic data integration systematically biases interactome mapping. PLoS Comput Biol 2018; 14:e1006474. [PMID: 30332399 PMCID: PMC6192561 DOI: 10.1371/journal.pcbi.1006474] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2018] [Accepted: 08/30/2018] [Indexed: 12/15/2022] Open
Abstract
Elucidating the complete network of protein-protein interactions, or interactome, is a fundamental goal of the post-genomic era, yet existing interactome maps are far from complete. To increase the throughput and resolution of interactome mapping, methods for protein-protein interaction discovery by co-migration have been introduced. However, accurate identification of interacting protein pairs within the resulting large-scale proteomic datasets is challenging. Consequently, most computational pipelines for co-migration data analysis incorporate external genomic datasets to distinguish interacting from non-interacting protein pairs. The effect of this procedure on interactome mapping is poorly understood. Here, we conduct a rigorous analysis of genomic data integration for interactome recovery across a large number of co-migration datasets, spanning diverse experimental and computational methods. We find that genomic data integration leads to an increase in the functional coherence of the resulting interactome maps, but this comes at the expense of a decrease in power to discover novel interactions. Importantly, putative novel interactions predicted by genomic data integration are no more likely to later be experimentally discovered than those predicted from co-migration data alone. Our results reveal a widespread and unappreciated limitation in a methodology that has been widely used to map the interactome of humans and model organisms.
Collapse
Affiliation(s)
| | - R. Greg Stacey
- Michael Smith Laboratories, University of British Columbia, Vancouver, Canada
| | - Leonard J. Foster
- Michael Smith Laboratories, University of British Columbia, Vancouver, Canada
- Department of Biochemistry, University of British Columbia, Vancouver, Canada
| |
Collapse
|
216
|
Choi J, Oh I, Seo S, Ahn J. G2Vec: Distributed gene representations for identification of cancer prognostic genes. Sci Rep 2018; 8:13729. [PMID: 30213980 PMCID: PMC6137174 DOI: 10.1038/s41598-018-32180-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Accepted: 09/03/2018] [Indexed: 12/27/2022] Open
Abstract
Identification of cancer prognostic genes is important in that it can lead to accurate outcome prediction and better therapeutic trials for cancer patients. Many computational approaches have been proposed to achieve this goal; however, there is room for improvement. Recent developments in deep learning techniques can aid in the identification of better prognostic genes and more accurate outcome prediction, but one of the main problems in the adoption of deep learning for this purpose is that data from cancer patients have too many dimensions, while the number of samples is relatively small. In this study, we propose a novel network-based deep learning method to identify prognostic gene signatures via distributed gene representations generated by G2Vec, which is a modified Word2Vec model originally used for natural language processing. We applied the proposed method to five cancer types including liver cancer and showed that G2Vec outperformed extant feature selection methods, especially for small number of samples. Moreover, biomarkers identified by G2Vec was useful to find significant prognostic gene modules associated with hepatocellular carcinoma.
Collapse
Affiliation(s)
- Jonghwan Choi
- Department of Computer Science & Engineering, Incheon National University, Incheon, South Korea
| | - Ilhwan Oh
- Department of Computer Science & Engineering, Incheon National University, Incheon, South Korea
| | - Sangmin Seo
- Department of Computer Science & Engineering, Incheon National University, Incheon, South Korea
| | - Jaegyoon Ahn
- Department of Computer Science & Engineering, Incheon National University, Incheon, South Korea.
| |
Collapse
|
217
|
Hou Y, Gao B, Li G, Su Z. MaxMIF: A New Method for Identifying Cancer Driver Genes through Effective Data Integration. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2018; 5:1800640. [PMID: 30250803 PMCID: PMC6145398 DOI: 10.1002/advs.201800640] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Revised: 06/14/2018] [Indexed: 05/05/2023]
Abstract
Identification of a few cancer driver mutation genes from a much larger number of passenger mutation genes in cancer samples remains a highly challenging task. Here, a novel method for distinguishing the driver genes from the passenger genes by effective integration of somatic mutation data and molecular interaction data using a maximal mutational impact function (MaxMIF) is presented. When evaluated on six somatic mutation datasets of Pan-Cancer and 19 datasets of different cancer types from TCGA, MaxMIF almost always significantly outperforms all the existing state-of-the-art methods in terms of predictive accuracy, sensitivity, and specificity. It recovers about 30% more known cancer genes in 500 top-ranked candidate genes than the best among the other tools evaluated. MaxMIF is also highly robust to data perturbation. Intriguingly, MaxMIF is able to identify potential cancer driver genes, with strong experimental data support. Therefore, MaxMIF can be very useful for identifying or prioritizing cancer driver genes in the increasing number of available cancer genomic data.
Collapse
Affiliation(s)
- Yingnan Hou
- School of MathematicsShandong UniversityJinan250100P. R. China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100P. R. China
| | - Bo Gao
- School of MathematicsShandong UniversityJinan250100P. R. China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100P. R. China
| | - Guojun Li
- School of MathematicsShandong UniversityJinan250100P. R. China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100P. R. China
- Department of Bioinformatics and GenomicsThe University of North Carolina at Charlotte9201, University City BlvdCharlotteNC28223USA
| | - Zhengchang Su
- Department of Bioinformatics and GenomicsThe University of North Carolina at Charlotte9201, University City BlvdCharlotteNC28223USA
| |
Collapse
|
218
|
Leveraging multiple gene networks to prioritize GWAS candidate genes via network representation learning. Methods 2018; 145:41-50. [DOI: 10.1016/j.ymeth.2018.06.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Revised: 04/10/2018] [Accepted: 06/01/2018] [Indexed: 12/20/2022] Open
|
219
|
Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles. PLoS One 2018; 13:e0201056. [PMID: 30048494 PMCID: PMC6062065 DOI: 10.1371/journal.pone.0201056] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Accepted: 07/06/2018] [Indexed: 02/02/2023] Open
Abstract
The identification of disease-related genes and disease mechanisms is an important research goal; many studies have approached this problem by analysing genetic networks based on gene expression profiles and interaction datasets. To construct a gene network, correlations or associations among pairs of genes must be obtained. However, when gene expression data are heterogeneous with high levels of noise for samples assigned to the same condition, it is difficult to accurately determine whether a gene pair represents a significant gene-gene interaction (GGI). In order to solve this problem, we proposed a random forest-based method to classify significant GGIs from gene expression data. To train the model, we defined novel feature sets and utilised various high-confidence interactome datasets to deduce the correct answer set from known disease-specific genes. Using Alzheimer's disease data, the proposed method showed remarkable accuracy, and the GGIs established in the analysis can be used to build a meaningful genetic network that can explain the mechanisms underlying Alzheimer's disease.
Collapse
|
220
|
Farooqui A, Tazyeen S, Ahmed MM, Alam A, Ali S, Malik MZ, Ali S, Ishrat R. Assessment of the key regulatory genes and their Interologs for Turner Syndrome employing network approach. Sci Rep 2018; 8:10091. [PMID: 29973620 PMCID: PMC6031616 DOI: 10.1038/s41598-018-28375-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Accepted: 06/15/2018] [Indexed: 12/13/2022] Open
Abstract
Turner Syndrome (TS) is a condition where several genes are affected but the molecular mechanism remains unknown. Identifying the genes that regulate the TS network is one of the main challenges in understanding its aetiology. Here, we studied the regulatory network from manually curated genes reported in the literature and identified essential proteins involved in TS. The power-law distribution analysis showed that TS network carries scale-free hierarchical fractal attributes. This organization of the network maintained the self-ruled constitution of nodes at various levels without having centrality-lethality control systems. Out of twenty-seven genes culminating into leading hubs in the network, we identified two key regulators (KRs) i.e. KDM6A and BDNF. These KRs serve as the backbone for all the network activities. Removal of KRs does not cause its breakdown, rather a change in the topological properties was observed. Since essential proteins are evolutionarily conserved, the orthologs of selected interacting proteins in C. elegans, cat and macaque monkey (lower to higher level organisms) were identified. We deciphered three important interologs i.e. KDM6A-WDR5, KDM6A-ASH2L and WDR5-ASH2L that form a triangular motif. In conclusion, these KRs and identified interologs are expected to regulate the TS network signifying their biological importance.
Collapse
Affiliation(s)
- Anam Farooqui
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi, 110025, India
| | - Safia Tazyeen
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi, 110025, India
| | - Mohd Murshad Ahmed
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi, 110025, India
| | - Aftab Alam
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi, 110025, India
| | - Shahnawaz Ali
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi, 110025, India
| | - Md Zubbair Malik
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi, 110025, India
| | - Sher Ali
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi, 110025, India
| | - Romana Ishrat
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi, 110025, India.
| |
Collapse
|
221
|
Sharma A, Halu A, Decano JL, Padi M, Liu YY, Prasad RB, Fadista J, Santolini M, Menche J, Weiss ST, Vidal M, Silverman EK, Aikawa M, Barabási AL, Groop L, Loscalzo J. Controllability in an islet specific regulatory network identifies the transcriptional factor NFATC4, which regulates Type 2 Diabetes associated genes. NPJ Syst Biol Appl 2018; 4:25. [PMID: 29977601 PMCID: PMC6028434 DOI: 10.1038/s41540-018-0057-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2017] [Revised: 04/09/2018] [Accepted: 05/04/2018] [Indexed: 01/14/2023] Open
Abstract
Probing the dynamic control features of biological networks represents a new frontier in capturing the dysregulated pathways in complex diseases. Here, using patient samples obtained from a pancreatic islet transplantation program, we constructed a tissue-specific gene regulatory network and used the control centrality (Cc) concept to identify the high control centrality (HiCc) pathways, which might serve as key pathobiological pathways for Type 2 Diabetes (T2D). We found that HiCc pathway genes were significantly enriched with modest GWAS p-values in the DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) study. We identified variants regulating gene expression (expression quantitative loci, eQTL) of HiCc pathway genes in islet samples. These eQTL genes showed higher levels of differential expression compared to non-eQTL genes in low, medium, and high glucose concentrations in rat islets. Among genes with highly significant eQTL evidence, NFATC4 belonged to four HiCc pathways. We asked if the expressions of T2D-associated candidate genes from GWAS and literature are regulated by Nfatc4 in rat islets. Extensive in vitro silencing of Nfatc4 in rat islet cells displayed reduced expression of 16, and increased expression of four putative downstream T2D genes. Overall, our approach uncovers the mechanistic connection of NFATC4 with downstream targets including a previously unknown one, TCF7L2, and establishes the HiCc pathways' relationship to T2D.
Collapse
Affiliation(s)
- Amitabh Sharma
- 1Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 USA.,2Center for Complex Network Research and Department of Physics, Northeastern University, Boston, MA 02115 USA.,3Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA 02215 USA.,4Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02215 USA
| | - Arda Halu
- 1Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 USA.,4Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02215 USA
| | - Julius L Decano
- 4Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02215 USA
| | - Megha Padi
- 5Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ 85721 USA
| | - Yang-Yu Liu
- 1Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 USA
| | - Rashmi B Prasad
- 6Lund University Diabetes Center, Department of Clinical Sciences, Diabetes & Endocrinology, Skåne University Hospital Malmö, Lund University, Malmö, 20502 Sweden
| | - Joao Fadista
- 6Lund University Diabetes Center, Department of Clinical Sciences, Diabetes & Endocrinology, Skåne University Hospital Malmö, Lund University, Malmö, 20502 Sweden
| | - Marc Santolini
- 1Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 USA.,2Center for Complex Network Research and Department of Physics, Northeastern University, Boston, MA 02115 USA
| | - Jörg Menche
- 2Center for Complex Network Research and Department of Physics, Northeastern University, Boston, MA 02115 USA.,7 CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, 1090 Austria
| | - Scott T Weiss
- 1Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 USA
| | - Marc Vidal
- 3Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA 02215 USA.,8Department of Genetics, Harvard Medical School, Boston, MA 02115 USA
| | - Edwin K Silverman
- 1Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 USA
| | - Masanori Aikawa
- 4Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02215 USA
| | - Albert-László Barabási
- 1Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 USA.,2Center for Complex Network Research and Department of Physics, Northeastern University, Boston, MA 02115 USA.,3Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA 02215 USA.,9Center for Network Science, Central European University, Nador u. 9, 1051 Budapest, Hungary
| | - Leif Groop
- 6Lund University Diabetes Center, Department of Clinical Sciences, Diabetes & Endocrinology, Skåne University Hospital Malmö, Lund University, Malmö, 20502 Sweden.,10Department of Clinical Sciences, Islet cell physiology, Skåne University Hospital Malmö, Lund University, Malmö, 20502 Sweden
| | - Joseph Loscalzo
- 11Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 USA
| |
Collapse
|
222
|
Schäfer M, Klein HU, Schwender H. Integrative analysis of multiple genomic variables using a hierarchical Bayesian model. Bioinformatics 2018; 33:3220-3227. [PMID: 28582573 DOI: 10.1093/bioinformatics/btx356] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Accepted: 05/31/2017] [Indexed: 12/13/2022] Open
Abstract
Motivation Genes showing congruent differences in several genomic variables between two biological conditions are crucial to unravel causalities behind phenotypes of interest. Detecting such genes is important in biomedical research, e.g. when identifying genes responsible for cancer development. Small sample sizes common in next-generation sequencing studies are a key challenge, and there are still only very few statistical methods to analyze more than two genomic variables in an integrative, model-based way. Here, we present a novel bioinformatics approach to detect congruent differences between two biological conditions in a larger number of different measurements such as various epigenetic marks or mRNA transcript levels. Results We propose a coefficient quantifying the degree to which genes present consistent alterations in multiple (more than two) genomic variables when comparing samples presenting a condition of interest (e.g. cancer) to a reference group. A hierarchical Bayesian model is employed to assess uncertainty on a gene level, incorporating information on functional relationships between genes. We demonstrate the approach on different data sets containing RNA-seq gene transcripton and up to four ChIP-seq histone modification measurements. Both the coefficient-based ranking and the inference based on the model lead to a plausible prioritizing of candidate genes when analyzing multiple genomic variables. Availability and implementation BUGS code in the Supplement. Contact m.schaefer@uni-duesseldorf.de. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Martin Schäfer
- Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany
| | - Hans-Ulrich Klein
- Program in Translational Neuropsychiatric Genomics, Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Boston, MA 02115, USA.,Harvard Medical School, Boston, MA 02115, USA.,Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02141, USA
| | - Holger Schwender
- Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany
| |
Collapse
|
223
|
Requena D, Maffucci P, Bigio B, Shang L, Abhyankar A, Boisson B, Stenson PD, Cooper DN, Cunningham-Rundles C, Casanova JL, Abel L, Itan Y. CDG: An Online Server for Detecting Biologically Closest Disease-Causing Genes and its Application to Primary Immunodeficiency. Front Immunol 2018; 9:1340. [PMID: 29997612 PMCID: PMC6030251 DOI: 10.3389/fimmu.2018.01340] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Accepted: 05/29/2018] [Indexed: 11/17/2022] Open
Abstract
High-throughput genomic technologies yield about 20,000 variants in the protein-coding exome of each individual. A commonly used approach to select candidate disease-causing variants is to test whether the associated gene has been previously reported to be disease-causing. In the absence of known disease-causing genes, it can be challenging to associate candidate genes with specific genetic diseases. To facilitate the discovery of novel gene-disease associations, we determined the putative biologically closest known genes and their associated diseases for 13,005 human genes not currently reported to be disease-associated. We used these data to construct the closest disease-causing genes (CDG) server, which can be used to infer the closest genes with an associated disease for a user-defined list of genes or diseases. We demonstrate the utility of the CDG server in five immunodeficiency patient exomes across different diseases and modes of inheritance, where CDG dramatically reduced the number of candidate genes to be evaluated. This resource will be a considerable asset for ascertaining the potential relevance of genetic variants found in patient exomes to specific diseases of interest. The CDG database and online server are freely available to non-commercial users at: http://lab.rockefeller.edu/casanova/CDG.
Collapse
Affiliation(s)
- David Requena
- St. Giles Laboratory of Human Genetics of Infectious Diseases (Rockefeller Branch), The Rockefeller University, New York, NY, United States
| | - Patrick Maffucci
- St. Giles Laboratory of Human Genetics of Infectious Diseases (Rockefeller Branch), The Rockefeller University, New York, NY, United States.,Graduate School, Icahn School of Medicine at Mount Sinai, New York, NY, United States.,Department of Medicine, Division of Clinical Immunology, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Benedetta Bigio
- St. Giles Laboratory of Human Genetics of Infectious Diseases (Rockefeller Branch), The Rockefeller University, New York, NY, United States
| | - Lei Shang
- St. Giles Laboratory of Human Genetics of Infectious Diseases (Rockefeller Branch), The Rockefeller University, New York, NY, United States
| | | | - Bertrand Boisson
- St. Giles Laboratory of Human Genetics of Infectious Diseases (Rockefeller Branch), The Rockefeller University, New York, NY, United States.,Laboratory of Human Genetics of Infectious Diseases (Necker Branch), INSERM U1163, Paris, France.,Paris Descartes University, Imagine Institute, Paris, France
| | - Peter D Stenson
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom
| | - David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom
| | - Charlotte Cunningham-Rundles
- Graduate School, Icahn School of Medicine at Mount Sinai, New York, NY, United States.,Department of Medicine, Division of Clinical Immunology, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Jean-Laurent Casanova
- St. Giles Laboratory of Human Genetics of Infectious Diseases (Rockefeller Branch), The Rockefeller University, New York, NY, United States.,Laboratory of Human Genetics of Infectious Diseases (Necker Branch), INSERM U1163, Paris, France.,Paris Descartes University, Imagine Institute, Paris, France.,Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom.,Howard Hughes Medical Institute, New York, NY, United States.,Pediatric Immunology-Hematology Unit, Necker Hospital for Sick Children, Paris, France
| | - Laurent Abel
- St. Giles Laboratory of Human Genetics of Infectious Diseases (Rockefeller Branch), The Rockefeller University, New York, NY, United States.,Laboratory of Human Genetics of Infectious Diseases (Necker Branch), INSERM U1163, Paris, France.,Paris Descartes University, Imagine Institute, Paris, France
| | - Yuval Itan
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, United States.,Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| |
Collapse
|
224
|
Structure Optimization for Large Gene Networks Based on Greedy Strategy. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2018; 2018:9674108. [PMID: 30013615 PMCID: PMC6022335 DOI: 10.1155/2018/9674108] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/14/2018] [Revised: 04/23/2018] [Accepted: 05/11/2018] [Indexed: 01/04/2023]
Abstract
In the last few years, gene networks have become one of most important tools to model biological processes. Among other utilities, these networks visually show biological relationships between genes. However, due to the large amount of the currently generated genetic data, their size has grown to the point of being unmanageable. To solve this problem, it is possible to use computational approaches, such as heuristics-based methods, to analyze and optimize gene network's structure by pruning irrelevant relationships. In this paper we present a new method, called GeSOp, to optimize large gene network structures. The method is able to perform a considerably prune of the irrelevant relationships comprising the input network. To do so, the method is based on a greedy heuristic to obtain the most relevant subnetwork. The performance of our method was tested by means of two experiments on gene networks obtained from different organisms. The first experiment shows how GeSOp is able not only to carry out a significant reduction in the size of the network, but also to maintain the biological information ratio. In the second experiment, the ability to improve the biological indicators of the network is checked. Hence, the results presented show that GeSOp is a reliable method to optimize and improve the structure of large gene networks.
Collapse
|
225
|
Sun H, Cai X, Zhou H, Li X, Du Z, Zou H, Wu J, Xie L, Cheng Y, Xie W, Lu X, Xu L, Chen L, Li E, Wu B. The protein-protein interaction network and clinical significance of heat-shock proteins in esophageal squamous cell carcinoma. Amino Acids 2018; 50:685-697. [PMID: 29700654 DOI: 10.1007/s00726-018-2569-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Accepted: 04/09/2018] [Indexed: 02/06/2023]
Abstract
Heat-shock proteins (HSPs), one of the evolutionarily conserved protein families, are widely found in various organisms, and play important physiological functions. Nevertheless, HSPs have not been systematically analyzed in esophageal squamous cell carcinoma (ESCC). In this study, we applied the protein-protein interaction (PPI) network methodology to explore the characteristics of HSPs, and integrate their expression in ESCC. First, differentially expressed HSPs in ESCC were identified from our previous RNA-seq data. By constructing a specific PPI network, we found differentially expressed HSPs interacted with hundreds of neighboring proteins. Subcellular localization analyses demonstrated that HSPs and their interacting proteins distributed in multiple layers, from membrane to nucleus. Functional enrichment annotation analyses revealed known and potential functions for HSPs. KEGG pathway analyses identified four significant enrichment pathways. Moreover, three HSPs (DNAJC5B, HSPA1B, and HSPH1) could serve as promising targets for prognostic prediction in ESCC, suggesting these HSPs might play a significant role in the development of ESCC. These multiple bioinformatics analyses have provided a comprehensive view of the roles of heat-shock proteins in esophageal squamous cell carcinoma.
Collapse
Affiliation(s)
- Hong Sun
- Key Laboratory of Molecular Biology in High Cancer Incidence Coastal Chaoshan Area of Guangdong Higher Education Institutes, Shantou University Medical College, Shantou, 515041, China
- Department of Biochemistry and Molecular Biology, Shantou University Medical College, Shantou, 515041, China
| | - Xinyi Cai
- Key Laboratory of Molecular Biology in High Cancer Incidence Coastal Chaoshan Area of Guangdong Higher Education Institutes, Shantou University Medical College, Shantou, 515041, China
- Department of Biochemistry and Molecular Biology, Shantou University Medical College, Shantou, 515041, China
| | - Haofeng Zhou
- Key Laboratory of Molecular Biology in High Cancer Incidence Coastal Chaoshan Area of Guangdong Higher Education Institutes, Shantou University Medical College, Shantou, 515041, China
- Department of Biochemistry and Molecular Biology, Shantou University Medical College, Shantou, 515041, China
| | - Xiaoqi Li
- Key Laboratory of Molecular Biology in High Cancer Incidence Coastal Chaoshan Area of Guangdong Higher Education Institutes, Shantou University Medical College, Shantou, 515041, China
- Department of Biochemistry and Molecular Biology, Shantou University Medical College, Shantou, 515041, China
| | - Zepeng Du
- Department of Pathology, Shantou Central Hospital, Affiliated Shantou Hospital of Sun Yat-sen University, Shantou, 515041, China
| | - Haiying Zou
- Key Laboratory of Molecular Biology in High Cancer Incidence Coastal Chaoshan Area of Guangdong Higher Education Institutes, Shantou University Medical College, Shantou, 515041, China
- Department of Biochemistry and Molecular Biology, Shantou University Medical College, Shantou, 515041, China
| | - Jianyi Wu
- Key Laboratory of Molecular Biology in High Cancer Incidence Coastal Chaoshan Area of Guangdong Higher Education Institutes, Shantou University Medical College, Shantou, 515041, China
- Department of Biochemistry and Molecular Biology, Shantou University Medical College, Shantou, 515041, China
| | - Lei Xie
- Key Laboratory of Molecular Biology in High Cancer Incidence Coastal Chaoshan Area of Guangdong Higher Education Institutes, Shantou University Medical College, Shantou, 515041, China
- Department of Biochemistry and Molecular Biology, Shantou University Medical College, Shantou, 515041, China
| | - Yinwei Cheng
- Key Laboratory of Molecular Biology in High Cancer Incidence Coastal Chaoshan Area of Guangdong Higher Education Institutes, Shantou University Medical College, Shantou, 515041, China
- Institute of Oncologic Pathology, Shantou University Medical College, Shantou, 515041, China
| | - Wenming Xie
- Network and Information Center, Shantou University Medical College, Shantou, 515041, China
| | - Xiaomei Lu
- Tumor Hospital Affiliated to Xinjiang Medical University, Ürümqi, 830054, Xinjiang Uygur Autonomous Region, China
| | - Liyan Xu
- Key Laboratory of Molecular Biology in High Cancer Incidence Coastal Chaoshan Area of Guangdong Higher Education Institutes, Shantou University Medical College, Shantou, 515041, China
- Department of Pathology, Shantou Central Hospital, Affiliated Shantou Hospital of Sun Yat-sen University, Shantou, 515041, China
| | - Longqi Chen
- Department of Thoracic Surgery, West China Hospital of Sichuan University, Sichuan, 610041, China
| | - Enmin Li
- Key Laboratory of Molecular Biology in High Cancer Incidence Coastal Chaoshan Area of Guangdong Higher Education Institutes, Shantou University Medical College, Shantou, 515041, China.
- Department of Biochemistry and Molecular Biology, Shantou University Medical College, Shantou, 515041, China.
| | - Bingli Wu
- Key Laboratory of Molecular Biology in High Cancer Incidence Coastal Chaoshan Area of Guangdong Higher Education Institutes, Shantou University Medical College, Shantou, 515041, China.
- Department of Biochemistry and Molecular Biology, Shantou University Medical College, Shantou, 515041, China.
| |
Collapse
|
226
|
Wang RS, Loscalzo J. Network-Based Disease Module Discovery by a Novel Seed Connector Algorithm with Pathobiological Implications. J Mol Biol 2018; 430:2939-2950. [PMID: 29791871 DOI: 10.1016/j.jmb.2018.05.016] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 04/30/2018] [Accepted: 05/10/2018] [Indexed: 01/09/2023]
Abstract
Understanding the genetic basis of complex diseases is challenging. Prior work shows that disease-related proteins do not typically function in isolation. Rather, they often interact with each other to form a network module that underlies dysfunctional mechanistic pathways. Identifying such disease modules will provide insights into a systems-level understanding of molecular mechanisms of diseases. Owing to the incompleteness of our knowledge of disease proteins and limited information on the biological mediators of pathobiological processes, the key proteins (seed proteins) for many diseases appear scattered over the human protein-protein interactome and form a few small branches, rather than coherent network modules. In this paper, we develop a network-based algorithm, called the Seed Connector algorithm (SCA), to pinpoint disease modules by adding as few additional linking proteins (seed connectors) to the seed protein pool as possible. Such seed connectors are hidden disease module elements that are critical for interpreting the functional context of disease proteins. The SCA aims to connect seed disease proteins so that disease mechanisms and pathways can be decoded based on predicted coherent network modules. We validate the algorithm using a large corpus of 70 complex diseases and binding targets of over 200 drugs, and demonstrate the biological relevance of the seed connectors. Lastly, as a specific proof of concept, we apply the SCA to a set of seed proteins for coronary artery disease derived from a meta-analysis of large-scale genome-wide association studies and obtain a coronary artery disease module enriched with important disease-related signaling pathways and drug targets not previously recognized.
Collapse
Affiliation(s)
- Rui-Sheng Wang
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Joseph Loscalzo
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
227
|
Li G, Luo J, Xiao Q, Liang C, Ding P. Predicting microRNA-disease associations using label propagation based on linear neighborhood similarity. J Biomed Inform 2018; 82:169-177. [PMID: 29763707 DOI: 10.1016/j.jbi.2018.05.005] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2018] [Revised: 04/17/2018] [Accepted: 05/11/2018] [Indexed: 12/11/2022]
Abstract
Interactions between microRNAs (miRNAs) and diseases can yield important information for uncovering novel prognostic markers. Since experimental determination of disease-miRNA associations is time-consuming and costly, attention has been given to designing efficient and robust computational techniques for identifying undiscovered interactions. In this study, we present a label propagation model with linear neighborhood similarity, called LPLNS, to predict unobserved miRNA-disease associations. Additionally, a preprocessing step is performed to derive new interaction likelihood profiles that will contribute to the prediction since new miRNAs and diseases lack known associations. Our results demonstrate that the LPLNS model based on the known disease-miRNA associations could achieve impressive performance with an AUC of 0.9034. Furthermore, we observed that the LPLNS model based on new interaction likelihood profiles could improve the performance to an AUC of 0.9127. This was better than other comparable methods. In addition, case studies also demonstrated our method's outstanding performance for inferring undiscovered interactions between miRNAs and diseases, especially for novel diseases.
Collapse
Affiliation(s)
- Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China.
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Qiu Xiao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Cheng Liang
- College of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Pingjian Ding
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| |
Collapse
|
228
|
Lancour D, Naj A, Mayeux R, Haines JL, Pericak-Vance MA, Schellenberg GD, Crovella M, Farrer LA, Kasif S. One for all and all for One: Improving replication of genetic studies through network diffusion. PLoS Genet 2018; 14:e1007306. [PMID: 29684019 PMCID: PMC5933817 DOI: 10.1371/journal.pgen.1007306] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Revised: 05/03/2018] [Accepted: 03/11/2018] [Indexed: 12/31/2022] Open
Abstract
Improving accuracy in genetic studies would greatly accelerate understanding the genetic basis of complex diseases. One approach to achieve such an improvement for risk variants identified by the genome wide association study (GWAS) approach is to incorporate previously known biology when screening variants across the genome. We developed a simple approach for improving the prioritization of candidate disease genes that incorporates a network diffusion of scores from known disease genes using a protein network and a novel integration with GWAS risk scores, and tested this approach on a large Alzheimer disease (AD) GWAS dataset. Using a statistical bootstrap approach, we cross-validated the method and for the first time showed that a network approach improves the expected replication rates in GWAS studies. Several novel AD genes were predicted including CR2, SHARPIN, and PTPN2. Our re-prioritized results are enriched for established known AD-associated biological pathways including inflammation, immune response, and metabolism, whereas standard non-prioritized results were not. Our findings support a strategy of considering network information when investigating genetic risk factors. Integrating multiple types of -omics data is a rapidly growing research area due in part to the increasing amount of diverse and publicly accessible data. In this study, we demonstrated that integration of genetic association and protein interaction data using a network diffusion approach measurably improves reproducibility of top candidate genes. Application of this approach to Alzheimer disease (AD) using a large dataset assembled by the Alzheimer’s Disease Genetics Consortium identified several novel candidate AD genes that are supported by pre-existing knowledge of AD pathobiology. Our findings support a strategy of considering network information when investigating genetic risk factors. Finally, we developed a transparent and easy-to-use R package that can facilitate the extension of our methodology to other phenotypes for which genetic data are available.
Collapse
Affiliation(s)
- Daniel Lancour
- Bioinformatics Graduate Program, Boston University, Boston, Massachusetts, United States of America
- Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Adam Naj
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Richard Mayeux
- Department of Neurology and Sergievsky Center, Columbia University, New York, New York, United States of America
| | - Jonathan L. Haines
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Margaret A. Pericak-Vance
- Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, United States of America
| | - Gerard D. Schellenberg
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Mark Crovella
- Bioinformatics Graduate Program, Boston University, Boston, Massachusetts, United States of America
- Department of Computer Science, Boston University, Boston, Massachusetts, United States of America
| | - Lindsay A. Farrer
- Bioinformatics Graduate Program, Boston University, Boston, Massachusetts, United States of America
- Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, Massachusetts, United States of America
- Department of Neurology, Boston University School of Medicine, Boston, Massachusetts, United States of America
- Department of Ophthalmology, Boston University School of Medicine, Boston, Massachusetts, United States of America
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, United States of America
- Department of Epidemiology, Boston University School of Public Health, Boston, Massachusetts, United States of America
- * E-mail:
| | - Simon Kasif
- Bioinformatics Graduate Program, Boston University, Boston, Massachusetts, United States of America
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
| |
Collapse
|
229
|
Woldesemayat AA, Modise DM, Gemeildien J, Ndimba BK, Christoffels A. Cross-species multiple environmental stress responses: An integrated approach to identify candidate genes for multiple stress tolerance in sorghum (Sorghum bicolor (L.) Moench) and related model species. PLoS One 2018; 13:e0192678. [PMID: 29590108 PMCID: PMC5873934 DOI: 10.1371/journal.pone.0192678] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Accepted: 01/29/2018] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Crop response to the changing climate and unpredictable effects of global warming with adverse conditions such as drought stress has brought concerns about food security to the fore; crop yield loss is a major cause of concern in this regard. Identification of genes with multiple responses across environmental stresses is the genetic foundation that leads to crop adaptation to environmental perturbations. METHODS In this paper, we introduce an integrated approach to assess candidate genes for multiple stress responses across-species. The approach combines ontology based semantic data integration with expression profiling, comparative genomics, phylogenomics, functional gene enrichment and gene enrichment network analysis to identify genes associated with plant stress phenotypes. Five different ontologies, viz., Gene Ontology (GO), Trait Ontology (TO), Plant Ontology (PO), Growth Ontology (GRO) and Environment Ontology (EO) were used to semantically integrate drought related information. RESULTS Target genes linked to Quantitative Trait Loci (QTLs) controlling yield and stress tolerance in sorghum (Sorghum bicolor (L.) Moench) and closely related species were identified. Based on the enriched GO terms of the biological processes, 1116 sorghum genes with potential responses to 5 different stresses, such as drought (18%), salt (32%), cold (20%), heat (8%) and oxidative stress (25%) were identified to be over-expressed. Out of 169 sorghum drought responsive QTLs associated genes that were identified based on expression datasets, 56% were shown to have multiple stress responses. On the other hand, out of 168 additional genes that have been evaluated for orthologous pairs, 90% were conserved across species for drought tolerance. Over 50% of identified maize and rice genes were responsive to drought and salt stresses and were co-located within multifunctional QTLs. Among the total identified multi-stress responsive genes, 272 targets were shown to be co-localized within QTLs associated with different traits that are responsive to multiple stresses. Ontology mapping was used to validate the identified genes, while reconstruction of the phylogenetic tree was instrumental to infer the evolutionary relationship of the sorghum orthologs. The results also show specific genes responsible for various interrelated components of drought response mechanism such as drought tolerance, drought avoidance and drought escape. CONCLUSIONS We submit that this approach is novel and to our knowledge, has not been used previously in any other research; it enables us to perform cross-species queries for genes that are likely to be associated with multiple stress tolerance, as a means to identify novel targets for engineering stress resistance in sorghum and possibly, in other crop species.
Collapse
Affiliation(s)
- Adugna Abdi Woldesemayat
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Belleville, South Africa
- Department of Life and Consumer Sciences, College of Agriculture and Environmental Sciences, University of South Africa, Science Campus, Florida, Johannesburg, South Africa
- * E-mail: ,
| | - David M. Modise
- Department of Agriculture and Animal Health, College of Agriculture and Environmental Sciences, University of South Africa, Science Campus, Florida, Johannesburg, South Africa
| | - Junaid Gemeildien
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Belleville, South Africa
| | - Bongani K. Ndimba
- Department of Biotechnology, University of the Western Cape, Cape Town, Western Cape, South Africa
- Agricultural Research Council, Infruitech-Nietvoorbij, Stellenbosch, South Africa
| | - Alan Christoffels
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Belleville, South Africa
| |
Collapse
|
230
|
Systematic Evaluation of Molecular Networks for Discovery of Disease Genes. Cell Syst 2018; 6:484-495.e5. [PMID: 29605183 DOI: 10.1016/j.cels.2018.03.001] [Citation(s) in RCA: 181] [Impact Index Per Article: 25.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Revised: 12/19/2017] [Accepted: 02/28/2018] [Indexed: 12/27/2022]
Abstract
Gene networks are rapidly growing in size and number, raising the question of which networks are most appropriate for particular applications. Here, we evaluate 21 human genome-wide interaction networks for their ability to recover 446 disease gene sets identified through literature curation, gene expression profiling, or genome-wide association studies. While all networks have some ability to recover disease genes, we observe a wide range of performance with STRING, ConsensusPathDB, and GIANT networks having the best performance overall. A general tendency is that performance scales with network size, suggesting that new interaction discovery currently outweighs the detrimental effects of false positives. Correcting for size, we find that the DIP network provides the highest efficiency (value per interaction). Based on these results, we create a parsimonious composite network with both high efficiency and performance. This work provides a benchmark for selection of molecular networks in human disease research.
Collapse
|
231
|
Hall AW, Battenhouse AM, Shivram H, Morris AR, Cowperthwaite MC, Shpak M, Iyer VR. Bivalent Chromatin Domains in Glioblastoma Reveal a Subtype-Specific Signature of Glioma Stem Cells. Cancer Res 2018; 78:2463-2474. [PMID: 29549165 DOI: 10.1158/0008-5472.can-17-1724] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Revised: 10/10/2017] [Accepted: 03/13/2018] [Indexed: 12/12/2022]
Abstract
Glioblastoma multiforme (GBM) can be clustered by gene expression into four main subtypes associated with prognosis and survival, but enhancers and other gene-regulatory elements have not yet been identified in primary tumors. Here, we profiled six histone modifications and CTCF binding as well as gene expression in primary gliomas and identified chromatin states that define distinct regulatory elements across the tumor genome. Enhancers in mesenchymal and classical tumor subtypes drove gene expression associated with cell migration and invasion, whereas enhancers in proneural tumors controlled genes associated with a less aggressive phenotype in GBM. We identified bivalent domains marked by activating and repressive chromatin modifications. Interestingly, the gene interaction network from common (subtype-independent) bivalent domains was highly enriched for homeobox genes and transcription factors and dominated by SHH and Wnt signaling pathways. This subtype-independent signature of early neural development may be indicative of poised dedifferentiation capacity in glioblastoma and could provide potential targets for therapy.Significance: Enhancers and bivalent domains in glioblastoma are regulated in a subtype-specific manner that resembles gene regulation in glioma stem cells. Cancer Res; 78(10); 2463-74. ©2018 AACR.
Collapse
Affiliation(s)
- Amelia Weber Hall
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas
| | - Anna M Battenhouse
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas
| | - Haridha Shivram
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas
| | - Adam R Morris
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas
| | | | - Max Shpak
- St David's Medical Center, Austin, Texas.,Sarah Cannon Research Institute, Nashville, Tennessee
| | - Vishwanath R Iyer
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas. .,Livestrong Cancer Institutes, Dell Medical School, University of Texas at Austin, Austin, Texas
| |
Collapse
|
232
|
Abstract
Genome-wide association studies (GWAS) have identified more than 100 loci that show robust association with schizophrenia risk. However, due to the complexity of linkage disequilibrium and gene regulatory, it is challenging to pinpoint the causal genes at the risk loci and translate the genetic findings from GWAS into disease mechanism and clinical treatment. Here we systematically predicted the plausible candidate causal genes for schizophrenia at genome-wide level. We utilized different approaches and strategies to predict causal genes for schizophrenia, including Sherlock, SMR, DAPPLE, Prix Fixe, NetWAS, and DEPICT. By integrating the results from different prediction approaches, we identified six top candidates that represent promising causal genes for schizophrenia, including CNTN4, GATAD2A, GPM6A, MMP16, PSMA4, and TCF4. Besides, we also identified 35 additional high-confidence causal genes for schizophrenia. The identified causal genes showed distinct spatio-temporal expression patterns in developing and adult human brain. Cell-type-specific expression analysis indicated that the expression level of the predicted causal genes was significantly higher in neurons compared with oligodendrocytes and microglia (P < 0.05). We found that synaptic transmission-related genes were significantly enriched among the identified causal genes (P < 0.05), providing further support for the dysregulation of synaptic transmission in schizophrenia. Finally, we showed that the top six causal genes are dysregulated in schizophrenia cases compared with controls and knockdown of these genes impaired the proliferation of neuronal cells. Our study depicts the landscape of plausible schizophrenia causal genes for the first time. Further genetic and functional validation of these genes will provide mechanistic insights into schizophrenia pathogenesis and may facilitate to provide potential targets for future therapeutics and diagnostics.
Collapse
Affiliation(s)
- Changguo Ma
- 0000000119573309grid.9227.eKey Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223 China
| | - Chunjie Gu
- 0000000119573309grid.9227.eKey Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223 China
| | - Yongxia Huo
- 0000000119573309grid.9227.eKey Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223 China
| | - Xiaoyan Li
- 0000000119573309grid.9227.eKey Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223 China
| | - Xiong-Jian Luo
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, 650223, China. .,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, Yunnan, 650223, China.
| |
Collapse
|
233
|
Abstract
Network propagation is a powerful tool for genetic analysis which is widely used to identify genes and genetic modules that underlie a process of interest. Here we provide a graphical, web-based platform (http://anat.cs.tau.ac.il/WebPropagate/) in which researchers can easily apply variants of this method to data sets of interest using up-to-date networks of protein-protein interactions in several organisms.
Collapse
Affiliation(s)
- Hadas Biran
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| | - Tovi Almozlino
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| | - Martin Kupiec
- Department of Molecular Microbiology and Biotechnology, Tel Aviv University, Tel Aviv 69978, Israel
| | - Roded Sharan
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel.
| |
Collapse
|
234
|
Leveraging human genetic and adverse outcome pathway (AOP) data to inform susceptibility in human health risk assessment. Mamm Genome 2018; 29:190-204. [DOI: 10.1007/s00335-018-9738-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 01/31/2018] [Indexed: 12/19/2022]
|
235
|
Gao L, Uzun Y, Gao P, He B, Ma X, Wang J, Han S, Tan K. Identifying noncoding risk variants using disease-relevant gene regulatory networks. Nat Commun 2018; 9:702. [PMID: 29453388 PMCID: PMC5816022 DOI: 10.1038/s41467-018-03133-y] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Accepted: 01/22/2018] [Indexed: 02/01/2023] Open
Abstract
Identifying noncoding risk variants remains a challenging task. Because noncoding variants exert their effects in the context of a gene regulatory network (GRN), we hypothesize that explicit use of disease-relevant GRNs can significantly improve the inference accuracy of noncoding risk variants. We describe Annotation of Regulatory Variants using Integrated Networks (ARVIN), a general computational framework for predicting causal noncoding variants. It employs a set of novel regulatory network-based features, combined with sequence-based features to infer noncoding risk variants. Using known causal variants in gene promoters and enhancers in a number of diseases, we show ARVIN outperforms state-of-the-art methods that use sequence-based features alone. Additional experimental validation using reporter assay further demonstrates the accuracy of ARVIN. Application of ARVIN to seven autoimmune diseases provides a holistic view of the gene subnetwork perturbed by the combinatorial action of the entire set of risk noncoding mutations.
Collapse
Affiliation(s)
- Long Gao
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Yasin Uzun
- Division of Oncology and Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Peng Gao
- Division of Oncology and Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Bing He
- Division of Oncology and Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Xiaoke Ma
- School of Computer Science and Technology, Xidian University, Xi'an, 710126, Shaanxi, China
| | - Jiahui Wang
- The Jackson Laboratory, Farmington, CT, 06032, USA
| | - Shizhong Han
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Kai Tan
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
- Division of Oncology and Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
- Department of Cell & Developmental Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
236
|
Lee T, Lee I. araGWAB: Network-based boosting of genome-wide association studies in Arabidopsis thaliana. Sci Rep 2018; 8:2925. [PMID: 29440686 PMCID: PMC5811503 DOI: 10.1038/s41598-018-21301-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Accepted: 02/01/2018] [Indexed: 11/23/2022] Open
Abstract
Genome-wide association studies (GWAS) have been applied for the genetic dissection of complex phenotypes in Arabidopsis thaliana. However, the significantly associated single-nucleotide polymorphisms (SNPs) could not explain all the phenotypic variations. A major reason for missing true phenotype-associated loci is the strict P-value threshold after adjustment for multiple hypothesis tests to reduce false positives. This statistical limitation can be partly overcome by increasing the sample size, but at a much higher cost. Alternatively, weak phenotype-association signals can be boosted by integrating other types of data. Here, we present a web application for network-based Arabidopsis genome-wide association boosting-araGWAB-which augments the likelihood of association with the given phenotype by integrating GWAS summary statistics (SNP P-values) and co-functional gene network information. The integration utilized the inherent values of SNPs with subthreshold significance, thus substantially increasing the information usage of GWAS data. We found that araGWAB could more effectively retrieve genes known to be associated with various phenotypes relevant to defense against bacterial pathogens, flowering time regulation, and organ development in A. thaliana. We also found that many of the network-boosted candidate genes for the phenotypes were supported by previous publications. The araGWAB is freely available at http://www.inetbio.org/aragwab/ .
Collapse
Affiliation(s)
- Tak Lee
- Department of Biotechnology, College of Life Sciences and Biotechnology, Yonsei University, Seoul, 03722, Korea
| | - Insuk Lee
- Department of Biotechnology, College of Life Sciences and Biotechnology, Yonsei University, Seoul, 03722, Korea.
| |
Collapse
|
237
|
Constructing Bayesian networks by integrating gene expression and copy number data identifies NLGN4Y as a novel regulator of prostate cancer progression. Oncotarget 2018; 7:68688-68707. [PMID: 27626693 PMCID: PMC5356583 DOI: 10.18632/oncotarget.11925] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2016] [Accepted: 08/24/2016] [Indexed: 12/27/2022] Open
Abstract
To understand the heterogeneity of prostate cancer (PCa) and identify novel underlying drivers, we constructed integrative molecular Bayesian networks (IMBNs) for PCa by integrating gene expression and copy number alteration data from published datasets. After demonstrating such IMBNs with superior network accuracy, we identified multiple sub-networks within IMBNs related to biochemical recurrence (BCR) of PCa and inferred the corresponding key drivers. The key drivers regulated a set of common effectors including genes preferentially expressed in neuronal cells. NLGN4Y—a protein involved in synaptic adhesion in neurons—was ranked as the top gene closely linked to key drivers of myogenesis subnetworks. Lower expression of NLGN4Y was associated with higher grade PCa and an increased risk of BCR. We show that restoration of the protein expression of NLGN4Y in PC-3 cells leads to decreased cell proliferation, migration and inflammatory cytokine expression. Our results suggest that NLGN4Y is an important negative regulator in prostate cancer progression. More importantly, it highlights the value of IMBNs in generating biologically and clinically relevant hypotheses about prostate cancer that can be validated by independent studies.
Collapse
|
238
|
Picart-Armada S, Thompson WK, Buil A, Perera-Lluna A. diffuStats: an R package to compute diffusion-based scores on biological networks. Bioinformatics 2018; 34:533-534. [PMID: 29029016 PMCID: PMC5860365 DOI: 10.1093/bioinformatics/btx632] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Revised: 09/28/2017] [Accepted: 10/04/2017] [Indexed: 12/31/2022] Open
Abstract
Summary Label propagation and diffusion over biological networks are a common mathematical formalism in computational biology for giving context to molecular entities and prioritizing novel candidates in the area of study. There are several choices in conceiving the diffusion process-involving the graph kernel, the score definitions and the presence of a posterior statistical normalization-which have an impact on the results. This manuscript describes diffuStats, an R package that provides a collection of graph kernels and diffusion scores, as well as a parallel permutation analysis for the normalized scores, that eases the computation of the scores and their benchmarking for an optimal choice. Availability and implementation The R package diffuStats is publicly available in Bioconductor, https://bioconductor.org, under the GPL-3 license. Contact sergi.picart@upc.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sergio Picart-Armada
- B2SLab, Departament d’Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Barcelona, Spain
- Department of Biomedical Engineering, Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
| | - Wesley K Thompson
- Institute of Biological Psychiatry, Mental Health Center Sct. Hans, Roskilde, Denmark
- Department of Family Medicine and Public Health, University of California, San Diego, La Jolla, CA, USA
| | - Alfonso Buil
- Institute of Biological Psychiatry, Mental Health Center Sct. Hans, Roskilde, Denmark
| | - Alexandre Perera-Lluna
- B2SLab, Departament d’Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Barcelona, Spain
- Department of Biomedical Engineering, Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
| |
Collapse
|
239
|
Zhao J, Cheng F, Jia P, Cox N, Denny JC, Zhao Z. An integrative functional genomics framework for effective identification of novel regulatory variants in genome-phenome studies. Genome Med 2018; 10:7. [PMID: 29378629 PMCID: PMC5789733 DOI: 10.1186/s13073-018-0513-x] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 01/04/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Genome-phenome studies have identified thousands of variants that are statistically associated with disease or traits; however, their functional roles are largely unclear. A comprehensive investigation of regulatory mechanisms and the gene regulatory networks between phenome-wide association study (PheWAS) and genome-wide association study (GWAS) is needed to identify novel regulatory variants contributing to risk for human diseases. METHODS In this study, we developed an integrative functional genomics framework that maps 215,107 significant single nucleotide polymorphism (SNP) traits generated from the PheWAS Catalog and 28,870 genome-wide significant SNP traits collected from the GWAS Catalog into a global human genome regulatory map via incorporating various functional annotation data, including transcription factor (TF)-based motifs, promoters, enhancers, and expression quantitative trait loci (eQTLs) generated from four major functional genomics databases: FANTOM5, ENCODE, NIH Roadmap, and Genotype-Tissue Expression (GTEx). In addition, we performed a tissue-specific regulatory circuit analysis through the integration of the identified regulatory variants and tissue-specific gene expression profiles in 7051 samples across 32 tissues from GTEx. RESULTS We found that the disease-associated loci in both the PheWAS and GWAS Catalogs were significantly enriched with functional SNPs. The integration of functional annotations significantly improved the power of detecting novel associations in PheWAS, through which we found a number of functional associations with strong regulatory evidence in the PheWAS Catalog. Finally, we constructed tissue-specific regulatory circuits for several complex traits: mental diseases, autoimmune diseases, and cancer, via exploring tissue-specific TF-promoter/enhancer-target gene interaction networks. We uncovered several promising tissue-specific regulatory TFs or genes for Alzheimer's disease (e.g. ZIC1 and STX1B) and asthma (e.g. CSF3 and IL1RL1). CONCLUSIONS This study offers powerful tools for exploring the functional consequences of variants generated from genome-phenome association studies in terms of their mechanisms on affecting multiple complex diseases and traits.
Collapse
Affiliation(s)
- Junfei Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA
| | - Feixiong Cheng
- Center for Cancer Systems Biology and Department of Cancer Biology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, 02215, USA
- Center for Complex Networks Research, Northeastern University, Boston, MA, 02215, USA
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA
| | - Nancy Cox
- Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN, 37232, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Joshua C Denny
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA.
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| |
Collapse
|
240
|
Lages J, Shepelyansky DL, Zinovyev A. Inferring hidden causal relations between pathway members using reduced Google matrix of directed biological networks. PLoS One 2018; 13:e0190812. [PMID: 29370181 PMCID: PMC5784915 DOI: 10.1371/journal.pone.0190812] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2017] [Accepted: 12/20/2017] [Indexed: 12/18/2022] Open
Abstract
Signaling pathways represent parts of the global biological molecular network which connects them into a seamless whole through complex direct and indirect (hidden) crosstalk whose structure can change during development or in pathological conditions. We suggest a novel methodology, called Googlomics, for the structural analysis of directed biological networks using spectral analysis of their Google matrices, using parallels with quantum scattering theory, developed for nuclear and mesoscopic physics and quantum chaos. We introduce analytical "reduced Google matrix" method for the analysis of biological network structure. The method allows inferring hidden causal relations between the members of a signaling pathway or a functionally related group of genes. We investigate how the structure of hidden causal relations can be reprogrammed as a result of changes in the transcriptional network layer during cancerogenesis. The suggested Googlomics approach rigorously characterizes complex systemic changes in the wiring of large causal biological networks in a computationally efficient way.
Collapse
Affiliation(s)
- José Lages
- Institut UTINAM, Observatoire des Sciences de l'Univers THETA, CNRS, Université de Franche-Comté, 25030 Besançon, France
| | - Dima L Shepelyansky
- Laboratoire de Physique Théorique du CNRS, IRSAMC, Université de Toulouse, CNRS, UPS, 31062 Toulouse, France
| | - Andrei Zinovyev
- Institut Curie, PSL Research University, Mines Paris Tech, Inserm, U900, F-75005, Paris, France
- Laboratory of advanced methods for high-dimensional data analysis, Lobachevsky University, Nizhni Novgorod, Russia
| |
Collapse
|
241
|
Cheng L, Jiang Y, Ju H, Sun J, Peng J, Zhou M, Hu Y. InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk. BMC Genomics 2018; 19:919. [PMID: 29363423 PMCID: PMC5780854 DOI: 10.1186/s12864-017-4338-6] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Background Since the establishment of the first biomedical ontology Gene Ontology (GO), the number of biomedical ontology has increased dramatically. Nowadays over 300 ontologies have been built including extensively used Disease Ontology (DO) and Human Phenotype Ontology (HPO). Because of the advantage of identifying novel relationships between terms, calculating similarity between ontology terms is one of the major tasks in this research area. Though similarities between terms within each ontology have been studied with in silico methods, term similarities across different ontologies were not investigated as deeply. The latest method took advantage of gene functional interaction network (GFIN) to explore such inter-ontology similarities of terms. However, it only used gene interactions and failed to make full use of the connectivity among gene nodes of the network. In addition, all existent methods are particularly designed for GO and their performances on the extended ontology community remain unknown. Results We proposed a method InfAcrOnt to infer similarities between terms across ontologies utilizing the entire GFIN. InfAcrOnt builds a term-gene-gene network which comprised ontology annotations and GFIN, and acquires similarities between terms across ontologies through modeling the information flow within the network by random walk. In our benchmark experiments on sub-ontologies of GO, InfAcrOnt achieves a high average area under the receiver operating characteristic curve (AUC) (0.9322 and 0.9309) and low standard deviations (1.8746e-6 and 3.0977e-6) in both human and yeast benchmark datasets exhibiting superior performance. Meanwhile, comparisons of InfAcrOnt results and prior knowledge on pair-wise DO-HPO terms and pair-wise DO-GO terms show high correlations. Conclusions The experiment results show that InfAcrOnt significantly improves the performance of inferring similarities between terms across ontologies in benchmark set. Electronic supplementary material The online version of this article (10.1186/s12864-017-4338-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, People's Republic of China
| | - Yue Jiang
- Hospital for Sick Children, Toronto, M5G 1X8, Canada
| | - Hong Ju
- Department of Information Engineering, Heilongjiang Biological Science and Technology Career Academy, Harbin, 150081, People's Republic of China
| | - Jie Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, People's Republic of China
| | - Jiajie Peng
- School of Computer Science, Northwestern Polytechnical University, Xian, 710072, People's Republic of China
| | - Meng Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, People's Republic of China.
| | - Yang Hu
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, 150088, People's Republic of China.
| |
Collapse
|
242
|
Voigt EA, Grill DE, Zimmermann MT, Simon WL, Ovsyannikova IG, Kennedy RB, Poland GA. Transcriptomic signatures of cellular and humoral immune responses in older adults after seasonal influenza vaccination identified by data-driven clustering. Sci Rep 2018; 8:739. [PMID: 29335477 PMCID: PMC5768803 DOI: 10.1038/s41598-017-17735-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2017] [Accepted: 11/30/2017] [Indexed: 12/13/2022] Open
Abstract
PBMC transcriptomes after influenza vaccination contain valuable information about factors affecting vaccine responses. However, distilling meaningful knowledge out of these complex datasets is often difficult and requires advanced data mining algorithms. We investigated the use of the data-driven Weighted Gene Correlation Network Analysis (WGCNA) gene clustering method to identify vaccine response-related genes in PBMC transcriptomic datasets collected from 138 healthy older adults (ages 50-74) before and after 2010-2011 seasonal trivalent influenza vaccination. WGCNA separated the 14,197 gene dataset into 15 gene clusters based on observed gene expression patterns across subjects. Eight clusters were strongly enriched for genes involved in specific immune cell types and processes, including B cells, T cells, monocytes, platelets, NK cells, cytotoxic T cells, and antiviral signaling. Examination of gene cluster membership identified signatures of cellular and humoral responses to seasonal influenza vaccination, as well as pre-existing cellular immunity. The results of this study illustrate the utility of this publically available analysis methodology and highlight genes previously associated with influenza vaccine responses (e.g., CAMK4, CD19), genes with functions not previously identified in vaccine responses (e.g., SPON2, MATK, CST7), and previously uncharacterized genes (e.g. CORO1C, C8orf83) likely related to influenza vaccine-induced immunity due to their expression patterns.
Collapse
Affiliation(s)
- Emily A Voigt
- Mayo Clinic Vaccine Research Group, Mayo Clinic, Rochester, MN 55905, USA
| | - Diane E Grill
- Division of Biomedical Statistics and Informatics Mayo Clinic, Rochester, MN 55905, USA
| | - Michael T Zimmermann
- Division of Biomedical Statistics and Informatics Mayo Clinic, Rochester, MN 55905, USA
| | - Whitney L Simon
- Mayo Clinic Vaccine Research Group, Mayo Clinic, Rochester, MN 55905, USA
| | | | - Richard B Kennedy
- Mayo Clinic Vaccine Research Group, Mayo Clinic, Rochester, MN 55905, USA
| | - Gregory A Poland
- Mayo Clinic Vaccine Research Group, Mayo Clinic, Rochester, MN 55905, USA.
| |
Collapse
|
243
|
Han H, Cho JW, Lee S, Yun A, Kim H, Bae D, Yang S, Kim CY, Lee M, Kim E, Lee S, Kang B, Jeong D, Kim Y, Jeon HN, Jung H, Nam S, Chung M, Kim JH, Lee I. TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic Acids Res 2018; 46:D380-D386. [PMID: 29087512 PMCID: PMC5753191 DOI: 10.1093/nar/gkx1013] [Citation(s) in RCA: 1216] [Impact Index Per Article: 173.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2017] [Revised: 10/02/2017] [Accepted: 10/13/2017] [Indexed: 02/07/2023] Open
Abstract
Transcription factors (TFs) are major trans-acting factors in transcriptional regulation. Therefore, elucidating TF-target interactions is a key step toward understanding the regulatory circuitry underlying complex traits such as human diseases. We previously published a reference TF-target interaction database for humans-TRRUST (Transcriptional Regulatory Relationships Unraveled by Sentence-based Text mining)-which was constructed using sentence-based text mining, followed by manual curation. Here, we present TRRUST v2 (www.grnpedia.org/trrust) with a significant improvement from the previous version, including a significantly increased size of the database consisting of 8444 regulatory interactions for 800 TFs in humans. More importantly, TRRUST v2 also contains a database for TF-target interactions in mice, including 6552 TF-target interactions for 828 mouse TFs. TRRUST v2 is also substantially more comprehensive and less biased than other TF-target interaction databases. We also improved the web interface, which now enables prioritization of key TFs for a physiological condition depicted by a set of user-input transcriptional responsive genes. With the significant expansion in the database size and inclusion of the new web tool for TF prioritization, we believe that TRRUST v2 will be a versatile database for the study of the transcriptional regulation involved in human diseases.
Collapse
Affiliation(s)
- Heonjong Han
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Jae-Won Cho
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Sangyoung Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Ayoung Yun
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Hyojin Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Dasom Bae
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Sunmo Yang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Chan Yeong Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Muyoung Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Eunbeen Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Sungho Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Byunghee Kang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Dabin Jeong
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Yaeji Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Hyeon-Nae Jeon
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Haein Jung
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Sunhwee Nam
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Michael Chung
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Jong-Hoon Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Korea University, Seoul 02841, Korea
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| |
Collapse
|
244
|
Abstract
Single cell RNA-seq (scRNA-seq) experiments suffer from a range of characteristic technical biases, such as dropouts (zero or near zero counts) and high variance. Current analysis methods rely on imputing missing values by various means of local averaging or regression, often amplifying biases inherent in the data. We present netSmooth, a network-diffusion based method that uses priors for the covariance structure of gene expression profiles on scRNA-seq experiments in order to smooth expression values. We demonstrate that netSmooth improves clustering results of scRNA-seq experiments from distinct cell populations, time-course experiments, and cancer genomics. We provide an R package for our method, available at: https://github.com/BIMSBbioinfo/netSmooth.
Collapse
Affiliation(s)
- Jonathan Ronen
- Scientific Bioinformatics Platform, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, 13125, Germany
| | - Altuna Akalin
- Scientific Bioinformatics Platform, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, 13125, Germany
| |
Collapse
|
245
|
Abstract
Single cell RNA-seq (scRNA-seq) experiments suffer from a range of characteristic technical biases, such as dropouts (zero or near zero counts) and high variance. Current analysis methods rely on imputing missing values by various means of local averaging or regression, often amplifying biases inherent in the data. We present netSmooth, a network-diffusion based method that uses priors for the covariance structure of gene expression profiles on scRNA-seq experiments in order to smooth expression values. We demonstrate that netSmooth improves clustering results of scRNA-seq experiments from distinct cell populations, time-course experiments, and cancer genomics. We provide an R package for our method, available at: https://github.com/BIMSBbioinfo/netSmooth.
Collapse
Affiliation(s)
- Jonathan Ronen
- Scientific Bioinformatics Platform, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, 13125, Germany
| | - Altuna Akalin
- Scientific Bioinformatics Platform, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, 13125, Germany
| |
Collapse
|
246
|
Abstract
Single cell RNA-seq (scRNA-seq) experiments suffer from a range of characteristic technical biases, such as dropouts (zero or near zero counts) and high variance. Current analysis methods rely on imputing missing values by various means of local averaging or regression, often amplifying biases inherent in the data. We present netSmooth, a network-diffusion based method that uses priors for the covariance structure of gene expression profiles on scRNA-seq experiments in order to smooth expression values. We demonstrate that netSmooth improves clustering results of scRNA-seq experiments from distinct cell populations, time-course experiments, and cancer genomics. We provide an R package for our method, available at: https://github.com/BIMSBbioinfo/netSmooth.
Collapse
Affiliation(s)
- Jonathan Ronen
- Scientific Bioinformatics Platform, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, 13125, Germany
| | - Altuna Akalin
- Scientific Bioinformatics Platform, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, 13125, Germany
| |
Collapse
|
247
|
Abstract
In post-genomic era, an important task is to explore the function of individual biological molecules (i.e., gene, noncoding RNA, protein, metabolite) and their organization in living cells. For this end, gene regulatory networks (GRNs) are constructed to show relationship between biological molecules, in which the vertices of network denote biological molecules and the edges of network present connection between nodes (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). Biologists can understand not only the function of biological molecules but also the organization of components of living cells through interpreting the GRNs, since a gene regulatory network is a comprehensively physiological map of living cells and reflects influence of genetic and epigenetic factors (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). In this paper, we will review the inference methods of GRN reconstruction and analysis approaches of network structure. As a powerful tool for studying complex diseases and biological processes, the applications of the network method in pathway analysis and disease gene identification will be introduced.
Collapse
|
248
|
|
249
|
Crosara KTB, Moffa EB, Xiao Y, Siqueira WL. Merging in - silico and in vitro salivary protein complex partners using the STRING database: A tutorial. J Proteomics 2018; 171:87-94. [DOI: 10.1016/j.jprot.2017.08.002] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2017] [Revised: 07/28/2017] [Accepted: 08/01/2017] [Indexed: 12/20/2022]
|
250
|
Goebels F, Hu L, Bader G, Emili A. Automated Computational Inference of Multi-protein Assemblies from Biochemical Co-purification Data. Methods Mol Biol 2018; 1764:391-399. [PMID: 29605929 DOI: 10.1007/978-1-4939-7759-8_25] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Biology has amassed a wealth of information about the function of a multitude of protein-coding genes across species. The challenge now is to understand how all these proteins work together to form a living organism, and a crucial step for gaining this knowledge is a complete description of the molecular "wiring circuits" that underlie cellular processes. In this chapter, we describe a general computational framework for predicting multi-protein assemblies from biochemical co-fractionation data.
Collapse
Affiliation(s)
- Florian Goebels
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Lucas Hu
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Gary Bader
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Andrew Emili
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|