1
|
Yokoyama TT, Okada M, Taniguchi T. Panacea: Visual exploration system for analyzing trends in annual recruitment using time-varying graphs. PLoS One 2021; 16:e0247587. [PMID: 33647012 PMCID: PMC7920367 DOI: 10.1371/journal.pone.0247587] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Accepted: 02/10/2021] [Indexed: 11/19/2022] Open
Abstract
Annual recruitment data of new graduates are manually analyzed by human resources (HR) specialists in industries, which signifies the need to evaluate the recruitment strategy of HR specialists. Different job seekers send applications to companies every year. The relationships between applicants' attributes (e.g., English skill or academic credentials) can be used to analyze the changes in recruitment trends across multiple years. However, most attributes are unnormalized and thus require thorough preprocessing. Such unnormalized data hinder effective comparison of the relationship between applicants in the early stage of data analysis. Thus, a visual exploration system is highly needed to gain insight from the overview of the relationship among applicant qualifications across multiple years. In this study, we propose the Polarizing Attributes for Network Analysis of Correlation on Entities Association (Panacea) visualization system. The proposed system integrates a time-varying graph model and dynamic graph visualization for heterogeneous tabular data. Using this system, HR specialists can interactively inspect the relationships between two attributes of prospective employees across multiple years. Further, we demonstrate the usability of Panacea with representative examples for finding hidden trends in real-world datasets, and we discuss feedback from HR specialists obtained throughout Panacea's development. The proposed Panacea system enables HR specialists to visually explore the annual recruitment of new graduates.
Collapse
Affiliation(s)
| | | | - Tadahiro Taniguchi
- Panasonic Corporation, Osaka, Japan
- Ritsumeikan University, Kusatsu, Shiga, Japan
| |
Collapse
|
2
|
Sakamoto Y, Xu L, Seki M, Yokoyama TT, Kasahara M, Kashima Y, Ohashi A, Shimada Y, Motoi N, Tsuchihara K, Kobayashi SS, Kohno T, Shiraishi Y, Suzuki A, Suzuki Y. Long-read sequencing for non-small-cell lung cancer genomes. Genome Res 2020; 30:1243-1257. [PMID: 32887687 PMCID: PMC7545141 DOI: 10.1101/gr.261941.120] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2020] [Accepted: 04/09/2020] [Indexed: 12/23/2022]
Abstract
Here, we report the application of a long-read sequencer, PromethION, for analyzing human cancer genomes. We first conducted whole-genome sequencing on lung cancer cell lines. We found that it is possible to genotype known cancerous mutations, such as point mutations. We also found that long-read sequencing is particularly useful for precisely identifying and characterizing structural aberrations, such as large deletions, gene fusions, and other chromosomal rearrangements. In addition, we identified several medium-sized structural aberrations consisting of complex combinations of local duplications, inversions, and microdeletions. These complex mutations occurred even in key cancer-related genes, such as STK11, NF1, SMARCA4, and PTEN. The biological relevance of those mutations was further revealed by epigenome, transcriptome, and protein analyses of the affected signaling pathways. Such structural aberrations were also found in clinical lung adenocarcinoma specimens. Those structural aberrations were unlikely to be reliably detected by conventional short-read sequencing. Therefore, long-read sequencing may contribute to understanding the molecular etiology of patients for whom causative cancerous mutations remain unknown and therapeutic strategies are elusive.
Collapse
Affiliation(s)
- Yoshitaka Sakamoto
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| | - Liu Xu
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| | - Masahide Seki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| | - Toshiyuki T Yokoyama
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| | - Masahiro Kasahara
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| | - Yukie Kashima
- Division of Translational Informatics, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba 277-8577, Japan.,Division of Translational Genomics, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba 277-8577, Japan
| | - Akihiro Ohashi
- Division of Translational Genomics, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba 277-8577, Japan
| | - Yoko Shimada
- Division of Genome Biology, National Cancer Center Research Institute, Tokyo 104-0045, Japan
| | - Noriko Motoi
- Department of Pathology, National Cancer Center Hospital, Tokyo 104-0045, Japan
| | - Katsuya Tsuchihara
- Division of Translational Informatics, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba 277-8577, Japan
| | - Susumu S Kobayashi
- Division of Translational Genomics, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba 277-8577, Japan
| | - Takashi Kohno
- Division of Genome Biology, National Cancer Center Research Institute, Tokyo 104-0045, Japan
| | - Yuichi Shiraishi
- Division of Cellular Signaling, National Cancer Center Research Institute, Tokyo 104-0045, Japan
| | - Ayako Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan.,Division of Translational Informatics, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba 277-8577, Japan
| | - Yutaka Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| |
Collapse
|
3
|
Yokoyama TT, Kasahara M. Visualization tools for human structural variations identified by whole-genome sequencing. J Hum Genet 2020; 65:49-60. [PMID: 31666648 PMCID: PMC8075883 DOI: 10.1038/s10038-019-0687-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 09/27/2019] [Accepted: 10/02/2019] [Indexed: 01/02/2023]
Abstract
Visualizing structural variations (SVs) is a critical step for finding associations between SVs and human traits or diseases. Given that there are many sequencing platforms used for SV identification and given that how best to visualize SVs together with other data, such as read alignments and annotations, depends on research goals, there are dozens of SV visualization tools designed for different research goals and sequencing platforms. Here, we provide a comprehensive survey of over 30 SV visualization tools to help users choose which tools to use. This review targets users who wish to visualize a set of SVs identified from the massively parallel sequencing reads of an individual human genome. We first categorize the ways in which SV visualization tools display SVs into ten major categories, which we denote as view modules. View modules allow readers to understand the features of each SV visualization tool quickly. Next, we introduce the features of individual SV visualization tools from several aspects, including whether SV views are integrated with annotations, whether long-read alignment is displayed, whether underlying data structures are graph-based, the type of SVs shown, whether auditing is possible, whether bird's eye view is available, sequencing platforms, and the number of samples. We hope that this review will serve as a guide for readers on the currently available SV visualization tools and lead to the development of new SV visualization tools in the near future.
Collapse
Affiliation(s)
- Toshiyuki T Yokoyama
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Masahiro Kasahara
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan.
| |
Collapse
|
4
|
Yokoyama TT, Sakamoto Y, Seki M, Suzuki Y, Kasahara M. MoMI-G: modular multi-scale integrated genome graph browser. BMC Bioinformatics 2019; 20:548. [PMID: 31690272 PMCID: PMC6833150 DOI: 10.1186/s12859-019-3145-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Accepted: 10/09/2019] [Indexed: 01/30/2023] Open
Abstract
Background Genome graph is an emerging approach for representing structural variants on genomes with branches. For example, representing structural variants of cancer genomes as a genome graph is more natural than representing such genomes as differences from the linear reference genome. While more and more structural variants are being identified by long-read sequencing, many of them are difficult to visualize using existing structural variants visualization tools. To this end, visualization method for large genome graphs such as human cancer genome graphs is demanded. Results We developed MOdular Multi-scale Integrated Genome graph browser, MoMI-G, a web-based genome graph browser that can visualize genome graphs with structural variants and supporting evidences such as read alignments, read depth, and annotations. This browser allows more intuitive recognition of large, nested, and potentially more complex structural variations. MoMI-G has view modules for different scales, which allow users to view the whole genome down to nucleotide-level alignments of long reads. Alignments spanning reference alleles and those spanning alternative alleles are shown in the same view. Users can customize the view, if they are not satisfied with the preset views. In addition, MoMI-G has Interval Card Deck, a feature for rapid manual inspection of hundreds of structural variants. Herein, we describe the utility of MoMI-G by using representative examples of large and nested structural variations found in two cell lines, LC-2/ad and CHM1. Conclusions Users can inspect complex and large structural variations found by long-read analysis in large genomes such as human genomes more smoothly and more intuitively. In addition, users can easily filter out false positives by manually inspecting hundreds of identified structural variants with supporting long-read alignments and annotations in a short time. Software availability MoMI-G is freely available at https://github.com/MoMI-G/MoMI-G under the MIT license.
Collapse
Affiliation(s)
- Toshiyuki T Yokoyama
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Yoshitaka Sakamoto
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Masahide Seki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Yutaka Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Masahiro Kasahara
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan.
| |
Collapse
|
5
|
Llamas B, Narzisi G, Schneider V, Audano PA, Biederstedt E, Blauvelt L, Bradbury P, Chang X, Chin CS, Fungtammasan A, Clarke WE, Cleary A, Ebler J, Eizenga J, Sibbesen JA, Markello CJ, Garrison E, Garg S, Hickey G, Lazo GR, Lin MF, Mahmoud M, Marschall T, Minkin I, Monlong J, Musunuri RL, Sagayaradj S, Novak AM, Rautiainen M, Regier A, Sedlazeck FJ, Siren J, Souilmi Y, Wagner J, Wrightsman T, Yokoyama TT, Zeng Q, Zook JM, Paten B, Busby B. A strategy for building and using a human reference pangenome. F1000Res 2019; 8:1751. [PMID: 34386196 PMCID: PMC8350888 DOI: 10.12688/f1000research.19630.1] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/23/2021] [Indexed: 01/27/2024] Open
Abstract
In March 2019, 45 scientists and software engineers from around the world converged at the University of California, Santa Cruz for the first pangenomics codeathon. The purpose of the meeting was to propose technical specifications and standards for a usable human pangenome as well as to build relevant tools for genome graph infrastructures. During the meeting, the group held several intense and productive discussions covering a diverse set of topics, including advantages of graph genomes over a linear reference representation, design of new methods that can leverage graph-based data structures, and novel visualization and annotation approaches for pangenomes. Additionally, the participants self-organized themselves into teams that worked intensely over a three-day period to build a set of pipelines and tools for specific pangenomic applications. A summary of the questions raised and the tools developed are reported in this manuscript.
Collapse
Affiliation(s)
- Bastien Llamas
- Australian Centre for Ancient DNA, School of Biological Sciences, Environment Institute, The University of Adelaide, Adelaide, South Australia, 5005, Australia
| | | | - Valerie Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Peter A. Audano
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
| | - Evan Biederstedt
- Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02215, USA
| | - Lon Blauvelt
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Peter Bradbury
- Robert W. Holley Center, USDA-ARS, Ithaca, NY, 14853, USA
| | - Xian Chang
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | | | | | | | - Alan Cleary
- National Center for Genome Resources 87505, Santa Fe, NM, 87505, USA
| | - Jana Ebler
- Max Planck Institute for Informatics, Saarbrücken, Germany
| | - Jordan Eizenga
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Jonas A. Sibbesen
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Charles J. Markello
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Erik Garrison
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Shilpa Garg
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
| | - Glenn Hickey
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Gerard R. Lazo
- Western Regional Research Center, USDA-ARS, Albany, CA, 94710-1105, USA
| | | | - Medhat Mahmoud
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston TX, TX, 77030, USA
| | | | - Ilia Minkin
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Jean Monlong
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | | | - Sagayamary Sagayaradj
- Genome Center, University of California, Davis, Davis, CA, USA
- BASF, West Sacramento, CA, USA
| | - Adam M. Novak
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | | | - Allison Regier
- McDonnell Genome Institute, Washington University in St Louis, St Louis, MO, 63108, USA
| | - Fritz J. Sedlazeck
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston TX, TX, 77030, USA
| | - Jouni Siren
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Yassine Souilmi
- Australian Centre for Ancient DNA, School of Biological Sciences, Environment Institute, The University of Adelaide, Adelaide, South Australia, 5005, Australia
| | - Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
| | - Travis Wrightsman
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, NY, 14853, USA
| | - Toshiyuki T. Yokoyama
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Qiandong Zeng
- Laboratory Corporation of America Holdings, Westborough, MA, 01581, USA
| | - Justin M. Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
| | - Benedict Paten
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Ben Busby
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| |
Collapse
|
6
|
Llamas B, Narzisi G, Schneider V, Audano PA, Biederstedt E, Blauvelt L, Bradbury P, Chang X, Chin CS, Fungtammasan A, Clarke WE, Cleary A, Ebler J, Eizenga J, Sibbesen JA, Markello CJ, Garrison E, Garg S, Hickey G, Lazo GR, Lin MF, Mahmoud M, Marschall T, Minkin I, Monlong J, Musunuri RL, Sagayaradj S, Novak AM, Rautiainen M, Regier A, Sedlazeck FJ, Siren J, Souilmi Y, Wagner J, Wrightsman T, Yokoyama TT, Zeng Q, Zook JM, Paten B, Busby B. A strategy for building and using a human reference pangenome. F1000Res 2019; 8:1751. [PMID: 34386196 PMCID: PMC8350888 DOI: 10.12688/f1000research.19630.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/23/2021] [Indexed: 11/20/2022] Open
Abstract
In March 2019, 45 scientists and software engineers from around the world converged at the University of California, Santa Cruz for the first pangenomics codeathon. The purpose of the meeting was to propose technical specifications and standards for a usable human pangenome as well as to build relevant tools for genome graph infrastructures. During the meeting, the group held several intense and productive discussions covering a diverse set of topics, including advantages of graph genomes over a linear reference representation, design of new methods that can leverage graph-based data structures, and novel visualization and annotation approaches for pangenomes. Additionally, the participants self-organized themselves into teams that worked intensely over a three-day period to build a set of pipelines and tools for specific pangenomic applications. A summary of the questions raised and the tools developed are reported in this manuscript.
Collapse
Affiliation(s)
- Bastien Llamas
- Australian Centre for Ancient DNA, School of Biological Sciences, Environment Institute, The University of Adelaide, Adelaide, South Australia, 5005, Australia
| | | | - Valerie Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Peter A Audano
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
| | - Evan Biederstedt
- Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02215, USA
| | - Lon Blauvelt
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Peter Bradbury
- Robert W. Holley Center, USDA-ARS, Ithaca, NY, 14853, USA
| | - Xian Chang
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | | | | | | | - Alan Cleary
- National Center for Genome Resources 87505, Santa Fe, NM, 87505, USA
| | - Jana Ebler
- Max Planck Institute for Informatics, Saarbrücken, Germany
| | - Jordan Eizenga
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Jonas A Sibbesen
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Charles J Markello
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Erik Garrison
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Shilpa Garg
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
| | - Glenn Hickey
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Gerard R Lazo
- Western Regional Research Center, USDA-ARS, Albany, CA, 94710-1105, USA
| | | | - Medhat Mahmoud
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston TX, TX, 77030, USA
| | | | - Ilia Minkin
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Jean Monlong
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | | | - Sagayamary Sagayaradj
- Genome Center, University of California, Davis, Davis, CA, USA.,BASF, West Sacramento, CA, USA
| | - Adam M Novak
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | | | - Allison Regier
- McDonnell Genome Institute, Washington University in St Louis, St Louis, MO, 63108, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston TX, TX, 77030, USA
| | - Jouni Siren
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Yassine Souilmi
- Australian Centre for Ancient DNA, School of Biological Sciences, Environment Institute, The University of Adelaide, Adelaide, South Australia, 5005, Australia
| | - Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
| | - Travis Wrightsman
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, NY, 14853, USA
| | - Toshiyuki T Yokoyama
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Qiandong Zeng
- Laboratory Corporation of America Holdings, Westborough, MA, 01581, USA
| | - Justin M Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
| | - Benedict Paten
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Ben Busby
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| |
Collapse
|