1
|
Szymczak P, Szczurek E. Artificial intelligence-driven antimicrobial peptide discovery. Curr Opin Struct Biol 2023; 83:102733. [PMID: 37992451 DOI: 10.1016/j.sbi.2023.102733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 10/06/2023] [Accepted: 10/30/2023] [Indexed: 11/24/2023]
Abstract
Antimicrobial peptides (AMPs) emerge as promising agents against antimicrobial resistance, providing an alternative to conventional antibiotics. Artificial intelligence (AI) revolutionized AMP discovery through both discrimination and generation approaches. The discriminators aid in the identification of promising candidates by predicting key peptide properties such as activity and toxicity, while the generators learn the distribution of peptides and enable sampling novel AMP candidates, either de novo or as analogs of a prototype peptide. Moreover, the controlled generation of AMPs with desired properties is achieved by discriminator-guided filtering, positive-only learning, latent space sampling, as well as conditional and optimized generation. Here we review recent achievements in AI-driven AMP discovery, highlighting the most exciting directions.
Collapse
Affiliation(s)
- Paulina Szymczak
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097, Warsaw, Poland.
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097, Warsaw, Poland.
| |
Collapse
|
2
|
Szymczak P, Możejko M, Grzegorzek T, Jurczak R, Bauer M, Neubauer D, Sikora K, Michalski M, Sroka J, Setny P, Kamysz W, Szczurek E. Author Correction: Discovering highly potent antimicrobial peptides with deep generative model HydrAMP. Nat Commun 2023; 14:5129. [PMID: 37612279 PMCID: PMC10447423 DOI: 10.1038/s41467-023-40879-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/25/2023] Open
Affiliation(s)
- Paulina Szymczak
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
| | - Marcin Możejko
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
| | - Tomasz Grzegorzek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
- NVIDIA, 2788 San Tomas Expressway, Santa Clara, CA, 95051, USA
| | - Radosław Jurczak
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
| | - Marta Bauer
- Department of Inorganic Chemistry, Faculty of Pharmacy, Medical University of Gdańsk, Al. Gen. J. Hallera 107, 80-416, Gdańsk, Poland
| | - Damian Neubauer
- Department of Inorganic Chemistry, Faculty of Pharmacy, Medical University of Gdańsk, Al. Gen. J. Hallera 107, 80-416, Gdańsk, Poland
| | - Karol Sikora
- Department of Inorganic Chemistry, Faculty of Pharmacy, Medical University of Gdańsk, Al. Gen. J. Hallera 107, 80-416, Gdańsk, Poland
| | - Michał Michalski
- The Centre of New Technologies, University of Warsaw, Stefana Banacha 2c, 02-097, Warsaw, Poland
| | - Jacek Sroka
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
| | - Piotr Setny
- The Centre of New Technologies, University of Warsaw, Stefana Banacha 2c, 02-097, Warsaw, Poland
| | - Wojciech Kamysz
- Department of Inorganic Chemistry, Faculty of Pharmacy, Medical University of Gdańsk, Al. Gen. J. Hallera 107, 80-416, Gdańsk, Poland
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland.
| |
Collapse
|
3
|
Geras A, Darvish Shafighi S, Domżał K, Filipiuk I, Rączkowska A, Szymczak P, Toosi H, Kaczmarek L, Koperski Ł, Lagergren J, Nowis D, Szczurek E. Celloscope: a probabilistic model for marker-gene-driven cell type deconvolution in spatial transcriptomics data. Genome Biol 2023; 24:120. [PMID: 37198601 PMCID: PMC10190053 DOI: 10.1186/s13059-023-02951-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 04/21/2023] [Indexed: 05/19/2023] Open
Abstract
Spatial transcriptomics maps gene expression across tissues, posing the challenge of determining the spatial arrangement of different cell types. However, spatial transcriptomics spots contain multiple cells. Therefore, the observed signal comes from mixtures of cells of different types. Here, we propose an innovative probabilistic model, Celloscope, that utilizes established prior knowledge on marker genes for cell type deconvolution from spatial transcriptomics data. Celloscope outperforms other methods on simulated data, successfully indicates known brain structures and spatially distinguishes between inhibitory and excitatory neuron types based in mouse brain tissue, and dissects large heterogeneity of immune infiltrate composition in prostate gland tissue.
Collapse
Affiliation(s)
- Agnieszka Geras
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Warsaw, Poland
| | - Shadi Darvish Shafighi
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Warsaw, Poland
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative - UMR, Paris, France
| | - Kacper Domżał
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Warsaw, Poland
| | - Igor Filipiuk
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Warsaw, Poland
| | - Alicja Rączkowska
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Warsaw, Poland
| | - Paulina Szymczak
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Warsaw, Poland
| | - Hosein Toosi
- KTH Royal Institute of Technology, Stockholm, Sweden
| | - Leszek Kaczmarek
- BRAINCITY, Nencki Institute of Experimental Biology of the Polish Academy of Sciences, Warsaw, Poland
| | - Łukasz Koperski
- Department of Pathology, Medical University of Warsaw, Warsaw, Poland
| | | | - Dominika Nowis
- Laboratory of Experimental Medicine, Medical University of Warsaw, Warsaw, Poland
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Warsaw, Poland.
| |
Collapse
|
4
|
Markowska M, Budzinska MA, Coenen-Stass A, Kang S, Kizling E, Kolmus K, Koras K, Staub E, Szczurek E. Synthetic lethality prediction in DNA damage repair, chromatin remodeling and the cell cycle using multi-omics data from cell lines and patients. Sci Rep 2023; 13:7049. [PMID: 37120674 PMCID: PMC10148866 DOI: 10.1038/s41598-023-34161-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Accepted: 04/25/2023] [Indexed: 05/01/2023] Open
Abstract
Discovering synthetic lethal (SL) gene partners of cancer genes is an important step in developing cancer therapies. However, identification of SL interactions is challenging, due to a large number of possible gene pairs, inherent noise and confounding factors in the observed signal. To discover robust SL interactions, we devised SLIDE-VIP, a novel framework combining eight statistical tests, including a new patient data-based test iSurvLRT. SLIDE-VIP leverages multi-omics data from four different sources: gene inactivation cell line screens, cancer patient data, drug screens and gene pathways. We applied SLIDE-VIP to discover SL interactions between genes involved in DNA damage repair, chromatin remodeling and cell cycle, and their potentially druggable partners. The top 883 ranking SL candidates had strong evidence in cell line and patient data, 250-fold reducing the initial space of 200K pairs. Drug screen and pathway tests provided additional corroboration and insights into these interactions. We rediscovered well-known SL pairs such as RB1 and E2F3 or PRKDC and ATM, and in addition, proposed strong novel SL candidates such as PTEN and PIK3CB. In summary, SLIDE-VIP opens the door to the discovery of SL interactions with clinical potential. All analysis and visualizations are available via the online SLIDE-VIP WebApp.
Collapse
Affiliation(s)
- Magda Markowska
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
- Postgraduate School of Molecular Medicine, Medical University of Warsaw, Zwirki i Wigury 61, 02-091, Warsaw, Poland
| | - Magdalena A Budzinska
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
- Ardigen S.A., Podole 76, 30-394, Cracow, Poland
| | - Anna Coenen-Stass
- Translational Medicine, Oncology Bioinformatics, Merck Healthcare KGaA, Frankfurt Strasse 250, 64293, Darmstadt, Germany
| | - Senbai Kang
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
| | - Ewa Kizling
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
| | | | - Krzysztof Koras
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
| | - Eike Staub
- Translational Medicine, Oncology Bioinformatics, Merck Healthcare KGaA, Frankfurt Strasse 250, 64293, Darmstadt, Germany
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland.
| |
Collapse
|
5
|
Sherratt K, Gruson H, Grah R, Johnson H, Niehus R, Prasse B, Sandmann F, Deuschel J, Wolffram D, Abbott S, Ullrich A, Gibson G, Ray EL, Reich NG, Sheldon D, Wang Y, Wattanachit N, Wang L, Trnka J, Obozinski G, Sun T, Thanou D, Pottier L, Krymova E, Meinke JH, Barbarossa MV, Leithäuser N, Mohring J, Schneider J, Włazło J, Fuhrmann J, Lange B, Rodiah I, Baccam P, Gurung H, Stage S, Suchoski B, Budzinski J, Walraven R, Villanueva I, Tucek V, Smid M, Zajíček M, Pérez Álvarez C, Reina B, Bosse NI, Meakin SR, Castro L, Fairchild G, Michaud I, Osthus D, Alaimo Di Loro P, Maruotti A, Eclerová V, Kraus A, Kraus D, Pribylova L, Dimitris B, Li ML, Saksham S, Dehning J, Mohr S, Priesemann V, Redlarski G, Bejar B, Ardenghi G, Parolini N, Ziarelli G, Bock W, Heyder S, Hotz T, Singh DE, Guzman-Merino M, Aznarte JL, Moriña D, Alonso S, Álvarez E, López D, Prats C, Burgard JP, Rodloff A, Zimmermann T, Kuhlmann A, Zibert J, Pennoni F, Divino F, Català M, Lovison G, Giudici P, Tarantino B, Bartolucci F, Jona Lasinio G, Mingione M, Farcomeni A, Srivastava A, Montero-Manso P, Adiga A, Hurt B, Lewis B, Marathe M, Porebski P, Venkatramanan S, Bartczuk RP, Dreger F, Gambin A, Gogolewski K, Gruziel-Słomka M, Krupa B, Moszyński A, Niedzielewski K, Nowosielski J, Radwan M, Rakowski F, Semeniuk M, Szczurek E, Zieliński J, Kisielewski J, Pabjan B, Kirsten H, Kheifetz Y, Scholz M, Biecek P, Bodych M, Filinski M, Idzikowski R, Krueger T, Ozanski T, Bracher J, Funk S. Predictive performance of multi-model ensemble forecasts of COVID-19 across European nations. eLife 2023; 12:81916. [PMID: 37083521 DOI: 10.7554/elife.81916] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 02/20/2023] [Indexed: 04/22/2023] Open
Abstract
Background: Short-term forecasts of infectious disease contribute to situational awareness and capacity planning. Based on best practice in other fields and recent insights in infectious disease epidemiology, one can maximise forecasts' predictive performance by combining independent models into an ensemble. Here we report the performance of ensemble predictions of COVID-19 cases and deaths across Europe from March 2021 to March 2022. Methods: We created the European COVID-19 Forecast Hub, an online open-access platform where modellers upload weekly forecasts for 32 countries with results publicly visualised and evaluated. We created a weekly ensemble forecast from the equally-weighted average across individual models' predictive quantiles. We measured forecast accuracy using a baseline and relative Weighted Interval Score (rWIS). We retrospectively explored ensemble methods, including weighting by past performance. Results: We collected weekly forecasts from 48 models, of which we evaluated 29 models alongside the ensemble model. The ensemble had a consistently strong performance across countries over time, performing better on rWIS than 91% of forecasts for deaths (N=763 predictions from 20 models), and 83% forecasts for cases (N=886 predictions from 23 models). Performance remained stable over a 4-week horizon for death forecasts but declined with longer horizons for cases. Among ensemble methods, the most influential choice came from using a median average instead of the mean, regardless of weighting component models. Conclusions: Our results support combining independent models into an ensemble forecast to improve epidemiological predictions, and suggest that median averages yield better performance than methods based on means. We highlight that forecast consumers should place more weight on incident death forecasts than case forecasts at horizons greater than two weeks. Funding: European Commission, Ministerio de Ciencia, Innovación y Universidades, FEDER; Agència de Qualitat i Avaluació Sanitàries de Catalunya; Netzwerk Universitätsmedizin; Health Protection Research Unit; Wellcome Trust; European Centre for Disease Prevention and Control; Ministry of Science and Higher Education of Poland; Federal Ministry of Education and Research; Los Alamos National Laboratory; German Free State of Saxony; NCBiR; FISR 2020 Covid-19 I Fase; Spanish Ministry of Health / REACT-UE (FEDER); National Institutes of General Medical Sciences; Ministerio de Sanidad/ISCIII; PERISCOPE European H2020; PERISCOPE European H2021; InPresa; National Institutes of Health, NSF, US Centers for Disease Control and Prevention, Google, University of Virginia, Defense Threat Reduction Agency.
Collapse
Affiliation(s)
- Katharine Sherratt
- Centre for the Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Hugo Gruson
- Centre for the Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Rok Grah
- European Centre for Disease Prevention and Control, Stockholm, Sweden
| | - Helen Johnson
- European Centre for Disease Prevention and Control, Stockholm, Sweden
| | - Rene Niehus
- European Centre for Disease Prevention and Control, Stockholm, Sweden
| | - Bastian Prasse
- European Centre for Disease Prevention and Control, Stockholm, Sweden
| | - Frank Sandmann
- European Centre for Disease Prevention and Control, Stockholm, Sweden
| | | | | | - Sam Abbott
- Centre for the Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | | | - Graham Gibson
- University of Massachusetts Amherst, Amherst, United States
| | - Evan L Ray
- University of Massachusetts Amherst, Amherst, United States
| | | | - Daniel Sheldon
- University of Massachusetts Amherst, Amherst, United States
| | - Yijin Wang
- University of Massachusetts Amherst, Amherst, United States
| | | | - Lijing Wang
- Boston Children's Hospital, Boston, United States
| | - Jan Trnka
- Department of Biochemistry, Cell and Molecular Biology, Charles University, Prague, Czech Republic
| | | | - Tao Sun
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Dorina Thanou
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | | | | | | | | | - Neele Leithäuser
- Fraunhofer Institute for Industrial Mathematics, Kaiserslautern, Germany
| | - Jan Mohring
- Fraunhofer Institute for Industrial Mathematics, Kaiserslautern, Germany
| | - Johanna Schneider
- Fraunhofer Institute for Industrial Mathematics, Kaiserslautern, Germany
| | - Jaroslaw Włazło
- Fraunhofer Institute for Industrial Mathematics, Kaiserslautern, Germany
| | | | - Berit Lange
- Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Isti Rodiah
- Helmholtz Centre for Infection Research, Braunschweig, Germany
| | | | | | | | | | | | | | - Inmaculada Villanueva
- Institut d'Investigacions Biomediques August Pi i Sunyer, Universitat Pompeu Fabra, Barcelona, Spain
| | - Vit Tucek
- Institute of Computer Science, Prague, Czech Republic
| | - Martin Smid
- Institute of Information Theory and Automation, Prague, Czech Republic
| | - Milan Zajíček
- Institute of Information Theory and Automation, Prague, Czech Republic
| | | | | | - Nikos I Bosse
- Centre for the Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Sophie R Meakin
- Centre for the Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Lauren Castro
- Los Alamos National Laboratory, Los Alamos, United States
| | | | - Isaac Michaud
- Los Alamos National Laboratory, Los Alamos, United States
| | - Dave Osthus
- Los Alamos National Laboratory, Los Alamos, United States
| | | | | | | | | | | | | | | | | | - Soni Saksham
- Massachusetts Institute of Technology, Cambridge, United States
| | - Jonas Dehning
- Max-Planck-Institut fur Dynamik und Selbstorganisation, Göttingen, Germany
| | - Sebastian Mohr
- Max-Planck-Institut fur Dynamik und Selbstorganisation, Göttingen, Germany
| | - Viola Priesemann
- MPRG Priesemann, Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| | | | | | | | | | | | - Wolfgang Bock
- Technical University of Kaiserlautern, Kaiserslautern, Germany
| | | | - Thomas Hotz
- Technische Universitat Ilmenau, Ilmenau, Germany
| | | | | | - Jose L Aznarte
- Universidad Nacional de Educacion a Distancia, Madrid, Spain
| | | | - Sergio Alonso
- Universitat Politecnica de Catalunya, Barcelona, Spain
| | - Enric Álvarez
- Universitat Politecnica de Catalunya, Barcelona, Spain
| | - Daniel López
- Universitat Politecnica de Catalunya, Barcelona, Spain
| | - Clara Prats
- Universitat Politecnica de Catalunya, Barcelona, Spain
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Benjamin Hurt
- University of Virginia, Charlottesville, United States
| | - Bryan Lewis
- University of Virginia, Charlottesville, United States
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Marcin Bodych
- Wroclaw University of Science and Technology, Wroclaw, Poland
| | - Maciej Filinski
- Wroclaw University of Science and Technology, Wroclaw, Poland
| | | | - Tyll Krueger
- Wroclaw University of Science and Technology, Wroclaw, Poland
| | - Tomasz Ozanski
- Wroclaw University of Science and Technology, Wroclaw, Poland
| | | | - Sebastian Funk
- Centre for the Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| |
Collapse
|
6
|
Szymczak P, Możejko M, Grzegorzek T, Jurczak R, Bauer M, Neubauer D, Sikora K, Michalski M, Sroka J, Setny P, Kamysz W, Szczurek E. Discovering highly potent antimicrobial peptides with deep generative model HydrAMP. Nat Commun 2023; 14:1453. [PMID: 36922490 PMCID: PMC10017685 DOI: 10.1038/s41467-023-36994-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 02/28/2023] [Indexed: 03/17/2023] Open
Abstract
Antimicrobial peptides emerge as compounds that can alleviate the global health hazard of antimicrobial resistance, prompting a need for novel computational approaches to peptide generation. Here, we propose HydrAMP, a conditional variational autoencoder that learns lower-dimensional, continuous representation of peptides and captures their antimicrobial properties. The model disentangles the learnt representation of a peptide from its antimicrobial conditions and leverages parameter-controlled creativity. HydrAMP is the first model that is directly optimized for diverse tasks, including unconstrained and analogue generation and outperforms other approaches in these tasks. An additional preselection procedure based on ranking of generated peptides and molecular dynamics simulations increases experimental validation rate. Wet-lab experiments on five bacterial strains confirm high activity of nine peptides generated as analogues of clinically relevant prototypes, as well as six analogues of an inactive peptide. HydrAMP enables generation of diverse and potent peptides, making a step towards resolving the antimicrobial resistance crisis.
Collapse
Affiliation(s)
- Paulina Szymczak
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
| | - Marcin Możejko
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
| | - Tomasz Grzegorzek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
- NVIDIA, 2788 San Tomas Expressway, Santa Clara, CA, 95051, USA
| | - Radosław Jurczak
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
| | - Marta Bauer
- Department of Inorganic Chemistry, Faculty of Pharmacy, Medical University of Gdańsk, Al. Gen. J. Hallera 107, 80-416, Gdańsk, Poland
| | - Damian Neubauer
- Department of Inorganic Chemistry, Faculty of Pharmacy, Medical University of Gdańsk, Al. Gen. J. Hallera 107, 80-416, Gdańsk, Poland
| | - Karol Sikora
- Department of Inorganic Chemistry, Faculty of Pharmacy, Medical University of Gdańsk, Al. Gen. J. Hallera 107, 80-416, Gdańsk, Poland
| | - Michał Michalski
- The Centre of New Technologies, University of Warsaw, Stefana Banacha 2c, 02-097, Warsaw, Poland
| | - Jacek Sroka
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
| | - Piotr Setny
- The Centre of New Technologies, University of Warsaw, Stefana Banacha 2c, 02-097, Warsaw, Poland
| | - Wojciech Kamysz
- Department of Inorganic Chemistry, Faculty of Pharmacy, Medical University of Gdańsk, Al. Gen. J. Hallera 107, 80-416, Gdańsk, Poland
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland.
| |
Collapse
|
7
|
Kang S, Borgsmüller N, Valecha M, Kuipers J, Alves JM, Prado-López S, Chantada D, Beerenwinkel N, Posada D, Szczurek E. SIEVE: joint inference of single-nucleotide variants and cell phylogeny from single-cell DNA sequencing data. Genome Biol 2022; 23:248. [PMID: 36451239 PMCID: PMC9714196 DOI: 10.1186/s13059-022-02813-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Accepted: 11/08/2022] [Indexed: 12/02/2022] Open
Abstract
We present SIEVE, a statistical method for the joint inference of somatic variants and cell phylogeny under the finite-sites assumption from single-cell DNA sequencing. SIEVE leverages raw read counts for all nucleotides and corrects the acquisition bias of branch lengths. In our simulations, SIEVE outperforms other methods in phylogenetic reconstruction and variant calling accuracy, especially in the inference of homozygous variants. Applying SIEVE to three datasets, one for triple-negative breast (TNBC), and two for colorectal cancer (CRC), we find that double mutant genotypes are rare in CRC but unexpectedly frequent in the TNBC samples.
Collapse
Affiliation(s)
- Senbai Kang
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Nico Borgsmüller
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, 4058 Basel, Switzerland
| | - Monica Valecha
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
| | - Jack Kuipers
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, 4058 Basel, Switzerland
| | - Joao M. Alves
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
| | - Sonia Prado-López
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
- Institute of Solid State Electronics E362, Technische Universität Wien, Vienna, Austria
| | - Débora Chantada
- Department of Pathology, Hospital Álvaro Cunqueiro, Vigo, Spain
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, 4058 Basel, Switzerland
| | - David Posada
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
- Department of Biochemistry, Genetics, and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| |
Collapse
|
8
|
Rączkowska A, Paśnik I, Kukiełka M, Nicoś M, Budzinska MA, Kucharczyk T, Szumiło J, Krawczyk P, Crosetto N, Szczurek E. Deep learning-based tumor microenvironment segmentation is predictive of tumor mutations and patient survival in non-small-cell lung cancer. BMC Cancer 2022; 22:1001. [PMID: 36131239 PMCID: PMC9490924 DOI: 10.1186/s12885-022-10081-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Accepted: 09/07/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Despite the fact that tumor microenvironment (TME) and gene mutations are the main determinants of progression of the deadliest cancer in the world - lung cancer, their interrelations are not well understood. Digital pathology data provides a unique insight into the spatial composition of the TME. Various spatial metrics and machine learning approaches were proposed for prediction of either patient survival or gene mutations from this data. Still, these approaches are limited in the scope of analyzed features and in their explainability, and as such fail to transfer to clinical practice. METHODS Here, we generated 23,199 image patches from 26 hematoxylin-and-eosin (H&E)-stained lung cancer tissue sections and annotated them into 9 different tissue classes. Using this dataset, we trained a deep neural network ARA-CNN. Next, we applied the trained network to segment 467 lung cancer H&E images from The Cancer Genome Atlas (TCGA) database. We used the segmented images to compute human-interpretable features reflecting the heterogeneous composition of the TME, and successfully utilized them to predict patient survival and cancer gene mutations. RESULTS We achieved per-class AUC ranging from 0.72 to 0.99 for classifying tissue types in lung cancer with ARA-CNN. Machine learning models trained on the proposed human-interpretable features achieved a c-index of 0.723 in the task of survival prediction and AUC up to 73.5% for PDGFRB in the task of mutation classification. CONCLUSIONS We presented a framework that accurately predicted survival and gene mutations in lung adenocarcinoma patients based on human-interpretable features extracted from H&E slides. Our approach can provide important insights for designing novel cancer treatments, by linking the spatial structure of the TME in lung adenocarcinoma to gene mutations and patient survival. It can also expand our understanding of the effects that the TME has on tumor evolutionary processes. Our approach can be generalized to different cancer types to inform precision medicine strategies.
Collapse
Affiliation(s)
- Alicja Rączkowska
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland
| | - Iwona Paśnik
- Department of Clinical Pathomorphology, Medical University of Lublin, Jaczewskiego 8b, 20-090 Lublin, Poland
| | - Michał Kukiełka
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland
| | - Marcin Nicoś
- Department of Pneumology, Oncology and Allergology, Medical University of Lublin, Jaczewskiego 8, 20-090 Lublin, Poland
| | | | - Tomasz Kucharczyk
- Department of Pneumology, Oncology and Allergology, Medical University of Lublin, Jaczewskiego 8, 20-090 Lublin, Poland
| | - Justyna Szumiło
- Department of Clinical Pathomorphology, Medical University of Lublin, Jaczewskiego 8b, 20-090 Lublin, Poland
| | - Paweł Krawczyk
- Department of Pneumology, Oncology and Allergology, Medical University of Lublin, Jaczewskiego 8, 20-090 Lublin, Poland
| | - Nicola Crosetto
- Division of Genome Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Tomtebodavägen 23a, 17165 Solna, Sweden
- Science for Life Laboratory, Tomtebodavägen 23a, 17165 Solna, Sweden
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland
| |
Collapse
|
9
|
Coenen-Stass AM, Markowska M, Budzinska-Zaniewska M, Kolmus K, Szczurek E, Staub E. Abstract 1918: A novel computational framework predicts synthetic lethal interactions between key regulators of the DNA damage response and chromatin modifiers. Cancer Res 2022. [DOI: 10.1158/1538-7445.am2022-1918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
The concept of synthetic lethality (SL) has made a pivotal impact for the development of the anti-cancer drug olaparib, the first approved agent targeting the DNA Damage Response (DDR). Typically, SL is described as the interaction of two genes, whereby simultaneous inactivation of both genes results in cell death whereas loss of one gene can be tolerated. Given the importance of SL to develop highly selective anti-cancer therapeutics, large efforts have been undertaken to identify these interactions, both experimentally and computationally.
Here, we present a novel computational framework harnessing large-scale cell line gene inactivation screens (DepMap, Project Score), as well as patient data (TCGA), to discover known and novel SL gene pairs. Overall, we implemented six statistical tests considering gene dependency scores, genomic profiles, gene expression and patient survival as parameters. We further utilized data from public drug screening consortia to validate our top-ranking pairs.
We applied our framework to a defined target space covering genes relating to DDR, chromatin binding, cell cycle and druggable genes (overall > 2.5 M. pairs were tested). When focusing on SL partners for three DDR genes with promising inhibitors in early clinical development: ATM, ATR and DNA-PK; we noticed that chromatin modifiers were enriched in the top ranking pairs. In particular, SL interactions were predicted with histone (de)methylases, histone (de)acetyltransferases and members of SWI/SNF family.
For instance, we observed that loss of function mutations in several members of the KMT2 family, also known as MLL family, would render cancers cells more dependent on ATR or ATM. Similarly, drug sensitivity was increased for selected DDR inhibitors in cell lines with KMT2 mutations. Members of the KMT2 family play essential roles in transcription, but have recently also been shown to be recruited to DNA damage sites and may be mediators for PARP inhibitor sensitivity. The KMT2 family is frequently mutated in several cancers types such as endometrial and bladder, thus underpinning their suitability as potential selection biomarkers as well as relevance to cancer.
Furthermore, we observed that patients with mutations in EP300, a histone lysine acetyl transferase, exhibited an increase in ATR expression which may comprise a compensatory mechanism. In addition, patients with ATR and EP300 double mutation tend to have a better probability of survival, suggesting decreased tumor fitness.
In summary, we present a modular framework to predict SL gene pairs to a defined target space. Here, we focused on discovering novel SL relationships for the key DDR regulators ATM, ATR and DNA-PK. Our results provide not only new biomarker hypotheses for further validation, but also suggest that cancers with a high mutation rate in chromatin modifying genes may be efficiently targeted by DDRi.
Citation Format: Anna M. Coenen-Stass, Magda Markowska, Magdalena Budzinska-Zaniewska, Krzysztof Kolmus, Ewa Szczurek, Eike Staub. A novel computational framework predicts synthetic lethal interactions between key regulators of the DNA damage response and chromatin modifiers [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 1918.
Collapse
Affiliation(s)
- Anna M. Coenen-Stass
- 1Oncology Bioinformatics, Translational Medicine, the healthcare business of Merck KGaA, Darmstadt, Germany
| | - Magda Markowska
- 2Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | | | | | - Ewa Szczurek
- 2Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Eike Staub
- 1Oncology Bioinformatics, Translational Medicine, the healthcare business of Merck KGaA, Darmstadt, Germany
| |
Collapse
|
10
|
Markowska M, Cąkała T, Miasojedow B, Aybey B, Juraeva D, Mazur J, Ross E, Staub E, Szczurek E. CONET: copy number event tree model of evolutionary tumor history for single-cell data. Genome Biol 2022; 23:128. [PMID: 35681161 PMCID: PMC9185904 DOI: 10.1186/s13059-022-02693-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 05/23/2022] [Indexed: 11/10/2022] Open
Abstract
Copy number alterations constitute important phenomena in tumor evolution. Whole genome single-cell sequencing gives insight into copy number profiles of individual cells, but is highly noisy. Here, we propose CONET, a probabilistic model for joint inference of the evolutionary tree on copy number events and copy number calling. CONET employs an efficient, regularized MCMC procedure to search the space of possible model structures and parameters. We introduce a range of model priors and penalties for efficient regularization. CONET reveals copy number evolution in two breast cancer samples, and outperforms other methods in tree reconstruction, breakpoint identification and copy number calling.
Collapse
Affiliation(s)
- Magda Markowska
- University of Warsaw, Faculty of Mathematics, Informatics and Mechanics, Banacha 2, Warsaw, Poland.,Medical University of Warsaw, Postgraduate School of Molecular Medicine, Ks. Trojdena 2a Street, Warsaw, Poland
| | - Tomasz Cąkała
- University of Warsaw, Faculty of Mathematics, Informatics and Mechanics, Banacha 2, Warsaw, Poland
| | - BłaŻej Miasojedow
- University of Warsaw, Faculty of Mathematics, Informatics and Mechanics, Banacha 2, Warsaw, Poland
| | - Bogac Aybey
- Merck Healthcare KGaA, Translational Medicine, Oncology Bioinformatics, Frankfurter Str. 250, Darmstadt, 64293, Germany
| | - Dilafruz Juraeva
- Merck Healthcare KGaA, Translational Medicine, Oncology Bioinformatics, Frankfurter Str. 250, Darmstadt, 64293, Germany
| | - Johanna Mazur
- Merck Healthcare KGaA, Translational Medicine, Oncology Bioinformatics, Frankfurter Str. 250, Darmstadt, 64293, Germany
| | - Edith Ross
- Merck Healthcare KGaA, Translational Medicine, Oncology Bioinformatics, Frankfurter Str. 250, Darmstadt, 64293, Germany
| | - Eike Staub
- Merck Healthcare KGaA, Translational Medicine, Oncology Bioinformatics, Frankfurter Str. 250, Darmstadt, 64293, Germany
| | - Ewa Szczurek
- University of Warsaw, Faculty of Mathematics, Informatics and Mechanics, Banacha 2, Warsaw, Poland.
| |
Collapse
|
11
|
Krueger T, Gogolewski K, Bodych M, Gambin A, Giordano G, Cuschieri S, Czypionka T, Perc M, Petelos E, Rosińska M, Szczurek E. Risk assessment of COVID-19 epidemic resurgence in relation to SARS-CoV-2 variants and vaccination passes. Commun Med 2022; 2:23. [PMID: 35603303 PMCID: PMC9053266 DOI: 10.1038/s43856-022-00084-w] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 02/03/2022] [Indexed: 12/18/2022] Open
Abstract
The introduction of COVID-19 vaccination passes (VPs) by many countries coincided with the Delta variant fast becoming dominant across Europe. A thorough assessment of their impact on epidemic dynamics is still lacking. Here, we propose the VAP-SIRS model that considers possibly lower restrictions for the VP holders than for the rest of the population, imperfect vaccination effectiveness against infection, rates of (re-)vaccination and waning immunity, fraction of never-vaccinated, and the increased transmissibility of the Delta variant. Some predicted epidemic scenarios for realistic parameter values yield new COVID-19 infection waves within two years, and high daily case numbers in the endemic state, even without introducing VPs and granting more freedom to their holders. Still, suitable adaptive policies can avoid unfavorable outcomes. While VP holders could initially be allowed more freedom, the lack of full vaccine effectiveness and increased transmissibility will require accelerated (re-)vaccination, wide-spread immunity surveillance, and/or minimal long-term common restrictions. Assessing the impact of vaccines, other public health measures, and declining immunity on SARS-CoV-2 control is challenging. This is particularly true in the context of vaccination passes, whereby vaccinated individuals have more freedom of making contacts than unvaccinated ones. Here, we use a mathematical model to simulate various scenarios and investigate the likelihood of containing COVID-19 outbreaks in example European countries. We demonstrate that both Alpha and Delta SARS-CoV-2 variants inevitably lead to recurring outbreaks when measures are lifted for vaccination pass holders. High re-vaccination rates and a lowered fraction of the unvaccinated population increase the benefit of vaccination passes. These observations are important for policy making, highlighting the need for continued vigilance, even where the epidemic is under control, especially when new variants of concern emerge. Krueger, Gogolewski, and Bodych et al. assess the risk of COVID-19 epidemic resurgence in relation to SARS-CoV-2 variants and vaccination passes. Their model predicts that new COVID-19 infection waves within two years from the onset of the vaccination program are possible but that suitable adaptive policies can help to avoid unfavorable outcomes.
Collapse
|
12
|
Czypionka T, Iftekhar EN, Prainsack B, Priesemann V, Bauer S, Calero Valdez A, Cuschieri S, Glaab E, Grill E, Krutzinna J, Lionis C, Machado H, Martins C, Pavlakis GN, Perc M, Petelos E, Pickersgill M, Skupin A, Schernhammer E, Szczurek E, Tsiodras S, Willeit P, Wilmes P. The benefits, costs and feasibility of a low incidence COVID-19 strategy. Lancet Reg Health Eur 2022; 13:100294. [PMID: 35005678 PMCID: PMC8720492 DOI: 10.1016/j.lanepe.2021.100294] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
In the summer of 2021, European governments removed most NPIs after experiencing prolonged second and third waves of the COVID-19 pandemic. Most countries failed to achieve immunization rates high enough to avoid resurgence of the virus. Public health strategies for autumn and winter 2021 have ranged from countries aiming at low incidence by re-introducing NPIs to accepting high incidence levels. However, such high incidence strategies almost certainly lead to the very consequences that they seek to avoid: restrictions that harm people and economies. At high incidence, the important pandemic containment measure 'test-trace-isolate-support' becomes inefficient. At that point, the spread of SARS-CoV-2 and its numerous harmful consequences can likely only be controlled through restrictions. We argue that all European countries need to pursue a low incidence strategy in a coordinated manner. Such an endeavour can only be successful if it is built on open communication and trust.
Collapse
Affiliation(s)
- Thomas Czypionka
- Institute for Advanced Studies, Vienna, Austria, and London School of Economics and Political Science, London, UK
| | - Emil N. Iftekhar
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| | | | - Viola Priesemann
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| | - Simon Bauer
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| | | | - Sarah Cuschieri
- Faculty of Medicine and Surgery, University of Malta, Msida, Malta
| | - Enrico Glaab
- University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Eva Grill
- Ludwig-Maximilians University, Munich, Germany
| | | | - Christos Lionis
- Clinic of Social and Family Medicine, Faculty of Medicine, University of Crete, Heraklion, Greece and Institute of Health and Medicine, University of Linkoping, Linkoping, Sweden
| | | | - Carlos Martins
- Department of Community Medicine, Health Information and Decision Sciences of the Faculty of Medicine of the University of Porto, Porto, Portugal
| | | | - Matjaž Perc
- University of Maribor, Maribor, Slovenia, and Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan
| | - Elena Petelos
- Clinic of Social and Family Medicine, Faculty of Medicine, University of Crete, Heraklion, Greece and Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands
| | | | | | | | | | - Sotirios Tsiodras
- National and Kapodistrian University of Athens Medical School, Athens, Greece
| | - Peter Willeit
- Medical University of Innsbruck, Innsbruck, Austria, and University of Cambridge, Cambridge, UK
| | - Paul Wilmes
- University of Luxembourg, Esch-sur-Alzette, Luxembourg
| |
Collapse
|
13
|
Gogolewski K, Miasojedow B, Sadkowska-Todys M, Stepień M, Demkow U, Lech A, Szczurek E, Rabczenko D, Rosińska M, Gambin A. Data-driven case fatality rate estimation for the primary lineage of SARS-CoV-2 in Poland. Methods 2022; 203:584-593. [PMID: 35085741 PMCID: PMC8785264 DOI: 10.1016/j.ymeth.2022.01.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Revised: 12/05/2021] [Accepted: 01/18/2022] [Indexed: 12/28/2022] Open
Abstract
After more than one and a half year since the COVID-19 pandemics outbreak the scientific world is constantly trying to understand its dynamics. In this paper of the case fatality rates (CFR) for COVID-19 we study the historic data regarding mortality in Poland during the first six months of pandemic, when no SARS-CoV-2 variants of concern were present among infected. To this end, we apply competing risk models to perform both uni- and multivariate analyses on specific subpopulations selected by different factors including the key indicators: age, sex, hospitalization. The study explores the case fatality rate to find out its decreasing trend in time. Furthermore, we describe the differences in mortality among hospitalized and other cases indicating a sudden increase of mortality among hospitalized cases at the end of the 2020 spring season. Exploratory and multivariate analysis revealed the real impact of each variable and besides the expected factors indicating increased mortality (age, comorbidities) we track more non-obvious indicators. Recent medical care as well as the identification of the source contact, independently of the comorbidities, significantly impact an individual mortality risk. As a result, the study provides a twofold insight into the COVID-19 mortality in Poland. On one hand we explore mortality in different groups with respect to different variables, on the other we indicate novel factors that may be crucial in reducing mortality. The later can be coped, e.g. by more efficient contact tracing and proper organization and management of the health care system to accompany those who need medical care independently of comorbidities or COVID-19 infection.
Collapse
Affiliation(s)
- Krzysztof Gogolewski
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Błażej Miasojedow
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Małgorzata Sadkowska-Todys
- Department of Infectious Disease Epidemiology and Surveillance, National Institute of Public Health NIH - National Research Institute, Warsaw, Poland
| | - Małgorzata Stepień
- Department of Infectious Disease Epidemiology and Surveillance, National Institute of Public Health NIH - National Research Institute, Warsaw, Poland
| | - Urszula Demkow
- Department of Laboratory Diagnostics and Clinical Immunology of Developmental Age, Medical University of Warsaw, Warsaw, Poland
| | - Agnieszka Lech
- Department of Medical Microbiology, Medical University of Warsaw, Warsaw, Poland
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Daniel Rabczenko
- Department for Monitoring and Analysis of Population Health Status, National Institute of Public Health NIH - National Research Institute, Warsaw, Poland
| | - Magdalena Rosińska
- Department of Infectious Disease Epidemiology and Surveillance, National Institute of Public Health NIH - National Research Institute, Warsaw, Poland
| | - Anna Gambin
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| |
Collapse
|
14
|
Priesemann V, Balling R, Bauer S, Beutels P, Valdez AC, Cuschieri S, Czypionka T, Dumpis U, Glaab E, Grill E, Hotulainen P, Iftekhar EN, Krutzinna J, Lionis C, Machado H, Martins C, McKee M, Pavlakis GN, Perc M, Petelos E, Pickersgill M, Prainsack B, Rocklöv J, Schernhammer E, Szczurek E, Tsiodras S, Van Gucht S, Willeit P. Towards a European strategy to address the COVID-19 pandemic. Lancet 2021; 398:838-839. [PMID: 34384539 PMCID: PMC8352491 DOI: 10.1016/s0140-6736(21)01808-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Accepted: 08/03/2021] [Indexed: 12/12/2022]
Affiliation(s)
- Viola Priesemann
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany.
| | - Rudi Balling
- University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Simon Bauer
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| | - Philippe Beutels
- Vaccine & Infectious Disease Institute, University of Antwerp, Belgium
| | | | - Sarah Cuschieri
- Faculty of Medicine and Surgery, University of Malta, Msida, Malta
| | | | - Uga Dumpis
- Pauls Stradins Clinical University Hospital, University of Latvia, Riga, Latvia
| | - Enrico Glaab
- University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Eva Grill
- Ludwig-Maximilians University, Munich, Germany
| | - Pirta Hotulainen
- Minerva Foundation Institute for Medical Research, Helsinki, Finland
| | - Emil N Iftekhar
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| | | | - Christos Lionis
- Clinic of Social and Family Medicine, Faculty of Medicine, University of Crete, Crete, Greece; Institute of Health and Medicine, University of Linköping, Linköping, Sweden
| | | | - Carlos Martins
- Department of Community Medicine, Health Information and Decision Sciences of the Faculty of Medicine of the University of Porto, Porto, Portugal
| | - Martin McKee
- London School of Hygiene & Tropical Medicine, London, UK
| | | | - Matjaž Perc
- University of Maribor, Maribor, Slovenia; Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan
| | - Elena Petelos
- Clinic of Social and Family Medicine, Faculty of Medicine, University of Crete, Heraklion, Greece; Faculty of Health, Medicine and Life Sciences, Maastricht University Maastricht, Maastricht, Netherlands
| | | | | | - Joacim Rocklöv
- Department of Public Health and Clinical Medicine, Section of Sustainable Health, Umeå University, Umeå, Sweden
| | | | | | - Sotirios Tsiodras
- National and Kapodistrian University of Athens Medical School, Athens, Greece
| | | | - Peter Willeit
- Medical University of Innsbruck, Innsbruck, Austria; University of Cambridge, Cambridge, UK
| |
Collapse
|
15
|
Iftekhar EN, Priesemann V, Balling R, Bauer S, Beutels P, Calero Valdez A, Cuschieri S, Czypionka T, Dumpis U, Glaab E, Grill E, Hanson C, Hotulainen P, Klimek P, Kretzschmar M, Krüger T, Krutzinna J, Low N, Machado H, Martins C, McKee M, Mohr SB, Nassehi A, Perc M, Petelos E, Pickersgill M, Prainsack B, Rocklöv J, Schernhammer E, Staines A, Szczurek E, Tsiodras S, Van Gucht S, Willeit P. A look into the future of the COVID-19 pandemic in Europe: an expert consultation. Lancet Reg Health Eur 2021; 8:100185. [PMID: 34345876 PMCID: PMC8321710 DOI: 10.1016/j.lanepe.2021.100185] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
How will the coronavirus disease 2019 (COVID-19) pandemic develop in the coming months and years? Based on an expert survey, we examine key aspects that are likely to influence the COVID-19 pandemic in Europe. The challenges and developments will strongly depend on the progress of national and global vaccination programs, the emergence and spread of variants of concern (VOCs), and public responses to non-pharmaceutical interventions (NPIs). In the short term, many people remain unvaccinated, VOCs continue to emerge and spread, and mobility and population mixing are expected to increase. Therefore, lifting restrictions too much and too early risk another damaging wave. This challenge remains despite the reduced opportunities for transmission given vaccination progress and reduced indoor mixing in summer 2021. In autumn 2021, increased indoor activity might accelerate the spread again, whilst a necessary reintroduction of NPIs might be too slow. The incidence may strongly rise again, possibly filling intensive care units, if vaccination levels are not high enough. A moderate, adaptive level of NPIs will thus remain necessary. These epidemiological aspects combined with economic, social, and health-related consequences provide a more holistic perspective on the future of the COVID-19 pandemic.
Collapse
Affiliation(s)
| | - Viola Priesemann
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| | - Rudi Balling
- University of Luxembourg, Luxembourg, Luxembourg
| | - Simon Bauer
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| | | | | | | | - Thomas Czypionka
- Institute for Advanced Studies, Vienna, Austria, and London School of Economics, London, UK
| | - Uga Dumpis
- Pauls Stradins Clinical University Hospital, University of Latvia, Riga, Latvia
| | - Enrico Glaab
- University of Luxembourg, Luxembourg, Luxembourg
| | - Eva Grill
- Ludwig-Maximilians-University München, München, Germany
| | - Claudia Hanson
- Karolinska Institute, Stockholm, Sweden, and London School of Hygiene & Tropical Medicine, London, UK
| | - Pirta Hotulainen
- Minerva Foundation Institute for Medical Research, Helsinki, Finland
| | - Peter Klimek
- Medical University of Vienna, Vienna, Austria, and Complexity Science Hub Vienna, Vienna, Austria
| | | | - Tyll Krüger
- Wroclaw University of Science and Technology, Wroclaw, Poland
| | | | | | - Helena Machado
- Institute for Social Sciences, University of Minho, Braga, Portugal
| | - Carlos Martins
- Department of Community Medicine, Health Information and Decision Sciences of the Faculty of Medicine, University of Porto, Porto, Portugal
| | - Martin McKee
- London School of Hygiene & Tropical Medicine, London, UK
| | | | - Armin Nassehi
- Ludwig-Maximilians-University München, München, Germany
| | - Matjaž Perc
- University of Maribor, Maribor, Slovenia, and Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan
| | - Elena Petelos
- University of Crete, Crete, Greece, and Maastricht University, Maastricht, The Netherlands
| | | | - Barbara Prainsack
- Department of Political Science, University of Vienna, Vienna, Austria
| | - Joacim Rocklöv
- Department of Public Health and Clinical Medicine, Section of Sustainable Health, Umeå University, Umeå, Sweden
| | | | | | | | | | | | - Peter Willeit
- Medical University of Innsbruck, Innsbruck, Austria, and University of Cambridge, Cambridge, UK
| |
Collapse
|
16
|
Koras K, Kizling E, Juraeva D, Staub E, Szczurek E. Interpretable deep recommender system model for prediction of kinase inhibitor efficacy across cancer cell lines. Sci Rep 2021; 11:15993. [PMID: 34362938 PMCID: PMC8346627 DOI: 10.1038/s41598-021-94564-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 07/06/2021] [Indexed: 01/02/2023] Open
Abstract
Computational models for drug sensitivity prediction have the potential to significantly improve personalized cancer medicine. Drug sensitivity assays, combined with profiling of cancer cell lines and drugs become increasingly available for training such models. Multiple methods were proposed for predicting drug sensitivity from cancer cell line features, some in a multi-task fashion. So far, no such model leveraged drug inhibition profiles. Importantly, multi-task models require a tailored approach to model interpretability. In this work, we develop DEERS, a neural network recommender system for kinase inhibitor sensitivity prediction. The model utilizes molecular features of the cancer cell lines and kinase inhibition profiles of the drugs. DEERS incorporates two autoencoders to project cell line and drug features into 10-dimensional hidden representations and a feed-forward neural network to combine them into response prediction. We propose a novel interpretability approach, which in addition to the set of modeled features considers also the genes and processes outside of this set. Our approach outperforms simpler matrix factorization models, achieving R [Formula: see text] 0.82 correlation between true and predicted response for the unseen cell lines. The interpretability analysis identifies 67 biological processes that drive the cell line sensitivity to particular compounds. Detailed case studies are shown for PHA-793887, XMD14-99 and Dabrafenib.
Collapse
Affiliation(s)
- Krzysztof Koras
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Ewa Kizling
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Dilafruz Juraeva
- Oncology Bioinformatics, Translational Medicine, Merck Healthcare KGaA, Darmstadt, Germany
| | - Eike Staub
- Oncology Bioinformatics, Translational Medicine, Merck Healthcare KGaA, Darmstadt, Germany
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland.
| |
Collapse
|
17
|
Elmes K, Schmich F, Szczurek E, Jenkins J, Beerenwinkel N, Gavryushkin A. Learning epistatic gene interactions from perturbation screens. PLoS One 2021; 16:e0254491. [PMID: 34255784 PMCID: PMC8277066 DOI: 10.1371/journal.pone.0254491] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 06/28/2021] [Indexed: 11/21/2022] Open
Abstract
The treatment of complex diseases often relies on combinatorial therapy, a strategy where drugs are used to target multiple genes simultaneously. Promising candidate genes for combinatorial perturbation often constitute epistatic genes, i.e., genes which contribute to a phenotype in a non-linear fashion. Experimental identification of the full landscape of genetic interactions by perturbing all gene combinations is prohibitive due to the exponential growth of testable hypotheses. Here we present a model for the inference of pairwise epistatic, including synthetic lethal, gene interactions from siRNA-based perturbation screens. The model exploits the combinatorial nature of siRNA-based screens resulting from the high numbers of sequence-dependent off-target effects, where each siRNA apart from its intended target knocks down hundreds of additional genes. We show that conditional and marginal epistasis can be estimated as interaction coefficients of regression models on perturbation data. We compare two methods, namely glinternet and xyz, for selecting non-zero effects in high dimensions as components of the model, and make recommendations for the appropriate use of each. For data simulated from real RNAi screening libraries, we show that glinternet successfully identifies epistatic gene pairs with high accuracy across a wide range of relevant parameters for the signal-to-noise ratio of observed phenotypes, the effect size of epistasis and the number of observations per double knockdown. xyz is also able to identify interactions from lower dimensional data sets (fewer genes), but is less accurate for many dimensions. Higher accuracy of glinternet, however, comes at the cost of longer running time compared to xyz. The general model is widely applicable and allows mining the wealth of publicly available RNAi screening data for the estimation of epistatic interactions between genes. As a proof of concept, we apply the model to search for interactions, and potential targets for treatment, among previously published sets of siRNA perturbation screens on various pathogens. The identified interactions include both known epistatic interactions as well as novel findings.
Collapse
Affiliation(s)
- Kieran Elmes
- Department of Computer Science, University of Otago, Dunedin, New Zealand
| | - Fabian Schmich
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Ewa Szczurek
- Institute of Informatics, University of Warsaw, Warsaw, Poland
| | - Jeremy Jenkins
- Novartis Institutes for BioMedical Research, Cambridge, Massachusetts, United States of America
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
- * E-mail: (NB); (AG)
| | - Alex Gavryushkin
- Department of Computer Science, University of Otago, Dunedin, New Zealand
- School of Mathematics and Statistics, University of Canterbury, Christchurch, New Zealand
- * E-mail: (NB); (AG)
| |
Collapse
|
18
|
Priesemann V, Brinkmann MM, Ciesek S, Cuschieri S, Czypionka T, Giordano G, Hanson C, Hens N, Iftekhar E, Klimek P, Kretzschmar M, Peichl A, Perc M, Sannino F, Schernhammer E, Schmidt A, Staines A, Szczurek E. Call for a pan-European COVID-19 response must be comprehensive - Authors' reply. Lancet 2021; 397:1541. [PMID: 33894827 PMCID: PMC9754104 DOI: 10.1016/s0140-6736(21)00462-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 02/16/2021] [Indexed: 11/28/2022]
Affiliation(s)
- Viola Priesemann
- Max Planck Institute for Dynamics and Self- Organization, 37077 Göttingen, Germany.
| | - Melanie M Brinkmann
- Technische Universität Braunschweig, Helmholtz Zentrum für Infektionsforschung, Braunschweig, Germany
| | - Sandra Ciesek
- University Hospital, Goethe-University Frankfurt, Frankfurt, Germany
| | - Sarah Cuschieri
- Faculty of Medicine & Surgery, University of Malta, Msida, Malta
| | - Thomas Czypionka
- Institute for Advanced Studies, Vienna, Austria; London School of Economics and Political Science, London, UK
| | | | - Claudia Hanson
- London School of Hygiene & Tropical Medicine, London, UK; Karolinska Institute, Stockholm, Sweden
| | - Niel Hens
- I-BioStat, Data Science Institute, Hasselt University, Hasselt, Belgium; Centre for Health Economic Research and Modelling Infectious Diseases, Vaccine and Infectious Disease Institute, University of Antwerp, Antwerp, Belgium
| | - Emil Iftekhar
- Max Planck Institute for Dynamics and Self- Organization, 37077 Göttingen, Germany; University Medical Center Utrecht, Utrecht, Netherlands
| | - Peter Klimek
- Medical University of Vienna, Vienna, Austria; Complexity Science Hub Vienna, Vienna, Austria
| | | | - Andreas Peichl
- ifo Institute, Leibniz Institute for Economic Research, University of Munich, Munich, Germany
| | | | - Francesco Sannino
- Federico II University of Napoli, Napoli, Italy; Centre of Excellence for Particle Physics and Cosmology and Danish Institute for Advanced Study, University of Southern Denmark, Aarhus, Denmark
| | - Eva Schernhammer
- Department of Epidemiology, Center for Public Health, Medical University of Vienna, Vienna, Austria
| | - Alexander Schmidt
- Campus Institute for Dynamics of Biological Networks, Göttingen, Germany; Max Planck Institute for Dynamics and Self- Organization, 37077 Göttingen, Germany
| | - Anthony Staines
- School of Nursing, Psychotherapy and Community Health, Dublin City University, Dublin, Ireland
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| |
Collapse
|
19
|
Shafighi SD, Kiełbasa SM, Sepúlveda-Yáñez J, Monajemi R, Cats D, Mei H, Menafra R, Kloet S, Veelken H, van Bergen CAM, Szczurek E. CACTUS: integrating clonal architecture with genomic clustering and transcriptome profiling of single tumor cells. Genome Med 2021; 13:45. [PMID: 33761980 PMCID: PMC7988935 DOI: 10.1186/s13073-021-00842-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Accepted: 02/03/2021] [Indexed: 01/13/2023] Open
Abstract
Background Drawing genotype-to-phenotype maps in tumors is of paramount importance for understanding tumor heterogeneity. Assignment of single cells to their tumor clones of origin can be approached by matching the genotypes of the clones to the mutations found in RNA sequencing of the cells. The confidence of the cell-to-clone mapping can be increased by accounting for additional measurements. Follicular lymphoma, a malignancy of mature B cells that continuously acquire mutations in parallel in the exome and in B cell receptor loci, presents a unique opportunity to join exome-derived mutations with B cell receptor sequences as independent sources of evidence for clonal evolution. Methods Here, we propose CACTUS, a probabilistic model that leverages the information from an independent genomic clustering of cells and exploits the scarce single cell RNA sequencing data to map single cells to given imperfect genotypes of tumor clones. Results We apply CACTUS to two follicular lymphoma patient samples, integrating three measurements: whole exome, single-cell RNA, and B cell receptor sequencing. CACTUS outperforms a predecessor model by confidently assigning cells and B cell receptor-based clusters to the tumor clones. Conclusions The integration of independent measurements increases model certainty and is the key to improving model performance in the challenging task of charting the genotype-to-phenotype maps in tumors. CACTUS opens the avenue to study the functional implications of tumor heterogeneity, and origins of resistance to targeted therapies. CACTUS is written in R and source code, along with all supporting files, are available on GitHub (https://github.com/LUMC/CACTUS). Supplementary Information The online version contains supplementary material available at (10.1186/s13073-021-00842-w).
Collapse
Affiliation(s)
- Shadi Darvish Shafighi
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Stefana Banacha 2, Warsaw, 02-097, Poland
| | - Szymon M Kiełbasa
- Department of Biomedical Data Sciences, Leiden University Medical Center, Einthovenweg 20, Leiden, 2333 ZC, The Netherlands
| | - Julieta Sepúlveda-Yáñez
- Department of Hematology, Leiden University Medical Center, Albinusdreef 2, Leiden, 2333 ZA, The Netherlands
| | - Ramin Monajemi
- Department of Biomedical Data Sciences, Leiden University Medical Center, Einthovenweg 20, Leiden, 2333 ZC, The Netherlands
| | - Davy Cats
- Department of Biomedical Data Sciences, Leiden University Medical Center, Einthovenweg 20, Leiden, 2333 ZC, The Netherlands
| | - Hailiang Mei
- Department of Biomedical Data Sciences, Leiden University Medical Center, Einthovenweg 20, Leiden, 2333 ZC, The Netherlands
| | - Roberta Menafra
- Leiden Genome Technology Center, Leiden University Medical Center, Einthovenweg 20, Leiden, 2333 ZC, The Netherlands
| | - Susan Kloet
- Leiden Genome Technology Center, Leiden University Medical Center, Einthovenweg 20, Leiden, 2333 ZC, The Netherlands
| | - Hendrik Veelken
- Department of Hematology, Leiden University Medical Center, Albinusdreef 2, Leiden, 2333 ZA, The Netherlands
| | - Cornelis A M van Bergen
- Department of Hematology, Leiden University Medical Center, Albinusdreef 2, Leiden, 2333 ZA, The Netherlands
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Stefana Banacha 2, Warsaw, 02-097, Poland
| |
Collapse
|
20
|
Priesemann V, Balling R, Brinkmann MM, Ciesek S, Czypionka T, Eckerle I, Giordano G, Hanson C, Hel Z, Hotulainen P, Klimek P, Nassehi A, Peichl A, Perc M, Petelos E, Prainsack B, Szczurek E. An action plan for pan-European defence against new SARS-CoV-2 variants. Lancet 2021; 397:469-470. [PMID: 33485462 PMCID: PMC7825950 DOI: 10.1016/s0140-6736(21)00150-1] [Citation(s) in RCA: 67] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 01/19/2021] [Indexed: 12/14/2022]
Affiliation(s)
- Viola Priesemann
- Max-Planck-Institute for Dynamics and Self-Organization, 37077 Göttingen, Germany.
| | - Rudi Balling
- University of Luxembourg, Luxembourg, Luxembourg
| | - Melanie M Brinkmann
- Technische Universität Braunschweig, Helmholtz Zentrum für Infektionsforschung, Braunschweig, Germany
| | - Sandra Ciesek
- University Hospital, Goethe-University Frankfurt, Frankfurt, Germany
| | - Thomas Czypionka
- Institute for Advanced Studies, Vienna, Austria; London School of Economics and Political Science, London, UK
| | | | | | - Claudia Hanson
- London School of Hygiene & Tropical Medicine, London, UK; Karolinska Institute, Stockholm, Sweden
| | - Zdenek Hel
- University of Alabama at Birmingham, Birmingham, AL, USA
| | - Pirta Hotulainen
- Minerva Foundation Institute for Medical Research, Helsinki, Finland
| | - Peter Klimek
- Medical University of Vienna, Vienna, Austria; Complexity Science Hub Vienna, Vienna, Austria
| | - Armin Nassehi
- Ludwig-Maximilian-Universität München, Munich, Germany
| | - Andreas Peichl
- ifo Institute, Leibniz Institute for Economic Research, University of Munich, Munich, Germany
| | - Matjaz Perc
- University of Maribor, Maribor, Slovenia; Alma Mater Europaea, Maribor, Slovenia
| | | | - Barbara Prainsack
- Department of Political Science, University of Vienna, Vienna, Austria
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| |
Collapse
|
21
|
Priesemann V, Brinkmann MM, Ciesek S, Cuschieri S, Czypionka T, Giordano G, Gurdasani D, Hanson C, Hens N, Iftekhar E, Kelly-Irving M, Klimek P, Kretzschmar M, Peichl A, Perc M, Sannino F, Schernhammer E, Schmidt A, Staines A, Szczurek E. Calling for pan-European commitment for rapid and sustained reduction in SARS-CoV-2 infections. Lancet 2021; 397:92-93. [PMID: 33347811 PMCID: PMC7833270 DOI: 10.1016/s0140-6736(20)32625-8] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/29/2020] [Accepted: 11/30/2020] [Indexed: 01/19/2023]
Affiliation(s)
- Viola Priesemann
- Max Planck Institute for Dynamics and Self-Organization, 37077 Göttingen, Germany.
| | - Melanie M Brinkmann
- Technische Universität Braunschweig, Helmholtz Zentrum für Infektionsforschung, Braunschweig, Germany
| | - Sandra Ciesek
- University Hospital, Goethe-University Frankfurt, Frankfurt, Germany
| | - Sarah Cuschieri
- Faculty of Medicine & Surgery, University of Malta, Msida, Malta
| | - Thomas Czypionka
- Institute for Advanced Studies, Vienna, Austria; London School of Economics, London, UK
| | | | | | - Claudia Hanson
- London School of Hygiene & Tropical Medicine, London, UK; Karolinska Institute, Stockholm, Sweden
| | - Niel Hens
- I-BioStat, Data Science Institute, Hasselt University, Hasselt, Belgium; Centre for Health Economic Research and Modelling Infectious Diseases, Vaccine and Infectious Disease Institute, University of Antwerp, Antwerp, Belgium
| | - Emil Iftekhar
- Max Planck Institute for Dynamics and Self-Organization, 37077 Göttingen, Germany; University Medical Center Utrecht, Utrecht, Netherlands
| | | | - Peter Klimek
- Medical University of Vienna, Vienna, Austria; Complexity Science Hub Vienna, Vienna, Austria
| | | | - Andreas Peichl
- ifo Institute, Leibniz Institute for Economic Research at the University of Munich, Munich, Germany
| | | | - Francesco Sannino
- Federico II University of Napoli, Napoli, Italy; Centre of Excellence for Particle Physics and Cosmology and Danish Institute for Advanced Study, University of Southern Denmark, Aarhus, Denmark
| | - Eva Schernhammer
- Department of Epidemiology, Center for Public Health, Medical University of Vienna, Austria
| | - Alexander Schmidt
- Max Planck Institute for Dynamics and Self-Organization, 37077 Göttingen, Germany; Campus Institute for Dynamics of Biological Networks, Göttingen, Germany
| | - Anthony Staines
- School of Nursing, Psychotherapy and Community Health, Dublin City University, Dublin, Ireland
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| |
Collapse
|
22
|
Szczurek E, Krüger T, Klink B, Beerenwinkel N. A mathematical model of the metastatic bottleneck predicts patient outcome and response to cancer treatment. PLoS Comput Biol 2020; 16:e1008056. [PMID: 33006977 PMCID: PMC7591057 DOI: 10.1371/journal.pcbi.1008056] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 10/27/2020] [Accepted: 06/15/2020] [Indexed: 12/20/2022] Open
Abstract
Metastases are the main reason for cancer-related deaths. Initiation of metastases, where newly seeded tumor cells expand into colonies, presents a tremendous bottleneck to metastasis formation. Despite its importance, a quantitative description of metastasis initiation and its clinical implications is lacking. Here, we set theoretical grounds for the metastatic bottleneck with a simple stochastic model. The model assumes that the proliferation-to-death rate ratio for the initiating metastatic cells increases when they are surrounded by more of their kind. For a total of 159,191 patients across 13 cancer types, we found that a single cell has an extremely low median probability of successful seeding of the order of 10-8. With increasing colony size, a sharp transition from very unlikely to very likely successful metastasis initiation occurs. The median metastatic bottleneck, defined as the critical colony size that marks this transition, was between 10 and 21 cells. We derived the probability of metastasis occurrence and patient outcome based on primary tumor size at diagnosis and tumor type. The model predicts that the efficacy of patient treatment depends on the primary tumor size but even more so on the severity of the metastatic bottleneck, which is estimated to largely vary between patients. We find that medical interventions aiming at tightening the bottleneck, such as immunotherapy, can be much more efficient than therapies that decrease overall tumor burden, such as chemotherapy.
Collapse
Affiliation(s)
- Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Tyll Krüger
- Faculty of Electronics, Wrocław University of Science and Technology, Wrocław, Poland
| | - Barbara Klink
- Institute for Clinical Genetics, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- National Center of Genetics, Laboratoir national de santé, Dudelange, Luxembourg
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
- * E-mail:
| |
Collapse
|
23
|
Abstract
Motivation Perturbation experiments constitute the central means to study cellular networks. Several confounding factors complicate computational modeling of signaling networks from this data. First, the technique of RNA interference (RNAi), designed and commonly used to knock-down specific genes, suffers from off-target effects. As a result, each experiment is a combinatorial perturbation of multiple genes. Second, the perturbations propagate along unknown connections in the signaling network. Once the signal is blocked by perturbation, proteins downstream of the targeted proteins also become inactivated. Finally, all perturbed network members, either directly targeted by the experiment, or by propagation in the network, contribute to the observed effect, either in a positive or negative manner. One of the key questions of computational inference of signaling networks from such data are, how many and what combinations of perturbations are required to uniquely and accurately infer the model? Results Here, we introduce an enhanced version of linear effects models (LEMs), which extends the original by accounting for both negative and positive contributions of the perturbed network proteins to the observed phenotype. We prove that the enhanced LEMs are identified from data measured under perturbations of all single, pairs and triplets of network proteins. For small networks of up to five nodes, only perturbations of single and pairs of proteins are required for identifiability. Extensive simulations demonstrate that enhanced LEMs achieve excellent accuracy of parameter estimation and network structure learning, outperforming the previous version on realistic data. LEMs applied to Bartonella henselae infection RNAi screening data identified known interactions between eight nodes of the infection network, confirming high specificity of our model and suggested one new interaction. Availability and implementation https://github.com/EwaSzczurek/LEM Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jerzy Tiuryn
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| |
Collapse
|
24
|
Lähnemann D, Köster J, Szczurek E, McCarthy DJ, Hicks SC, Robinson MD, Vallejos CA, Campbell KR, Beerenwinkel N, Mahfouz A, Pinello L, Skums P, Stamatakis A, Attolini CSO, Aparicio S, Baaijens J, Balvert M, Barbanson BD, Cappuccio A, Corleone G, Dutilh BE, Florescu M, Guryev V, Holmer R, Jahn K, Lobo TJ, Keizer EM, Khatri I, Kielbasa SM, Korbel JO, Kozlov AM, Kuo TH, Lelieveldt BP, Mandoiu II, Marioni JC, Marschall T, Mölder F, Niknejad A, Rączkowska A, Reinders M, Ridder JD, Saliba AE, Somarakis A, Stegle O, Theis FJ, Yang H, Zelikovsky A, McHardy AC, Raphael BJ, Shah SP, Schönhuth A. Eleven grand challenges in single-cell data science. Genome Biol 2020; 21:31. [PMID: 32033589 PMCID: PMC7007675 DOI: 10.1186/s13059-020-1926-6] [Citation(s) in RCA: 534] [Impact Index Per Article: 133.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 01/02/2020] [Indexed: 02/08/2023] Open
Abstract
The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years.
Collapse
Affiliation(s)
- David Lähnemann
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Department of Paediatric Oncology, Haematology and Immunology, Medical Faculty, Heinrich Heine University, University Hospital, Düsseldorf, Germany
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Johannes Köster
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, USA
| | - Ewa Szczurek
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warszawa, Poland
| | - Davis J. McCarthy
- Bioinformatics and Cellular Genomics, St Vincent’s Institute of Medical Research, Fitzroy, Australia
- Melbourne Integrative Genomics, School of BioSciences–School of Mathematics & Statistics, Faculty of Science, University of Melbourne, Melbourne, Australia
| | - Stephanie C. Hicks
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD USA
| | - Mark D. Robinson
- Institute of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zürich, Zürich, Switzerland
| | - Catalina A. Vallejos
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, UK
- The Alan Turing Institute, British Library, London, UK
| | - Kieran R. Campbell
- Department of Statistics, University of British Columbia, Vancouver, Canada
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada
- Data Science Institute, University of British Columbia, Vancouver, Canada
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Ahmed Mahfouz
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
| | - Luca Pinello
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital Research Institute, Charlestown, USA
- Department of Pathology, Harvard Medical School, Boston, USA
- Broad Institute of Harvard and MIT, Cambridge, MA USA
| | - Pavel Skums
- Department of Computer Science, Georgia State University, Atlanta, USA
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | | | - Samuel Aparicio
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, Canada
| | - Jasmijn Baaijens
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
| | - Marleen Balvert
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
| | - Buys de Barbanson
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
- Quantitative biology, Hubrecht Institute, Utrecht, The Netherlands
| | - Antonio Cappuccio
- Institute for Advanced Study, University of Amsterdam, Amsterdam, The Netherlands
| | - Giacomo Corleone
- Department of Surgery and Cancer, The Imperial Centre for Translational and Experimental Medicine, Imperial College London, London, UK
| | - Bas E. Dutilh
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
- Centre for Molecular and Biomolecular Informatics, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Maria Florescu
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
- Quantitative biology, Hubrecht Institute, Utrecht, The Netherlands
| | - Victor Guryev
- European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Rens Holmer
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Katharina Jahn
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Thamar Jessurun Lobo
- European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Emma M. Keizer
- Biometris, Wageningen University & Research, Wageningen, The Netherlands
| | - Indu Khatri
- Department of Immunohematology and Blood Transfusion, Leiden University Medical Center, Leiden, The Netherlands
| | - Szymon M. Kielbasa
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Jan O. Korbel
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Alexey M. Kozlov
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Tzu-Hao Kuo
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Boudewijn P.F. Lelieveldt
- PRB lab, Delft University of Technology, Delft, The Netherlands
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Ion I. Mandoiu
- Computer Science & Engineering Department, University of Connecticut, Storrs, USA
| | - John C. Marioni
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Tobias Marschall
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
- Max Planck Institute for Informatics, Saarbrücken, Germany
| | - Felix Mölder
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Institute of Pathology, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| | - Amir Niknejad
- Computation molecular design, Zuse Institute Berlin, Berlin, Germany
- Mathematics Department, Mount Saint Vincent, New York, USA
| | - Alicja Rączkowska
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warszawa, Poland
| | - Marcel Reinders
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
| | - Jeroen de Ridder
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
| | - Antoine-Emmanuel Saliba
- Helmholtz Institute for RNA-based Infection Research, Helmholtz-Center for Infection Research, Würzburg, Germany
| | - Antonios Somarakis
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Oliver Stegle
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center–DKFZ, Heidelberg, Germany
| | - Fabian J. Theis
- Institute of Computational Biology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany
| | - Huan Yang
- Division of Drug Discovery and Safety, Leiden Academic Center for Drug Research–LACDR–Leiden University, Leiden, The Netherlands
| | - Alex Zelikovsky
- Department of Computer Science, Georgia State University, Atlanta, USA
- The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Alice C. McHardy
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | | | - Sohrab P. Shah
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, USA
| | - Alexander Schönhuth
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
25
|
Rączkowska A, Możejko M, Zambonelli J, Szczurek E. ARA: accurate, reliable and active histopathological image classification framework with Bayesian deep learning. Sci Rep 2019; 9:14347. [PMID: 31586139 PMCID: PMC6778075 DOI: 10.1038/s41598-019-50587-1] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Accepted: 09/16/2019] [Indexed: 02/07/2023] Open
Abstract
Machine learning algorithms hold the promise to effectively automate the analysis of histopathological images that are routinely generated in clinical practice. Any machine learning method used in the clinical diagnostic process has to be extremely accurate and, ideally, provide a measure of uncertainty for its predictions. Such accurate and reliable classifiers need enough labelled data for training, which requires time-consuming and costly manual annotation by pathologists. Thus, it is critical to minimise the amount of data needed to reach the desired accuracy by maximising the efficiency of training. We propose an accurate, reliable and active (ARA) image classification framework and introduce a new Bayesian Convolutional Neural Network (ARA-CNN) for classifying histopathological images of colorectal cancer. The model achieves exceptional classification accuracy, outperforming other models trained on the same dataset. The network outputs an uncertainty measurement for each tested image. We show that uncertainty measures can be used to detect mislabelled training samples and can be employed in an efficient active learning workflow. Using a variational dropout-based entropy measure of uncertainty in the workflow speeds up the learning process by roughly 45%. Finally, we utilise our model to segment whole-slide images of colorectal tissue and compute segmentation-based spatial statistics.
Collapse
Affiliation(s)
- Alicja Rączkowska
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Marcin Możejko
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Joanna Zambonelli
- Department of Pathology, Medical University of Warsaw, Warsaw, Poland
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland.
| |
Collapse
|
26
|
Matlak D, Szczurek E. Correction: Epistasis in genomic and survival data of cancer patients. PLoS Comput Biol 2019; 15:e1006887. [PMID: 30811379 PMCID: PMC6392309 DOI: 10.1371/journal.pcbi.1006887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
27
|
Abstract
Background Establishing the cancer type and site of origin is important in determining the most appropriate course of treatment for cancer patients. Patients with cancer of unknown primary, where the site of origin cannot be established from an examination of the metastatic cancer cells, typically have poor survival. Here, we evaluate the potential and limitations of utilising gene alteration data from tumour DNA to identify cancer types. Methods Using sequenced tumour DNA downloaded via the cBioPortal for Cancer Genomics, we collected the presence or absence of calls for gene alterations for 6640 tumour samples spanning 28 cancer types, as predictive features. We employed three machine-learning techniques, namely linear support vector machines with recursive feature selection, L1-regularised logistic regression and random forest, to select a small subset of gene alterations that are most informative for cancer-type prediction. We then evaluated the predictive performance of the models in a comparative manner. Results We found the linear support vector machine to be the most predictive model of cancer type from gene alterations. Using only 100 somatic point-mutated genes for prediction, we achieved an overall accuracy of 49.4±0.4 % (95 % confidence interval). We observed a marked increase in the accuracy when copy number alterations are included as predictors. With a combination of somatic point mutations and copy number alterations, a mere 50 genes are enough to yield an overall accuracy of 77.7±0.3 %. Conclusions A general cancer diagnostic tool that utilises either only somatic point mutations or only copy number alterations is not sufficient for distinguishing a broad range of cancer types. The combination of both gene alteration types can dramatically improve the performance. Electronic supplementary material The online version of this article (doi:10.1186/s13073-017-0493-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Kee Pang Soh
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, Basel, 4058, Switzerland.,Saw Swee Hock School of Public Health, National University of Singapore, Tahir Foundation Building, 12 Science Drive 2 MD1, Singapore, 117549, Singapore
| | - Ewa Szczurek
- Institute of Informatics, University of Warsaw, Banacha 2, Warsaw, 02-097, Poland
| | - Thomas Sakoparnig
- Biozentrum, University of Basel, Klingelbergstrasse 50/70, Basel, 4056, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, 4058, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, Basel, 4058, Switzerland. .,SIB Swiss Institute of Bioinformatics, Basel, 4058, Switzerland.
| |
Collapse
|
28
|
Abstract
Motivation: Perturbations constitute the central means to study signaling pathways. Interrupting components of the pathway and analyzing observed effects of those interruptions can give insight into unknown connections within the signaling pathway itself, as well as the link from the pathway to the effects. Different pathway components may have different individual contributions to the measured perturbation effects, such as gene expression changes. Those effects will be observed in combination when the pathway components are perturbed. Extant approaches focus either on the reconstruction of pathway structure or on resolving how the pathway components control the downstream effects. Results: Here, we propose a linear effects model, which can be applied to solve both these problems from combinatorial perturbation data. We use simulated data to demonstrate the accuracy of learning the pathway structure as well as estimation of the individual contributions of pathway components to the perturbation effects. The practical utility of our approach is illustrated by an application to perturbations of the mitogen-activated protein kinase pathway in Saccharomyces cerevisiae. Availability and Implementation: lem is available as a R package at http://www.mimuw.edu.pl/∼szczurek/lem. Contact:szczurek@mimuw.edu.pl; niko.beerenwinkel@bsse.ethz.ch Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland SIB Swiss Institute of Bioinformatics
| |
Collapse
|
29
|
Abstract
Cancer aggressiveness and its effect on patient survival depends on mutations in the tumor genome. Epistatic interactions between the mutated genes may guide the choice of anticancer therapy and set predictive factors of its success. Inhibitors targeting synthetic lethal partners of genes mutated in tumors are already utilized for efficient and specific treatment in the clinic. The space of possible epistatic interactions, however, is overwhelming, and computational methods are needed to limit the experimental effort of validating the interactions for therapy and characterizing their biomarkers. Here, we introduce SurvLRT, a statistical likelihood ratio test for identifying epistatic gene pairs and triplets from cancer patient genomic and survival data. Compared to established approaches, SurvLRT performed favorable in predicting known, experimentally verified synthetic lethal partners of PARP1 from TCGA data. Our approach is the first to test for epistasis between triplets of genes to identify biomarkers of synthetic lethality-based therapy. SurvLRT proved successful in identifying the known gene TP53BP1 as the biomarker of success of the therapy targeting PARP in BRCA1 deficient tumors. Search for other biomarkers for the same interaction revealed a region whose deletion was a more significant biomarker than deletion of TP53BP1. With the ability to detect not only pairwise but twelve different types of triple epistasis, applicability of SurvLRT goes beyond cancer therapy, to the level of characterization of shapes of fitness landscapes. Genomic alterations in tumors affect the fitness of tumor cells, controlling how well they replicate and survive compared to other cells. The landscape of tumor fitness is shaped by epistasis. Epistasis occurs when the contribution of gene alterations to the total fitness is non-linear. The type of epistatic genetic interactions with great potential for cancer therapy is synthetic lethality. Inhibitors targeting synthetic lethal partners of genes mutated in tumors can selectively kill tumor and not normal cells. Therapy based on synthetic lethality is, however, context dependent, and it is crucial to identify its biomarkers. Unfortunately, the space of possible interactions and their biomarkers is overwhelming for experimental validation. Computational pre-selection methods are required to limit the experimental effort. Here, we introduce a statistical approach called SurvLRT, for the identification of epistatic gene pairs and triplets based on patient genomic and survival data. First, we show that using SurvLRT, we can deliver synthetic lethal interactions of pairs of genes that are specific to cancer. Second, we demonstrate the applicability of SurvLRT to identify biomarkers for synthetic lethality, such as mutational status of other genes that can alleviate the synthetic effect.
Collapse
Affiliation(s)
- Dariusz Matlak
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
- * E-mail:
| |
Collapse
|
30
|
Schmich F, Szczurek E, Kreibich S, Dilling S, Andritschke D, Casanova A, Low SH, Eicher S, Muntwiler S, Emmenlauer M, Rämo P, Conde-Alvarez R, von Mering C, Hardt WD, Dehio C, Beerenwinkel N. Erratum to: gespeR: a statistical model for deconvoluting off-target-confounded RNA interference screens. Genome Biol 2015; 16:233. [PMID: 26490954 PMCID: PMC4618351 DOI: 10.1186/s13059-015-0807-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Accepted: 10/16/2015] [Indexed: 11/18/2022] Open
Affiliation(s)
- Fabian Schmich
- Department of Biosystems Science and Engineering, ETH, Zurich, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Ewa Szczurek
- Department of Biosystems Science and Engineering, ETH, Zurich, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | | | | | | | | | | | | | | | | | - Pauli Rämo
- Biozentrum, University of Basel, Basel, Switzerland
| | - Raquel Conde-Alvarez
- Institute for Tropical Health and Departamento de Microbiología y Parasitología, Universidad de Navarra, Pamplona, Spain
| | - Christian von Mering
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.,Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
| | | | | | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH, Zurich, Switzerland. .,SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
| |
Collapse
|
31
|
Schmich F, Szczurek E, Kreibich S, Dilling S, Andritschke D, Casanova A, Low SH, Eicher S, Muntwiler S, Emmenlauer M, Rämö P, Conde-Alvarez R, von Mering C, Hardt WD, Dehio C, Beerenwinkel N. gespeR: a statistical model for deconvoluting off-target-confounded RNA interference screens. Genome Biol 2015; 16:220. [PMID: 26445817 PMCID: PMC4597449 DOI: 10.1186/s13059-015-0783-1] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2015] [Accepted: 09/16/2015] [Indexed: 12/31/2022] Open
Abstract
Small interfering RNAs (siRNAs) exhibit strong off-target effects, which confound the gene-level interpretation of RNA interference screens and thus limit their utility for functional genomics studies. Here, we present gespeR, a statistical model for reconstructing individual, gene-specific phenotypes. Using 115,878 siRNAs, single and pooled, from three companies in three pathogen infection screens, we demonstrate that deconvolution of image-based phenotypes substantially improves the reproducibility between independent siRNA sets targeting the same genes. Genes selected and prioritized by gespeR are validated and shown to constitute biologically relevant components of pathogen entry mechanisms and TGF-β signaling. gespeR is available as a Bioconductor R-package.
Collapse
Affiliation(s)
- Fabian Schmich
- Department of Biosystems Science and Engineering, ETH, Zurich, Switzerland. .,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| | - Ewa Szczurek
- Department of Biosystems Science and Engineering, ETH, Zurich, Switzerland. .,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| | | | | | | | | | | | - Simone Eicher
- Biozentrum, University of Basel, Basel, Switzerland.
| | | | | | - Pauli Rämö
- Biozentrum, University of Basel, Basel, Switzerland.
| | - Raquel Conde-Alvarez
- Institute for Tropical Health and Departamento de Microbiología y Parasitología, Universidad de Navarra, Pamplona, Spain.
| | - Christian von Mering
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland. .,Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland.
| | | | | | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH, Zurich, Switzerland. .,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
32
|
Constantinescu S, Szczurek E, Mohammadi P, Rahnenführer J, Beerenwinkel N. TiMEx: a waiting time model for mutually exclusive cancer alterations. Bioinformatics 2015; 32:968-75. [PMID: 26163509 DOI: 10.1093/bioinformatics/btv400] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2015] [Accepted: 06/26/2015] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Despite recent technological advances in genomic sciences, our understanding of cancer progression and its driving genetic alterations remains incomplete. RESULTS We introduce TiMEx, a generative probabilistic model for detecting patterns of various degrees of mutual exclusivity across genetic alterations, which can indicate pathways involved in cancer progression. TiMEx explicitly accounts for the temporal interplay between the waiting times to alterations and the observation time. In simulation studies, we show that our model outperforms previous methods for detecting mutual exclusivity. On large-scale biological datasets, TiMEx identifies gene groups with strong functional biological relevance, while also proposing new candidates for biological validation. TiMEx possesses several advantages over previous methods, including a novel generative probabilistic model of tumorigenesis, direct estimation of the probability of mutual exclusivity interaction, computational efficiency and high sensitivity in detecting gene groups involving low-frequency alterations. AVAILABILITY AND IMPLEMENTATION TiMEx is available as a Bioconductor R package at www.bsse.ethz.ch/cbg/software/TiMEx CONTACT niko.beerenwinkel@bsse.ethz.ch SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Simona Constantinescu
- Department of Biosystems Science and Engineering, ETH Zürich, Swiss Institute of Bioinformatics, Basel 4058, Switzerland and
| | - Ewa Szczurek
- Department of Biosystems Science and Engineering, ETH Zürich, Swiss Institute of Bioinformatics, Basel 4058, Switzerland and
| | - Pejman Mohammadi
- Department of Biosystems Science and Engineering, ETH Zürich, Swiss Institute of Bioinformatics, Basel 4058, Switzerland and
| | - Jörg Rahnenführer
- Faculty of Statistics, Technische Universität Dortmund, Dortmund 44221, Germany
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zürich, Swiss Institute of Bioinformatics, Basel 4058, Switzerland and
| |
Collapse
|
33
|
Rämö P, Drewek A, Arrieumerlou C, Beerenwinkel N, Ben-Tekaya H, Cardel B, Casanova A, Conde-Alvarez R, Cossart P, Csúcs G, Eicher S, Emmenlauer M, Greber U, Hardt WD, Helenius A, Kasper C, Kaufmann A, Kreibich S, Kühbacher A, Kunszt P, Low SH, Mercer J, Mudrak D, Muntwiler S, Pelkmans L, Pizarro-Cerdá J, Podvinec M, Pujadas E, Rinn B, Rouilly V, Schmich F, Siebourg-Polster J, Snijder B, Stebler M, Studer G, Szczurek E, Truttmann M, von Mering C, Vonderheit A, Yakimovich A, Bühlmann P, Dehio C. Simultaneous analysis of large-scale RNAi screens for pathogen entry. BMC Genomics 2014; 15:1162. [PMID: 25534632 PMCID: PMC4326433 DOI: 10.1186/1471-2164-15-1162] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2014] [Accepted: 12/12/2014] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Large-scale RNAi screening has become an important technology for identifying genes involved in biological processes of interest. However, the quality of large-scale RNAi screening is often deteriorated by off-targets effects. In order to find statistically significant effector genes for pathogen entry, we systematically analyzed entry pathways in human host cells for eight pathogens using image-based kinome-wide siRNA screens with siRNAs from three vendors. We propose a Parallel Mixed Model (PMM) approach that simultaneously analyzes several non-identical screens performed with the same RNAi libraries. RESULTS We show that PMM gains statistical power for hit detection due to parallel screening. PMM allows incorporating siRNA weights that can be assigned according to available information on RNAi quality. Moreover, PMM is able to estimate a sharedness score that can be used to focus follow-up efforts on generic or specific gene regulators. By fitting a PMM model to our data, we found several novel hit genes for most of the pathogens studied. CONCLUSIONS Our results show parallel RNAi screening can improve the results of individual screens. This is currently particularly interesting when large-scale parallel datasets are becoming more and more publicly available. Our comprehensive siRNA dataset provides a public, freely available resource for further statistical and biological analyses in the high-content, high-throughput siRNA screening field.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Christoph Dehio
- Focal Area Infection Biology, Biozentrum, University of Basel, Klingelberstrasse 70, CH-4056 Basel, Switzerland.
| |
Collapse
|
34
|
Abstract
MOTIVATION Cancer cell genomes acquire several genetic alterations during somatic evolution from a normal cell type. The relative order in which these mutations accumulate and contribute to cell fitness is affected by epistatic interactions. Inferring their evolutionary history is challenging because of the large number of mutations acquired by cancer cells as well as the presence of unknown epistatic interactions. RESULTS We developed Bayesian Mutation Landscape (BML), a probabilistic approach for reconstructing ancestral genotypes from tumor samples for much larger sets of genes than previously feasible. BML infers the likely sequence of mutation accumulation for any set of genes that is recurrently mutated in tumor samples. When applied to tumor samples from colorectal, glioblastoma, lung and ovarian cancer patients, BML identifies the diverse evolutionary scenarios involved in tumor initiation and progression in greater detail, but broadly in agreement with prior results. AVAILABILITY AND IMPLEMENTATION Source code and all datasets are freely available at bml.molgen.mpg.de. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Navodit Misra
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, D-14195 Berlin, Germany and Department of Biosystems Science and Engineering, ETH Zurich and Swiss Institute of Bioinformatics, CH-4058 Basel, Switzerland
| | - Ewa Szczurek
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, D-14195 Berlin, Germany and Department of Biosystems Science and Engineering, ETH Zurich and Swiss Institute of Bioinformatics, CH-4058 Basel, Switzerland Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, D-14195 Berlin, Germany and Department of Biosystems Science and Engineering, ETH Zurich and Swiss Institute of Bioinformatics, CH-4058 Basel, Switzerland
| | - Martin Vingron
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, D-14195 Berlin, Germany and Department of Biosystems Science and Engineering, ETH Zurich and Swiss Institute of Bioinformatics, CH-4058 Basel, Switzerland
| |
Collapse
|
35
|
Abstract
In large collections of tumor samples, it has been observed that sets of genes that are commonly involved in the same cancer pathways tend not to occur mutated together in the same patient. Such gene sets form mutually exclusive patterns of gene alterations in cancer genomic data. Computational approaches that detect mutually exclusive gene sets, rank and test candidate alteration patterns by rewarding the number of samples the pattern covers and by punishing its impurity, i.e., additional alterations that violate strict mutual exclusivity. However, the extant approaches do not account for possible observation errors. In practice, false negatives and especially false positives can severely bias evaluation and ranking of alteration patterns. To address these limitations, we develop a fully probabilistic, generative model of mutual exclusivity, explicitly taking coverage, impurity, as well as error rates into account, and devise efficient algorithms for parameter estimation and pattern ranking. Based on this model, we derive a statistical test of mutual exclusivity by comparing its likelihood to the null model that assumes independent gene alterations. Using extensive simulations, the new test is shown to be more powerful than a permutation test applied previously. When applied to detect mutual exclusivity patterns in glioblastoma and in pan-cancer data from twelve tumor types, we identify several significant patterns that are biologically relevant, most of which would not be detected by previous approaches. Our statistical modeling framework of mutual exclusivity provides increased flexibility and power to detect cancer pathways from genomic alteration data in the presence of noise. A summary of this paper appears in the proceedings of the RECOMB 2014 conference, April 2-5.
Collapse
Affiliation(s)
- Ewa Szczurek
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
36
|
Szczurek E, Misra N, Vingron M. Synthetic sickness or lethality points at candidate combination therapy targets in glioblastoma. Int J Cancer 2013; 133:2123-32. [PMID: 23629686 DOI: 10.1002/ijc.28235] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2012] [Accepted: 04/11/2013] [Indexed: 12/30/2022]
Abstract
Synthetic lethal interactions in cancer hold the potential for successful combined therapies, which would avoid the difficulties of single molecule-targeted treatment. Identification of interactions that are specific for human tumors is an open problem in cancer research. This work aims at deciphering synthetic sick or lethal interactions directly from somatic alteration, expression and survival data of cancer patients. To this end, we look for pairs of genes and their alterations or expression levels that are "avoided" by tumors and "beneficial" for patients. Thus, candidates for synthetic sickness or lethality (SSL) interaction are identified as such gene pairs whose combination of states is under-represented in the data. Our main methodological contribution is a quantitative score that allows ranking of the candidate SSL interactions according to evidence found in patient survival. Applying this analysis to glioblastoma data, we collect 1,956 synthetic sick or lethal partners for 85 abundantly altered genes, most of which show extensive copy number variation across the patient cohort. We rediscover and interpret known interaction between TP53 and PLK1, as well as provide insight into the mechanism behind EGFR interacting with AKT2, but not AKT1 nor AKT3. Cox model analysis determines 274 of identified interactions as having significant impact on overall survival in glioblastoma, which is more informative than a standard survival predictor based on patient's age.
Collapse
Affiliation(s)
- Ewa Szczurek
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Ihnestr. 63-73, 14195, Berlin, Germany.
| | | | | |
Collapse
|
37
|
Jankowski A, Szczurek E, Jauch R, Tiuryn J, Prabhakar S. Comprehensive prediction in 78 human cell lines reveals rigidity and compactness of transcription factor dimers. Genome Res 2013; 23:1307-18. [PMID: 23554463 PMCID: PMC3730104 DOI: 10.1101/gr.154922.113] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
The binding of transcription factors (TFs) to their specific motifs in genomic regulatory regions is commonly studied in isolation. However, in order to elucidate the mechanisms of transcriptional regulation, it is essential to determine which TFs bind DNA cooperatively as dimers and to infer the precise nature of these interactions. So far, only a small number of such dimeric complexes are known. Here, we present an algorithm for predicting cell-type–specific TF–TF dimerization on DNA on a large scale, using DNase I hypersensitivity data from 78 human cell lines. We represented the universe of possible TF complexes by their corresponding motif complexes, and analyzed their occurrence at cell-type–specific DNase I hypersensitive sites. Based on ∼1.4 billion tests for motif complex enrichment, we predicted 603 highly significant cell-type–specific TF dimers, the vast majority of which are novel. Our predictions included 76% (19/25) of the known dimeric complexes and showed significant overlap with an experimental database of protein–protein interactions. They were also independently supported by evolutionary conservation, as well as quantitative variation in DNase I digestion patterns. Notably, the known and predicted TF dimers were almost always highly compact and rigidly spaced, suggesting that TFs dimerize in close proximity to their partners, which results in strict constraints on the structure of the DNA-bound complex. Overall, our results indicate that chromatin openness profiles are highly predictive of cell-type–specific TF–TF interactions. Moreover, cooperative TF dimerization seems to be a widespread phenomenon, with multiple TF complexes predicted in most cell types.
Collapse
Affiliation(s)
- Aleksander Jankowski
- Computational and Systems Biology, Genome Institute of Singapore, Singapore 138672, Singapore
| | | | | | | | | |
Collapse
|
38
|
|
39
|
Szczurek E, Markowetz F, Gat-Viks I, Biecek P, Tiuryn J, Vingron M. Deregulation upon DNA damage revealed by joint analysis of context-specific perturbation data. BMC Bioinformatics 2011; 12:249. [PMID: 21693013 PMCID: PMC3236061 DOI: 10.1186/1471-2105-12-249] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2010] [Accepted: 06/21/2011] [Indexed: 12/17/2022] Open
Abstract
Background Deregulation between two different cell populations manifests itself in changing gene expression patterns and changing regulatory interactions. Accumulating knowledge about biological networks creates an opportunity to study these changes in their cellular context. Results We analyze re-wiring of regulatory networks based on cell population-specific perturbation data and knowledge about signaling pathways and their target genes. We quantify deregulation by merging regulatory signal from the two cell populations into one score. This joint approach, called JODA, proves advantageous over separate analysis of the cell populations and analysis without incorporation of knowledge. JODA is implemented and freely available in a Bioconductor package 'joda'. Conclusions Using JODA, we show wide-spread re-wiring of gene regulatory networks upon neocarzinostatin-induced DNA damage in Human cells. We recover 645 deregulated genes in thirteen functional clusters performing the rich program of response to damage. We find that the clusters contain many previously characterized neocarzinostatin target genes. We investigate connectivity between those genes, explaining their cooperation in performing the common functions. We review genes with the most extreme deregulation scores, reporting their involvement in response to DNA damage. Finally, we investigate the indirect impact of the ATM pathway on the deregulated genes, and build a hypothetical hierarchy of direct regulation. These results prove that JODA is a step forward to a systems level, mechanistic understanding of changes in gene regulation between different cell populations.
Collapse
Affiliation(s)
- Ewa Szczurek
- Computational Molecular Biology Department, Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany.
| | | | | | | | | | | |
Collapse
|
40
|
Abstract
Gene expression measurements allow determining sets of up- or down-regulated, or unchanged genes in a particular experimental condition. Additional biological knowledge can suggest examples of genes from one of these sets. For instance, known target genes of a transcriptional activator are expected, but are not certain to go down after this activator is knocked out. Available differential expression analysis tools do not take such imprecise examples into account. Here we put forward a novel partially supervised mixture modeling methodology for differential expression analysis. Our approach, guided by imprecise examples, clusters expression data into differentially expressed and unchanged genes. The partially supervised methodology is implemented by two methods: a newly introduced belief-based mixture modeling, and soft-label mixture modeling, a method proved efficient in other applications. We investigate on synthetic data the input example settings favorable for each method. In our tests, both belief-based and soft-label methods prove their advantage over semi-supervised mixture modeling in correcting for erroneous examples. We also compare them to alternative differential expression analysis approaches, showing that incorporation of knowledge yields better performance. We present a broad range of knowledge sources and data to which our partially supervised methodology can be applied. First, we determine targets of Ste12 based on yeast knockout data, guided by a Ste12 DNA-binding experiment. Second, we distinguish miR-1 from miR-124 targets in human by clustering expression data under transfection experiments of both microRNAs, using their computationally predicted targets as examples. Finally, we utilize literature knowledge to improve clustering of time-course expression profiles.
Collapse
Affiliation(s)
- Ewa Szczurek
- Max Planck Institute for Molecular Genetics, Berlin, Germany.
| | | | | | | |
Collapse
|
41
|
Roytberg M, Gambin A, Noé L, Lasota S, Furletova E, Szczurek E, Kucherov G. On subset seeds for protein alignment. IEEE/ACM Trans Comput Biol Bioinform 2009; 6:483-494. [PMID: 19644175 DOI: 10.1109/tcbb.2009.4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
We apply the concept of subset seeds to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal sensitivity/selectivity trade-offs. We propose several different design methods and use them to construct several alphabets. We then perform a comparative analysis of seeds built over those alphabets and compare them with the standard Blastp seeding method, as well as with the family of vector seeds. While the formalism of subset seeds is less expressive (but less costly to implement) than the cumulative principle used in Blastp and vector seeds, our seeds show a similar or even better performance than Blastp on Bernoulli models of proteins compatible with the common BLOSUM62 matrix. Finally, we perform a large-scale benchmarking of our seeds against several main databases of protein alignments. Here again, the results show a comparable or better performance of our seeds versus Blastp.
Collapse
Affiliation(s)
- Mikhail Roytberg
- Institute of Mathematical Problems in Biology, Pushchino, Moscow Region 142290, Russia.
| | | | | | | | | | | | | |
Collapse
|
42
|
Gambin A, Szczurek E, Dutkowski J, Bakun M, Dadlez M. Classification of peptide mass fingerprint data by novel no-regret boosting method. Comput Biol Med 2009; 39:460-73. [PMID: 19386298 DOI: 10.1016/j.compbiomed.2009.03.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2007] [Revised: 07/31/2008] [Accepted: 03/05/2009] [Indexed: 11/30/2022]
Abstract
We have developed an integrated tool for statistical analysis of large-scale LC-MS profiles of complex protein mixtures comprising a set of procedures for data processing, selection of biomarkers used in early diagnostic and classification of patients based on their peptide mass fingerprints. Here, a novel boosting technique is proposed, which is embedded in our framework for MS data analysis. Our boosting scheme is based on Hannan-consistent game playing strategies. We analyze boosting from a game-theoretic perspective and define a new class of boosting algorithms called H-boosting methods. In the experimental part of this work we apply the new classifier together with classical and state-of-the-art algorithms to classify ovarian cancer and cystic fibrosis patients based on peptide mass spectra. The methods developed here provide automatic, general, and efficient means for processing of large scale LC-MS datasets. Good classification results suggest that our approach is able to uncover valuable information to support medical diagnosis.
Collapse
Affiliation(s)
- Anna Gambin
- Institute of Informatics, Warsaw University, Banacha 2, 02-097 Warsaw, Poland.
| | | | | | | | | |
Collapse
|
43
|
Seibt ST, Solf A, Steinhoff C, Kielbasa S, Szczurek E, Vingron M, Sers C. A role for Fra1 in the control of transcriptional network reorganization following ras transformation. Cell Commun Signal 2009. [PMCID: PMC4291840 DOI: 10.1186/1478-811x-7-s1-a8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
44
|
Grynberg M, Li Z, Szczurek E, Godzik A. Putative type IV secretion genes in Bacillus anthracis. Trends Microbiol 2007; 15:191-5. [PMID: 17387016 DOI: 10.1016/j.tim.2007.03.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2006] [Revised: 02/15/2007] [Accepted: 03/13/2007] [Indexed: 11/22/2022]
Abstract
Although the physiology of Bacillus anthracis, the causative agent of anthrax, has been studied extensively, we still do not know how toxins are dispatched from the bacterial cell. Here, by means of distant homology and genome context analyses, we identify genes encoding putative type IV secretion system-related elements on the B. anthracis plasmids pXO1 and pXO2 and in the chromosome. We argue that this type IV secretion system-like system could be responsible for anthrax toxin secretion, although we also discuss the possibilities of its involvement in the processes of sporulation, germination or conjugation.
Collapse
Affiliation(s)
- Marcin Grynberg
- Department of Genetics, Institute of Biochemistry and Biophysics PAS, 02-106 Warsaw, Poland.
| | | | | | | |
Collapse
|