2
|
Pathak GA, Karjalainen J, Stevens C, Neale BM, Daly M, Ganna A, Andrews SJ, Kanai M, Cordioli M, Polimanti R, Harerimana N, Pirinen M, Liao RG, Chwialkowska K, Trankiem A, Balaconis MK, Nguyen H, Solomonson M, Veerapen K, Wolford B, Roberts G, Park D, Ball CA, Coignet M, McCurdy S, Knight S, Partha R, Rhead B, Zhang M, Berkowitz N, Gaddis M, Noto K, Ruiz L, Pavlovic M, Hong EL, Rand K, Girshick A, Guturu H, Baltzell AH, Niemi MEK, Rahmouni S, Guntz J, Beguin Y, Cordioli M, Pigazzini S, Nkambule L, Georges M, Moutschen M, Misset B, Darcis G, Guiot J, Azarzar S, Gofflot S, Claassen S, Malaise O, Huynen P, Meuris C, Thys M, Jacques J, Léonard P, Frippiat F, Giot JB, Sauvage AS, Frenckell CV, Belhaj Y, Lambermont B, Nakanishi T, Morrison DR, Mooser V, Richards JB, Butler-Laporte G, Forgetta V, Li R, Ghosh B, Laurent L, Belisle A, Henry D, Abdullah T, Adeleye O, Mamlouk N, Kimchi N, Afrasiabi Z, Rezk N, Vulesevic B, Bouab M, Guzman C, Petitjean L, Tselios C, Xue X, Afilalo J, Afilalo M, Oliveira M, Brenner B, Brassard N, Durand M, Schurr E, Lepage P, Ragoussis J, Auld D, Chassé M, Kaufmann DE, Lathrop GM, Adra D, Hayward C, Glessner JT, Shaw DM, Campbell A, Morris M, Hakonarson H, Porteous DJ, Below J, Richmond A, Chang X, Polikowski H, Lauren PE, Chen HH, Wanying Z, Fawns-Ritchie C, North K, McCormick JB, Chang X, Glessner JR, Hakonarson H, Gignoux CR, Wicks SJ, Crooks K, Barnes KC, Daya M, Shortt J, Rafaels N, Chavan S, Timmers PRHJ, Wilson JF, Tenesa A, Kerr SM, D’Mellow K, Shahin D, El-Sherbiny YM, von Hohenstaufen KA, Sobh A, Eltoukhy MM, Nkambul L, Elhadidy TA, Abd Elghafar MS, El-Jawhari JJ, Mohamed AAS, Elnagdy MH, Samir A, Abdel-Aziz M, Khafaga WT, El-Lawaty WM, Torky MS, El-shanshory MR, Yassen AM, Hegazy MAF, Okasha K, Eid MA, Moahmed HS, Medina-Gomez C, Ikram MA, Uitterlinden AG, Mägi R, Milani L, Metspalu A, Laisk T, Läll K, Lepamets M, Esko T, Reimann E, Naaber P, Laane E, Pesukova J, Peterson P, Kisand K, Tabri J, Allos R, Hensen K, Starkopf J, Ringmets I, Tamm A, Kallaste A, Alavere H, Metsalu K, Puusepp M, Batini C, Tobin MD, Venn LD, Lee PH, Shrine N, Williams AT, Guyatt AL, John C, Packer RJ, Ali A, Free RC, Wang X, Wain LV, Hollox EJ, Bee CE, Adams EL, Palotie A, Ripatti S, Ruotsalainen S, Kristiansson K, Koskelainen S, Perola M, Donner K, Kivinen K, Palotie A, Kaunisto M, Rivolta C, Bochud PY, Bibert S, Boillat N, Nussle SG, Albrich W, Quinodoz M, Kamdar D, Suh N, Neofytos D, Erard V, Voide C, Bochud PY, Rivolta C, Bibert S, Quinodoz M, Kamdar D, Neofytos D, Erard V, Voide C, Friolet R, Vollenweider P, Pagani JL, Oddo M, zu Bentrup FM, Conen A, Clerc O, Marchetti O, Guillet A, Guyat-Jacques C, Foucras S, Rime M, Chassot J, Jaquet M, Viollet RM, Lannepoudenx Y, Portopena L, Bochud PY, Vollenweider P, Pagani JL, Desgranges F, Filippidis P, Guéry B, Haefliger D, Kampouri EE, Manuel O, Munting A, Papadimitriou-Olivgeris M, Regina J, Rochat-Stettler L, Suttels V, Tadini E, Tschopp J, Van Singer M, Viala B, Boillat-Blanco N, Brahier T, Hügli O, Meuwly JY, Pantet O, Gonseth Nussle S, Bochud M, D’Acremont V, Estoppey Younes S, Albrich WC, Suh N, Cerny A, O’Mahony L, von Mering C, Bochud PY, Frischknecht M, Kleger GR, Filipovic M, Kahlert CR, Wozniak H, Negro TR, Pugin J, Bouras K, Knapp C, Egger T, Perret A, Montillier P, di Bartolomeo C, Barda B, de Cid R, Carreras A, Moreno V, Kogevinas M, Galván-Femenía I, Blay N, Farré X, Sumoy L, Cortés B, Mercader JM, Guindo-Martinez M, Torrents D, Garcia-Aymerich J, Castaño-Vinyals G, Dobaño C, Gori M, Renieri A, Mari F, Mondelli MU, Castelli F, Vaghi M, Rusconi S, Montagnani F, Bargagli E, Franchi F, Mazzei MA, Cantarini L, Tacconi D, Feri M, Scala R, Spargi G, Nencioni C, Bandini M, Caldarelli GP, Canaccini A, Ognibene A, D’Arminio Monforte A, Girardis M, Antinori A, Francisci D, Schiaroli E, Scotton PG, Panese S, Scaggiante R, Monica MD, Capasso M, Fiorentino G, Castori M, Aucella F, Biagio AD, Masucci L, Valente S, Mandalà M, Zucchi P, Giannattasio F, Coviello DA, Mussini C, Tavecchia L, Crotti L, Rizzi M, Rovere MTL, Sarzi-Braga S, Bussotti M, Ravaglia S, Artuso R, Perrella A, Romani D, Bergomi P, Catena E, Vincenti A, Ferri C, Grassi D, Pessina G, Tumbarello M, Pietro MD, Sabrina R, Luchi S, Furini S, Dei S, Benetti E, Picchiotti N, Sanarico M, Ceri S, Pinoli P, Raimondi F, Biscarini F, Stella A, Zguro K, Capitani K, Nkambule L, Tanfoni M, Fallerini C, Daga S, Baldassarri M, Fava F, Frullanti E, Valentino F, Doddato G, Giliberti A, Tita R, Amitrano S, Bruttini M, Croci S, Meloni I, Mencarelli MA, Rizzo CL, Pinto AM, Beligni G, Tommasi A, Sarno LD, Palmieri M, Carriero ML, Alaverdian D, Busani S, Bruno R, Vecchia M, Belli MA, Mantovani S, Ludovisi S, Quiros-Roldan E, Antoni MD, Zanella I, Siano M, Emiliozzi A, Fabbiani M, Rossetti B, Bergantini L, D’Alessandro M, Cameli P, Bennett D, Anedda F, Marcantonio S, Scolletta S, Guerrini S, Conticini E, Frediani B, Spertilli C, Donati A, Guidelli L, Corridi M, Croci L, Piacentini P, Desanctis E, Cappelli S, Verzuri A, Anemoli V, Pancrazzi A, Lorubbio M, Miraglia FG, Venturelli S, Cossarizza A, Vergori A, Gabrieli A, Riva A, Paciosi F, Andretta F, Gatti F, Parisi SG, Baratti S, Piscopo C, Russo R, Andolfo I, Iolascon A, Carella M, Merla G, Squeo GM, Raggi P, Marciano C, Perna R, Bassetti M, Sanguinetti M, Giorli A, Salerni L, Parravicini P, Menatti E, Trotta T, Coiro G, Lena F, Martinelli E, Mancarella S, Gabbi C, Maggiolo F, Ripamonti D, Bachetti T, Suardi C, Parati G, Bottà G, Domenico PD, Rancan I, Bianchi F, Colombo R, Barbieri C, Acquilini D, Andreucci E, Segala FV, Tiseo G, Falcone M, Lista M, Poscente M, Vivo OD, Petrocelli P, Guarnaccia A, Baroni S, Hayward C, Porteous DJ, Fawns-Ritchie C, Richmond A, Campbell A, van Heel DA, Hunt KA, Trembath RC, Huang QQ, Martin HC, Mason D, Trivedi B, Wright J, Finer S, Akhtar S, Anwar M, Arciero E, Ashraf S, Breen G, Chung R, Curtis CJ, Chowdhury M, Colligan G, Deloukas P, Durham C, Finer S, Griffiths C, Huang QQ, Hurles M, Hunt KA, Hussain S, Islam K, Khan A, Khan A, Lavery C, Lee SH, Lerner R, MacArthur D, MacLaughlin B, Martin H, Mason D, Miah S, Newman B, Safa N, Tahmasebi F, Trembath RC, Trivedi B, van Heel DA, Wright J, Griffiths CJ, Smith AV, Boughton AP, Li KW, LeFaive J, Annis A, Niavarani A, Aliannejad R, Sharififard B, Amirsavadkouhi A, Naderpour Z, Tadi HA, Aleagha AE, Ahmadi S, Moghaddam SBM, Adamsara A, Saeedi M, Abdollahi H, Hosseini A, Chariyavilaskul P, Jantarabenjakul W, Hirankarn N, Chamnanphon M, Suttichet TB, Shotelersuk V, Pongpanich M, Phokaew C, Chetruengchai W, Putchareon O, Torvorapanit P, Puthanakit T, Suchartlikitwong P, Nilaratanakul V, Sodsai P, Brumpton BM, Hveem K, Willer C, Wolford B, Zhou W, Rogne T, Solligard E, Åsvold BO, Franke L, Boezen M, Deelen P, Claringbould A, Lopera E, Warmerdam R, Vonk JM, van Blokland I, Lanting P, Ori APS, Feng YCA, Mercader J, Weiss ST, Karlson EW, Smoller JW, Murphy SN, Meigs JB, Woolley AE, Green RC, Perez EF, Wolford B, Zöllner S, Wang J, Beck A, Sloofman LG, Ascolillo S, Sebra RP, Collins BL, Levy T, Buxbaum JD, Sealfon SC, Jordan DM, Thompson RC, Gettler K, Chaudhary K, Belbin GM, Preuss M, Hoggart C, Choi S, Underwood SJ, Salib I, Britvan B, Keller K, Tang L, Peruggia M, Hiester LL, Niblo K, Aksentijevich A, Labkowsky A, Karp A, Zlatopolsky M, Zyndorf M, Charney AW, Beckmann ND, Schadt EE, Abul-Husn NS, Cho JH, Itan Y, Kenny EE, Loos RJF, Nadkarni GN, Do R, O’Reilly P, Huckins LM, Ferreira MAR, Abecasis GR, Leader JB, Cantor MN, Justice AE, Carey DJ, Chittoor G, Josyula NS, Kosmicki JA, Horowitz JE, Baras A, Gass MC, Yadav A, Mirshahi T, Hottenga JJ, Bartels M, de geus EEJC, Nivard MMG, Verma A, Ritchie MD, Rader D, Li B, Verma SS, Lucas A, Bradford Y, Abedalthagafi M, Alaamery M, Alshareef A, Sawaji M, Massadeh S, AlMalik A, Alqahtani S, Baraka D, Harthi FA, Alsolm E, Safieh LA, Alowayn AM, Alqubaishi F, Mutairi AA, Mangul S, Almutairi M, Aljawini N, Albesher N, Arabi YM, Mahmoud ES, Khattab AK, Halawani RT, Alahmadey ZZ, Albakri JK, Felemban WA, Suliman BA, Hasanato R, Al-Awdah L, Alghamdi J, AlZahrani D, AlJohani S, Al-Afghani H, AlDhawi N, AlBardis H, Alkwai S, Alswailm M, Almalki F, Albeladi M, Almohammed I, Barhoush E, Albader A, Alotaibi S, Alghamdi B, Jung J, fawzy MS, Alrashed M, Zeberg H, Nkambul L, Frithiof R, Hultström M, Lipcsey M, Tardif N, Rooyackers O, Grip J, Maricic T, Helgeland Ø, Magnus P, Trogstad LIS, Lee Y, Harris JR, Mangino M, Spector TD, Emma D, Moutsianas L, Caulfield MJ, Scott RH, Kousathanas A, Pasko D, Walker S, Stuckey A, Odhams CA, Rhodes D, Fowler T, Rendon A, Chan G, Arumugam P, Karczewski KJ, Martin AR, Wilson DJ, Spencer CCA, Crook DW, Wyllie DH, O’Connell AM, Atkinson EG, Kanai M, Tsuo K, Baya N, Turley P, Gupta R, Walters RK, Palmer DS, Sarma G, Solomonson M, Cheng N, Lu W, Churchhouse C, Goldstein JI, King D, Zhou W, Seed C, Daly MJ, Neale BM, Finucane H, Bryant S, Satterstrom FK, Band G, Earle SG, Lin SK, Arning N, Koelling N, Armstrong J, Rudkin JK, Callier S, Bryant S, Cusick C, Soranzo N, Zhao JH, Danesh J, Angelantonio ED, Butterworth AS, Sun YV, Huffman JE, Cho K, O’Donnell CJ, Tsao P, Gaziano JM, Peloso G, Ho YL, Smieszek SP, Polymeropoulos C, Polymeropoulos V, Polymeropoulos MH, Przychodzen BP, Fernandez-Cadenas I, Planas AM, Perez-Tur J, Llucià-Carol L, Cullell N, Muiño E, Cárcel-Márquez J, DeDiego ML, Iglesias LL, Soriano A, Rico V, Agüero D, Bedini JL, Lozano F, Domingo C, Robles V, Ruiz-Jaén F, Márquez L, Gomez J, Coto E, Albaiceta GM, García-Clemente M, Dalmau D, Arranz MJ, Dietl B, Serra-Llovich A, Soler P, Colobrán R, Martín-Nalda A, Martínez AP, Bernardo D, Rojo S, Fiz-López A, Arribas E, de la Cal-Sabater P, Segura T, González-Villa E, Serrano-Heras G, Martí-Fàbregas J, Jiménez-Xarrié E, de Felipe Mimbrera A, Masjuan J, García-Madrona S, Domínguez-Mayoral A, Villalonga JM, Menéndez-Valladares P, Chasman DI, Sesso HD, Manson JE, Buring JE, Ridker PM, Franco G, Davis L, Lee S, Priest J, Sankaran VG, van Heel D, Biesecker L, Kerchberger VE, Baillie JK. A first update on mapping the human genetic architecture of COVID-19. Nature 2022; 608:E1-E10. [PMID: 35922517 PMCID: PMC9352569 DOI: 10.1038/s41586-022-04826-7] [Citation(s) in RCA: 52] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Accepted: 04/29/2022] [Indexed: 01/04/2023]
|
15
|
Krallinger M, Vazquez M, Leitner F, Salgado D, Chatr-aryamontri A, Winter A, Perfetto L, Briganti L, Licata L, Iannuccelli M, Castagnoli L, Cesareni G, Tyers M, Schneider G, Rinaldi F, Leaman R, Gonzalez G, Matos S, Kim S, Wilbur WJ, Rocha L, Shatkay H, Tendulkar AV, Agarwal S, Liu F, Wang X, Rak R, Noto K, Elkan C, Lu Z, Dogan RI, Fontaine JF, Andrade-Navarro MA, Valencia A. The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text. BMC Bioinformatics 2011; 12 Suppl 8:S3. [PMID: 22151929 PMCID: PMC3269938 DOI: 10.1186/1471-2105-12-s8-s3] [Citation(s) in RCA: 107] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts. Therefore the BCIII-ACT corpus was provided, which includes a training, development and test set of over 12,000 PPI relevant and non-relevant PubMed abstracts labeled manually by domain experts and recording also the human classification times. The Interaction Method Task (IMT) went beyond abstracts and required mining for associations between more than 3,500 full text articles and interaction detection method ontology concepts that had been applied to detect the PPIs reported in them. RESULTS A total of 11 teams participated in at least one of the two PPI tasks (10 in ACT and 8 in the IMT) and a total of 62 persons were involved either as participants or in preparing data sets/evaluating these tasks. Per task, each team was allowed to submit five runs offline and another five online via the BioCreative Meta-Server. From the 52 runs submitted for the ACT, the highest Matthew's Correlation Coefficient (MCC) score measured was 0.55 at an accuracy of 89% and the best AUC iP/R was 68%. Most ACT teams explored machine learning methods, some of them also used lexical resources like MeSH terms, PSI-MI concepts or particular lists of verbs and nouns, some integrated NER approaches. For the IMT, a total of 42 runs were evaluated by comparing systems against manually generated annotations done by curators from the BioGRID and MINT databases. The highest AUC iP/R achieved by any run was 53%, the best MCC score 0.55. In case of competitive systems with an acceptable recall (above 35%) the macro-averaged precision ranged between 50% and 80%, with a maximum F-Score of 55%. CONCLUSIONS The results of the ACT task of BioCreative III indicate that classification of large unbalanced article collections reflecting the real class imbalance is still challenging. Nevertheless, text-mining tools that report ranked lists of relevant articles for manual selection can potentially reduce the time needed to identify half of the relevant articles to less than 1/4 of the time when compared to unranked results. Detecting associations between full text articles and interaction detection method PSI-MI terms (IMT) is more difficult than might be anticipated. This is due to the variability of method term mentions, errors resulting from pre-processing of articles provided as PDF files, and the heterogeneity and different granularity of method term concepts encountered in the ontology. However, combining the sophisticated techniques developed by the participants with supporting evidence strings derived from the articles for human interpretation could result in practical modules for biological annotation workflows.
Collapse
Affiliation(s)
- Martin Krallinger
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Miguel Vazquez
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Florian Leitner
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - David Salgado
- Australian Regenerative Medicine Institute, Monash University, Australia
| | | | - Andrew Winter
- School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Livia Perfetto
- Department of Biology, University of Rome Tor Vergata, Rome, Italy
| | | | - Luana Licata
- Department of Biology, University of Rome Tor Vergata, Rome, Italy
| | | | - Luisa Castagnoli
- Department of Biology, University of Rome Tor Vergata, Rome, Italy
| | - Gianni Cesareni
- Department of Biology, University of Rome Tor Vergata, Rome, Italy
- IRCSS, Fondazione Santa Lucia, Rome, Italy
| | - Mike Tyers
- School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Gerold Schneider
- Institute of Computational Linguistics, University of Zurich, Zurich, Switzerland
| | - Fabio Rinaldi
- Institute of Computational Linguistics, University of Zurich, Zurich, Switzerland
| | - Robert Leaman
- School of Computing, Informatics and Decision Systems Engineering, Arizona State University, Tempe, Arizona, USA
| | - Graciela Gonzalez
- Department of Biomedical Informatics, Arizona State University, Tempe, Arizona, USA
| | - Sergio Matos
- Institute of Electronics and Telematics Engineering of Aveiro, University of Aveiro Campus Universitario de Santiago, 3810-193 Aveiro, Portugal
| | - Sun Kim
- National Center for Biotechnology Information (NCBI), 8600 Rockville Pike, Bethesda, Maryland, 20894, USA
| | - W John Wilbur
- National Center for Biotechnology Information (NCBI), 8600 Rockville Pike, Bethesda, Maryland, 20894, USA
| | - Luis Rocha
- School of Informatics and Computing, Indiana University, 919 E. 10th St Bloomington IN, 47408, USA
| | - Hagit Shatkay
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716, USA
| | - Ashish V Tendulkar
- Department of Computer Science and Engineering, IIT Madras, Chennai-600 036, India
| | - Shashank Agarwal
- Medical Informatics, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA
| | - Feifan Liu
- Medical Informatics, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA
| | - Xinglong Wang
- National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester, UK
| | - Rafal Rak
- National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester, UK
| | - Keith Noto
- Department of Computer Science, Tufts University, 161 College Ave, Medford, MA 02155, USA
| | - Charles Elkan
- Department of Computer Science and Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Zhiyong Lu
- National Center for Biotechnology Information (NCBI), 8600 Rockville Pike, Bethesda, Maryland, 20894, USA
| | - Rezarta Islamaj Dogan
- National Center for Biotechnology Information (NCBI), 8600 Rockville Pike, Bethesda, Maryland, 20894, USA
| | - Jean-Fred Fontaine
- Computational Biology and Data Mining Group, Max-Delbrück-Centrum für Molekulare Medizin, Robert-Rössle-Str. 10, 13125 Berlin, Germany
| | - Miguel A Andrade-Navarro
- Computational Biology and Data Mining Group, Max-Delbrück-Centrum für Molekulare Medizin, Robert-Rössle-Str. 10, 13125 Berlin, Germany
| | - Alfonso Valencia
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| |
Collapse
|