Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ji Y, Zhou Z, Liu H, Davuluri RV. DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics 2021;37:2112-2120. [PMID: 33538820 PMCID: PMC11025658 DOI: 10.1093/bioinformatics/btab083] [Citation(s) in RCA: 340] [Impact Index Per Article: 85.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 12/31/2020] [Accepted: 02/01/2021] [Indexed: 12/19/2022] Open

For:	Ji Y, Zhou Z, Liu H, Davuluri RV. DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics 2021;37:2112-2120. [PMID: 33538820 PMCID: PMC11025658 DOI: 10.1093/bioinformatics/btab083] [Citation(s) in RCA: 340] [Impact Index Per Article: 85.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 12/31/2020] [Accepted: 02/01/2021] [Indexed: 12/19/2022] Open

Number

Cited by Other Article(s)

251

Zhang J, Liu B, Wu J, Wang Z, Li J. DeepCAC: a deep learning approach on DNA transcription factors classification based on multi-head self-attention and concatenate convolutional neural network. BMC Bioinformatics 2023;24:345. [PMID: 37723425 PMCID: PMC10506269 DOI: 10.1186/s12859-023-05469-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 09/06/2023] [Indexed: 09/20/2023] Open

252

Chatterjee S, Bhattacharya M, Lee SS, Chakraborty C. Can artificial intelligence-strengthened ChatGPT or other large language models transform nucleic acid research? MOLECULAR THERAPY. NUCLEIC ACIDS 2023;33:205-207. [PMID: 37727444 PMCID: PMC10505907 DOI: 10.1016/j.omtn.2023.06.019] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/21/2023]

253

Gündüz HA, Binder M, To XY, Mreches R, Bischl B, McHardy AC, Münch PC, Rezaei M. A self-supervised deep learning method for data-efficient training in genomics. Commun Biol 2023;6:928. [PMID: 37696966 PMCID: PMC10495322 DOI: 10.1038/s42003-023-05310-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 09/01/2023] [Indexed: 09/13/2023] Open

254

Tan W, Shen Y. Multimodal learning of noncoding variant effects using genome sequence and chromatin structure. Bioinformatics 2023;39:btad541. [PMID: 37669132 PMCID: PMC10502240 DOI: 10.1093/bioinformatics/btad541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 08/28/2023] [Accepted: 09/04/2023] [Indexed: 09/07/2023] Open

255

Lentzen M, Linden T, Veeranki S, Madan S, Kramer D, Leodolter W, Frohlich H. A Transformer-Based Model Trained on Large Scale Claims Data for Prediction of Severe COVID-19 Disease Progression. IEEE J Biomed Health Inform 2023;27:4548-4558. [PMID: 37347632 DOI: 10.1109/jbhi.2023.3288768] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/24/2023]

256

Li Z, Jin J, Long W, Wei L. PLPMpro: Enhancing promoter sequence prediction with prompt-learning based pre-trained language model. Comput Biol Med 2023;164:107260. [PMID: 37557052 DOI: 10.1016/j.compbiomed.2023.107260] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 06/27/2023] [Accepted: 07/16/2023] [Indexed: 08/11/2023]

257

Zhang Y, Ge F, Li F, Yang X, Song J, Yu DJ. Prediction of Multiple Types of RNA Modifications via Biological Language Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:3205-3214. [PMID: 37289599 DOI: 10.1109/tcbb.2023.3283985] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

258

Liang S, Zhao Y, Jin J, Qiao J, Wang D, Wang Y, Wei L. Rm-LR: A long-range-based deep learning model for predicting multiple types of RNA modifications. Comput Biol Med 2023;164:107238. [PMID: 37515874 DOI: 10.1016/j.compbiomed.2023.107238] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 06/16/2023] [Accepted: 07/07/2023] [Indexed: 07/31/2023]

259

Wichmann A, Buschong E, Müller A, Jünger D, Hildebrandt A, Hankeln T, Schmidt B. MetaTransformer: deep metagenomic sequencing read classification using self-attention models. NAR Genom Bioinform 2023;5:lqad082. [PMID: 37705831 PMCID: PMC10495543 DOI: 10.1093/nargab/lqad082] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 07/14/2023] [Accepted: 08/30/2023] [Indexed: 09/15/2023] Open

260

Strutt JPB, Natarajan M, Lee E, Teo DBL, Sin WX, Cheung KW, Chew M, Thazin K, Barone PW, Wolfrum JM, Williams RBH, Rice SA, Springs SL. Machine learning-based detection of adventitious microbes in T-cell therapy cultures using long-read sequencing. Microbiol Spectr 2023;11:e0135023. [PMID: 37646508 PMCID: PMC10580871 DOI: 10.1128/spectrum.01350-23] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 07/03/2023] [Indexed: 09/01/2023] Open

Abstract

Assuring that cell therapy products are safe before releasing them for use in patients is critical. Currently, compendial sterility testing for bacteria and fungi can take 7-14 days. The goal of this work was to develop a rapid untargeted approach for the sensitive detection of microbial contaminants at low abundance from low volume samples during the manufacturing process of cell therapies. We developed a long-read sequencing methodology using Oxford Nanopore Technologies MinION platform with 16S and 18S amplicon sequencing to detect USP <71> organisms and other microbial species. Reads are classified metagenomically to predict the microbial species. We used an extreme gradient boosting machine learning algorithm (XGBoost) to first assess if a sample is contaminated, and second, determine whether the predicted contaminant is correctly classified or misclassified. The model was used to make a final decision on the sterility status of the input sample. An optimized experimental and bioinformatics pipeline starting from spiked species through to sequenced reads allowed for the detection of microbial samples at 10 colony-forming units (CFU)/mL using metagenomic classification. Machine learning can be coupled with long-read sequencing to detect and identify sample sterility status and microbial species present in T-cell cultures, including the USP <71> organisms to 10 CFU/mL. IMPORTANCE This research presents a novel method for rapidly and accurately detecting microbial contaminants in cell therapy products, which is essential for ensuring patient safety. Traditional testing methods are time-consuming, taking 7-14 days, while our approach can significantly reduce this time. By combining advanced long-read nanopore sequencing techniques and machine learning, we can effectively identify the presence and types of microbial contaminants at low abundance levels. This breakthrough has the potential to improve the safety and efficiency of cell therapy manufacturing, leading to better patient outcomes and a more streamlined production process.

Collapse

261

Tang X, Shang J, Ji Y, Sun Y. PLASMe: a tool to identify PLASMid contigs from short-read assemblies using transformer. Nucleic Acids Res 2023;51:e83. [PMID: 37427782 PMCID: PMC10450166 DOI: 10.1093/nar/gkad578] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 06/19/2023] [Accepted: 06/26/2023] [Indexed: 07/11/2023] Open

262

Ju H, Bai J, Jiang J, Che Y, Chen X. Comparative evaluation and analysis of DNA N4-methylcytosine methylation sites using deep learning. Front Genet 2023;14:1254827. [PMID: 37671040 PMCID: PMC10476523 DOI: 10.3389/fgene.2023.1254827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 07/31/2023] [Indexed: 09/07/2023] Open

263

Yelmen B, Jay F. An Overview of Deep Generative Models in Functional and Evolutionary Genomics. Annu Rev Biomed Data Sci 2023;6:173-189. [PMID: 37137168 DOI: 10.1146/annurev-biodatasci-020722-115651] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]

264

Aspromonte MC, Conte AD, Zhu S, Tan W, Shen Y, Zhang Y, Li Q, Wang MH, Babbi G, Bovo S, Martelli PL, Casadio R, Althagafi A, Toonsi S, Kulmanov M, Hoehndorf R, Katsonis P, Williams A, Lichtarge O, Xian S, Surento W, Pejaver V, Mooney SD, Sunderam U, Srinivasan R, Murgia A, Piovesan D, Tosatto SCE, Leonardi E. CAGI6 ID-Challenge: Assessment of phenotype and variant predictions in 415 children with Neurodevelopmental Disorders (NDDs). RESEARCH SQUARE 2023:rs.3.rs-3209168. [PMID: 37577579 PMCID: PMC10418555 DOI: 10.21203/rs.3.rs-3209168/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]

Affiliation(s)

Maria Cristina Aspromonte Department of Biomedical Sciences, University of Padova
Alessio Del Conte Department of Biomedical Sciences, University of Padova
Shaowen Zhu Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843
Wuwei Tan Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843
Yang Shen Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843
Yexian Zhang CUHK Shenzhen Research Institute, Shenzhen
Qi Li CUHK Shenzhen Research Institute, Shenzhen
Maggie Haitian Wang CUHK Shenzhen Research Institute, Shenzhen
Giulia Babbi Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna
Samuele Bovo Department of Agricultural and Food Sciences, University of Bologna
Pier Luigi Martelli Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna
Rita Casadio Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna
Azza Althagafi Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23
Sumyyah Toonsi Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23
Maxat Kulmanov Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23
Robert Hoehndorf Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23
Panagiotis Katsonis Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030
Amanda Williams Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030
Olivier Lichtarge Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030
Su Xian Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA 98195
Wesley Surento Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA 98195
Vikas Pejaver Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029
Sean D Mooney Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA 98195
Uma Sunderam Innovation Labs, Tata Consultancy Services, Hyderabad
Rajgopal Srinivasan Innovation Labs, Tata Consultancy Services, Hyderabad
Alessandra Murgia Department of Women's and Children's Health, University of Padova
Damiano Piovesan Department of Biomedical Sciences, University of Padova
Silvio C E Tosatto Department of Biomedical Sciences, University of Padova
Emanuela Leonardi Department of Biomedical Sciences, University of Padova

Collapse

265

Myronov A, Mazzocco G, Król P, Plewczynski D. BERTrand-peptide:TCR binding prediction using Bidirectional Encoder Representations from Transformers augmented with random TCR pairing. Bioinformatics 2023;39:btad468. [PMID: 37535685 PMCID: PMC10444968 DOI: 10.1093/bioinformatics/btad468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 06/28/2023] [Accepted: 08/01/2023] [Indexed: 08/05/2023] Open

266

Li L, Xue Z, Du X. ASCRB: Multi-view based attentional feature selection for CircRNA-binding site prediction. Comput Biol Med 2023;162:107077. [PMID: 37290390 DOI: 10.1016/j.compbiomed.2023.107077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 05/15/2023] [Accepted: 05/27/2023] [Indexed: 06/10/2023]

267

Chao KH, Mao A, Salzberg SL, Pertea M. Splam: a deep-learning-based splice site predictor that improves spliced alignments. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.27.550754. [PMID: 37546880 PMCID: PMC10402160 DOI: 10.1101/2023.07.27.550754] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]

268

Zhu C, Baumgarten N, Wu M, Wang Y, Das AP, Kaur J, Ardakani FB, Duong TT, Pham MD, Duda M, Dimmeler S, Yuan T, Schulz MH, Krishnan J. CVD-associated SNPs with regulatory potential reveal novel non-coding disease genes. Hum Genomics 2023;17:69. [PMID: 37491351 PMCID: PMC10369730 DOI: 10.1186/s40246-023-00513-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Accepted: 07/12/2023] [Indexed: 07/27/2023] Open

Abstract

BACKGROUND

Cardiovascular diseases (CVDs) are the leading cause of death worldwide. Genome-wide association studies (GWAS) have identified many single nucleotide polymorphisms (SNPs) appearing in non-coding genomic regions in CVDs. The SNPs may alter gene expression by modifying transcription factor (TF) binding sites and lead to functional consequences in cardiovascular traits or diseases. To understand the underlying molecular mechanisms, it is crucial to identify which variations are involved and how they affect TF binding.

METHODS

The SNEEP (SNP exploration and analysis using epigenomics data) pipeline was used to identify regulatory SNPs, which alter the binding behavior of TFs and link GWAS SNPs to their potential target genes for six CVDs. The human-induced pluripotent stem cells derived cardiomyocytes (hiPSC-CMs), monoculture cardiac organoids (MCOs) and self-organized cardiac organoids (SCOs) were used in the study. Gene expression, cardiomyocyte size and cardiac contractility were assessed.

RESULTS

By using our integrative computational pipeline, we identified 1905 regulatory SNPs in CVD GWAS data. These were associated with hundreds of genes, half of them non-coding RNAs (ncRNAs), suggesting novel CVD genes. We experimentally tested 40 CVD-associated non-coding RNAs, among them RP11-98F14.11, RPL23AP92, IGBP1P1, and CTD-2383I20.1, which were upregulated in hiPSC-CMs, MCOs and SCOs under hypoxic conditions. Further experiments showed that IGBP1P1 depletion rescued expression of hypertrophic marker genes, reduced hypoxia-induced cardiomyocyte size and improved hypoxia-reduced cardiac contractility in hiPSC-CMs and MCOs.

CONCLUSIONS

IGBP1P1 is a novel ncRNA with key regulatory functions in modulating cardiomyocyte size and cardiac function in our disease models. Our data suggest ncRNA IGBP1P1 as a potential therapeutic target to improve cardiac function in CVDs.

Collapse

Affiliation(s)

Chaonan Zhu Institute for Cardiovascular Regeneration, Goethe University, 60590, Frankfurt Am Main, Germany Cardio-Pulmonary Institute, Goethe University Hospital, 60590, Frankfurt Am Main, Germany
Nina Baumgarten Institute for Cardiovascular Regeneration, Goethe University, 60590, Frankfurt Am Main, Germany German Center for Cardiovascular Research, Partner Site Rhein-Main, 60590, Frankfurt Am Main, Germany Cardio-Pulmonary Institute, Goethe University Hospital, 60590, Frankfurt Am Main, Germany
Meiqian Wu Institute for Cardiovascular Regeneration, Goethe University, 60590, Frankfurt Am Main, Germany
Yue Wang Institute for Cardiovascular Regeneration, Goethe University, 60590, Frankfurt Am Main, Germany
Arka Provo Das Institute for Cardiovascular Regeneration, Goethe University, 60590, Frankfurt Am Main, Germany Cardio-Pulmonary Institute, Goethe University Hospital, 60590, Frankfurt Am Main, Germany
Jaskiran Kaur Institute for Cardiovascular Regeneration, Goethe University, 60590, Frankfurt Am Main, Germany
Fatemeh Behjati Ardakani Institute for Cardiovascular Regeneration, Goethe University, 60590, Frankfurt Am Main, Germany German Center for Cardiovascular Research, Partner Site Rhein-Main, 60590, Frankfurt Am Main, Germany Cardio-Pulmonary Institute, Goethe University Hospital, 60590, Frankfurt Am Main, Germany
Thanh Thuy Duong Genome Biologics, Theodor-Stern-Kai 7, 60590, Frankfurt Am Main, Germany
Minh Duc Pham Institute for Cardiovascular Regeneration, Goethe University, 60590, Frankfurt Am Main, Germany Cardio-Pulmonary Institute, Goethe University Hospital, 60590, Frankfurt Am Main, Germany Department of Medicine III, Cardiology/Angiology/ Nephrology, Goethe University Hospital, Frankfurt, Germany Genome Biologics, Theodor-Stern-Kai 7, 60590, Frankfurt Am Main, Germany
Maria Duda Genome Biologics, Theodor-Stern-Kai 7, 60590, Frankfurt Am Main, Germany
Stefanie Dimmeler Institute for Cardiovascular Regeneration, Goethe University, 60590, Frankfurt Am Main, Germany German Center for Cardiovascular Research, Partner Site Rhein-Main, 60590, Frankfurt Am Main, Germany Cardio-Pulmonary Institute, Goethe University Hospital, 60590, Frankfurt Am Main, Germany
Ting Yuan Institute for Cardiovascular Regeneration, Goethe University, 60590, Frankfurt Am Main, Germany. Cardio-Pulmonary Institute, Goethe University Hospital, 60590, Frankfurt Am Main, Germany. Department of Medicine III, Cardiology/Angiology/ Nephrology, Goethe University Hospital, Frankfurt, Germany.
Marcel H Schulz Institute for Cardiovascular Regeneration, Goethe University, 60590, Frankfurt Am Main, Germany. German Center for Cardiovascular Research, Partner Site Rhein-Main, 60590, Frankfurt Am Main, Germany. Cardio-Pulmonary Institute, Goethe University Hospital, 60590, Frankfurt Am Main, Germany.
Jaya Krishnan Institute for Cardiovascular Regeneration, Goethe University, 60590, Frankfurt Am Main, Germany. German Center for Cardiovascular Research, Partner Site Rhein-Main, 60590, Frankfurt Am Main, Germany. Cardio-Pulmonary Institute, Goethe University Hospital, 60590, Frankfurt Am Main, Germany. Department of Medicine III, Cardiology/Angiology/ Nephrology, Goethe University Hospital, Frankfurt, Germany.

Collapse

269

Li H, He X, Kurowski L, Zhang R, Zhao D, Zeng J. Improving comparative analyses of Hi-C data via contrastive self-supervised learning. Brief Bioinform 2023;24:bbad193. [PMID: 37287135 DOI: 10.1093/bib/bbad193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 04/12/2023] [Accepted: 04/27/2023] [Indexed: 06/09/2023] Open

270

Prakash A, Banerjee M. An interpretable block-attention network for identifying regulatory feature interactions. Brief Bioinform 2023;24:bbad250. [PMID: 37401370 DOI: 10.1093/bib/bbad250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 05/15/2023] [Accepted: 06/16/2023] [Indexed: 07/05/2023] Open

271

Wu KE, Zou JY, Chang H. Machine learning modeling of RNA structures: methods, challenges and future perspectives. Brief Bioinform 2023;24:bbad210. [PMID: 37280185 DOI: 10.1093/bib/bbad210] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 05/12/2023] [Accepted: 05/17/2023] [Indexed: 06/08/2023] Open

272

Liu R, Hu YF, Huang JD, Fan X. A Bayesian approach to estimate MHC-peptide binding threshold. Brief Bioinform 2023;24:bbad208. [PMID: 37279464 DOI: 10.1093/bib/bbad208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 05/08/2023] [Accepted: 05/16/2023] [Indexed: 06/08/2023] Open

273

Zhang Z, Feng F, Qiu Y, Liu J. A generalizable framework to comprehensively predict epigenome, chromatin organization, and transcriptome. Nucleic Acids Res 2023;51:5931-5947. [PMID: 37224527 PMCID: PMC10325920 DOI: 10.1093/nar/gkad436] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 03/31/2023] [Accepted: 05/09/2023] [Indexed: 05/26/2023] Open

274

Umerenkov D, Herbert A, Konovalov D, Danilova A, Beknazarov N, Kokh V, Fedorov A, Poptsova M. Z-flipon variants reveal the many roles of Z-DNA and Z-RNA in health and disease. Life Sci Alliance 2023;6:e202301962. [PMID: 37164635 PMCID: PMC10172764 DOI: 10.26508/lsa.202301962] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 04/25/2023] [Accepted: 04/28/2023] [Indexed: 05/12/2023] Open

275

Biharie K, Michielsen L, Reinders MJT, Mahfouz A. Cell type matching across species using protein embeddings and transfer learning. Bioinformatics 2023;39:i404-i412. [PMID: 37387141 PMCID: PMC10311290 DOI: 10.1093/bioinformatics/btad248] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open

276

Valeri JA, Soenksen LR, Collins KM, Ramesh P, Cai G, Powers R, Angenent-Mari NM, Camacho DM, Wong F, Lu TK, Collins JJ. BioAutoMATED: An end-to-end automated machine learning tool for explanation and design of biological sequences. Cell Syst 2023;14:525-542.e9. [PMID: 37348466 PMCID: PMC10700034 DOI: 10.1016/j.cels.2023.05.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 02/17/2023] [Accepted: 05/22/2023] [Indexed: 06/24/2023]

Affiliation(s)

Jacqueline A Valeri Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Luis R Soenksen Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA
Katherine M Collins Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Department of Engineering, University of Cambridge, Trumpington St, Cambridge CB2 1PZ, UK
Pradeep Ramesh Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
George Cai Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
Rani Powers Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Pluto Biosciences, Golden, CO 80402, USA
Nicolaas M Angenent-Mari Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
Diogo M Camacho Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
Felix Wong Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Timothy K Lu Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Synthetic Biology Group, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
James J Collins Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Harvard-MIT Program in Health Sciences and Technology, Cambridge, MA 02139, USA; Abdul Latif Jameel Clinic for Machine Learning in Health, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

Collapse

277

Luo H, Li Y, Liu H, Ding P, Yu Y, Luo L. SENet: A deep learning framework for discriminating super- and typical enhancers by sequence information. Comput Biol Chem 2023;105:107905. [PMID: 37348298 DOI: 10.1016/j.compbiolchem.2023.107905] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Revised: 05/08/2023] [Accepted: 06/09/2023] [Indexed: 06/24/2023]

278

Zhu H, Liu T, Wang Z. scHiMe: predicting single-cell DNA methylation levels based on single-cell Hi-C data. Brief Bioinform 2023:7193585. [PMID: 37302805 PMCID: PMC10359091 DOI: 10.1093/bib/bbad223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 05/10/2023] [Accepted: 05/23/2023] [Indexed: 06/13/2023] Open

279

Yang R, Das A, Gao VR, Karbalayghareh A, Noble WS, Bilmes JA, Leslie CS. Epiphany: predicting Hi-C contact maps from 1D epigenomic signals. Genome Biol 2023;24:134. [PMID: 37280678 PMCID: PMC10242996 DOI: 10.1186/s13059-023-02934-9] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 04/06/2023] [Indexed: 06/08/2023] Open

280

Zhou M, Zhang H, Baii Z, Mann-Krzisnik D, Wang F, Li Y. Single-cell multi-omic topic embedding reveals cell-type-specific and COVID-19 severity-related immune signatures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.31.526312. [PMID: 36778483 PMCID: PMC9915637 DOI: 10.1101/2023.01.31.526312] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

281

Cao C, Yang S, Li M, Li C. CircSSNN: circRNA-binding site prediction via sequence self-attention neural networks with pre-normalization. BMC Bioinformatics 2023;24:220. [PMID: 37254080 DOI: 10.1186/s12859-023-05352-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Accepted: 05/25/2023] [Indexed: 06/01/2023] Open

Abstract

BACKGROUND

Circular RNAs (circRNAs) play a significant role in some diseases by acting as transcription templates. Therefore, analyzing the interaction mechanism between circRNA and RNA-binding proteins (RBPs) has far-reaching implications for the prevention and treatment of diseases. Existing models for circRNA-RBP identification usually adopt convolution neural network (CNN), recurrent neural network (RNN), or their variants as feature extractors. Most of them have drawbacks such as poor parallelism, insufficient stability, and inability to capture long-term dependencies.

METHODS

In this paper, we propose a new method completely using the self-attention mechanism to capture deep semantic features of RNA sequences. On this basis, we construct a CircSSNN model for the cirRNA-RBP identification. The proposed model constructs a feature scheme by fusing circRNA sequence representations with statistical distributions, static local contexts, and dynamic global contexts. With a stable and efficient network architecture, the distance between any two positions in a sequence is reduced to a constant, so CircSSNN can quickly capture the long-term dependencies and extract the deep semantic features.

RESULTS

Experiments on 37 circRNA datasets show that the proposed model has overall advantages in stability, parallelism, and prediction performance. Keeping the network structure and hyperparameters unchanged, we directly apply the CircSSNN to linRNA datasets. The favorable results show that CircSSNN can be transformed simply and efficiently without task-oriented tuning.

CONCLUSIONS

In conclusion, CircSSNN can serve as an appealing circRNA-RBP identification tool with good identification performance, excellent scalability, and wide application scope without the need for task-oriented fine-tuning of parameters, which is expected to reduce the professional threshold required for hyperparameter tuning in bioinformatics analysis.

Collapse

282

Shen L, Feng H, Qiu Y, Wei GW. SVSBI: sequence-based virtual screening of biomolecular interactions. Commun Biol 2023;6:536. [PMID: 37202415 PMCID: PMC10195826 DOI: 10.1038/s42003-023-04866-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Accepted: 04/24/2023] [Indexed: 05/20/2023] Open

283

Smith GD, Ching WH, Cornejo-Páramo P, Wong ES. Decoding enhancer complexity with machine learning and high-throughput discovery. Genome Biol 2023;24:116. [PMID: 37173718 PMCID: PMC10176946 DOI: 10.1186/s13059-023-02955-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 04/28/2023] [Indexed: 05/15/2023] Open

284

Mourad R. Semi-supervised learning improves regulatory sequence prediction with unlabeled sequences. BMC Bioinformatics 2023;24:186. [PMID: 37147561 PMCID: PMC10163727 DOI: 10.1186/s12859-023-05303-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 04/25/2023] [Indexed: 05/07/2023] Open

285

Grešová K, Martinek V, Čechák D, Šimeček P, Alexiou P. Genomic benchmarks: a collection of datasets for genomic sequence classification. BMC Genom Data 2023;24:25. [PMID: 37127596 PMCID: PMC10150520 DOI: 10.1186/s12863-023-01123-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 03/31/2023] [Indexed: 05/03/2023] Open

Abstract

BACKGROUND

Recently, deep neural networks have been successfully applied in many biological fields. In 2020, a deep learning model AlphaFold won the protein folding competition with predicted structures within the error tolerance of experimental methods. However, this solution to the most prominent bioinformatic challenge of the past 50 years has been possible only thanks to a carefully curated benchmark of experimentally predicted protein structures. In Genomics, we have similar challenges (annotation of genomes and identification of functional elements) but currently, we lack benchmarks similar to protein folding competition.

RESULTS

Here we present a collection of curated and easily accessible sequence classification datasets in the field of genomics. The proposed collection is based on a combination of novel datasets constructed from the mining of publicly available databases and existing datasets obtained from published articles. The collection currently contains nine datasets that focus on regulatory elements (promoters, enhancers, open chromatin region) from three model organisms: human, mouse, and roundworm. A simple convolution neural network is also included in a repository and can be used as a baseline model. Benchmarks and the baseline model are distributed as the Python package 'genomic-benchmarks', and the code is available at https://github.com/ML-Bioinfo-CEITEC/genomic_benchmarks .

CONCLUSIONS

Deep learning techniques revolutionized many biological fields but mainly thanks to the carefully curated benchmarks. For the field of Genomics, we propose a collection of benchmark datasets for the classification of genomic sequences with an interface for the most commonly used deep learning libraries, implementation of the simple neural network and a training framework that can be used as a starting point for future research. The main aim of this effort is to create a repository for shared datasets that will make machine learning for genomics more comparable and reproducible while reducing the overhead of researchers who want to enter the field, leading to healthy competition and new discoveries.

Collapse

286

Rios-Martinez C, Bhattacharya N, Amini AP, Crawford L, Yang KK. Deep self-supervised learning for biosynthetic gene cluster detection and product classification. PLoS Comput Biol 2023;19:e1011162. [PMID: 37220151 PMCID: PMC10241353 DOI: 10.1371/journal.pcbi.1011162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Revised: 06/05/2023] [Accepted: 05/07/2023] [Indexed: 05/25/2023] Open

287

Soylu NN, Sefer E. BERT2OME: Prediction of 2'-O-Methylation Modifications From RNA Sequence by Transformer Architecture Based on BERT. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:2177-2189. [PMID: 37819796 DOI: 10.1109/tcbb.2023.3237769] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]

288

Nikolados EM, Oyarzún DA. Deep learning for optimization of protein expression. Curr Opin Biotechnol 2023;81:102941. [PMID: 37087839 DOI: 10.1016/j.copbio.2023.102941] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 02/02/2023] [Accepted: 03/17/2023] [Indexed: 04/25/2023]

289

Kravchuk EV, Ashniev GA, Gladkova MG, Orlov AV, Vasileva AV, Boldyreva AV, Burenin AG, Skirda AM, Nikitin PI, Orlova NN. Experimental Validation and Prediction of Super-Enhancers: Advances and Challenges. Cells 2023;12:cells12081191. [PMID: 37190100 DOI: 10.3390/cells12081191] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 04/07/2023] [Accepted: 04/14/2023] [Indexed: 05/17/2023] Open

290

Barbero-Aparicio JA, Olivares-Gil A, Díez-Pastor JF, García-Osorio C. Deep learning and support vector machines for transcription start site identification. PeerJ Comput Sci 2023;9:e1340. [PMID: 37346545 PMCID: PMC10280436 DOI: 10.7717/peerj-cs.1340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 03/21/2023] [Indexed: 06/23/2023]

Abstract

Recognizing transcription start sites is key to gene identification. Several approaches have been employed in related problems such as detecting translation initiation sites or promoters, many of the most recent ones based on machine learning. Deep learning methods have been proven to be exceptionally effective for this task, but their use in transcription start site identification has not yet been explored in depth. Also, the very few existing works do not compare their methods to support vector machines (SVMs), the most established technique in this area of study, nor provide the curated dataset used in the study. The reduced amount of published papers in this specific problem could be explained by this lack of datasets. Given that both support vector machines and deep neural networks have been applied in related problems with remarkable results, we compared their performance in transcription start site predictions, concluding that SVMs are computationally much slower, and deep learning methods, specially long short-term memory neural networks (LSTMs), are best suited to work with sequences than SVMs. For such a purpose, we used the reference human genome GRCh38. Additionally, we studied two different aspects related to data processing: the proper way to generate training examples and the imbalanced nature of the data. Furthermore, the generalization performance of the models studied was also tested using the mouse genome, where the LSTM neural network stood out from the rest of the algorithms. To sum up, this article provides an analysis of the best architecture choices in transcription start site identification, as well as a method to generate transcription start site datasets including negative instances on any species available in Ensembl. We found that deep learning methods are better suited than SVMs to solve this problem, being more efficient and better adapted to long sequences and large amounts of data. We also create a transcription start site (TSS) dataset large enough to be used in deep learning experiments.

Collapse

291

Joiret M, Leclercq M, Lambrechts G, Rapino F, Close P, Louppe G, Geris L. Cracking the genetic code with neural networks. Front Artif Intell 2023;6:1128153. [PMID: 37091301 PMCID: PMC10117997 DOI: 10.3389/frai.2023.1128153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 03/21/2023] [Indexed: 04/09/2023] Open

292

Rozowsky J, Gao J, Borsari B, Yang YT, Galeev T, Gürsoy G, Epstein CB, Xiong K, Xu J, Li T, Liu J, Yu K, Berthel A, Chen Z, Navarro F, Sun MS, Wright J, Chang J, Cameron CJF, Shoresh N, Gaskell E, Drenkow J, Adrian J, Aganezov S, Aguet F, Balderrama-Gutierrez G, Banskota S, Corona GB, Chee S, Chhetri SB, Cortez Martins GC, Danyko C, Davis CA, Farid D, Farrell NP, Gabdank I, Gofin Y, Gorkin DU, Gu M, Hecht V, Hitz BC, Issner R, Jiang Y, Kirsche M, Kong X, Lam BR, Li S, Li B, Li X, Lin KZ, Luo R, Mackiewicz M, Meng R, Moore JE, Mudge J, Nelson N, Nusbaum C, Popov I, Pratt HE, Qiu Y, Ramakrishnan S, Raymond J, Salichos L, Scavelli A, Schreiber JM, Sedlazeck FJ, See LH, Sherman RM, Shi X, Shi M, Sloan CA, Strattan JS, Tan Z, Tanaka FY, Vlasova A, Wang J, Werner J, Williams B, Xu M, Yan C, Yu L, Zaleski C, Zhang J, Ardlie K, Cherry JM, Mendenhall EM, Noble WS, Weng Z, Levine ME, Dobin A, Wold B, Mortazavi A, Ren B, Gillis J, Myers RM, Snyder MP, Choudhary J, Milosavljevic A, Schatz MC, Bernstein BE, et alRozowsky J, Gao J, Borsari B, Yang YT, Galeev T, Gürsoy G, Epstein CB, Xiong K, Xu J, Li T, Liu J, Yu K, Berthel A, Chen Z, Navarro F, Sun MS, Wright J, Chang J, Cameron CJF, Shoresh N, Gaskell E, Drenkow J, Adrian J, Aganezov S, Aguet F, Balderrama-Gutierrez G, Banskota S, Corona GB, Chee S, Chhetri SB, Cortez Martins GC, Danyko C, Davis CA, Farid D, Farrell NP, Gabdank I, Gofin Y, Gorkin DU, Gu M, Hecht V, Hitz BC, Issner R, Jiang Y, Kirsche M, Kong X, Lam BR, Li S, Li B, Li X, Lin KZ, Luo R, Mackiewicz M, Meng R, Moore JE, Mudge J, Nelson N, Nusbaum C, Popov I, Pratt HE, Qiu Y, Ramakrishnan S, Raymond J, Salichos L, Scavelli A, Schreiber JM, Sedlazeck FJ, See LH, Sherman RM, Shi X, Shi M, Sloan CA, Strattan JS, Tan Z, Tanaka FY, Vlasova A, Wang J, Werner J, Williams B, Xu M, Yan C, Yu L, Zaleski C, Zhang J, Ardlie K, Cherry JM, Mendenhall EM, Noble WS, Weng Z, Levine ME, Dobin A, Wold B, Mortazavi A, Ren B, Gillis J, Myers RM, Snyder MP, Choudhary J, Milosavljevic A, Schatz MC, Bernstein BE, Guigó R, Gingeras TR, Gerstein M. The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models. Cell 2023;186:1493-1511.e40. [PMID: 37001506 PMCID: PMC10074325 DOI: 10.1016/j.cell.2023.02.018] [Show More Authors] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 10/16/2022] [Accepted: 02/10/2023] [Indexed: 04/03/2023]

Affiliation(s)

Joel Rozowsky Section on Biomedical Informatics and Data Science, Yale University, New Haven, CT, USA; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Jiahao Gao Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Beatrice Borsari Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA; Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
Yucheng T Yang Institute of Science and Technology for Brain-Inspired Intelligence; MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence; MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Timur Galeev Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Gamze Gürsoy Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Charles B Epstein Broad Institute of MIT and Harvard, Cambridge, MA, USA
Kun Xiong Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Jinrui Xu Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Tianxiao Li Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Jason Liu Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Keyang Yu Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
Ana Berthel Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Zhanlin Chen Department of Statistics and Data Science, Yale University, New Haven, CT, USA
Fabio Navarro Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Maxwell S Sun Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
James Wright Institute of Cancer Research, London, UK
Justin Chang Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Christopher J F Cameron Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Noam Shoresh Broad Institute of MIT and Harvard, Cambridge, MA, USA
Elizabeth Gaskell Broad Institute of MIT and Harvard, Cambridge, MA, USA
Jorg Drenkow Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
Jessika Adrian Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
Sergey Aganezov Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
François Aguet Broad Institute of MIT and Harvard, Cambridge, MA, USA
Gabriela Balderrama-Gutierrez Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA, USA
Samridhi Banskota Broad Institute of MIT and Harvard, Cambridge, MA, USA
Guillermo Barreto Corona Broad Institute of MIT and Harvard, Cambridge, MA, USA
Sora Chee Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
Surya B Chhetri HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
Gabriel Conte Cortez Martins Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Cassidy Danyko Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
Carrie A Davis Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
Daniel Farid Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Nina P Farrell Broad Institute of MIT and Harvard, Cambridge, MA, USA
Idan Gabdank Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
Yoel Gofin Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
David U Gorkin Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
Mengting Gu Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Vivian Hecht Broad Institute of MIT and Harvard, Cambridge, MA, USA
Benjamin C Hitz Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
Robbyn Issner Broad Institute of MIT and Harvard, Cambridge, MA, USA
Yunzhe Jiang Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Melanie Kirsche Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
Xiangmeng Kong Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Bonita R Lam Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
Shantao Li Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Bian Li Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Xiqi Li Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
Khine Zin Lin Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
Ruibang Luo Department of Computer Science, The University of Hong Kong, Hong Kong, CHN
Mark Mackiewicz HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
Ran Meng Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Jill E Moore Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
Jonathan Mudge European Bioinformatics Institute, Cambridge, Cambridgeshire, GB
Nicholas Nelson Broad Institute of MIT and Harvard, Cambridge, MA, USA
Chad Nusbaum Broad Institute of MIT and Harvard, Cambridge, MA, USA
Ioann Popov Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Henry E Pratt Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
Yunjiang Qiu Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
Srividya Ramakrishnan Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
Joe Raymond Broad Institute of MIT and Harvard, Cambridge, MA, USA
Leonidas Salichos Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA; Department of Biological and Chemical Sciences, New York Institute of Technology, Old Westbury, NY, USA
Alexandra Scavelli Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
Jacob M Schreiber Department of Genome Sciences, University of Washington, Seattle, WA, USA
Fritz J Sedlazeck Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA; Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
Lei Hoon See Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
Rachel M Sherman Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
Xu Shi Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Minyi Shi Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
Cricket Alicia Sloan Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
J Seth Strattan Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
Zhen Tan Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Forrest Y Tanaka Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
Anna Vlasova Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain; Comparative Genomics Group, Life Science Programme, Barcelona Supercomputing Centre, Barcelona, Spain; Institute of Research in Biomedicine, Barcelona, Spain
Jun Wang Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Jonathan Werner Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
Brian Williams Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
Min Xu Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Chengfei Yan Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Lu Yu Institute of Cancer Research, London, UK
Christopher Zaleski Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
Jing Zhang Department of Computer Science, University of California, Irvine, Irvine, CA, USA
Kristin Ardlie Broad Institute of MIT and Harvard, Cambridge, MA, USA
J Michael Cherry Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
Eric M Mendenhall HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
William S Noble Department of Genome Sciences, University of Washington, Seattle, WA, USA
Zhiping Weng Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
Morgan E Levine Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Pathology, Yale University School of Medicine, New Haven, CT, USA
Alexander Dobin Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
Barbara Wold Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
Ali Mortazavi Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA, USA
Bing Ren Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
Jesse Gillis Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA; Department of Physiology, University of Toronto, Toronto, ON, Canada
Richard M Myers HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
Michael P Snyder Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
Jyoti Choudhary Institute of Cancer Research, London, UK
Aleksandar Milosavljevic Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
Michael C Schatz Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA; Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
Bradley E Bernstein Broad Institute of MIT and Harvard, Cambridge, MA, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
Roderic Guigó Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain; Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.
Thomas R Gingeras Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
Mark Gerstein Section on Biomedical Informatics and Data Science, Yale University, New Haven, CT, USA; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA; Department of Statistics and Data Science, Yale University, New Haven, CT, USA; Department of Computer Science, Yale University, New Haven, CT, USA.

Collapse

293

Wang L, Sun J, Ma S, Xia J, Li X. PredDSMC: A predictor for driver synonymous mutations in human cancers. Front Genet 2023;14:1164593. [PMID: 37051593 PMCID: PMC10083435 DOI: 10.3389/fgene.2023.1164593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 03/09/2023] [Indexed: 03/29/2023] Open

294

Huang Z, Wang J, Lu X, Mohd Zain A, Yu G. scGGAN: single-cell RNA-seq imputation by graph-based generative adversarial network. Brief Bioinform 2023;24:7024714. [PMID: 36733262 DOI: 10.1093/bib/bbad040] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 12/21/2022] [Accepted: 01/18/2023] [Indexed: 02/04/2023] Open

295

M6A-BERT-Stacking: A Tissue-Specific Predictor for Identifying RNA N6-Methyladenosine Sites Based on BERT and Stacking Strategy. Symmetry (Basel) 2023. [DOI: 10.3390/sym15030731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023] Open

296

Luo H, Shan W, Chen C, Ding P, Luo L. Improving language model of human genome for DNA-protein binding prediction based on task-specific pre-training. Interdiscip Sci 2023;15:32-43. [PMID: 36136096 DOI: 10.1007/s12539-022-00537-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 08/30/2022] [Accepted: 09/07/2022] [Indexed: 11/27/2022]

297

Wang X, Zhang M, Long C, Yao L, Zhu M. Self-Attention Based Neural Network for Predicting RNA-Protein Binding Sites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:1469-1479. [PMID: 36067103 DOI: 10.1109/tcbb.2022.3204661] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

298

Hwang H, Chang HR, Baek D. Determinants of Functional MicroRNA Targeting. Mol Cells 2023;46:21-32. [PMID: 36697234 PMCID: PMC9880601 DOI: 10.14348/molcells.2023.2157] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 11/09/2022] [Accepted: 11/15/2022] [Indexed: 01/27/2023] Open

299

Jonnakuti VS, Wagner EJ, Maletić-Savatić M, Liu Z, Yalamanchili HK. PolyAMiner-Bulk: A Machine Learning Based Bioinformatics Algorithm to Infer and Decode Alternative Polyadenylation Dynamics from bulk RNA-seq data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.23.523471. [PMID: 36747700 PMCID: PMC9900750 DOI: 10.1101/2023.01.23.523471] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

300

Li Z, Gao E, Zhou J, Han W, Xu X, Gao X. Applications of deep learning in understanding gene regulation. CELL REPORTS METHODS 2023;3:100384. [PMID: 36814848 PMCID: PMC9939384 DOI: 10.1016/j.crmeth.2022.100384] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]

Affiliation(s)

Zhongxiao Li Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Elva Gao The KAUST School, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Juexiao Zhou Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Wenkai Han Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Xiaopeng Xu Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Xin Gao Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia

Collapse