1
|
Frederick MJ, Perez-Bello D, Yadollahi P, Castro P, Frederick A, Frederick A, Osman RA, Essien F, Yebra I, Hamlin A, Ow TJ, Skinner HD, Sandulache VC. Reliable RNA-seq analysis from FFPE specimens as a means to accelerate cancer-related health disparities research. PLoS One 2025; 20:e0321631. [PMID: 40258023 PMCID: PMC12011225 DOI: 10.1371/journal.pone.0321631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2024] [Accepted: 03/10/2025] [Indexed: 04/23/2025] Open
Abstract
Whole transcriptome sequencing (WTS/ RNA-Seq) is a ubiquitous tool for investigating cancer biology. RNA isolated from frozen sources limits possible studies for analysis of associations with phenotypes or clinical variables requiring long-term follow-up. Although good correlations are reported in RNA-Seq data from paired frozen and formalin fixed paraffin embedded (FFPE) samples, uncertainties regarding RNA quality, methods of extraction, and data reliability are hurdles to utilization of archival samples. We compared three different platforms for performing RNA-seq using archival FFPE oropharyngeal squamous carcinoma (OPSCC) specimens stored up to 20 years, as part of an investigation of transcriptional profiles related to health disparities. We developed guidelines to purify DNA and RNA from FFPE tissue and perform downstream RNA-seq and DNA SNP arrays. RNA was extracted from 150 specimens, with an average yield of 401.8 ng/cm2 of tissue. Most samples yielded sufficient RNA reads >13,000 protein coding genes which could be used to differentiate HPV-associated from HPV-independent OPSCCs. Co-isolated DNA was used to identify reliably define patient ancestry which correlated well with patient-reported race. Utilizing the methods described in this study provides a robust, reliable, and standardized means of DNA & RNA extraction from FFPE as well as a means by which to assure the quality of the data generated. Optimized RNA extraction techniques, combined with robust bioinformatic approaches designed to optimize data homogenization, analysis and biological validation can revolutionize our ability to transcriptomically profile large solid tumor sets derived from ancestrally varied patient populations.
Collapse
Affiliation(s)
- Mitchell J. Frederick
- Bobby R. Alford Department of Otolaryngology Head and Neck Surgery, Baylor College of Medicine, Houston, Texas, United States of America
| | - Dannelys Perez-Bello
- Bobby R. Alford Department of Otolaryngology Head and Neck Surgery, Baylor College of Medicine, Houston, Texas, United States of America
| | - Pedram Yadollahi
- Bobby R. Alford Department of Otolaryngology Head and Neck Surgery, Baylor College of Medicine, Houston, Texas, United States of America
| | - Patricia Castro
- Department of Pathology & Immunology, Baylor College of Medicine, Houston, TX, United States
| | | | | | - Rashid A. Osman
- - Department of Biological Sciences, Vanderbilt University College of Arts and Science, Nashville, Tennessee, United States of America
| | - Fonma Essien
- Bobby R. Alford Department of Otolaryngology Head and Neck Surgery, Baylor College of Medicine, Houston, Texas, United States of America
| | - Imelda Yebra
- Bobby R. Alford Department of Otolaryngology Head and Neck Surgery, Baylor College of Medicine, Houston, Texas, United States of America
| | - Ashley Hamlin
- Bobby R. Alford Department of Otolaryngology Head and Neck Surgery, Baylor College of Medicine, Houston, Texas, United States of America
| | - Thomas J. Ow
- Department of Otorhinolaryngology-Head and Neck Surgery, Montefiore Medical Center, Bronx, New York, United States of America
- Department of Pathology, Montefiore Medical Center, Bronx, New York, United States of America
| | - Heath D. Skinner
- Department of Radiation Oncology, UPMC Hilman Cancer Center, Pittsburgh, Pennsylvania, United States of America
| | - Vlad C. Sandulache
- Bobby R. Alford Department of Otolaryngology Head and Neck Surgery, Baylor College of Medicine, Houston, Texas, United States of America
- ENT Section, Operative CareLine, Michael E. DeBakey VAMC, Houston, Texas, United States of America
- Center for Translational Research on Inflammatory Diseases, Michael E. DeBakey Veterans Affairs Medical Center, Houston, Texas, United States of America
| |
Collapse
|
2
|
Frederick MJ, Perez-Bello D, Yadollahi P, Castro P, Frederick A, Frederick A, Osman RA, Essien F, Yebra I, Hamlin A, Ow TJ, Skinner HD, Sandulache VC. Reliable RNA-seq analysis from FFPE specimens as a means to accelerate cancer-related health disparities research. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.10.617597. [PMID: 39416147 PMCID: PMC11482925 DOI: 10.1101/2024.10.10.617597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2024]
Abstract
Whole transcriptome sequencing (WTS/ RNA-Seq) is a ubiquitous tool for investigating cancer biology. RNA isolated from frozen sources limits possible studies for analysis of associations with phenotypes or clinical variables requiring long-term follow-up. Although good correlations are reported in RNA-Seq data from paired frozen and formalin fixed paraffin embedded (FFPE) samples, uncertainties regarding RNA quality, methods of extraction, and data reliability are hurdles to utilization of archival samples. We compared three different platforms for performing RNA-seq using archival FFPE oropharyngeal squamous carcinoma (OPSCC) specimens stored up to 20 years, as part of an investigation of transcriptional profiles related to health disparities. We developed guidelines to purify DNA and RNA from FFPE tissue and perform downstream RNA-seq and DNA SNP arrays. RNA was extracted from 150 specimens, with an average yield of 401.8 ng/cm 2 of tissue. Most samples yielded sufficient RNA reads >13,000 protein coding genes which could be used to differentiate HPV-associated from HPV-independent OPSCCs. Co-isolated DNA was used to identify patient ancestry. Utilizing the methods described in this study provides a robust, reliable, and standardized means of DNA & RNA extraction from FFPE as well as a means by which to assure the quality of the data generated.
Collapse
|
3
|
Yeh CH, Chou YJ, Tsai TH, Hsu PWC, Li CH, Chan YH, Tsai SF, Ng SC, Chou KM, Lin YC, Juan YH, Fu TC, Lai CC, Sytwu HK, Tsai TF. Artificial-Intelligence-Assisted Discovery of Genetic Factors for Precision Medicine of Antiplatelet Therapy in Diabetic Peripheral Artery Disease. Biomedicines 2022; 10:biomedicines10010116. [PMID: 35052795 PMCID: PMC8773099 DOI: 10.3390/biomedicines10010116] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Revised: 12/30/2021] [Accepted: 01/04/2022] [Indexed: 12/15/2022] Open
Abstract
An increased risk of cardiovascular events was identified in patients with peripheral artery disease (PAD). Clopidogrel is one of the most widely used antiplatelet medications. However, there are heterogeneous outcomes when clopidogrel is used to prevent cardiovascular events in PAD patients. Here, we use an artificial intelligence (AI)-assisted methodology to identify genetic factors potentially involved in the clopidogrel-resistant mechanism, which is currently unclear. Several discoveries can be pinpointed. Firstly, a high proportion (>50%) of clopidogrel resistance was found among diabetic PAD patients in Taiwan. Interestingly, our result suggests that platelet function test-guided antiplatelet therapy appears to reduce the post-interventional occurrence of major adverse cerebrovascular and cardiac events in diabetic PAD patients. Secondly, AI-assisted genome-wide association study of a single-nucleotide polymorphism (SNP) database identified a SNP signature composed of 20 SNPs, which are mapped into 9 protein-coding genes (SLC37A2, IQSEC1, WASHC3, PSD3, BTBD7, GLIS3, PRDM11, LRBA1, and CNR1). Finally, analysis of the protein connectivity map revealed that LRBA, GLIS3, BTBD7, IQSEC1, and PSD3 appear to form a protein interaction network. Intriguingly, the genetic factors seem to pinpoint a pathway related to endocytosis and recycling of P2Y12 receptor, which is the drug target of clopidogrel. Our findings reveal that a combination of AI-assisted discovery of SNP signatures and clinical parameters has the potential to develop an ethnic-specific precision medicine for antiplatelet therapy in diabetic PAD patients.
Collapse
Affiliation(s)
- Chi-Hsiao Yeh
- Department of Thoracic and Cardiovascular Surgery, Chang Gung Memorial Hospital, Taoyuan 333, Taiwan;
- College of Medicine, Chang Gung University, Taoyuan 333, Taiwan; (Y.-C.L.); (Y.-H.J.); (T.-C.F.)
- Community Medicine Research Center, Chang Gung Memorial Hospital, Keelung 204, Taiwan
| | - Yi-Ju Chou
- Institute of Molecular and Genomic Medicine, National Health Research Institutes, Miaoli 350, Taiwan; (Y.-J.C.); (P.W.-C.H.); (S.-F.T.)
| | - Tsung-Hsien Tsai
- Advanced Tech BU, Acer Inc., New Taipei City 221, Taiwan; (T.-H.T.); (C.-H.L.); (Y.-H.C.)
| | - Paul Wei-Che Hsu
- Institute of Molecular and Genomic Medicine, National Health Research Institutes, Miaoli 350, Taiwan; (Y.-J.C.); (P.W.-C.H.); (S.-F.T.)
| | - Chun-Hsien Li
- Advanced Tech BU, Acer Inc., New Taipei City 221, Taiwan; (T.-H.T.); (C.-H.L.); (Y.-H.C.)
| | - Yun-Hsuan Chan
- Advanced Tech BU, Acer Inc., New Taipei City 221, Taiwan; (T.-H.T.); (C.-H.L.); (Y.-H.C.)
| | - Shih-Feng Tsai
- Institute of Molecular and Genomic Medicine, National Health Research Institutes, Miaoli 350, Taiwan; (Y.-J.C.); (P.W.-C.H.); (S.-F.T.)
| | - Soh-Ching Ng
- Department of Internal Medicine, Division of Endocrinology and Metabolism, Chang Gung Memorial Hospital, Keelung 204, Taiwan; (S.-C.N.); (K.-M.C.)
| | - Kuei-Mei Chou
- Department of Internal Medicine, Division of Endocrinology and Metabolism, Chang Gung Memorial Hospital, Keelung 204, Taiwan; (S.-C.N.); (K.-M.C.)
| | - Yu-Ching Lin
- College of Medicine, Chang Gung University, Taoyuan 333, Taiwan; (Y.-C.L.); (Y.-H.J.); (T.-C.F.)
- Department of Medical Imaging and Intervention, Chang Gung Memorial Hospital, Keelung 204, Taiwan
| | - Yu-Hsiang Juan
- College of Medicine, Chang Gung University, Taoyuan 333, Taiwan; (Y.-C.L.); (Y.-H.J.); (T.-C.F.)
- Department of Medical Imaging and Intervention, Chang Gung Memorial Hospital, Keelung 204, Taiwan
| | - Tieh-Cheng Fu
- College of Medicine, Chang Gung University, Taoyuan 333, Taiwan; (Y.-C.L.); (Y.-H.J.); (T.-C.F.)
- Department of Physical Medicine and Rehabilitation, Chang Gung Memorial Hospital, Keelung 204, Taiwan
| | - Chi-Chun Lai
- College of Medicine, Chang Gung University, Taoyuan 333, Taiwan; (Y.-C.L.); (Y.-H.J.); (T.-C.F.)
- Community Medicine Research Center, Chang Gung Memorial Hospital, Keelung 204, Taiwan
- Department of Ophthalmology, Chang Gung Memorial Hospital, Keelung 204, Taiwan
- Correspondence: (C.-C.L.); (H.-K.S.); (T.-F.T.); Tel.: +886-2-24313131 (ext. 6101) (C.-C.L.); +886-37-206166 (ext. 31010) (H.-K.S.); +886-2-28267293 (T.-F.T.)
| | - Huey-Kang Sytwu
- National Institute of Infectious Diseases and Vaccinology, National Health Research Institutes, Miaoli 350, Taiwan
- National Defense Medical Center, Department & Graduate Institute of Microbiology and Immunology, Taipei 114, Taiwan
- Correspondence: (C.-C.L.); (H.-K.S.); (T.-F.T.); Tel.: +886-2-24313131 (ext. 6101) (C.-C.L.); +886-37-206166 (ext. 31010) (H.-K.S.); +886-2-28267293 (T.-F.T.)
| | - Ting-Fen Tsai
- Institute of Molecular and Genomic Medicine, National Health Research Institutes, Miaoli 350, Taiwan; (Y.-J.C.); (P.W.-C.H.); (S.-F.T.)
- Departments of Life Sciences and Institute of Genome Sciences, National Yang Ming Chiao Tung University, Taipei 112, Taiwan
- Center for Healthy Longevity and Aging Sciences, National Yang Ming Chiao Tung University, Taipei 112, Taiwan
- Correspondence: (C.-C.L.); (H.-K.S.); (T.-F.T.); Tel.: +886-2-24313131 (ext. 6101) (C.-C.L.); +886-37-206166 (ext. 31010) (H.-K.S.); +886-2-28267293 (T.-F.T.)
| |
Collapse
|
4
|
Mármol-Sánchez E, Luigi-Sierra MG, Castelló A, Guan D, Quintanilla R, Tonda R, Amills M. Variability in porcine microRNA genes and its association with mRNA expression and lipid phenotypes. Genet Sel Evol 2021; 53:43. [PMID: 33947333 PMCID: PMC8097994 DOI: 10.1186/s12711-021-00632-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 04/15/2021] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Mature microRNAs (miRNAs) play an important role in repressing the expression of a wide range of mRNAs. The presence of polymorphic sites in miRNA genes and their corresponding 3'UTR binding sites can disrupt canonical conserved miRNA-mRNA pairings, and thus modify gene expression patterns. However, to date such polymorphic sites in miRNA genes and their association with gene expression phenotypes and complex traits are poorly characterized in pigs. RESULTS By analyzing whole-genome sequences from 120 pigs and wild boars from Europe and Asia, we identified 285 single nucleotide polymorphisms (SNPs) that map to miRNA loci, and 109,724 SNPs that are located in predicted 7mer-m8 miRNA binding sites within porcine 3'UTR. In porcine miRNA genes, SNP density is reduced compared with their flanking non-miRNA regions. By sequencing the genomes of five Duroc boars, we identified 12 miRNA SNPs that were subsequently genotyped in their offspring (N = 345, Lipgen population). Association analyses of miRNA SNPs with 38 lipid-related traits and hepatic and muscle microarray expression phenotypes recorded in the Lipgen population were performed. The most relevant detected association was between the genotype of the rs319154814 (G/A) SNP located in the apical loop of the ssc-miR-326 hairpin precursor and PPP1CC mRNA levels in the liver (q-value = 0.058). This result was subsequently confirmed by qPCR (P-value = 0.027). The rs319154814 (G/A) genotype was also associated with several fatty acid composition traits. CONCLUSIONS Our findings show a reduced variability of porcine miRNA genes, which is consistent with strong purifying selection, particularly in the seed region that plays a critical role in miRNA binding. Although it is generally assumed that SNPs mapping to the seed region are those with the most pronounced consequences on mRNA expression, we show that a SNP mapping to the apical region of ssc-miR-326 is significantly associated with hepatic mRNA levels of the PPP1CC gene, one of its predicted targets. Although experimental confirmation of such an interaction is reported in humans but not in pigs, this result highlights the need to further investigate the functional effects of miRNA polymorphisms that are located outside the seed region on gene expression in pigs.
Collapse
Affiliation(s)
- Emilio Mármol-Sánchez
- Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Universitat Autònoma de Barcelona, 08193, Bellaterra, Spain
| | - María Gracia Luigi-Sierra
- Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Universitat Autònoma de Barcelona, 08193, Bellaterra, Spain
| | - Anna Castelló
- Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Universitat Autònoma de Barcelona, 08193, Bellaterra, Spain.,Departament de Ciència Animal i dels Aliments, Universitat Autònoma de Barcelona, 08193, Bellaterra, Barcelona, Spain
| | - Dailu Guan
- Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Universitat Autònoma de Barcelona, 08193, Bellaterra, Spain
| | - Raquel Quintanilla
- Animal Breeding and Genetics Program, Institute for Research and Technology in Food and Agriculture (IRTA), Torre Marimon, 08140, Caldes de Montbui, Spain
| | - Raul Tonda
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Marcel Amills
- Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Universitat Autònoma de Barcelona, 08193, Bellaterra, Spain. .,Departament de Ciència Animal i dels Aliments, Universitat Autònoma de Barcelona, 08193, Bellaterra, Barcelona, Spain.
| |
Collapse
|
5
|
Ooi BNS, Raechell, Ying AF, Koh YZ, Jin Y, Yee SWL, Lee JHS, Chong SS, Tan JWC, Liu J, Lee CG, Drum CL. Robust Performance of Potentially Functional SNPs in Machine Learning Models for the Prediction of Atorvastatin-Induced Myalgia. Front Pharmacol 2021; 12:605764. [PMID: 33967749 PMCID: PMC8100589 DOI: 10.3389/fphar.2021.605764] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2020] [Accepted: 03/08/2021] [Indexed: 12/20/2022] Open
Abstract
Statins can cause muscle symptoms resulting in poor adherence to therapy and increased cardiovascular risk. We hypothesize that combinations of potentially functional SNPs (pfSNPs), rather than individual SNPs, better predict myalgia in patients on atorvastatin. This study assesses the value of potentially functional single nucleotide polymorphisms (pfSNPs) and employs six machine learning algorithms to identify the combination of SNPs that best predict myalgia. Methods: Whole genome sequencing of 183 Chinese, Malay and Indian patients from Singapore was conducted to identify genetic variants associated with atorvastatin induced myalgia. To adjust for confounding factors, demographic and clinical characteristics were also examined for their association with myalgia. The top factor, sex, was then used as a covariate in the whole genome association analyses. Variants that were highly associated with myalgia from this and previous studies were extracted, assessed for potential functionality (pfSNPs) and incorporated into six machine learning models. Predictive performance of a combination of different models and inputs were compared using the average cross validation area under ROC curve (AUC). The minimum combination of SNPs to achieve maximum sensitivity and specificity as determined by AUC, that predict atorvastatin-induced myalgia in most, if not all the six machine learning models was determined. Results: Through whole genome association analyses using sex as a covariate, a larger proportion of pfSNPs compared to non-pf SNPs were found to be highly associated with myalgia. Although none of the individual SNPs achieved genome wide significance in univariate analyses, machine learning models identified a combination of 15 SNPs that predict myalgia with good predictive performance (AUC >0.9). SNPs within genes identified in this study significantly outperformed SNPs within genes previously reported to be associated with myalgia. pfSNPs were found to be more robust in predicting myalgia, outperforming non-pf SNPs in the majority of machine learning models tested. Conclusion: Combinations of pfSNPs that were consistently identified by different machine learning models to have high predictive performance have good potential to be clinically useful for predicting atorvastatin-induced myalgia once validated against an independent cohort of patients.
Collapse
Affiliation(s)
- Brandon N S Ooi
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Dundee, Singapore
| | - Raechell
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Dundee, Singapore
| | | | - Yong Zher Koh
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Dundee, Singapore
| | - Yu Jin
- Division of Cellular and Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, Singapore, Singapore
| | - Sherman W L Yee
- Department of Medicine, Yong Loo Lin School of Medicine, Cardiovascular Research Institute, National University of Singapore, Singapore, Singapore
| | | | - Samuel S Chong
- Department of Pediatrics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Jack W C Tan
- Department of Cardiology, National Heart Centre Singapore, Singapore, Singapore
| | - Jianjun Liu
- Genome Institute of Singapore, Singapore, Singapore
| | - Caroline G Lee
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Dundee, Singapore.,Duke-NUS Graduate School, Singapore, Singapore.,Division of Cellular and Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, Singapore, Singapore.,NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore, Singapore
| | - Chester L Drum
- Department of Medicine, Yong Loo Lin School of Medicine, Cardiovascular Research Institute, National University of Singapore, Singapore, Singapore.,Translational Laboratory in Genetic Medicine, Singapore, Singapore
| |
Collapse
|