1
|
Jonas F, Navon Y, Barkai N. Intrinsically disordered regions as facilitators of the transcription factor target search. Nat Rev Genet 2025; 26:424-435. [PMID: 39984675 DOI: 10.1038/s41576-025-00816-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/14/2025] [Indexed: 02/23/2025]
Abstract
Transcription factors (TFs) contribute to organismal development and function by regulating gene expression. Despite decades of research, the factors determining the specificity and speed at which eukaryotic TFs detect their target binding sites remain poorly understood. Recent studies have pointed to intrinsically disordered regions (IDRs) within TFs as key regulators of the process by which TFs find their target sites on DNA (the TF target search). However, IDRs are challenging to study because they can confer specificity despite low sequence complexity and can be functionally conserved despite rapid sequence divergence. Nevertheless, emerging computational and experimental approaches are beginning to elucidate the sequence-function relationship within the IDRs of TFs. Additional insights are informing potential mechanisms underlying the IDR-directed search for the DNA targets of TFs, including incorporation into biomolecular condensates, facilitating TF co-localization, and the hypothesis that IDRs recognize and directly interact with specific genomic regions.
Collapse
Affiliation(s)
- Felix Jonas
- School of Science, Constructor University, Bremen, Germany.
| | - Yoav Navon
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Naama Barkai
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
2
|
Dhibar S, Jana B. Optimized Collective Variable for Collapse Transition in Linear Hydrophobic Polymers: Importance of Hydration Water and End-to-End Distance. J Chem Theory Comput 2024; 20:7404-7415. [PMID: 39252562 DOI: 10.1021/acs.jctc.4c00753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Choosing an appropriate collective variable (CV) for any biomolecular process is a challenging task. Researchers are developing methods to solve this issue using a variety of methodologies, most recently using machine learning (ML) methods. In this work, we investigate the mechanism of collapse transition across various lengths of polymer systems through adaptively sampled multiple short trajectories utilizing the Time Lagged Independent Component Analysis (TICA) framework. From TICA analysis, it is revealed that the radius of gyration (Rg) and end-to-end distance serve as good order parameters (OPs) for these systems describing overall energy landscapes. Markov state model (MSM) and mean first passage time (MFPT) analysis suggest that hydration water (Nw) plays a determining role in dictating the time scale and barrier for the collapsed transition for the C40 system. P-fold analysis on identifying transition state ensembles (TSE) identified by committor analysis also strengthens the role of Nw in such a transition. TICA, MSM, and committor analyses on the collapse transition for C45 reveal similarities with C40 systems in different aspects. Furthermore, we propose a pipeline integrating XGBoost regression along with an interpretable ML model, Shapley Additive exPlanation (SHAP) to precisely elucidate the contribution of each OP locally at the TSE. Through this approach, we observe that the collapse transition is primarily driven by Nw for both polymer systems. A carefully designed protocol for the collapsed transition of C60 systems indirectly reiterates the above result. Overall, our results suggest that while the end-to-end distance should be considered for better resolution of metastable states in the landscape, Nw is the crucial coordinate to be used in enhanced sampling for the exploration of actual collapse transitions for linear hydrophobic polymer systems. The Python code for analyzing the contribution of different OPs in the TSE using an ML-aided protocol is available on GitHub (https://github.com/saikat-ai/linear_polymer_project).
Collapse
Affiliation(s)
- Saikat Dhibar
- School of Chemical Sciences, Indian Association for the Cultivation of Science, Jadavpur, Kolkata 700032, India
| | - Biman Jana
- School of Chemical Sciences, Indian Association for the Cultivation of Science, Jadavpur, Kolkata 700032, India
| |
Collapse
|
3
|
Lee SH, Lee J, Kim DW, Kim DH, Ahn SJ, Choi MG, Jo S, Suh CH, Chung SJ. Factors to predict recurrence after epidural blood patch in patients with spontaneous intracranial hypotension. Headache 2024; 64:380-389. [PMID: 38634709 DOI: 10.1111/head.14703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 01/10/2024] [Accepted: 01/16/2024] [Indexed: 04/19/2024]
Abstract
OBJECTIVES This study aimed to identify predictors for the recurrence of spontaneous intracranial hypotension (SIH) after epidural blood patch (EBP). BACKGROUND Epidural blood patch is the main treatment option for SIH; however, the characteristics of patients who experience relapse after successful EBP treatment for SIH remain understudied. METHODS In this exploratory, retrospective, case-control study, we included 19 patients with SIH recurrence after EBP and 36 age- and sex-matched patients without recurrence from a single tertiary medical institution. We analyzed clinical characteristics, neuroimaging findings, and volume changes in intracranial structures after EBP treatment. Machine learning methods were utilized to predict the recurrence of SIH after EBP treatment. RESULTS There were no significant differences in clinical features between the recurrence and no-recurrence groups. Among brain magnetic resonance imaging signs, diffuse pachymeningeal enhancement and cerebral venous dilatation were more prominent in the recurrence group than no-recurrence group after EBP (14/19 [73%] vs. eight of 36 [22%] patients, p = 0.001; 11/19 [57%] vs. seven of 36 [19%] patients, p = 0.010, respectively). The midbrain-pons angle decreased in the recurrence group compared to the no-recurrence group after EBP, at a mean (standard deviation [SD]) of -12.0 [16.7] vs. +1.8[18.3]° (p = 0.048). In volumetric analysis, volume changes after EBP were smaller in the recurrence group than in the no-recurrence group in intracranial cerebrospinal fluid (mean [SD] -11.6 [15.3] vs. +4.8 [17.1] mL, p = 0.001) and ventricles (mean [SD] +1.0 [2.0] vs. +2.0 [2.5] mL, p = 0.003). Notably, the random forest classifier indicated that the model constructed with brain volumetry was more accurate in discriminating SIH recurrence (area under the curve = 0.80 vs. 0.52). CONCLUSION Our study suggests that volumetric analysis of intracranial structures may aid in predicting recurrence after EBP treatment in patients with SIH.
Collapse
Affiliation(s)
- Seung Hyun Lee
- Department of Neurology, Seoul Medical Center, Seoul, South Korea
| | - Jooyoung Lee
- Department of Applied Statistics, Chung-Ang University, Seoul, South Korea
| | - Da-Woon Kim
- Department of Applied Statistics, Chung-Ang University, Seoul, South Korea
| | - Dong Hyun Kim
- Department of Neurology, Seoul Medical Center, Seoul, South Korea
| | - Sung Jae Ahn
- Department of Neurology, Seoul Medical Center, Seoul, South Korea
| | - Moon Gwan Choi
- Department of Neurology, Seoul Medical Center, Seoul, South Korea
| | - Sungyang Jo
- Department of Neurology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, South Korea
| | - Chong Hyun Suh
- Department of Radiology and Research institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, South Korea
| | - Sun J Chung
- Department of Neurology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, South Korea
| |
Collapse
|
4
|
Dhibar S, Jana B. Accurate Prediction of Antifreeze Protein from Sequences through Natural Language Text Processing and Interpretable Machine Learning Approaches. J Phys Chem Lett 2023; 14:10727-10735. [PMID: 38009833 DOI: 10.1021/acs.jpclett.3c02817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Antifreeze proteins (AFPs) bind to growing iceplanes owing to their structural complementarity nature, thereby inhibiting the ice-crystal growth by thermal hysteresis. Classification of AFPs from sequence is a difficult task due to their low sequence similarity, and therefore, the usual sequence similarity algorithms, like Blast and PSI-Blast, are not efficient. Here, a method combining n-gram feature vectors and machine learning models to accelerate the identification of potential AFPs from sequences is proposed. All these n-gram features are extracted from the K-mer counting method. The comparative analysis reveals that, among different machine learning models, Xgboost outperforms others in predicting AFPs from sequence when penta-mers are used as a feature vector. When tested on an independent dataset, our method performed better compared to other existing ones with sensitivity of 97.50%, recall of 98.30%, and f1 score of 99.10%. Further, we used the SHAP method, which provides important insight into the functional activity of AFPs.
Collapse
Affiliation(s)
- Saikat Dhibar
- School of Chemical Sciences, Indian Association for the Cultivation of Science, Jadavpur, Kolkata 700032, India
| | - Biman Jana
- School of Chemical Sciences, Indian Association for the Cultivation of Science, Jadavpur, Kolkata 700032, India
| |
Collapse
|
5
|
Qayyum M, Zhang Y, Wang M, Yu Y, Li S, Ahmad W, Maodaa SN, Sayed SRM, Gan J. Advancements in technology and innovation for sustainable agriculture: Understanding and mitigating greenhouse gas emissions from agricultural soils. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2023; 347:119147. [PMID: 37776793 DOI: 10.1016/j.jenvman.2023.119147] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 09/03/2023] [Accepted: 09/22/2023] [Indexed: 10/02/2023]
Abstract
In recent decades, Technology and Innovation (TI) have shown tremendous potential for improving agricultural productivity and environmental sustainability. However, the adoption and implementation of TI in the agricultural sector and its impact on the environment remain limited. To gain deeper insights into the significance of TI in enhancing agricultural productivity while maintaining environmental balance, this study investigates 21 agriculture-dependent Asian countries. Two machine learning techniques, LASSO (Least Absolute Shrinkage and Selection Operator) and Elastic-Net, are employed to analyze the data, which is categorized into three regional groups: ASEAN (Association of Southeast Asian Nations), SAARC (South Asian Association for Regional Cooperation), and GCC (Gulf Cooperation Council). The findings of this study highlight the heterogeneous nature of technology adoption and its environmental implications across the three country groups. ASEAN countries emerge as proactive adopters of relevant technologies, effectively enhancing agricultural production while simultaneously upholding environmental quality. Conversely, SAARC countries exhibit weaker technology adoption, leading to significant fluctuations in environmental quality, which in turn impact agricultural productivity. Notably, agricultural emissions of N2O (nitrous oxide) and CO2 (carbon dioxide) in SAARC countries show a positive association with agricultural production, while CH4 (methane) emissions have an adverse effect. In contrast, the study reveals a lack of evidence regarding technological adoption in agriculture among GCC countries. Surprisingly, higher agricultural productivity in these countries is correlated with increased N2O emissions. Moreover, the results indicate that deforestation and expansion of cropland contribute to increased agricultural production; however, this expansion is accompanied by higher emissions related to agricultural activities. This research represents a pioneering empirical analysis of the impact of TI and environmental emission gases on agricultural productivity in the three aforementioned country groups. It underscores the imperative of embracing relevant technologies to enhance agricultural output while concurrently ensuring environmental sustainability. The findings of this study provide valuable insights for policymakers and stakeholders in formulating strategies to promote sustainable agriculture and technological advancement in the context of diverse regional dynamics.
Collapse
Affiliation(s)
- Muhammad Qayyum
- School of Economics and Statistics, Guangzhou University, Guangzhou, China.
| | - Yanping Zhang
- School of Management, Guangzhou University, Guangzhou, China.
| | - Mansi Wang
- School of Innovation and Entrepreneurship, Guangzhou University, Guangzhou, China.
| | - Yuyuan Yu
- Department of Economics and Finance, City University of Hong Kong, China.
| | - Shijie Li
- School of Economics, Nankai University, Tianjin City, China.
| | - Wasim Ahmad
- School of Economics and Statistics, Guangzhou University, Guangzhou, China.
| | - Saleh N Maodaa
- Department of Botany and Microbiology, College of Science, King Saud University, P.O. Box 2455, Riyadh, 11451, Saudi Arabia
| | - Shaban R M Sayed
- Department of Botany and Microbiology, College of Science, King Saud University, P.O. Box 2455, Riyadh, 11451, Saudi Arabia
| | - Jiawei Gan
- School of Management, Guangzhou University, Guangzhou, China
| |
Collapse
|
6
|
Smirnov E, Molínová P, Chmúrčiaková N, Vacík T, Cmarko D. Non-canonical DNA structures in the human ribosomal DNA. Histochem Cell Biol 2023; 160:499-515. [PMID: 37750997 DOI: 10.1007/s00418-023-02233-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/15/2023] [Indexed: 09/27/2023]
Abstract
Non-canonical structures (NCS) refer to the various forms of DNA that differ from the B-conformation described by Watson and Crick. It has been found that these structures are usual components of the genome, actively participating in its essential functions. The present review is focused on the nine kinds of NCS appearing or likely to appear in human ribosomal DNA (rDNA): supercoiling structures, R-loops, G-quadruplexes, i-motifs, DNA triplexes, cruciform structures, DNA bubbles, and A and Z DNA conformations. We discuss the conditions of their generation, including their sequence specificity, distribution within the locus, dynamics, and beneficial and detrimental role in the cell.
Collapse
Affiliation(s)
- Evgeny Smirnov
- Laboratory of Cell Biology, Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University and General University Hospital in Prague, 128 00, Prague, Czech Republic.
| | - Pavla Molínová
- Laboratory of Cell Biology, Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University and General University Hospital in Prague, 128 00, Prague, Czech Republic
| | - Nikola Chmúrčiaková
- Laboratory of Cell Biology, Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University and General University Hospital in Prague, 128 00, Prague, Czech Republic
| | - Tomáš Vacík
- Laboratory of Cell Biology, Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University and General University Hospital in Prague, 128 00, Prague, Czech Republic
| | - Dušan Cmarko
- Laboratory of Cell Biology, Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University and General University Hospital in Prague, 128 00, Prague, Czech Republic
| |
Collapse
|
7
|
Jonas F, Carmi M, Krupkin B, Steinberger J, Brodsky S, Jana T, Barkai N. The molecular grammar of protein disorder guiding genome-binding locations. Nucleic Acids Res 2023; 51:4831-4844. [PMID: 36938874 PMCID: PMC10250222 DOI: 10.1093/nar/gkad184] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 01/25/2023] [Accepted: 03/15/2023] [Indexed: 03/21/2023] Open
Abstract
Intrinsically disordered regions (IDRs) direct transcription factors (TFs) towards selected genomic occurrences of their binding motif, as exemplified by budding yeast's Msn2. However, the sequence basis of IDR-directed TF binding selectivity remains unknown. To reveal this sequence grammar, we analyze the genomic localizations of >100 designed IDR mutants, each carrying up to 122 mutations within this 567-AA region. Our data points at multivalent interactions, carried by hydrophobic-mostly aliphatic-residues dispersed within a disordered environment and independent of linear sequence motifs, as the key determinants of Msn2 genomic localization. The implications of our results for the mechanistic basis of IDR-based TF binding preferences are discussed.
Collapse
Affiliation(s)
- Felix Jonas
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Miri Carmi
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Beniamin Krupkin
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Joseph Steinberger
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Sagie Brodsky
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Tamar Jana
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Naama Barkai
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
8
|
Nakajima K, Yuno M. Elevated All-Cause Mortality among Overweight Older People: AI Predicts a High Normal Weight Is Optimal. Geriatrics (Basel) 2022; 7:68. [PMID: 35735773 PMCID: PMC9222635 DOI: 10.3390/geriatrics7030068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Revised: 06/12/2022] [Accepted: 06/14/2022] [Indexed: 11/19/2022] Open
Abstract
It has been proposed that being overweight may provide an advantage with respect to mortality in older people, although this has not been investigated fully. Therefore, to confirm that and elucidate the underlying mechanism, we investigated mortality in older people using explainable artificial intelligence (AI) with the gradient-boosting algorithm XGboost. Baseline body mass indexes (BMIs) of 5699 people (79.3 ± 3.9 years) were evaluated to determine the relationship with all-cause mortality over eight years. In the unadjusted model, the first negative (protective) BMI range for mortality was 25.9−28.4 kg/m2. However, in the adjusted cross-validation model, this range was 22.7−23.6 kg/m2; the second and third negative BMI ranges were then 25.8−28.2 and 24.6−25.8 kg/m2, respectively. Conversely, the first advancing BMI range was 12.8−18.7 kg/m2, which did not vary across conditions with high feature importance. Actual and predicted mortality rates in participants aged <90 years showed a negative-linear or L-shaped relationship with BMI, whereas predicted mortality rates in men aged ≥90 years showed a blunt U-shaped relationship. In conclusion, AI predicted that being overweight may not be an optimal condition with regard to all-cause mortality in older adults. Instead, it may be that a high normal weight is optimal, though this may vary according to the age and sex.
Collapse
Affiliation(s)
- Kei Nakajima
- School of Nutrition and Dietetics, Faculty of Health and Social Services, Kanagawa University of Human Services, 1-10-1 Heisei-cho, Yokosuka 238-8522, Japan;
- Department of Endocrinology and Diabetes, Saitama Medical Center, Saitama Medical University, 1981 Kamoda, Kawagoe 350-8550, Japan
- Department of Food and Nutrition, Faculty of Human Sciences and Design, Japan Women’s University, 2-8-1 Mejiro-dai, Bunkyo-ku, Tokyo 112-8681, Japan
| | - Mariko Yuno
- School of Nutrition and Dietetics, Faculty of Health and Social Services, Kanagawa University of Human Services, 1-10-1 Heisei-cho, Yokosuka 238-8522, Japan;
| |
Collapse
|
9
|
Developing Community Resources for Nucleic Acid Structures. Life (Basel) 2022; 12:life12040540. [PMID: 35455031 PMCID: PMC9031032 DOI: 10.3390/life12040540] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 03/28/2022] [Accepted: 03/31/2022] [Indexed: 01/14/2023] Open
Abstract
In this review, we describe the creation of the Nucleic Acid Database (NDB) at Rutgers University and how it became a testbed for the current infrastructure of the RCSB Protein Data Bank. We describe some of the special features of the NDB and how it has been used to enable research. Plans for the next phase as the Nucleic Acid Knowledgebase (NAKB) are summarized.
Collapse
|