1
|
Yang L, Wang C, Zhou P, Xie N, Tian M, Wang K. Change point detection in brucellosis time series from 2010 to 2023 in Xinjiang China using the BEAST algorithm. Sci Rep 2025; 15:3830. [PMID: 39885345 PMCID: PMC11782483 DOI: 10.1038/s41598-025-88508-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Accepted: 01/28/2025] [Indexed: 02/01/2025] Open
Abstract
Brucellosis is a significant global challenge, but there has been a lack of epidemiological studies on brucellosis in Xinjiang from a change point perspective. This study aims to bridge this gap by employing sequence decomposition and identifying significant change points, with datasets sourced from the Xinjiang Disease Prevention and Control Information System. This study employed the BEAST algorithm to decompose the brucellosis time series in Xinjiang from 2010 to 2023, while simultaneously identifying change points in the decomposed seasonal and trend components. The probability of four change points occurring within the seasonal component is 0.8950. And the locations where these four change points occur and the probabilities associated with each change point are August 2013 ([Formula: see text]), August 2017 ([Formula: see text]), February 2022 ([Formula: see text]), and May 2023 ([Formula: see text]), respectively. The probability of the existence of five change points in the trend factors of brucellosis in Xinjiang is highest ([Formula: see text]). The times at which these five change points occur, along with the probabilities of change at those moments, are as follows: March 2013 ([Formula: see text]), August 2015 ([Formula: see text]), July 2017 ([Formula: see text]), February 2020 ([Formula: see text]), and May 2023 ([Formula: see text]). Change point analysis holds significant utility within the field of epidemiology. These discoveries furnish pivotal insights for epidemiological investigations and the development of early warning systems tailored to brucellosis.
Collapse
Affiliation(s)
- Liping Yang
- College of Public Health, Xinjiang Medical University, Urumqi, 830017, China
| | - Chunxia Wang
- College of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, 830017, China
| | - Pan Zhou
- College of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, 830017, China
| | - Na Xie
- Department of Immunization Programme, Xinjiang Center for Disease Control and Prevention, Urumqi, 830054, China
| | - Maozai Tian
- College of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, 830017, China.
| | - Kai Wang
- College of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, 830017, China.
- Institute of Medical Engineering Interdisciplinary Research, Xinjiang Medical University, Urumqi, 830017, China.
| |
Collapse
|
2
|
Hazra A, Bose S. Estimating changepoints in extremal dependence, applied to aviation stock prices during COVID-19 pandemic. J Appl Stat 2024; 52:525-554. [PMID: 39950015 PMCID: PMC11816642 DOI: 10.1080/02664763.2024.2373939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 06/13/2024] [Indexed: 02/16/2025]
Abstract
The dependence in the tails of the joint distribution of two random variables is generally assessed using χ-measure, the limiting conditional probability of one variable being extremely high given the other variable is also extremely high. This work is motivated by the structural changes in χ-measure between the daily rate of return (RoR) of the two Indian airlines, IndiGo and SpiceJet, during the COVID-19 pandemic. We model the daily maximum and minimum RoR vectors (potentially transformed) using the bivariate Hüsler-Reiss (BHR) distribution. To estimate the changepoint in the χ-measure of the BHR distribution, we explore two changepoint detection procedures based on the Likelihood Ratio Test (LRT) and Modified Information Criterion (MIC). We obtain critical values and power curves of the LRT and MIC test statistics for low through high values of χ-measure. We also explore the consistency of the estimators of the changepoint based on LRT and MIC numerically. In our data application, for RoR maxima and minima, the most prominent changepoints detected by LRT and MIC are close to the announcement of the first phases of lockdown and unlock, respectively, which are realistic; thus, our study would be beneficial for portfolio optimization in the case of future pandemic situations.
Collapse
Affiliation(s)
- Arnab Hazra
- Department of Mathematics and Statistics, Indian Institute of Technology Kanpur, Kanpur, India
| | - Shiladitya Bose
- Department of Mathematics and Statistics, Indian Institute of Technology Kanpur, Kanpur, India
| |
Collapse
|
3
|
Lindborg SR, Goyal NA, Katz J, Burford M, Li J, Kaspi H, Abramov N, Boulanger B, Berry JD, Nicholson K, Mozaffar T, Miller R, Jenkins L, Baloh RH, Lewis R, Staff NP, Owegi MA, Dagher B, Blondheim-Shraga NR, Gothelf Y, Levy YS, Kern R, Aricha R, Windebank AJ, Bowser R, Brown RH, Cudkowicz ME. Debamestrocel multimodal effects on biomarker pathways in amyotrophic lateral sclerosis are linked to clinical outcomes. Muscle Nerve 2024; 69:719-729. [PMID: 38593477 DOI: 10.1002/mus.28093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 03/16/2024] [Accepted: 03/19/2024] [Indexed: 04/11/2024]
Abstract
INTRODUCTION/AIMS Biomarkers have shown promise in amyotrophic lateral sclerosis (ALS) research, but the quest for reliable biomarkers remains active. This study evaluates the effect of debamestrocel on cerebrospinal fluid (CSF) biomarkers, an exploratory endpoint. METHODS A total of 196 participants randomly received debamestrocel or placebo. Seven CSF samples were to be collected from all participants. Forty-five biomarkers were analyzed in the overall study and by two subgroups characterized by the ALS Functional Rating Scale-Revised (ALSFRS-R). A prespecified model was employed to predict clinical outcomes leveraging biomarkers and disease characteristics. Causal inference was used to analyze relationships between neurofilament light chain (NfL) and ALSFRS-R. RESULTS We observed significant changes with debamestrocel in 64% of the biomarkers studied, spanning pathways implicated in ALS pathology (63% neuroinflammation, 50% neurodegeneration, and 89% neuroprotection). Biomarker changes with debamestrocel show biological activity in trial participants, including those with advanced ALS. CSF biomarkers were predictive of clinical outcomes in debamestrocel-treated participants (baseline NfL, baseline latency-associated peptide/transforming growth factor beta1 [LAP/TGFβ1], change galectin-1, all p < .01), with baseline NfL and LAP/TGFβ1 remaining (p < .05) when disease characteristics (p < .005) were incorporated. Change from baseline to the last measurement showed debamestrocel-driven reductions in NfL were associated with less decline in ALSFRS-R. Debamestrocel significantly reduced NfL from baseline compared with placebo (11% vs. 1.6%, p = .037). DISCUSSION Following debamestrocel treatment, many biomarkers showed increases (anti-inflammatory/neuroprotective) or decreases (inflammatory/neurodegenerative) suggesting a possible treatment effect. Neuroinflammatory and neuroprotective biomarkers were predictive of clinical response, suggesting a potential multimodal mechanism of action. These results offer preliminary insights that need to be confirmed.
Collapse
Affiliation(s)
| | - Namita A Goyal
- UCI Health ALS & Neuromuscular Center, University of California, Irvine, California, USA
| | - Jonathan Katz
- Sutter Pacific Medical Foundation, California Pacific Medical Center, San Francisco, California, USA
| | - Matthew Burford
- Department of Neurology, Cedars-Sinai Medical Center, Los Angeles, California, USA
| | - Jenny Li
- Brainstorm Cell Therapeutics, Boston, Massachusetts, USA
| | | | | | - Bruno Boulanger
- Department of Statistics and Data Science, PharmaLex, Mont-Saint-Guibert, Belgium
| | - James D Berry
- Healey & AMG Center, Mass General Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Katharine Nicholson
- Healey & AMG Center, Mass General Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Tahseen Mozaffar
- UCI Health ALS & Neuromuscular Center, University of California, Irvine, California, USA
| | - Robert Miller
- Sutter Pacific Medical Foundation, California Pacific Medical Center, San Francisco, California, USA
| | - Liberty Jenkins
- Sutter Pacific Medical Foundation, California Pacific Medical Center, San Francisco, California, USA
| | - Robert H Baloh
- Department of Neurology, Cedars-Sinai Medical Center, Los Angeles, California, USA
| | - Richard Lewis
- Department of Neurology, Cedars-Sinai Medical Center, Los Angeles, California, USA
| | - Nathan P Staff
- Department of Neurology, Mayo Clinic College of Medicine, Rochester, Minnesota, USA
| | - Margaret Ayo Owegi
- Neurology Department, University of Massachusetts Medical School, Worcester, Massachusetts, USA
| | - Bob Dagher
- Brainstorm Cell Therapeutics, Boston, Massachusetts, USA
| | | | | | - Yossef S Levy
- Manufacturing, Brainstorm Cell Therapeutics, Tel Aviv, Israel
| | - Ralph Kern
- Brainstorm Cell Therapeutics, Boston, Massachusetts, USA
| | | | - Anthony J Windebank
- Department of Neurology, Mayo Clinic College of Medicine, Rochester, Minnesota, USA
| | - Robert Bowser
- Department of Neurology, Barrow Neurological Institute, Phoenix, Arizona, USA
| | - Robert H Brown
- Neurology Department, University of Massachusetts Medical School, Worcester, Massachusetts, USA
| | - Merit E Cudkowicz
- Healey & AMG Center, Mass General Hospital, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
4
|
Yang L, Xie N, Yao Y, Wang C, Tian M, Wang K. Hepatitis B time series in Xinjiang, China (2006-2021): change point detection based on the Mann-Kendall-Sneyers test. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:2458-2469. [PMID: 38454691 DOI: 10.3934/mbe.2024108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
Hepatitis B is a major global challenge, but there is a lack of epidemiological research on hepatitis B incidence from a change point perspective. This study aimed to fill this gap by identifying significant change points and trends in hepatitis time series in Xinjiang, China. The datasets were obtained from the Xinjiang Information System for Disease Control and Prevention. The Mann-Kendall-Sneyers (MKS) test was used to detect change points and trend changes on the hepatitis B time series of 14 regions in Xinjiang, and the effectiveness of this method was validated by comparing it with the binary segmentation (BS) and segment regression (SR) methods. Based on the results of change point analysis, the prevention and control policies and measures of hepatitis in Xinjiang were discussed. The results showed that 8 regions (57.1%) with at least one change fell within the 95% confidence interval (CI) in all 14 regions by the MKS test, where five regions (Turpan (TP), Hami (HM), Bayingolin (BG), Kyzylsu Kirgiz (KK), Altai (AT)) were identified at one change point, two change points existed for two regions (Aksu (AK), Hotan (HT)) and three change points was detected in 1 region (Bortala (BT)). Most of the change points occurred at both ends of the sequence. More change points indicated an upward trend in the front half of the sequence, while in the latter half, many change points indicated a downward trend prominently. Finally, in comparing the results of the three change point tests, the MKS test showed a 61.5% agreement (8/13) with the BS and SR.
Collapse
Affiliation(s)
- Liping Yang
- College of Public Health, Xinjiang Medical University, Urumqi 830017, China
| | - Na Xie
- Department of Immunization Programme, Xinjiang Center for Disease Control and Prevention, Urumqi 830054, China
| | - Yanru Yao
- College of Science, Shihezi University, Shihezi 832000, China
| | - Chunxia Wang
- College of Medical Engineering and Technology, Xinjiang Medical University, Urumqi 830017, China
| | - Maozai Tian
- College of Medical Engineering and Technology, Xinjiang Medical University, Urumqi 830017, China
| | - Kai Wang
- College of Medical Engineering and Technology, Xinjiang Medical University, Urumqi 830017, China
| |
Collapse
|
5
|
Gao Z, Xiao X, Fang YP, Rao J, Mo H. A Selective Review on Information Criteria in Multiple Change Point Detection. ENTROPY (BASEL, SWITZERLAND) 2024; 26:50. [PMID: 38248176 PMCID: PMC10813938 DOI: 10.3390/e26010050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 01/02/2024] [Accepted: 01/03/2024] [Indexed: 01/23/2024]
Abstract
Change points indicate significant shifts in the statistical properties in data streams at some time points. Detecting change points efficiently and effectively are essential for us to understand the underlying data-generating mechanism in modern data streams with versatile parameter-varying patterns. However, it becomes a highly challenging problem to locate multiple change points in the noisy data. Although the Bayesian information criterion has been proven to be an effective way of selecting multiple change points in an asymptotical sense, its finite sample performance could be deficient. In this article, we have reviewed a list of information criterion-based methods for multiple change point detection, including Akaike information criterion, Bayesian information criterion, minimum description length, and their variants, with the emphasis on their practical applications. Simulation studies are conducted to investigate the actual performance of different information criteria in detecting multiple change points with possible model mis-specification for the practitioners. A case study on the SCADA signals of wind turbines is conducted to demonstrate the actual change point detection power of different information criteria. Finally, some key challenges in the development and application of multiple change point detection are presented for future research work.
Collapse
Affiliation(s)
- Zhanzhongyu Gao
- School of Systems and Computing, University of New South Wales, Canberra, ACT 2612, Australia; (Z.G.); (H.M.)
| | - Xun Xiao
- Department of Mathematics and Statistics, University of Otago, Dunedin 9016, New Zealand
| | - Yi-Ping Fang
- Chair Risk and Resilience of Complex Systems, Laboratoire Génie Industriel, CentraleSupélec, Université Paris-Saclay, 91190 Bures-sur-Yvette, France;
| | - Jing Rao
- Key Laboratory of Precision Opto-Mechatronics Technology, School of Instrumentation and Opto-Electronic Engineering, Beihang University, Beijing 100191, China;
| | - Huadong Mo
- School of Systems and Computing, University of New South Wales, Canberra, ACT 2612, Australia; (Z.G.); (H.M.)
| |
Collapse
|
6
|
Yang L, Xie N, Yao Y, Wang C, RiFhat R, Tian M, Wang K. Multiple change point analysis of hepatitis B reports in Xinjiang, China from 2006 to 2021. Front Public Health 2023; 11:1223176. [PMID: 38035295 PMCID: PMC10682783 DOI: 10.3389/fpubh.2023.1223176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 10/23/2023] [Indexed: 12/02/2023] Open
Abstract
Objective Hepatitis B (HB) is a major global challenge, but there has been a lack of epidemiological studies on HB incidence in Xinjiang from a change-point perspective. This study aims to bridge this gap by identifying significant change points and trends. Method The datasets were obtained from the Xinjiang Information System for Disease Control and Prevention. Change points were identified using binary segmentation for full datasets and a segmented regression model for five age groups. Results The results showed four change points for the quarterly HB time series, with the period between the first change point (March 2007) and the second change point (March 2010) having the highest mean number of HB reports. In the subsequent segments, there was a clear downward trend in reported cases. The segmented regression model showed different numbers of change points for each age group, with the 30-50, 51-80, and 15-29 age groups having higher growth rates. Conclusion Change point analysis has valuable applications in epidemiology. These findings provide important information for future epidemiological studies and early warning systems for HB.
Collapse
Affiliation(s)
- Liping Yang
- College of Public Health, Xinjiang Medical University, Ürümqi, China
- College of Medical Engineering and Technology, Xinjiang Medical University, Ürümqi, China
| | - Na Xie
- Department of Immunization Programme, Xinjiang Center for Disease Control and Prevention, Ürümqi, China
| | - Yanru Yao
- College of Science, Shihezi University, Shihezi, China
| | - Chunxia Wang
- College of Medical Engineering and Technology, Xinjiang Medical University, Ürümqi, China
| | - Ramziya RiFhat
- College of Medical Engineering and Technology, Xinjiang Medical University, Ürümqi, China
| | - Maozai Tian
- College of Medical Engineering and Technology, Xinjiang Medical University, Ürümqi, China
| | - Kai Wang
- College of Medical Engineering and Technology, Xinjiang Medical University, Ürümqi, China
| |
Collapse
|
7
|
Liu J, Bellows B, Hu XJ, Wu J, Zhou Z, Soteros C, Wang L. A new time-varying coefficient regression approach for analyzing infectious disease data. Sci Rep 2023; 13:14687. [PMID: 37673956 PMCID: PMC10482960 DOI: 10.1038/s41598-023-41551-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 08/28/2023] [Indexed: 09/08/2023] Open
Abstract
Since the beginning of the global pandemic of Coronavirus (SARS-COV-2), there has been many studies devoted to predicting the COVID-19 related deaths/hospitalizations. The aim of our work is to (1) explore the lagged dependence between the time series of case counts and the time series of death counts; and (2) utilize such a relationship for prediction. The proposed approach can also be applied to other infectious diseases or wherever dynamics in lagged dependence are of primary interest. Different from the previous studies, we focus on time-varying coefficient models to account for the evolution of the coronavirus. Using two different types of time-varying coefficient models, local polynomial regression models and piecewise linear regression models, we analyze the province-level data in Canada as well as country-level data using cumulative counts. We use out-of-sample prediction to evaluate the model performance. Based on our data analyses, both time-varying coefficient modeling strategies work well. Local polynomial regression models generally work better than piecewise linear regression models, especially when the pattern of the relationship between the two time series of counts gets more complicated (e.g., more segments are needed to portray the pattern). Our proposed methods can be easily and quickly implemented via existing R packages.
Collapse
Affiliation(s)
- Juxin Liu
- Department of Mathematics and Statistics, University of Saskatchewan, Saskatoon, S7N 5E6, Canada.
| | - Brandon Bellows
- Department of Mathematics and Statistics, University of Saskatchewan, Saskatoon, S7N 5E6, Canada
| | - X Joan Hu
- Department of Statistics and Actuarial Science, Simon Fraser University, Vancouver, V5A 1S6, Canada
| | - Jianhong Wu
- Department of Mathematics and Statistics, York University, Toronto, M3J 1P3, Canada
| | - Zhou Zhou
- Department of Statistical Sciences, University of Toronto, Toronto, M5G 1X6, Canada
| | - Chris Soteros
- Department of Mathematics and Statistics, University of Saskatchewan, Saskatoon, S7N 5E6, Canada
| | - Lin Wang
- Department of Mathematics and Statistics, University of New Brunswick, Fredericton, E3B 5A3, Canada
| |
Collapse
|
8
|
D’Angelo N, Adelfio G, Chiodi M, D’Alessandro A. Statistical Picking of Multivariate Waveforms. SENSORS (BASEL, SWITZERLAND) 2022; 22:9636. [PMID: 36560007 PMCID: PMC9788455 DOI: 10.3390/s22249636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 11/14/2022] [Accepted: 12/06/2022] [Indexed: 06/17/2023]
Abstract
In this paper, we propose a new approach based on the fitting of a generalized linear regression model in order to detect points of change in the variance of a multivariate-covariance Gaussian variable, where the variance function is piecewise constant. By applying this new approach to multivariate waveforms, our method provides simultaneous detection of change points in functional time series. The proposed approach can be used as a new picking algorithm in order to automatically identify the arrival times of P- and S-waves in different seismograms that are recording the same seismic event. A seismogram is a record of ground motion at a measuring station as a function of time, and it typically records motions along three orthogonal axes (X, Y, and Z), with the Z-axis being perpendicular to the Earth's surface and the X- and Y-axes being parallel to the surface and generally oriented in North-South and East-West directions, respectively. The proposed method was tested on a dataset of simulated waveforms in order to capture changes in the performance according to the waveform characteristics. In an application to real seismic data, our results demonstrated the ability of the multivariate algorithm to pick the arrival times in quite noisy waveforms coming from seismic events with low magnitudes.
Collapse
Affiliation(s)
- Nicoletta D’Angelo
- Dipartimento di Scienze Economiche, Aziendali e Statistiche, Università degli Studi di Palermo, 90128 Palermo, Italy
| | - Giada Adelfio
- Dipartimento di Scienze Economiche, Aziendali e Statistiche, Università degli Studi di Palermo, 90128 Palermo, Italy
- Osservatorio Nazionale Terremoti, Istituto Nazionale di Geofisica e Vulcanologia (INGV), 90146 Palermo, Italy
| | - Marcello Chiodi
- Dipartimento di Scienze Economiche, Aziendali e Statistiche, Università degli Studi di Palermo, 90128 Palermo, Italy
- Osservatorio Nazionale Terremoti, Istituto Nazionale di Geofisica e Vulcanologia (INGV), 90146 Palermo, Italy
| | - Antonino D’Alessandro
- Osservatorio Nazionale Terremoti, Istituto Nazionale di Geofisica e Vulcanologia (INGV), 90146 Palermo, Italy
| |
Collapse
|
9
|
Hu J, Wang L. A weighted U-statistic based change point test for multivariate time series. Stat Pap (Berl) 2022. [DOI: 10.1007/s00362-022-01341-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
10
|
S T PK, Lahiri B, Alvarado R. Multiple change point estimation of trends in Covid-19 infections and deaths in India as compared with WHO regions. SPATIAL STATISTICS 2022; 49:100538. [PMID: 34493970 PMCID: PMC8413104 DOI: 10.1016/j.spasta.2021.100538] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 08/25/2021] [Accepted: 08/26/2021] [Indexed: 05/08/2023]
Abstract
The present study aims at estimating the multiple change points for the time series data of COVID-19 confirmed cases and deaths and trend estimation within the estimated multiple change points (MCP) in India as compared with WHO regions. The data were described using descriptive statistical measures, and for the estimation of change point's E-divisive procedure was employed. Further, the trend within the estimated change points was tested using Sen's slope and Mann Kendal tests. India, along with the African Region, American region, and South East Asia regions experienced a significant surge in the fresh cases up to the 5th Change point. Among the WHO regions, The American region was the worst hit by the pandemic in case of fresh cases and deaths. While the European region experienced an early negative trend of fresh cases during the 3rd and 4th change point, but later the situation reversed by the 5th (7th July 2020) and 6th (6th August 2020) change point. The trend of deaths in India and the South-East Asia Region was similar, and global deaths had a negative trend from the 4th (17th May 2020) Change point onwards. The change points were estimated with prefixed significance level α < 0.002. Infections and deaths were positively significant for India and SEARO region across change points. Infection was significant at every 30 days interval across other WHO regions, and any delay in the infections was due to the interventions. The European region is expected to have a second wave of positive infections during the 5th and 6th change points though the early two change points were negatively significant. The study highlights the efficacy of change point analysis in understanding the dynamics of covid-19 cases in India and across the world. It further helps to develop effective public health strategies.
Collapse
Affiliation(s)
- Pavan Kumar S T
- College of Community Science, Central Agricultural University, Tura, Meghalaya 794005, India
| | - Biswajit Lahiri
- College of Fisheries, Central Agricultural University, Lembucherra, Tripura, India
| | - Rafael Alvarado
- Carrera de Economía and Centro de Investigaciones Sociales y Económicas, Universidad Nacional de Loja, Loja 110150, Ecuador
| |
Collapse
|
11
|
Jia S, Shi L. Efficient change-points detection for genomic sequences via cumulative segmented regression. Bioinformatics 2022; 38:311-317. [PMID: 34601562 DOI: 10.1093/bioinformatics/btab685] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 07/08/2021] [Accepted: 09/28/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Knowing the number and the exact locations of multiple change points in genomic sequences serves several biological needs. The cumulative-segmented algorithm (cumSeg) has been recently proposed as a computationally efficient approach for multiple change-points detection, which is based on a simple transformation of data and provides results quite robust to model mis-specifications. However, the errors are also accumulated in the transformed model so that heteroscedasticity and serial correlation will show up, and thus the variations of the estimated change points will be quite different, while the locations of the change points should be of the same importance in the original genomic sequences. RESULTS In this study, we develop two new change-points detection procedures in the framework of cumulative segmented regression. Simulations reveal that the proposed methods not only improve the efficiency of each change point estimator substantially but also provide the estimators with similar variations for all the change points. By applying these proposed algorithms to Coriel and SNP genotyping data, we illustrate their performance on detecting copy number variations. AVAILABILITY AND IMPLEMENTATION The proposed algorithms are implemented in R program and the codes are provided in the online supplementary material. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shengji Jia
- School of Statistics and Mathematics; Interdisciplinary Research Institute of Data Science, Shanghai Lixin University of Accounting and Finance, Shanghai 201209, China
| | - Lei Shi
- Statistics and Mathematics School, Yunnan University of Finance and Economics, Kunming 650221, China
| |
Collapse
|
12
|
Li D, Wang L, Zhao W. Estimation and inference for multikink expectile regression with longitudinal data. Stat Med 2021; 41:1296-1313. [PMID: 34883531 DOI: 10.1002/sim.9277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Revised: 11/14/2021] [Accepted: 11/16/2021] [Indexed: 11/07/2022]
Abstract
In this article, we investigate parameter estimation, kink points testing and statistical inference for a longitudinal multikink expectile regression model. The estimators for the kink locations and regression coefficients are obtained by using a bootstrap restarting iterative algorithm to avoid local minima. A backward selection procedure based on a modified BIC is applied to estimate the number of kink points. We theoretically demonstrate the number selection consistency of kink points and the asymptotic normality of all estimators. In particular, the estimators of kink locations are shown to achieve root-n consistency. A weighted cumulative sum type statistic is proposed to test the existence of kink effects at a given expectile, and its limiting distributions are derived under both the null and the local alternative hypotheses. The traditional Wald-type and cluster bootstrap confidence intervals for kink locations are also constructed. Simulation studies show that the proposed estimators and test have desirable finite sample performance in both homoscedastic and heteroscedastic errors. Two applications to the Nation Growth, Lung and Health Study and Capital Bike sharing dataset in Washington D.C. are also presented. The R codes for simulation studies and the real data are available at https://github.com/wangleink/MKER.
Collapse
Affiliation(s)
- Dongyu Li
- School of Statistics and Data Science & LPMC, Nankai University, Tianjin, China
| | - Lei Wang
- School of Statistics and Data Science & LPMC, Nankai University, Tianjin, China
| | - Weihua Zhao
- School of Science, Nantong University, Nantong, China
| |
Collapse
|
13
|
Gierz K, Park K. Detection of multiple change points in a Weibull accelerated failure time model using sequential testing. Biom J 2021; 64:617-634. [PMID: 34873728 DOI: 10.1002/bimj.202000262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 06/21/2021] [Accepted: 07/19/2021] [Indexed: 11/07/2022]
Abstract
With improvements to cancer diagnoses and treatments, incidences and mortality rates have changed. However, the most commonly used analysis methods do not account for such distributional changes. In survival analysis, change point problems can concern a shift in a distribution for a set of time-ordered observations, potentially under censoring or truncation. We propose a sequential testing approach for detecting multiple change points in the Weibull accelerated failure time model, since this is sufficiently flexible to accommodate increasing, decreasing, or constant hazard rates and is also the only continuous distribution for which the accelerated failure time model can be reparameterized as a proportional hazards model. Our sequential testing procedure does not require the number of change points to be known; this information is instead inferred from the data. We conduct a simulation study to show that the method accurately detects change points and estimates the model. The numerical results along with real data applications demonstrate that our proposed method can detect change points in the hazard rate.
Collapse
Affiliation(s)
| | - Kayoung Park
- Department of Mathematics and Statistics, Old Dominion University, Norfolk, VA, USA
| |
Collapse
|
14
|
Gierz K, Park K, Qiu P. Non-parametric treatment time-lag effect estimation. Stat Methods Med Res 2021; 31:62-75. [PMID: 34784808 DOI: 10.1177/09622802211032693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In general, the change point problem considers inference of a change in distribution for a set of time-ordered observations. This has applications in a large variety of fields, and can also apply to survival data. In survival analysis, most existing methods compare two treatment groups for the entirety of the study period. Some treatments may take a length of time to show effects in subjects. This has been called the time-lag effect in the literature, and in cases where time-lag effect is considerable, such methods may not be appropriate to detect significant differences between two groups. In this paper, we propose a novel non-parametric approach for estimating the point of treatment time-lag effect by using an empirical divergence measure. Theoretical properties of the estimator are studied. The results from the simulated data and the applications to real data examples support our proposed method.
Collapse
Affiliation(s)
- Kristine Gierz
- Head Quarters Air Force Studies, Analysis, and Assessments, The Pentagon, Washington, D.C., USA
| | - Kayoung Park
- Department of Mathematics and Statistics, 6042Old Dominion University, Old Dominion University, Norfolk, VA, USA
| | - Peihua Qiu
- Department of Biostatistics, University of Florida, Gainesville, FL, USA
| |
Collapse
|
15
|
Jin H, Yin G, Yuan B, Jiang F. Bayesian Hierarchical Model for Change Point Detection in Multivariate Sequences. Technometrics 2021. [DOI: 10.1080/00401706.2021.1927848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Huaqing Jin
- Department of Statistics and Actuarial Science, the University of Hong Kong, Hong Kong, Hong Kong
| | - Guosheng Yin
- Department of Statistics and Actuarial Science, the University of Hong Kong, Hong Kong, Hong Kong
| | - Binhang Yuan
- Computer Science Department, Rice University, Houston, TX
| | - Fei Jiang
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA
| |
Collapse
|
16
|
Anastasiou A, Fryzlewicz P. Detecting multiple generalized change-points by isolating single ones. METRIKA 2021; 85:141-174. [PMID: 34054146 PMCID: PMC8142888 DOI: 10.1007/s00184-021-00821-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 04/28/2021] [Indexed: 11/12/2022]
Abstract
We introduce a new approach, called Isolate-Detect (ID), for the consistent estimation of the number and location of multiple generalized change-points in noisy data sequences. Examples of signal changes that ID can deal with are changes in the mean of a piecewise-constant signal and changes, continuous or not, in the linear trend. The number of change-points can increase with the sample size. Our method is based on an isolation technique, which prevents the consideration of intervals that contain more than one change-point. This isolation enhances ID's accuracy as it allows for detection in the presence of frequent changes of possibly small magnitudes. In ID, model selection is carried out via thresholding, or an information criterion, or SDLL, or a hybrid involving the former two. The hybrid model selection leads to a general method with very good practical performance and minimal parameter choice. In the scenarios tested, ID is at least as accurate as the state-of-the-art methods; most of the times it outperforms them. ID is implemented in the R packages IDetect and breakfast, available from CRAN. SUPPLEMENTARY INFORMATION The online version supplementary material available at 10.1007/s00184-021-00821-6.
Collapse
Affiliation(s)
- Andreas Anastasiou
- Department of Mathematics and Statistics, University of Cyprus, P.O. Box 20537, 1678 Nicosia, Cyprus
| | - Piotr Fryzlewicz
- Department of Statistics, The London School of Economics and Political Science, Columbia House, Houghton Street, London, WC2A 2AE UK
| |
Collapse
|
17
|
Charlesworth D, Zhang Y, Bergero R, Graham C, Gardner J, Yong L. Using GC Content to Compare Recombination Patterns on the Sex Chromosomes and Autosomes of the Guppy, Poecilia reticulata, and Its Close Outgroup Species. Mol Biol Evol 2021; 37:3550-3562. [PMID: 32697821 DOI: 10.1093/molbev/msaa187] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Genetic and physical mapping of the guppy (Poecilia reticulata) have shown that recombination patterns differ greatly between males and females. Crossover events occur evenly across the chromosomes in females, but in male meiosis they are restricted to the tip furthest from the centromere of each chromosome, creating very high recombination rates per megabase, as in pseudoautosomal regions of mammalian sex chromosomes. We used GC content to indirectly infer recombination patterns on guppy chromosomes, based on evidence that recombination is associated with GC-biased gene conversion, so that genome regions with high recombination rates should be detectable by high GC content. We used intron sequences and third positions of codons to make comparisons between sequences that are matched, as far as possible, and are all probably under weak selection. Almost all guppy chromosomes, including the sex chromosome (LG12), have very high GC values near their assembly ends, suggesting high recombination rates due to strong crossover localization in male meiosis. Our test does not suggest that the guppy XY pair has stronger crossover localization than the autosomes, or than the homologous chromosome in the close relative, the platyfish (Xiphophorus maculatus). We therefore conclude that the guppy XY pair has not recently undergone an evolutionary change to a different recombination pattern, or reduced its crossover rate, but that the guppy evolved Y-linkage due to acquiring a male-determining factor that also conferred the male crossover pattern. We also identify the centromere ends of guppy chromosomes, which were not determined in the genome assembly.
Collapse
Affiliation(s)
- Deborah Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Yexin Zhang
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Roberta Bergero
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Chay Graham
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Jim Gardner
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Lengxob Yong
- Centre for Ecology and Conservation, University of Exeter, Falmouth, Cornwall, United Kingdom
| |
Collapse
|
18
|
Classification of abrupt changes along viewing profiles of scientific articles. J Informetr 2021. [DOI: 10.1016/j.joi.2021.101158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
19
|
Li Y, Hu Z, Liu J, Deng J. A note on regression kink model. COMMUN STAT-THEOR M 2021. [DOI: 10.1080/03610926.2021.1890780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Yi Li
- College of Finance and Statistics, Hunan University, Changsha, China
| | - Zongyi Hu
- College of Finance and Statistics, Hunan University, Changsha, China
| | - Jiaqi Liu
- College of Finance and Statistics, Hunan University, Changsha, China
| | - Jingjing Deng
- School of Business, Hunan International Economics University, Changsha, China
| |
Collapse
|
20
|
Bayesian Multiple Change-Points Detection in a Normal Model with Heterogeneous Variances. Comput Stat 2021. [DOI: 10.1007/s00180-020-01054-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
21
|
Belcaid A, Douimi M. A Novel Online Change Point Detection Using an Approximate Random Blanket and the Line Process Energy. INT J ARTIF INTELL T 2020. [DOI: 10.1142/s0218213020500189] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
In this paper, we focus on the problem of change point detection in piecewise constant signals. This problem is central to several applications such as human activity analysis, speech or image analysis and anomaly detection in genetics. We present a novel window-sliding algorithm for an online change point detection. The proposed approach considers a local blanket of a global Markov Random Field (MRF) representing the signal and its noisy observation. For each window, we define and solve the local energy minimization problem to deduce the gradient on each edge of the MRF graph. The gradient is then processed by an activation function to filter the weak features and produce the final jumps. We demonstrate the effectiveness of our method by comparing its running time and several detection metrics with state of the art algorithms.
Collapse
Affiliation(s)
- A. Belcaid
- Euromed University of Fes, Route Nationale Fès-Meknès, Morocco
| | - M. Douimi
- Mathematics Department, National School of Arts and Crafts, Meknes, 50010, Morocco
| |
Collapse
|
22
|
Wang Y, Wang Z, Zi X. Rank-based multiple change-point detection. COMMUN STAT-THEOR M 2020. [DOI: 10.1080/03610926.2019.1589515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Yunlong Wang
- Institute of Statistics and LPMC Nankai University, Tianjin, China
| | - Zhaojun Wang
- Institute of Statistics and LPMC Nankai University, Tianjin, China
| | - Xuemin Zi
- School of Science, Tianjin University of Technology and Education, Tianjin, China
| |
Collapse
|
23
|
Fearnhead P, Rigaill G. Relating and comparing methods for detecting changes in mean. Stat (Int Stat Inst) 2020. [DOI: 10.1002/sta4.291] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Paul Fearnhead
- Department of Mathematics and Statistics Lancaster University Lancaster LA1 4YF UK
| | - Guillem Rigaill
- Université Paris‐Saclay, CNRS, INRAE, Univ Evry, Institute of Plant Sciences Paris‐Saclay (IPS2) Orsay 91405 France
- Université de Paris, CNRS, INRAE, Institute of Plant Sciences Paris‐Saclay (IPS2) Orsay 91405 France
- Université Paris‐Saclay, CNRS, Univ Evry, Laboratoire de Mathématiques et Modélisation d'Evry Evry 91037 France
| |
Collapse
|
24
|
Wang W, He X, Zhu Z. Statistical inference for multiple change‐point models. Scand Stat Theory Appl 2020. [DOI: 10.1111/sjos.12456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Wu Wang
- Statistics Program King Abdullah University of Science and Technology Saudi Arabia
| | - Xuming He
- Department of Statistics University of Michigan USA
| | - Zhongyi Zhu
- Department of Statistics Fudan University China
| |
Collapse
|
25
|
Chronister WD, Burbulis IE, Wierman MB, Wolpert MJ, Haakenson MF, Smith ACB, Kleinman JE, Hyde TM, Weinberger DR, Bekiranov S, McConnell MJ. Neurons with Complex Karyotypes Are Rare in Aged Human Neocortex. Cell Rep 2020; 26:825-835.e7. [PMID: 30673605 DOI: 10.1016/j.celrep.2018.12.107] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Revised: 09/04/2018] [Accepted: 12/26/2018] [Indexed: 11/26/2022] Open
Abstract
A subset of human neocortical neurons harbors complex karyotypes wherein megabase-scale copy-number variants (CNVs) alter allelic diversity. Divergent levels of neurons with complex karyotypes (CNV neurons) are reported in different individuals, yet genome-wide and familial studies implicitly assume a single brain genome when assessing the genetic risk architecture of neurological disease. We assembled a brain CNV atlas using a robust computational approach applied to a new dataset (>800 neurons from 5 neurotypical individuals) and to published data from 10 additional neurotypical individuals. The atlas reveals that the frequency of neocortical neurons with complex karyotypes varies widely among individuals, but this variability is not readily accounted for by tissue quality or CNV detection approach. Rather, the age of the individual is anti-correlated with CNV neuron frequency. Fewer CNV neurons are observed in aged individuals than in young individuals.
Collapse
Affiliation(s)
- William D Chronister
- Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA 22908, USA
| | - Ian E Burbulis
- Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA 22908, USA; Universidad San Sebastian, Escuela de Medicina, Sede de la Patagonia, Puerto Montt, Chile
| | - Margaret B Wierman
- Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA 22908, USA
| | - Matthew J Wolpert
- Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA 22908, USA
| | - Mark F Haakenson
- Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA 22908, USA
| | - Aiden C B Smith
- Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA 22908, USA
| | - Joel E Kleinman
- Lieber Institute for Brain Development, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Psychiatry, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Thomas M Hyde
- Lieber Institute for Brain Development, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Psychiatry, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Daniel R Weinberger
- Lieber Institute for Brain Development, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Psychiatry, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Stefan Bekiranov
- Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA 22908, USA; Center for Public Health Genomics, University of Virginia School of Medicine, Charlottesville, VA 22908, USA
| | - Michael J McConnell
- Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA 22908, USA; Department of Neuroscience, University of Virginia School of Medicine, Charlottesville, VA 22908, USA; Center for Public Health Genomics, University of Virginia School of Medicine, Charlottesville, VA 22908, USA; Center for Brain Immunology and Glia, University of Virginia School of Medicine, Charlottesville, VA 22908, USA; Child Health Research Center, University of Virginia School of Medicine, Charlottesville, VA 22908, USA.
| |
Collapse
|
26
|
Fryzlewicz P. Detecting possibly frequent change-points: Wild Binary Segmentation 2 and steepest-drop model selection. J Korean Stat Soc 2020. [DOI: 10.1007/s42952-020-00060-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
27
|
Kachouie NN, Shutaywi M, Christiani DC. Discriminant Analysis of Lung Cancer Using Nonlinear Clustering of Copy Numbers. Cancer Invest 2020; 38:102-112. [PMID: 31977287 PMCID: PMC10283398 DOI: 10.1080/07357907.2020.1719501] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Accepted: 01/18/2020] [Indexed: 01/14/2023]
Abstract
Background: Patient survival is not optimal for non-small cell lung cancer (NSCLC) patients, recurrence rate is high, and hence, early detection is crucial to increase the patient's survival. Gene-cancer mapping intends to discover associated genes with cancers and due to advances in high-throughput genotyping, screening for disease loci on a genome-wide scale is now possible. DNA copy numbers can potentially be used to identify cancer from normal cells in early detection of cancer.Methods: We use a nonlinear clustering method, so-called kernel K-means to separate cancer from normal samples. Kernel K-means is applied to the copy numbers obtained for each chromosome to cluster 63 paired cancer-blood samples (total of 126 samples) into two groups. Clustering performance is evaluated using true and false-positive rates, true and false-negative rates, and a nonlinear criterion, normalized mutual information (NMI).Results: Copy numbers of paired cancer-blood samples for 63 NSCLC patients are used in this study. Kernel K-means was applied to cluster 126 samples in two groups using copy numbers on each chromosome separately. The clustering results for 22 chromosomes are evaluated and discriminant power of them in identifying cancer is computed. We identified the top five and bottom five chromosomes based on their discriminant power.Conclusions: The results reveal high discriminant power of chromosomes 8, 5, 1, 3, and 19 for identifying cancer with the highest sensitivity of 75% yielded by chromosome 5. Bottom 5 chromosomes 9, 6, 4, 13, and 21 show low discriminant power with the accuracy of below 54% where true cancer and normal samples are grouped into substantially overlapping groups using copy numbers. This indicates the similarities of copy numbers obtained for cancer and normal samples on these chromosomes.
Collapse
Affiliation(s)
| | - Meshal Shutaywi
- Department of Mathematical Sciences, Florida Institute of Technology
| | - David C. Christiani
- Department of Environmental Health, Harvard School of Public Health
- Department of Epidemiology, Harvard School of Public Health
| |
Collapse
|
28
|
Cheng D, He Z, Schwartzman A. Multiple testing of local extrema for detection of change points. Electron J Stat 2020. [DOI: 10.1214/20-ejs1751] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
29
|
Kachouie NN, Deebani W, Christiani DC. Identifying Similarities and Disparities Between DNA Copy Number Changes in Cancer and Matched Blood Samples. Cancer Invest 2019; 37:535-545. [PMID: 31584296 DOI: 10.1080/07357907.2019.1667368] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Background: Non-small cell lung cancer (NSCLC) is the first cause of cancer-related mortality for men and women in the United States. In spite of curative resection in early-stage, patient survival is not optimal and recurrence rate is high. Consequently, early detection and staging is essential to increase the patient's survival.Methods: Copy number (CN) changes in cancer populations have been broadly investigated to identify CN gains and deletions associated with cancer. In contrast, in this research, we quantify the similarities and disparities between cancer and paired peripheral blood samples using maximal information coefficient (MIC). We then detect the spatial locations with substantially high and the spatial locations with very low MICs in each chromosome. These locations can potentially help with early diagnosis, treatment, and prevention of cancer by identifying the similarities and disparities between cancer and healthy tissues.Results: Lung cancer data used in this project contains CN pairs for cancer and blood (non-involved) samples for 63 subjects. MIC was obtained to quantify the relation (linear or nonlinear) between cancer-blood pair samples for 63 subjects at each location for each chromosome. MIC values above a high threshold and MIC values below a low threshold were located. Among them top five (with lowest MIC's and with highest MIC's) were identified for each chromosome. For these identified locations, a high MIC score indicates high similarity between blood (non-involved) and cancer samples, while a low MIC score shows lack of similarity between the two samples.Conclusions: The results showed that a few chromosomes have a large number of MICs exceeding a high threshold. These locations can potentially be used to identify early indicators of NSCLC. In contrast, second group of chromosomes have several locations with small MICs which are potential candidates to develop biomarkers for discriminating cancer from the matched blood sample. Moreover, there is a third group of chromosomes with a large number of MICs exceeding a high threshold and a large set of MICs below a low threshold. These locations can help with both finding early indicators of cancer and developing biomarkers for discriminating cancer from non-involved tissue.
Collapse
Affiliation(s)
- Nezamoddin N Kachouie
- Department of Mathematical Sciences, Florida Institute of Technology, Melbourne, Florida, USA
| | - Wejdan Deebani
- Department of Mathematical Sciences, Florida Institute of Technology, Melbourne, Florida, USA
| | - David C Christiani
- Department of Environmental Health, Harvard School of Public Health, Boston, Massachusetts, USA.,Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, USA
| |
Collapse
|
30
|
Vlisidou I, Hapeshi A, Healey JR, Smart K, Yang G, Waterfield NR. The Photorhabdus asymbiotica virulence cassettes deliver protein effectors directly into target eukaryotic cells. eLife 2019; 8:46259. [PMID: 31526474 PMCID: PMC6748792 DOI: 10.7554/elife.46259] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Accepted: 06/12/2019] [Indexed: 01/19/2023] Open
Abstract
Photorhabdus is a highly effective insect pathogen and symbiont of insecticidal nematodes. To exert its potent insecticidal effects, it elaborates a myriad of toxins and small molecule effectors. Among these, the Photorhabdus Virulence Cassettes (PVCs) represent an elegant self-contained delivery mechanism for diverse protein toxins. Importantly, these self-contained nanosyringes overcome host cell membrane barriers, and act independently, at a distance from the bacteria itself. In this study, we demonstrate that Pnf, a PVC needle complex associated toxin, is a Rho-GTPase, which acts via deamidation and transglutamination to disrupt the cytoskeleton. TEM and Western blots have shown a physical association between Pnf and its cognate PVC delivery mechanism. We demonstrate that for Pnf to exert its effect, translocation across the cell membrane is absolutely essential.
Collapse
Affiliation(s)
- Isabella Vlisidou
- All Wales Genetics Laboratory, Institute of Medical Genetics, University Hospital of Wales, Cardiff, United Kingdom
| | - Alexia Hapeshi
- Warwick Medical School, Warwick University, Coventry, United Kingdom
| | - Joseph Rj Healey
- Warwick Medical School, Warwick University, Coventry, United Kingdom
| | - Katie Smart
- Warwick Medical School, Warwick University, Coventry, United Kingdom
| | - Guowei Yang
- Beijing Friendship Hospital, Capital Medical University, Beijing, China
| | | |
Collapse
|
31
|
Nenova Z, Hotchkiss J. Appointment utilization as a trigger for palliative care introduction: A retrospective cohort study. Palliat Med 2019; 33:457-461. [PMID: 30747040 DOI: 10.1177/0269216319828602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
BACKGROUND Chronic kidney disease palliative care guidelines would benefit from more diverse and objectively defined health status measures. AIM The aim is to identify high-risk patients from administrative data and facilitate timely and uniform palliative care involvement. DESIGN It is a retrospective cohort study. SETTING/PARTICIPANTS In total, 45,368 Veterans, with chronic kidney disease Stage 3, 4, or 5, were monitored for up to 6 years and categorized into three groups, based on whether they died, started dialysis, or avoided both outcomes. RESULTS Patient's appointment utilization was a significant predictor for both outcomes. It separated individuals into low, medium, and high appointment utilizers. Among the low appointment utilizers, the risk of death did not change significantly, while the risk of dialysis increased. Medium appointment utilizers had a stable risk of death and a decreasing risk of dialysis. Significant appointment utilization (above 31 visits during the baseline year) helped high-risk patients avoid both outcomes of interest-death and dialysis. CONCLUSION Our model could justify the creation of a novel palliative care introduction trigger, as patients with medium demand for care may benefit from additional palliative care evaluation. The trigger could facilitate the uniformization of conservative treatment preparations. It could prompt messages to a managing physician when a patient crosses the threshold between low and medium appointment utilization. It may also aid in system-level policy development. Furthermore, our results highlight the benefit of significant appointment utilization among high-risk patients.
Collapse
Affiliation(s)
- Zlatana Nenova
- 1 Daniels College of Business, University of Denver, Denver, CO, USA
| | - John Hotchkiss
- 2 Department of Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
32
|
Xu M, Wu Y, Jin B. Detection of a change-point in variance by a weighted sum of powers of variances test. J Appl Stat 2019. [DOI: 10.1080/02664763.2018.1510475] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Affiliation(s)
- M. Xu
- Global Risk Management, Scotiabank, Toronto, Canada
| | - Y. Wu
- Department of Mathematics and Statistics, York University, Toronto, Canada
| | - B. Jin
- Department of Statistics and Finance, University of Science and Technology of China, Hefei, People's Republic of China
| |
Collapse
|
33
|
|
34
|
Sundararajan RR, Pourahmadi M. Nonparametric change point detection in multivariate piecewise stationary time series. J Nonparametr Stat 2018. [DOI: 10.1080/10485252.2018.1504943] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Affiliation(s)
| | - Mohsen Pourahmadi
- Department of Statistics, Texas A&M University, College Station, TX, USA
| |
Collapse
|
35
|
Messer M, Albert S, Schneider G. The multiple filter test for change point detection in time series. METRIKA 2018. [DOI: 10.1007/s00184-018-0672-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
36
|
Measuring Recovery to Build up Metrics of Flood Resilience Based on Pollutant Discharge Data: A Case Study in East China. WATER 2017. [DOI: 10.3390/w9080619] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
37
|
Fasola S, Muggeo VMR, Küchenhoff H. A heuristic, iterative algorithm for change-point detection in abrupt change models. Comput Stat 2017. [DOI: 10.1007/s00180-017-0740-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
38
|
Hardman SI, Zollinger SA, Koselj K, Leitner S, Marshall RC, Brumm H. Lombard effect onset times reveal the speed of vocal plasticity in a songbird. J Exp Biol 2017; 220:1065-1071. [DOI: 10.1242/jeb.148734] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Accepted: 12/29/2016] [Indexed: 11/20/2022]
Abstract
Animals that use vocal signals to communicate often compensate for interference and masking from background noise by raising the amplitude of their vocalisations. This response has been termed the Lombard effect. However, despite more than a century of research little is known how quickly animals can adjust the amplitude of their vocalisations after the onset of noise. The ability to respond quickly to increases in noise levels would allow animals to avoid signal masking and ensure their calls continue to be heard, even if they are interrupted by sudden bursts of high amplitude noise. We tested how quickly singing male canaries (Serinus canaria) exhibit the Lombard effect by exposing them to short playbacks of white noise and measuring the speed of their responses. We show that canaries exhibit the Lombard effect in as little as 300 ms after the onset of noise and are also able to increase the amplitude of their songs mid-song and mid-phrase without pausing. Our results demonstrate high vocal plasticity in this species and suggest that birds are able to adjust the amplitude of their vocalisations very rapidly to ensure they can still be heard even during sudden changes in background noise levels.
Collapse
Affiliation(s)
- Samuel I. Hardman
- The Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
- Communication and Social Behaviour Group, Seewiesen, 82319, Germany
| | | | - Klemen Koselj
- Acoustic and Functional Ecology Group, Seewiesen, 82319, Germany
| | - Stefan Leitner
- Department of Behavioural Neurobiology, Max Planck Institute for Ornithology, Seewiesen, 82319, Germany
| | - Rupert C. Marshall
- The Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| | - Henrik Brumm
- Communication and Social Behaviour Group, Seewiesen, 82319, Germany
| |
Collapse
|
39
|
Szlęk J, Pacławski A, Lau R, Jachowicz R, Kazemi P, Mendyk A. Empirical search for factors affecting mean particle size of PLGA microspheres containing macromolecular drugs. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2016; 134:137-147. [PMID: 27480738 DOI: 10.1016/j.cmpb.2016.07.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2016] [Revised: 05/16/2016] [Accepted: 07/01/2016] [Indexed: 06/06/2023]
Abstract
BACKGROUND AND OBJECTIVES Poly(lactic-co-glycolic acid) (PLGA) has become one of the most promising in design, development, and optimization for medical applications polymers. PLGA-based multiparticulate dosage forms are usually prepared as microspheres where the size is from 5 to 100 µm, depending on the route of administration. The main objectives of the study were to develop a predictive model of mean volumetric particle size and on its basis extract knowledge of PLGA containing proteins forming behaviour. METHODS In the present study, a model for the prediction of mean volumetric particle size developed by an rgp package of R environment is presented. Other tools like fscaret, monmlp, fugeR, MARS, SVM, kNNreg, Cubist, randomForest and piecewise linear regression are also applied during the data mining procedure. RESULTS The feature selection provided by the fscaret package reduced the original input vector from a total of 295 input variables to 10, 16 and 19. The developed models had good predictive ability, which was confirmed by a normalized root-mean-square error (NRMSE) of 6.8 to 11.1% in 10-fold cross validation training procedure. Moreover, the best models were validated using external experimental data. The superior predictiveness had a model obtained by rgp in the form of a classical equation with a normalized root-mean-squared error (NRMSE) of 6.1%. CONCLUSIONS A new approach is proposed for computational modelling of the mean particle size of PLGA microspheres and rules extraction from tree-based models. The feature selection leads to revealing chemical descriptor variables which are important in predicting the size of PLGA microspheres. In order to achieve better understanding in the relationships between particle size and formulation characteristics, the surface analysis method and rules extraction procedures were applied.
Collapse
Affiliation(s)
- Jakub Szlęk
- Department of Pharmaceutical Technology and Biopharmaceutics, Jagiellonian University Medical College, Medyczna 9 St., 30-688 Cracow, Poland.
| | - Adam Pacławski
- Department of Pharmaceutical Technology and Biopharmaceutics, Jagiellonian University Medical College, Medyczna 9 St., 30-688 Cracow, Poland
| | - Raymond Lau
- School of Chemical and Biomedical Engineering, Nanyang Technological University, 62 Nanyang Drive, Singapore 637459, Singapore
| | - Renata Jachowicz
- Department of Pharmaceutical Technology and Biopharmaceutics, Jagiellonian University Medical College, Medyczna 9 St., 30-688 Cracow, Poland
| | - Pezhman Kazemi
- Department of Pharmaceutical Technology and Biopharmaceutics, Jagiellonian University Medical College, Medyczna 9 St., 30-688 Cracow, Poland
| | - Aleksander Mendyk
- Department of Pharmaceutical Technology and Biopharmaceutics, Jagiellonian University Medical College, Medyczna 9 St., 30-688 Cracow, Poland
| |
Collapse
|
40
|
van den Broek E, van Lieshout S, Rausch C, Ylstra B, van de Wiel MA, Meijer GA, Fijneman RJ, Abeln S. GeneBreak: detection of recurrent DNA copy number aberration-associated chromosomal breakpoints within genes. F1000Res 2016; 5:2340. [PMID: 28713543 PMCID: PMC5500957 DOI: 10.12688/f1000research.9259.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/05/2017] [Indexed: 01/23/2023] Open
Abstract
Development of cancer is driven by somatic alterations, including numerical and structural chromosomal aberrations. Currently, several computational methods are available and are widely applied to detect numerical copy number aberrations (CNAs) of chromosomal segments in tumor genomes. However, there is lack of computational methods that systematically detect structural chromosomal aberrations by virtue of the genomic location of CNA-associated chromosomal breaks and identify genes that appear non-randomly affected by chromosomal breakpoints across (large) series of tumor samples. 'GeneBreak' is developed to systematically identify genes recurrently affected by the genomic location of chromosomal CNA-associated breaks by a genome-wide approach, which can be applied to DNA copy number data obtained by array-Comparative Genomic Hybridization (CGH) or by (low-pass) whole genome sequencing (WGS). First, 'GeneBreak' collects the genomic locations of chromosomal CNA-associated breaks that were previously pinpointed by the segmentation algorithm that was applied to obtain CNA profiles. Next, a tailored annotation approach for breakpoint-to-gene mapping is implemented. Finally, dedicated cohort-based statistics is incorporated with correction for covariates that influence the probability to be a breakpoint gene. In addition, multiple testing correction is integrated to reveal recurrent breakpoint events. This easy-to-use algorithm, 'GeneBreak', is implemented in R ( www.cran.r-project.org) and is available from Bioconductor ( www.bioconductor.org/packages/release/bioc/html/GeneBreak.html).
Collapse
Affiliation(s)
- Evert van den Broek
- Department of Pathology, VU University Medical Center, Amsterdam, 1081 HZ, Netherlands
- Department of Pathology, Netherlands Cancer Institute, Amsterdam, 1066CX, Netherlands
| | - Stef van Lieshout
- Department of Pathology, VU University Medical Center, Amsterdam, 1081 HZ, Netherlands
| | - Christian Rausch
- Department of Pathology, VU University Medical Center, Amsterdam, 1081 HZ, Netherlands
- Department of Pathology, Netherlands Cancer Institute, Amsterdam, 1066CX, Netherlands
| | - Bauke Ylstra
- Department of Pathology, VU University Medical Center, Amsterdam, 1081 HZ, Netherlands
| | - Mark A. van de Wiel
- Department of Epidemiology & Biostatistics, VU University Medical Center, Amsterdam, 1081 HZ, Netherlands
- Department of Mathematics, VU University Medical Center, Amsterdam, Amsterdam, 1081 HV, Netherlands
| | - Gerrit A. Meijer
- Department of Pathology, VU University Medical Center, Amsterdam, 1081 HZ, Netherlands
- Department of Pathology, Netherlands Cancer Institute, Amsterdam, 1066CX, Netherlands
| | - Remond J.A. Fijneman
- Department of Pathology, VU University Medical Center, Amsterdam, 1081 HZ, Netherlands
- Department of Pathology, Netherlands Cancer Institute, Amsterdam, 1066CX, Netherlands
| | - Sanne Abeln
- Department of Computer Science, VU University Medical Center, Amsterdam, 1081 HV, Netherlands
| |
Collapse
|
41
|
van den Broek E, van Lieshout S, Rausch C, Ylstra B, van de Wiel MA, Meijer GA, Fijneman RJ, Abeln S. GeneBreak: detection of recurrent DNA copy number aberration-associated chromosomal breakpoints within genes. F1000Res 2016; 5:2340. [PMID: 28713543 PMCID: PMC5500957 DOI: 10.12688/f1000research.9259.1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/14/2016] [Indexed: 06/02/2024] Open
Abstract
Development of cancer is driven by somatic alterations, including numerical and structural chromosomal aberrations. Currently, several computational methods are available and are widely applied to detect numerical copy number aberrations (CNAs) of chromosomal segments in tumor genomes. However, there is lack of computational methods that systematically detect structural chromosomal aberrations by virtue of the genomic location of CNA-associated chromosomal breaks and identify genes that appear non-randomly affected by chromosomal breakpoints across (large) series of tumor samples. 'GeneBreak' is developed to systematically identify genes recurrently affected by the genomic location of chromosomal CNA-associated breaks by a genome-wide approach, which can be applied to DNA copy number data obtained by array-Comparative Genomic Hybridization (CGH) or by (low-pass) whole genome sequencing (WGS). First, 'GeneBreak' collects the genomic locations of chromosomal CNA-associated breaks that were previously pinpointed by the segmentation algorithm that was applied to obtain CNA profiles. Next, a tailored annotation approach for breakpoint-to-gene mapping is implemented. Finally, dedicated cohort-based statistics is incorporated with correction for covariates that influence the probability to be a breakpoint gene. In addition, multiple testing correction is integrated to reveal recurrent breakpoint events. This easy-to-use algorithm, 'GeneBreak', is implemented in R ( www.cran.r-project.org) and is available from Bioconductor ( www.bioconductor.org/packages/release/bioc/html/GeneBreak.html).
Collapse
Affiliation(s)
- Evert van den Broek
- Department of Pathology, VU University Medical Center, Amsterdam, 1081 HZ, Netherlands
- Department of Pathology, Netherlands Cancer Institute, Amsterdam, 1066CX, Netherlands
| | - Stef van Lieshout
- Department of Pathology, VU University Medical Center, Amsterdam, 1081 HZ, Netherlands
| | - Christian Rausch
- Department of Pathology, VU University Medical Center, Amsterdam, 1081 HZ, Netherlands
- Department of Pathology, Netherlands Cancer Institute, Amsterdam, 1066CX, Netherlands
| | - Bauke Ylstra
- Department of Pathology, VU University Medical Center, Amsterdam, 1081 HZ, Netherlands
| | - Mark A. van de Wiel
- Department of Epidemiology & Biostatistics, VU University Medical Center, Amsterdam, 1081 HZ, Netherlands
- Department of Mathematics, VU University Medical Center, Amsterdam, Amsterdam, 1081 HV, Netherlands
| | - Gerrit A. Meijer
- Department of Pathology, VU University Medical Center, Amsterdam, 1081 HZ, Netherlands
- Department of Pathology, Netherlands Cancer Institute, Amsterdam, 1066CX, Netherlands
| | - Remond J.A. Fijneman
- Department of Pathology, VU University Medical Center, Amsterdam, 1081 HZ, Netherlands
- Department of Pathology, Netherlands Cancer Institute, Amsterdam, 1066CX, Netherlands
| | - Sanne Abeln
- Department of Computer Science, VU University Medical Center, Amsterdam, 1081 HV, Netherlands
| |
Collapse
|
42
|
Pein F, Sieling H, Munk A. Heterogeneous change point inference. J R Stat Soc Series B Stat Methodol 2016. [DOI: 10.1111/rssb.12202] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
| | | | - Axel Munk
- Georg-August-Universität Göttingen and Max Planck Institute for Biophysical Chemistry; Göttingen Germany
| |
Collapse
|
43
|
Kachouie NN, Lin X, Christiani DC, Schwartzman A. Detection of Local DNA Copy Number Changes in Lung Cancer Population Analyses Using A Multi-Scale Approach. COMMUNICATIONS IN STATISTICS. CASE STUDIES, DATA ANALYSIS AND APPLICATIONS 2016; 1:206-216. [PMID: 31489360 PMCID: PMC6727850 DOI: 10.1080/23737484.2016.1197079] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Emerging advances in genomic sequencing have prompted the development of new computational methods for studying the genomic sources of human diseases. This paper presents a recent statistical approach for detection of local regions with significant copy number alterations (CNAs) in lung cancer population. Mapping such regions is of interest as they are potentially associated with lung cancer. Conventional application of multiple testing methods corresponds to testing for CNAs at each probe separately and thresholding the t-statistics as test statistics. Due to the large number of probes, this approach often fails to detect CNA regions. In contrast, the proposed method uses the heights of located peaks and improves the detection power. This is achieved by taking advantage of the spatial structure in the data as well as reducing the number of tests in the multiple comparisons problem. In copy number analysis, it is common to apply segmentation or change detection tools to each individual genomic sample. However, since segmentation results vary among subjects, it becomes difficult to find the common genomic regions in population analyses. Our approach solves this problem by performing the analysis using summary statistics to study at population level directly. Hence, the region detection is performed on the summary t-statistic map. The proposed method is applied to lung cancer data and shows promise for detection of local regions with significant CNAs.
Collapse
Affiliation(s)
| | - Xihong Lin
- Department of Statistics, Harvard School of Public Health
| | - David C Christiani
- Department of Environmental Health, Harvard School of Public Health
- Department of Epidemiology, Harvard School of Public Health
| | | |
Collapse
|
44
|
Rueda C, Fernández MA, Barragán S, Mardia KV, Peddada SD. Circular piecewise regression with applications to cell-cycle data. Biometrics 2016; 72:1266-1274. [PMID: 26991351 DOI: 10.1111/biom.12512] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2015] [Revised: 02/01/2016] [Accepted: 02/01/2016] [Indexed: 01/13/2023]
Abstract
Applications of circular regression models appear in many different fields such as evolutionary psychology, motor behavior, biology, and, in particular, in the analysis of gene expressions in oscillatory systems. Specifically, for the gene expression problem, a researcher may be interested in modeling the relationship among the phases of cell-cycle genes in two species with differing periods. This challenging problem reduces to the problem of constructing a piecewise circular regression model and, with this objective in mind, we propose a flexible circular regression model which allows different parameter values depending on sectors along the circle. We give a detailed interpretation of the parameters in the model and provide maximum likelihood estimators. We also provide a model selection procedure based on the concept of generalized degrees of freedom. The model is then applied to the analysis of two different cell-cycle data sets and through these examples we highlight the power of our new methodology.
Collapse
Affiliation(s)
- Cristina Rueda
- Departamento de Estadística e I.O., Universidad de Valladolid, 47011 Valladolid, Spain
| | - Miguel A Fernández
- Departamento de Estadística e I.O., Universidad de Valladolid, 47011 Valladolid, Spain
| | - Sandra Barragán
- Departamento de Estadística e I.O., Universidad de Valladolid, 47011 Valladolid, Spain
| | - Kanti V Mardia
- Department of Statistics, University of Oxford, Oxford, U.K., and Department of Statistics, University of Leeds, Leeds, U.K
| | - Shyamal D Peddada
- Biostatistics and Computational Biology Branch NIEHS (NIH), Research Triangle Park, NC, U.S.A
| |
Collapse
|
45
|
Hitt NP, Floyd M, Compton M, McDonald K. Threshold Responses of Blackside Dace (Chrosomus cumberlandensis) and Kentucky Arrow Darter (Etheostoma spilotum) to Stream Conductivity. SOUTHEAST NAT 2016. [DOI: 10.1656/058.015.0104] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
46
|
Exploring change of internal nutrients cycling in a shallow lake: A dynamic nutrient driven phytoplankton model. Ecol Modell 2015. [DOI: 10.1016/j.ecolmodel.2015.06.025] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
47
|
Nguyen TD, Schmidt B, Zheng Z, Kwoh CK. Efficient and Accurate OTU Clustering with GPU-Based Sequence Alignment and Dynamic Dendrogram Cutting. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:1060-1073. [PMID: 26451819 DOI: 10.1109/tcbb.2015.2407574] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
De novo clustering is a popular technique to perform taxonomic profiling of a microbial community by grouping 16S rRNA amplicon reads into operational taxonomic units (OTUs). In this work, we introduce a new dendrogram-based OTU clustering pipeline called CRiSPy. The key idea used in CRiSPy to improve clustering accuracy is the application of an anomaly detection technique to obtain a dynamic distance cutoff instead of using the de facto value of 97 percent sequence similarity as in most existing OTU clustering pipelines. This technique works by detecting an abrupt change in the merging heights of a dendrogram. To produce the output dendrograms, CRiSPy employs the OTU hierarchical clustering approach that is computed on a genetic distance matrix derived from an all-against-all read comparison by pairwise sequence alignment. However, most existing dendrogram-based tools have difficulty processing datasets larger than 10,000 unique reads due to high computational complexity. We address this difficulty by developing two efficient algorithms for CRiSPy: a compute-efficient GPU-accelerated parallel algorithm for pairwise distance matrix computation and a memory-efficient hierarchical clustering algorithm. Our experiments on various datasets with distinct attributes show that CRiSPy is able to produce more accurate OTU groupings than most OTU clustering applications.
Collapse
|
48
|
Guo P, Zeng F, Hu X, Zhang D, Zhu S, Deng Y, Hao Y. Improved Variable Selection Algorithm Using a LASSO-Type Penalty, with an Application to Assessing Hepatitis B Infection Relevant Factors in Community Residents. PLoS One 2015. [PMID: 26214802 PMCID: PMC4516242 DOI: 10.1371/journal.pone.0134151] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
OBJECTIVES In epidemiological studies, it is important to identify independent associations between collective exposures and a health outcome. The current stepwise selection technique ignores stochastic errors and suffers from a lack of stability. The alternative LASSO-penalized regression model can be applied to detect significant predictors from a pool of candidate variables. However, this technique is prone to false positives and tends to create excessive biases. It remains challenging to develop robust variable selection methods and enhance predictability. MATERIAL AND METHODS Two improved algorithms denoted the two-stage hybrid and bootstrap ranking procedures, both using a LASSO-type penalty, were developed for epidemiological association analysis. The performance of the proposed procedures and other methods including conventional LASSO, Bolasso, stepwise and stability selection models were evaluated using intensive simulation. In addition, methods were compared by using an empirical analysis based on large-scale survey data of hepatitis B infection-relevant factors among Guangdong residents. RESULTS The proposed procedures produced comparable or less biased selection results when compared to conventional variable selection models. In total, the two newly proposed procedures were stable with respect to various scenarios of simulation, demonstrating a higher power and a lower false positive rate during variable selection than the compared methods. In empirical analysis, the proposed procedures yielding a sparse set of hepatitis B infection-relevant factors gave the best predictive performance and showed that the procedures were able to select a more stringent set of factors. The individual history of hepatitis B vaccination, family and individual history of hepatitis B infection were associated with hepatitis B infection in the studied residents according to the proposed procedures. CONCLUSIONS The newly proposed procedures improve the identification of significant variables and enable us to derive a new insight into epidemiological association analysis.
Collapse
Affiliation(s)
- Pi Guo
- Department of Medical Statistics and Epidemiology and Health Information Research Center, School of Public Health, Sun Yat-sen University, Guangzhou, Guangdong, 510080, China
- Laboratory of Health Informatics, Guangdong Key Laboratory of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, 510080, China
| | - Fangfang Zeng
- Department of Medical Statistics and Epidemiology and Health Information Research Center, School of Public Health, Sun Yat-sen University, Guangzhou, Guangdong, 510080, China
- Laboratory of Health Informatics, Guangdong Key Laboratory of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, 510080, China
| | - Xiaomin Hu
- Department of Medical Statistics and Epidemiology and Health Information Research Center, School of Public Health, Sun Yat-sen University, Guangzhou, Guangdong, 510080, China
- Laboratory of Health Informatics, Guangdong Key Laboratory of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, 510080, China
| | - Dingmei Zhang
- Department of Medical Statistics and Epidemiology and Health Information Research Center, School of Public Health, Sun Yat-sen University, Guangzhou, Guangdong, 510080, China
- Laboratory of Health Informatics, Guangdong Key Laboratory of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, 510080, China
| | - Shuming Zhu
- Department of Medical Statistics and Epidemiology and Health Information Research Center, School of Public Health, Sun Yat-sen University, Guangzhou, Guangdong, 510080, China
- Laboratory of Health Informatics, Guangdong Key Laboratory of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, 510080, China
| | - Yu Deng
- Department of Medical Statistics and Epidemiology and Health Information Research Center, School of Public Health, Sun Yat-sen University, Guangzhou, Guangdong, 510080, China
- Laboratory of Health Informatics, Guangdong Key Laboratory of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, 510080, China
| | - Yuantao Hao
- Department of Medical Statistics and Epidemiology and Health Information Research Center, School of Public Health, Sun Yat-sen University, Guangzhou, Guangdong, 510080, China
- Laboratory of Health Informatics, Guangdong Key Laboratory of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, 510080, China
- * E-mail:
| |
Collapse
|
49
|
Hybrid algorithms for multiple change-point detection in biological sequences. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2015; 823:41-61. [PMID: 25381101 DOI: 10.1007/978-3-319-10984-8_3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Abstract
Array comparative genomic hybridization (aCGH) is one of the techniques that can be used to detect copy number variations in DNA sequences in high resolution. It has been identified that abrupt changes in the human genome play a vital role in the progression and development of many complex diseases. In this study we propose two distinct hybrid algorithms that combine efficient sequential change-point detection procedures (the Shiryaev-Roberts procedure and the cumulative sum control chart (CUSUM) procedure) with the Cross-Entropy method, which is an evolutionary stochastic optimization technique to estimate both the number of change-points and their corresponding locations in aCGH data. The proposed hybrid algorithms are applied to both artificially generated data and real aCGH experimental data to illustrate their usefulness. Our results show that the proposed methodologies are effective in detecting multiple change-points in biological sequences of continuous measurements.
Collapse
|
50
|
Adelfio G, Boscaino G. Degree course change and student performance: a mixed-effect model approach. J Appl Stat 2015. [DOI: 10.1080/02664763.2015.1018673] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|