1
|
Abstract
A two-level group-specific curve model is such that the mean response of each member of a group is a separate smooth function of a predictor of interest. The three-level extension is such that one grouping variable is nested within another one, and higher level extensions are analogous. Streamlined variational inference for higher level group-specific curve models is a challenging problem. We confront it by systematically working through two-level and then three-level cases and making use of the higher level sparse matrix infrastructure laid down in Nolan and Wand (2019). A motivation is analysis of data from ultrasound technology for which three-level group-specific curve models are appropriate. Whilst extension to the number of levels exceeding three is not covered explicitly, the pattern established by our systematic approach sheds light on what is required for even higher level group-specific curve models.
Collapse
Affiliation(s)
- M Menictas
- School of Mathematical and Physical Sciences, University of Technology Sydney, Australia
| | - T H Nolan
- School of Mathematical and Physical Sciences, University of Technology Sydney, Australia.,Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers
| | - D G Simpson
- Department of Statistics, University of Illinois at Urbana-Champaign, United States of America
| | - M P Wand
- School of Mathematical and Physical Sciences, University of Technology Sydney, Australia.,Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers
| |
Collapse
|
2
|
|
3
|
Affiliation(s)
| | - M. P. Wand
- University of New South Wales; Kensington Australia
| | | | | |
Collapse
|
4
|
Affiliation(s)
- M. P. Wand
- School of Mathematical and Physical Sciences, University of Technology Sydney, Sydney, Australia, and Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology (QUT), Brisbane, Australia
| |
Collapse
|
5
|
Affiliation(s)
- M. P. Wand
- School of Mathematical and Physical Sciences, University of Technology Sydney, Sydney, Australia, and Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology (QUT), Brisbane, Australia
| |
Collapse
|
6
|
|
7
|
|
8
|
Neville SE, Ormerod JT, Wand MP. Mean field variational Bayes for continuous sparse signal shrinkage: Pitfalls and remedies. Electron J Stat 2014. [DOI: 10.1214/14-ejs910] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
9
|
|
10
|
|
11
|
|
12
|
|
13
|
|
14
|
|
15
|
|
16
|
Abstract
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.
Collapse
Affiliation(s)
- M Al Kadiri
- Centre for Statistical and Survey Methodology, School of Mathematics and Applied Statistics, University of Wollongong, Wollongong, New South Wales, Australia
| | | | | |
Collapse
|
17
|
|
18
|
Wand MP, Ormerod JT. ON SEMIPARAMETRIC REGRESSION WITH O'SULLIVAN PENALISED SPLINES. AUST NZ J STAT 2010. [DOI: 10.1111/j.1467-842x.2010.00578.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
19
|
|
20
|
Abstract
The t-distribution allows the incorporation of outlier robustness into statistical models while retaining the elegance of likelihood-based inference. In this paper, we develop and implement a linear mixed model for the general design of the linear mixed model using the univariate t-distribution. This general design allows a considerably richer class of models to be fit than is possible with existing methods. Included in this class are semi-parametric regression and smoothing and spatial models.
Collapse
Affiliation(s)
- J Staudenmayer
- Department of Mathematics and Statistics, University of Massachusetts, USA
| | - E E Lake
- Eigenstat Inc., Newton, Massachusetts, USA
| | - M P Wand
- School of Mathematics and Applied Statistics, University of Wollongong, Australia
| |
Collapse
|
21
|
Abstract
Motivated by the needs of scientists using flow cytometry, we study the problem of estimating the region where two multivariate samples differ in density. We call this problem highest density difference region estimation and recognise it as a two-sample analogue of highest density region or excess set estimation. Flow cytometry samples are typically in the order of 10,000 and 100,000 and with dimension ranging from about 3 to 20. The industry standard for the problem being studied is called Frequency Difference Gating, due to Roederer and Hardy (2001). After couching the problem in a formal statistical framework we devise an alternative estimator that draws upon recent statistical developments such as patient rule induction methods. Improved performance is illustrated in simulations. While motivated by flow cytometry, the methodology is suitable for general multivariate random samples where density difference regions are of interest.
Collapse
Affiliation(s)
- Tarn Duong
- Institut Pasteur, Groupe Imagerie et Modélisation; CNRS, URA 2582, F-75015 Paris, France
| | | | | |
Collapse
|
22
|
|
23
|
|
24
|
Abstract
Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology - thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application.
Collapse
Affiliation(s)
- David Ruppert
- School of Operations Research and Information Engineering, Cornell University, 1170 Comstock Hall, Ithaca, NY 14853, U.S.A
| | | | | |
Collapse
|
25
|
|
26
|
Abstract
Semiparametric mixed model analysis benefits from variability estimates such as standard errors of effect estimates and variability bars to accompany curve estimates. We show how the underlying variance calculations can be done extremely efficiently compared with the direct naïve approach. These streamlined calculations are linear in the number of subjects, representing a two orders of magnitude improvement.
Collapse
Affiliation(s)
- Andrew D A C Smith
- School of Mathematics and Statistics, University of New South Wales, Sydney 2052, NSW, Australia.
| | | |
Collapse
|
27
|
Kuo FY, Dunsmuir WTM, Sloan IH, Wand MP, Womersley RS. Quasi-Monte Carlo for Highly Structured Generalised Response Models. Methodol Comput Appl Probab 2007. [DOI: 10.1007/s11009-007-9045-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
28
|
Abstract
Asthma researchers have found some evidence that geographical variations in susceptibility to asthma could reflect the effect of community level factors such as exposure to violence. Our methodology was motivated by a study of age at onset of asthma among children of inner-city neighbourhoods in East Boston. Cox's proportional hazards model was not appropriate since there was not enough information about the nature of geographical variations so as to impose a parametric relationship. In addition, some of the known risk factors were believed to have non-linear log-hazard ratios. We extend the geoadditive models of Kamman and Wand to the case where the outcome measure is a possibly censored time to event. We reduce the problem to one of fitting a Poisson mixed model by using Poisson approximations in conjunction with a mixed model formulation of generalized additive modelling. Our method allows for low-rank additive modelling, provides likelihood-based estimation of all parameters including the amount of smoothing and can be implemented using standard software. We illustrate our method on the East Boston data.
Collapse
Affiliation(s)
- B Ganguli
- Department of Statistics, University of Calcutta, Kolkata, India.
| | | |
Collapse
|
29
|
Oakes SR, Robertson FG, Kench JG, Gardiner-Garden M, Wand MP, Green JE, Ormandy CJ. Loss of mammary epithelial prolactin receptor delays tumor formation by reducing cell proliferation in low-grade preinvasive lesions. Oncogene 2006; 26:543-53. [PMID: 16862169 DOI: 10.1038/sj.onc.1209838] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Top quartile serum prolactin levels confer a twofold increase in the relative risk of developing breast cancer. Prolactin exerts this effect at an ill defined point in the carcinogenic process, via mechanisms involving direct action via prolactin receptors within mammary epithelium and/or indirect action through regulation of other hormones such as estrogen and progesterone. We have addressed these questions by examining mammary carcinogenesis in transplants of mouse mammary epithelium expressing the SV40T oncogene, with or without the prolactin receptor, using host animals with a normal endocrine system. In prolactin receptor knockout transplants the area of neoplasia was significantly smaller (7 versus 17%; P < 0.001 at 22 weeks and 7 versus 14%; P = 0.009 at 32 weeks). Low-grade neoplastic lesions displayed reduced BrdU incorporation rate (11.3 versus 17% P = 0.003) but no change in apoptosis rate. Tumor latency increased (289 days versus 236 days, P < 0.001). Tumor frequency, growth rate, morphology, cell proliferation and apoptosis were not altered. Thus, prolactin acts directly on the mammary epithelial cells to increase cell proliferation in preinvasive lesions, resulting in more neoplasia and acceleration of the transition to invasive carcinoma. Targeting of mammary prolactin signaling thus provides a strategy to prevent the early progression of neoplasia to invasive carcinoma.
Collapse
Affiliation(s)
- S R Oakes
- Cancer Research Program, Garvan Institute of Medical Research, Darlinghurst, Sydney, NSW, Australia
| | | | | | | | | | | | | |
Collapse
|
30
|
|
31
|
Salganik MP, Hardie DL, Swart B, Dandie GW, Zola H, Shaw S, Shapiro H, Tinckam K, Milford EL, Wand MP. Detecting antibodies with similar reactivity patterns in the HLDA8 blind panel of flow cytometry data. J Immunol Methods 2005; 305:67-74. [PMID: 16129446 DOI: 10.1016/j.jim.2005.07.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/20/2005] [Indexed: 11/25/2022]
Abstract
The blind panel collected for the 8th Human Leucocyte Differentiation Antigens Workshop (HLDA8; ) included 49 antibodies of known CD specificities and 76 antibodies of unknown specificity. We have identified groups of antibodies showing similar patterns of reactivity that need to be investigated by biochemical methods to evaluate whether the antibodies within these groups are reacting with the same molecule. Our approach to data analysis was based on the work of Salganik et al. (in press) [Salganik, M.P., Milford E.L., Hardie D.L., Shaw, S., Wand, M.P., in press. Classifying antibodies using flow cytometry data: class prediction and class discovery. Biometrical Journal].
Collapse
Affiliation(s)
- M P Salganik
- Department of Biostatistics, Harvard School of Public Health, 665 Huntington Avenue, Boston, MA 02115, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Abstract
We present a simple semiparametric model for fitting subject-specific curves for longitudinal data. Individual curves are modelled as penalized splines with random coefficients. This model has a mixed model representation, and it is easily implemented in standard statistical software. We conduct an analysis of the long-term effect of radiation therapy on the height of children suffering from acute lymphoblastic leukaemia using penalized splines in the framework of semiparametric mixed effects models. The analysis revealed significant differences between therapies and showed that the growth rate of girls in the study cannot be fully explained by the group-average curve and that individual curves are necessary to reflect the individual response to treatment. We also show how to implement these models in S-PLUS and R in the appendix.
Collapse
Affiliation(s)
- M Durbán
- Department of Statistics, Universidad Carlos III de Madrid, Leganés, 28911 Madrid, Spain.
| | | | | | | |
Collapse
|
33
|
|
34
|
|
35
|
|
36
|
Abstract
We discuss the use of local likelihood methods to fit proportional hazards regression models to right and interval censored data. The assumed model allows for an arbitrary, smoothed baseline hazard on which a vector of covariates operates in a proportional manner, and thus produces an interpretable baseline hazard function along with estimates of global covariate effects. For estimation, we extend the modified EM algorithm suggested by Betensky, Lindsey, Ryan and Wand. We illustrate the method with data on times to deterioration of breast cosmeses and HIV-1 infection rates among haemophiliacs.
Collapse
Affiliation(s)
- Rebecca A Betensky
- Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, Massachusetts 02115, USA.
| | | | | | | |
Collapse
|
37
|
Abstract
We conduct a reanalysis of data from the Utah Valley respiratory health/air pollution study of Pope and co-workers (Pope et al., 1991) using additive mixed models. A relatively recent statistical development (e.g. Wang, 1998; Verbyla et al., 1999; Lin and Zhang, 1999), the methods allow for smooth functional relationships, subject-specific effects and time series error structure. All three of these are apparent in the Utah Valley data.
Collapse
Affiliation(s)
- B A Coull
- Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, Massachusetts 02115, USA.
| | | | | |
Collapse
|
38
|
|
39
|
Abstract
Often, the functional form of covariate effects in an additive model varies across groups defined by levels of a categorical variable. This structure represents a factor-by-curve interaction. This article presents penalized spline models that incorporate factor-by-curve interactions into additive models. A mixed model formulation for penalized splines allows for straightforward model fitting and smoothing parameter selection. We illustrate the proposed model by applying it to pollen ragweed data in which seasonal trends vary by year.
Collapse
Affiliation(s)
- B A Coull
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115, USA.
| | | | | |
Collapse
|
40
|
|
41
|
Moore PE, Laporte JD, Abraham JH, Schwartzman IN, Yandava CN, Silverman ES, Drazen JM, Wand MP, Panettieri RA, Shore SA. Polymorphism of the beta(2)-adrenergic receptor gene and desensitization in human airway smooth muscle. Am J Respir Crit Care Med 2000; 162:2117-24. [PMID: 11112125 DOI: 10.1164/ajrccm.162.6.9909046] [Citation(s) in RCA: 82] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
We examined the influence of two common polymorphic forms of the beta(2)-adrenergic receptor (beta(2)AR): the Gly16 and Glu27 alleles, on acute and long-term beta(2)AR desensitization in human airway smooth muscle (HASM) cells. In cells from 15 individuals, considered without respect to genotype, pretreatment with Isoproterenol (ISO) at 10(-7) M for 1 h or 24 h caused approximately 25% and 64% decreases in the ability of subsequent ISO (10(-6) M) stimulation to reduce HASM cell stiffness as measured by magnetic twisting cytometry. Similar results were obtained with ISO-induced cyclic adenosine monophosphate (cAMP) as the outcome indicator. Data were then stratified post hoc by genotype. Cells containing at least one Glu27 allele (equivalent to presence of the Gly16Glu27 haplotype) showed significantly greater acute desensitization than did cells with no Glu27 allele, whether ISO-induced cell stiffness (34% versus 19%, p < 0.03) or cAMP formation (58% versus 11%, p < 0.02) was measured. Likewise, cells with any Glu27 allele showed greater long-term desensitization of cell stiffness and cAMP formation responses than did cells without the Glu27 allele. The distribution of genotypes limited direct conclusions about the influence of the Gly16 allele. However, presence of the Gly16Gln27 haplotype was associated with less acute and long-term desensitization of ISO-induced cAMP formation than was seen in cells without the Gly16Gln27 haplotype (14% versus 47%, p < 0.09 for short-term desensitization; 32% versus 84%, p < 0.01 for long-term desensitization), suggesting that the influence of Glu27 is not through its association with Gly16. The Glu27 allele was in strong linkage disequilibrium with the Arg19 allele, a polymorphic form of the beta(2)AR upstream peptide of the 5'-leader cistron of the beta(2)AR, and this polymorphism in the beta(2)AR 5'-flanking region may explain the effects of the Glu27 allele. Cells with any Arg19 allele showed significantly greater acute and long-term desensitization of ISO-induced cAMP formation than did cells without the Arg19 allele (54% versus 2%, p < 0.01 for short-term desensitization; 73% versus 35%, p < 0.05 for long-term desensitization). Similar results were obtained for ISO-induced changes in cell stiffness. Thus, the presence of the Glu27 allele is associated with increased acute and long-term desensitization in HASM.
Collapse
MESH Headings
- Adrenergic beta-Agonists/pharmacology
- Alleles
- Analysis of Variance
- Base Sequence
- Bucladesine/pharmacology
- Cells, Cultured
- Dose-Response Relationship, Drug
- Genotype
- Humans
- Indomethacin/pharmacology
- Isoproterenol/pharmacology
- Molecular Sequence Data
- Muscle, Smooth/cytology
- Muscle, Smooth/drug effects
- Muscle, Smooth/physiology
- Polymorphism, Genetic/drug effects
- Polymorphism, Genetic/genetics
- Polymorphism, Genetic/physiology
- Receptors, Adrenergic, beta-2/drug effects
- Receptors, Adrenergic, beta-2/genetics
- Receptors, Adrenergic, beta-2/physiology
- Time Factors
- Trachea/cytology
Collapse
Affiliation(s)
- P E Moore
- Department of Environmental Physiology, Harvard School of Public Health, Boston, Massachusetts 02115, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Abstract
There are a number of applied settings where a response is measured repeatedly over time, and the impact of a stimulus at one time is distributed over several subsequent response measures. In the motivating application the stimulus is an air pollutant such as airborne particulate matter and the response is mortality. However, several other variables (e.g. daily temperature) impact the response in a possibly non-linear fashion. To quantify the effect of the stimulus in the presence of covariate data we combine two established regression techniques: generalized additive models and distributed lag models. Generalized additive models extend multiple linear regression by allowing for continuous covariates to be modeled as smooth, but otherwise unspecified, functions. Distributed lag models aim to relate the outcome variable to lagged values of a time-dependent predictor in a parsimonious fashion. The resultant, which we call generalized additive distributed lag models, are seen to effectively quantify the so-called 'mortality displacement effect' in environmental epidemiology, as illustrated through air pollution/mortality data from Milan, Italy.
Collapse
Affiliation(s)
- A Zanobetti
- Department of Environmental Health, Harvard School of Public Health, 665 Huntington Avenue, Boston, MA 02115, USA
| | | | | | | |
Collapse
|
43
|
Abstract
The generalized additive model is extended to handle negative binomial responses. The extension is complicated by the fact that the negative binomial distribution has two parameters and is not in the exponential family. The methodology is applied to data involving DNA adduct counts and smoking variables among ex-smokers with lung cancer. A more detailed investigation is made of the parametric relationship between the number of adducts and years since quitting while retaining a smooth relationship between adducts and the other covariates.
Collapse
Affiliation(s)
- S W Thurston
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115, USA.
| | | | | |
Collapse
|
44
|
Abstract
A method for fitting regression models to data that exhibit spatial correlation and heteroskedasticity is proposed. It is well known that ignoring a nonconstant variance does not bias least-squares estimates of regression parameters; thus, data analysts are easily lead to the false belief that moderate heteroskedasticity can generally be ignored. Unfortunately, ignoring nonconstant variance when fitting variograms can seriously bias estimated correlation functions. By modeling heteroskedasticity and standardizing by estimated standard deviations, our approach eliminates this bias in the correlations. A combination of parametric and nonparametric regression techniques is used to iteratively estimate the various components of the model. The approach is demonstrated on a large data set of predicted nitrogen runoff from agricultural lands in the Midwest and Northern Plains regions of the U.S.A. For this data set, the model comprises three main components: (1) the mean function, which includes farming practice variables, local soil and climate characteristics, and the nitrogen application treatment, is assumed to be linear in the parameters and is fitted by generalized least squares; (2) the variance function, which contains a local and a spatial component whose shapes are left unspecified, is estimated by local linear regression; and (3) the spatial correlation function is estimated by fitting a parametric variogram model to the standardized residuals, with the standardization adjusting the variogram for the presence of heteroskedasticity. The fitting of these three components is iterated until convergence. The model provides an improved fit to the data compared with a previous model that ignored the heteroskedasticity and the spatial correlation.
Collapse
Affiliation(s)
- J D Opsomer
- Department of Statistics, Iowa State University, Ames, USA.
| | | | | | | | | |
Collapse
|
45
|
|
46
|
Brumback BA, Ruppert D, Wand MP. Variable Selection and Function Estimation in Additive Nonparametric Regression Using a Data-Based Prior: Comment. J Am Stat Assoc 1999. [DOI: 10.2307/2669991] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
47
|
|
48
|
Abstract
We propose a smooth hazard estimator for interval-censored survival data using the method of local likelihood. The model is fit using a local EM algorithm. The estimator is more descriptive than traditional empirical estimates in regions of concentrated information and takes on a parametric flavor in regions of sparse information. We derive two different standard error estimates for the smooth curve, one based on asymptotic theory and the other on the bootstrap. We illustrate the local EM method for times to breast cosmesis deterioration (Finkelstein, 1986, Biometrics 42, 845-854) and for times to HIV-1 infection for individuals with hemophilia (Kroner et al., 1994, Journal of AIDS 7, 279-286). Our hazard estimates for each of these data sets show interesting structures that would not be found using a standard parametric hazard model or empirical survivorship estimates.
Collapse
Affiliation(s)
- R A Betensky
- Harvard School of Public Health, Boston, Massachusetts 02115, USA.
| | | | | | | |
Collapse
|
49
|
|
50
|
|