Department of Statistics Seminar, Spring 2006
All the seminars are held in the meeting room of the
Department of Statistics (
Please contact Hongmei Jiang at hongmei@northwestern.edu if there is a speaker you would like to be invited to speak in this seminar.
Spring 2006 Schedule
Tuesday, March 7, 2006, 11am
Ms. Cindy Xin Wang, Department of
Statistics, Northwestern University
Title: Gatekeeping Procedures Based on Weighted Bonferroni Tests for Multiple Endpoints in Dose Finding Studies
Abstract: In many dose finding studies there are hierarchically
ordered endpoints (e.g., primary, secondary, etc.) and a given dose is compared with a control on any endpoint
conditional on the tests on the
higher-ordered endpoints being significant (serial gatekeeping).
It is required to control the familywise error rate at a designated level taking into account multiplicity of tests. We give a closed procedure
(Marcus, Pertiz and Gabriel 1976) for this problem by applying the general and flexible
tree-structured testing approach to gatekeeping
problems developed in Dmitrienko, Wiens, Tamhane and Wang (2006). The proposed procedure uses weighted Bonferroni
tests for testing intersection hypotheses. For an easier implementation of this closed procedure, we give an
equivalent stepwise procedure that uses penalized Bonferroni tests for all endpoints except the
last, for which it uses a penalized Holm test. The penalty charged at each step
of testing is inversely proportional to a so-called rejection gain factor, which depends on the number of
rejections at earlier steps and the weights assigned to those rejected hypotheses. The method is applied to an diabetes drug trial data with three endpoints. Extensions in which the Bonferroni
test is replaced with the Simes or resampling or the Dunnett test
are indicated.
Tuesday, April 4, 2006, 11am
Ms. Yang Ge, Department of Statistics,
Northwestern University
Place: Meeting
Room of Department of Statistics
Title: On Consistency of Bayesian
Inference with Mixtures of Logistic
Regression Models
Abstract: This is a theoretical
study of the consistency properties of Bayesian inference using mixtures of
logistic regression models. When
standard logistic regression models are combined in a ‘mixtures of experts’
set-up, a flexible model is formed to model the relationship between a binary
(yes-no) response y and a vector of predictors x. Bayesian inference conditional on the
observed data can then be used for regression and classification. This study gives conditions on choosing the
number of experts (i.e., number of mixing components) k, or choosing a
prior distribution for k, so that Bayesian inference is ‘consistent’, in
the sense of ‘often approximating’ the underlying true relationship between y
and x. The resulting classification rule is also
‘consistent’, in the sense of having near-optimal performance in
classification. We show these desirable
consistency properties with a nonstochastic k
growing slowly with the sample size n of the observed data, or with a
random k that takes large values with nonzero but small probabilities.
Monday, April 24, 2006, 3:30pm (Note:
unusual time)
Professor Mohsen Pourahmadi,
Division of Statistics, Northern
Title:
Generalized Linear Models for the Covariance Matrix of Longitudinal Data
Abstract: We survey the progress made in modelling covariance matrices from the perspective of
generalized linear models (GLM) and show how one can move beyond the use of the
identity and logarithmic link functions, and prespecified
structures. Observing that most time-domain models (ARMA,
state-space,....) in time series analysis are means to diagonalize
a Toeplitz covariance matrix via a unit lower
triangular matrix (Cholesky decomposition), we discuss
the distinguished role of the Cholesky decomposition
in providing a systematic and data-based procedure for formulating and
fitting parsimonious models for general covariance matrices guaranteeing
the positive-definiteness of the estimates. Pulling together some techniques
from regression and time series analyses provide the necessary tools for
the procedure which reduces the unintuitive task of modelling
covariance matrices to that of a sequence of regression models. The
procedure is illustrated using a real longitudinal dataset.Once
a bona fide GLM framework for modelling covariances is found, its Bayesian, nonparametric,
generalized additive and other extensions can be developed in direct
analogy with the respective extensions of the traditional GLM.
Tuesday, May 2, 2006, 11am
Professor Hakan Demirtas,
Division of Epidemiology and Biostatistics,
Title: Multiple imputation under Bayesianly
smoothed random-coefficient
hierarchical pattern-mixture models for nonignorably
missing longitudinal
data
Abstract: Conventional
pattern-mixture models can be highly sensitive to
model misspecification. In many longitudinal studies, where the nature of
the drop-out and the form of the population model are unknown, interval
estimates from any single pattern-mixture model may suffer from
undercoverage, because uncertainty about model
misspecification is not taken
into account. In this talk, I will introduce a new class of Bayesian random
coefficient pattern-mixture models to address potentially non-ignorable
drop-out. Instead of imposing hard equality constraints to overcome inherent
inestimability problems in pattern-mixture models, I
propose to smooth the
polynomial coefficient estimates across patterns using a hierarchical
Bayesian model that allows random variation across groups. Using real and
simulated data, I show that multiple imputation under
a three-level linear
mixed-effects model which accommodates a random level due to drop-out groups
can be an effective method to deal with non-ignorable drop-out by allowing
model uncertainty to be incorporated into the imputation process.
Papers that are relevant to this talk:
Demirtas, H. & Schafer, J.L. (2003). On the performance of
random-coefficient pattern-mixture models for non-ignorable drop-out.
Statistics in Medicine, 22, 2553-2575.
Demirtas,
H. (2004). Modeling incomplete longitudinal data. Journal of
Modern Applied Statistical Methods, Volume 3, No 2, 305-321.
Demirtas, H. (2005). Multiple imputation under Bayesianly smoothed
pattern-mixture models for non-ignorable drop-out. Statistics in Medicine,
24, 2345-2363.
Demirtas,
H. (2005). Bayesian analysis of hierarchical
pattern-mixture
models for clinical trials data with attrition and comparisons to commonly
used ad-hoc and model-based approaches. Journal of
Biopharmaceutical
Statistics, Volume 15, Issue 3, 383-402.
Tuesday, May 9, 2006, 11am
Professor Peter Song, Department of Statistics and Actuarial
Science,
Title: Maximization by Parts in Likelihood Inference
Abstract: In this talk I will
present a new algorithm for solving a score equation for the maximum likelihood
estimate in certain problems of practical interest. The method circumvents the
need to compute second order derivatives of the full likelihood function. It
exploits the structure of certain models that yield a natural decomposition of
a very complicated likelihood function. In this decomposition, the first part
is a log likelihood from a simply analyzed model and
the second part is used to update estimates from the first. Convergence
properties of this iterative (fixed point) algorithm are examined and asymptotics are derived for estimators obtained by using
only a finite number of iterations. I will illustrate several examples in the
presentation, including multivariate Gaussian copula models, nonnormal random effects models, generalized linear mixed
models, and state space models. Properties of the algorithm and of estimators
are discussed in detail via simulation studies on a bivariate
copula model and a nonnormal linear random effects
model.
Tuesday, May 16, 2006, 11am
Professor Edward C Malthouse, Department of Integrated Marketing
Communications,
Title: Conceptualizing and Measuring Media Engagement and its Effects
Abstract: We propose measuring the latent construct “media engagement” with a third-order confirmatory factor analysis model. The approach is tested using five large reader surveys of 100 and 50 newspapers, 100 magazines, and 39 and 8 media web sites. Over 400 qualitative interviews generated samples of items (questions) from the construct domain for the three media platforms. Consumer surveys measured the items on samples of readers. We used exploratory factor analysis (EFA) to develop scales measuring different dimensions of engagement and confirmatory factor analysis (CFA) to purify the scales further. Additional EFA was used to identify higher-order factors, which were then tested with confirmatory models. We contrast the higher-order factor structure for the three media platforms. Predictive validity is assessed by relating the engagement factors to outcome measures such as usage with random coefficient models and ridge regression. Three quasi-experiments evaluate the effect of engagement on advertising effectiveness.
(This is a joint research with Bobby Calder, Marketing Department, Kellogg.)
Tuesday, May 30, 2006, 11am
Place: Meeting room of Department of
Statistics
Title: Continuous-Time Models, Realized
Volatilities, and Testable Distributional Implications for Daily Stock Returns
ABSTRACT: We provide a framework for analyzing and
understanding daily return distributions within the context of traditional continuous-time asset price processes. We
develop a sequence of simple-to-implement
distributional tests from transformed inter-daily returns. They hinge on the availability of intraday data for
construction of nonparametric realized variation measures and jump detection statistics. Each step
speaks to key features of the process underlying the discretely observed prices and should help in developing
empirically more realistic models. For thirty
large stocks, we find that time-varying diffusive volatility, jumps and
leverage effects are all critical in order to describe the dynamic dependencies
in the observed prices.
Coauthors:
Tim Bollerslev, Dept. of Economics and
Per H. Frederiksen, Jyske
Bank,
Morten Ø. Nielsen, Dept. of Economics,
Fall 2005 Schedule
Monday, October 17, 2005, 11am
Professor Denise Scholtens, Department of Preventive Medicine, Northwestern University
Title: Local modeling of global interactome networks
Abstract: Accurate systems biology modeling requires a complete catalog of protein complexes and their constituent proteins. We discuss a graph theoretic/statistical algorithm for local dynamic modeling of protein complexes using data from affinity purification-mass spectrometry experiments. The algorithm readily accommodates multicomplex membership by individual proteins and dynamic complex composition, two biological realities not accounted for in existing topological descriptions of the overall protein network. A penalized likelihood approach guides the protein complex modeling algorithm. With an accurate complex membership catalog in place, systems biology can proceed with greater precision.
Monday, October 31, 2005, 11am
Professor Hua Yun
Chen, Department of Epidemiology & Biostatistics,
Title: Approximation to
locally semiparametric efficient scores in missing
data problems through likelihood robustification
Abstract: In parametric/semiparametric models with missing data, the efficient
estimator often cannot be obtained without additional model assumptions even if
the efficient estimator has a simple form when no missing data are involved.
Robins et al. proposed to find the locally efficient estimator as a compromise
and showed that the locally efficient estimator have the doubly robust property
when the missing data are missing at random in Rubin's sense. In practice, the approach proposed by Robins
et al. to finding a locally efficient estimator can be very challenge to
implement. We propose an alternative representation of the efficient score
through likelihood robustification. The proposed
representation is straightforward to obtain, can be applied to missing data
with arbitrary missing patterns, and is amenable to computing the locally
efficient score. The estimator based on the proposed representation has the
doubly robust property when missing data are MAR, and
only requires correct specification of the missing data mechanism model for
consistency when missing data are nonignorable.
Estimation and inferences on the parameters are proposed. Applications of the
proposed method are illustrated by examples. The performance of the approach is
examined by a simulation study.
Monday, November 14, 2005, 11am
Dr. Guei-Feng (Cindy) Tsai, Department of Statistics, Northwestern University
Title: Semi-nonparametric Models and Inference for High Dimensional Microarray Data
Abstract: We develop a new approach to analyze high dimensional cell-cycle microarray data with no replicates. There are two kinds of correlations for cell-cycle microarray data. Measurements are correlated within a gene, and measurements are also correlated between genes since some genes may be biologically related. The proposed procedure combines a classification method, the quadratic inference function method and nonparametric techniques for complex high dimensional data. We first perform a gene classifying analysis to classify genes into classes with similar cell-cycle patterns, including a class with no cell-cycle phenomena at all. We use genes within the same class as pseudo-replicates to build nonparametric models and inference functions. In order to incorporate correlation of longitudinal measurements, the quadratic inference function method is also applied. This approach allows us to perform chi-squared tests for testing whether the coefficients are time varying or not. This also allows us to determine whether certain genes regulate cell cycles. A real data example on cell-cycle microarray data as well as simulations are illustrated.
Friday, December 9, 2005, 11am (Note:
Unusual Date)
Dr. Alex Dmitrienko, Eli Lilly
Title: Branching tests in clinical trials with multiple objectives
Abstract: This talk discusses branching multiple tests with clinical trial applications. Branching tests arise in clinical trials with hierarchically ordered multiple objectives, for example, in the context of multiple dose-control tests with logical restrictions or analysis of multiple endpoints. The proposed branching approach is based on the principle of closed testing and generalizes the serial and parallel gatekeeping approaches. The branching testing methodology will be illustrated using a clinical trial with multiple endpoints (primary, secondary and tertiary) and multiple objectives (superiority and non-inferiority testing) as well as a dose-finding trial with multiple endpoints.