Department of Statistics Seminar, Spring 2006

 

All the seminars are held in the meeting room of the Department of Statistics (2006 Sheridan Road, Evanston, IL).

 

Please contact Hongmei Jiang at hongmei@northwestern.edu if there is a speaker you would like to be invited to speak in this seminar.

 

Spring 2006 Schedule

Tuesday, March 7, 2006, 11am

 

Ms. Cindy Xin Wang, Department of Statistics, Northwestern University

Title: Gatekeeping Procedures Based on Weighted Bonferroni Tests for Multiple Endpoints in Dose Finding Studies

Abstract: In many dose finding studies there are hierarchically ordered endpoints (e.g., primary, secondary, etc.) and a given dose is compared with a control on any endpoint conditional on the tests on the higher-ordered endpoints being significant (serial gatekeeping). It is required to control the familywise error rate at a designated level  taking into account multiplicity of tests. We give a closed procedure (Marcus, Pertiz and Gabriel 1976) for this problem by applying the general and flexible tree-structured testing approach to gatekeeping problems developed in Dmitrienko, Wiens, Tamhane and Wang (2006). The proposed procedure uses weighted Bonferroni tests for testing intersection hypotheses. For an easier implementation of this closed procedure, we give an equivalent stepwise procedure that uses penalized Bonferroni tests for all endpoints except the last, for which it uses a penalized Holm test. The penalty charged at each step of testing is inversely proportional to a so-called rejection gain factor, which depends on the number of rejections at earlier steps and the weights assigned to those rejected hypotheses. The method is applied to an diabetes drug trial data with three endpoints. Extensions in which the Bonferroni test is replaced with the Simes or resampling or the Dunnett test are indicated.

 

Tuesday, April 4, 2006, 11am

 

Ms. Yang Ge, Department of Statistics, Northwestern University

 

Place: Meeting Room of Department of Statistics

Title: On Consistency of Bayesian Inference with Mixtures of Logistic
Regression Models

Abstract: This is a theoretical study of the consistency properties of Bayesian inference using mixtures of logistic regression models.  When standard logistic regression models are combined in a ‘mixtures of experts’ set-up, a flexible model is formed to model the relationship between a binary (yes-no) response y and a vector of predictors x.  Bayesian inference conditional on the observed data can then be used for regression and classification.  This study gives conditions on choosing the number of experts (i.e., number of mixing components) k, or choosing a prior distribution for k, so that Bayesian inference is ‘consistent’, in the sense of ‘often approximating’ the underlying true relationship between y and x. The resulting classification rule is also ‘consistent’, in the sense of having near-optimal performance in classification.  We show these desirable consistency properties with a nonstochastic k growing slowly with the sample size n of the observed data, or with a random k that takes large values with nonzero but small probabilities.

 

Monday, April 24, 2006, 3:30pm (Note: unusual time)

 

Professor Mohsen Pourahmadi, Division of Statistics, Northern Illinois Univ.

 

Title: Generalized Linear Models for the Covariance Matrix of Longitudinal Data

 

Abstract: We survey the progress made in modelling covariance matrices from the perspective of generalized linear models (GLM) and show how one can move beyond the use of the identity and logarithmic link functions, and prespecified structures. Observing that most time-domain models (ARMA, state-space,....) in time series analysis are means to diagonalize a Toeplitz covariance matrix via a unit lower triangular matrix (Cholesky decomposition), we discuss the distinguished role of the Cholesky decomposition in providing a systematic and data-based procedure for formulating and fitting parsimonious models for general covariance matrices guaranteeing the positive-definiteness of the estimates. Pulling together some techniques from regression and time series analyses provide the necessary tools for the procedure which reduces the unintuitive task of modelling covariance matrices to that of a sequence of regression models. The procedure is illustrated using a real longitudinal dataset.Once a bona fide GLM framework for modelling covariances is found, its Bayesian, nonparametric, generalized additive and other extensions can be developed in direct analogy with the respective extensions of the traditional GLM.

 

Tuesday, May 2, 2006, 11am

 

Professor Hakan Demirtas, Division of Epidemiology and Biostatistics, University of Illinois at Chicago

 

Title: Multiple imputation under Bayesianly smoothed random-coefficient
hierarchical pattern-mixture models for nonignorably missing longitudinal
data

Abstract: Conventional pattern-mixture models can be highly sensitive to
model misspecification. In many longitudinal studies, where the nature of
the drop-out and the form of the population model are unknown, interval
estimates from any single pattern-mixture model may suffer from
undercoverage, because uncertainty about model misspecification is not taken
into account. In this talk, I will introduce a new class of Bayesian random
coefficient pattern-mixture models to address potentially non-ignorable
drop-out. Instead of imposing hard equality constraints to overcome inherent
inestimability problems in pattern-mixture models, I propose to smooth the
polynomial coefficient estimates across patterns using a hierarchical
Bayesian model that allows random variation across groups. Using real and
simulated data, I show that multiple imputation under a three-level linear
mixed-effects model which accommodates a random level due to drop-out groups
can be an effective method to deal with non-ignorable drop-out by allowing
model uncertainty to be incorporated into the imputation process.

Papers that are relevant to this talk:

Demirtas, H. & Schafer, J.L. (2003). On the performance of
random-coefficient pattern-mixture models for non-ignorable drop-out.

Statistics in Medicine, 22, 2553-2575.

Demirtas, H.  (2004). Modeling incomplete longitudinal data. Journal of
Modern Applied Statistical Methods, Volume 3, No 2, 305-321.

Demirtas, H. (2005). Multiple imputation under Bayesianly smoothed
pattern-mixture models for non-ignorable drop-out. Statistics in Medicine,
24, 2345-2363.

Demirtas, H.  (2005). Bayesian analysis of hierarchical pattern-mixture
models for clinical trials data with attrition and comparisons to commonly
used ad-hoc and model-based approaches.
  Journal of Biopharmaceutical
Statistics, Volume 15, Issue 3, 383-402.

 

Tuesday, May 9, 2006, 11am

 

Professor Peter Song, Department of Statistics and Actuarial Science, 

University of Waterloo

 

Title: Maximization by Parts in Likelihood Inference

Abstract: In this talk I will present a new algorithm for solving a score equation for the maximum likelihood estimate in certain problems of practical interest. The method circumvents the need to compute second order derivatives of the full likelihood function. It exploits the structure of certain models that yield a natural decomposition of a very complicated likelihood function. In this decomposition, the first part is a log likelihood from a simply analyzed model and the second part is used to update estimates from the first. Convergence properties of this iterative (fixed point) algorithm are examined and asymptotics are derived for estimators obtained by using only a finite number of iterations. I will illustrate several examples in the presentation, including multivariate Gaussian copula models, nonnormal random effects models, generalized linear mixed models, and state space models. Properties of the algorithm and of estimators are discussed in detail via simulation studies on a bivariate copula model and a nonnormal linear random effects model.

 

Tuesday, May 16, 2006, 11am

 

Professor Edward C Malthouse, Department of Integrated Marketing Communications, Medill School, Northwestern University

Title: Conceptualizing and Measuring Media Engagement and its Effects          

Abstract: We propose measuring the latent construct “media engagement” with a third-order confirmatory factor analysis model.  The approach is tested using five large reader surveys of 100 and 50 newspapers, 100 magazines, and 39 and 8 media web sites.  Over 400 qualitative interviews generated samples of items (questions) from the construct domain for the three media platforms.  Consumer surveys measured the items on samples of readers.  We used exploratory factor analysis (EFA) to develop scales measuring different dimensions of engagement and confirmatory factor analysis (CFA) to purify the scales further.  Additional EFA was used to identify higher-order factors, which were then tested with confirmatory models.  We contrast the higher-order factor structure for the three media platforms.  Predictive validity is assessed by relating the engagement factors to outcome measures such as usage with random coefficient models and ridge regression.  Three quasi-experiments evaluate the effect of engagement on advertising effectiveness.

(This is a joint research with Bobby Calder, Marketing Department, Kellogg.)

 

Tuesday, May 30, 2006, 11am

 

Professor Torben G. Andersen, Kellogg School of Management, Northwestern University and NBER

 

Place: Meeting room of Department of Statistics

 

Title: Continuous-Time Models, Realized Volatilities, and Testable Distributional Implications for Daily Stock Returns

 

ABSTRACT: We provide a framework for analyzing and understanding daily return distributions within the context of traditional continuous-time asset price processes. We develop a sequence of simple-to-implement distributional tests from transformed inter-daily returns. They hinge on the availability of intraday data for construction of nonparametric realized variation measures and jump detection statistics. Each step speaks to key features of the process underlying the discretely observed prices and should help in developing empirically more realistic models. For thirty large stocks, we find that time-varying diffusive volatility, jumps and leverage effects are all critical in order to describe the dynamic dependencies in the observed prices.

 

            Coauthors:

Tim Bollerslev, Dept. of Economics and Fuqua School, Duke University and NBER

Per H. Frederiksen, Jyske Bank, Denmark

Morten Ø. Nielsen, Dept. of Economics, Cornell University

 

 

Fall 2005 Schedule

Monday, October 17, 2005, 11am

Professor Denise Scholtens, Department of Preventive Medicine, Northwestern University

 
Title: Local modeling of global interactome networks

 

Abstract: Accurate systems biology modeling requires a complete catalog of protein complexes and their constituent proteins. We discuss a graph theoretic/statistical algorithm for local dynamic modeling of protein complexes using data from affinity purification-mass spectrometry experiments. The algorithm readily accommodates multicomplex membership by individual proteins and dynamic complex composition, two biological realities not accounted for in existing topological descriptions of the overall protein network. A penalized likelihood approach guides the protein complex modeling algorithm. With an accurate complex membership catalog in place, systems biology can proceed with greater precision.

 

Monday, October 31, 2005, 11am

Professor Hua Yun Chen, Department of Epidemiology & Biostatistics, University of Illinois at Chicago

 

Title: Approximation to locally semiparametric efficient scores in missing data   problems through likelihood robustification

 

Abstract: In parametric/semiparametric models with missing data, the efficient estimator often cannot be obtained without additional model assumptions even if the efficient estimator has a simple form when no missing data are involved. Robins et al. proposed to find the locally efficient estimator as a compromise and showed that the locally efficient estimator have the doubly robust property when the missing data are missing at random in Rubin's sense.  In practice, the approach proposed by Robins et al. to finding a locally efficient estimator can be very challenge to implement. We propose an alternative representation of the efficient score through likelihood robustification. The proposed representation is straightforward to obtain, can be applied to missing data with arbitrary missing patterns, and is amenable to computing the locally efficient score. The estimator based on the proposed representation has the doubly robust property when missing data are MAR, and only requires correct specification of the missing data mechanism model for consistency when missing data are nonignorable. Estimation and inferences on the parameters are proposed. Applications of the proposed method are illustrated by examples. The performance of the approach is examined by a simulation study.

 

Monday, November 14, 2005, 11am

Dr. Guei-Feng (Cindy) Tsai, Department of Statistics, Northwestern University

 

Title: Semi-nonparametric Models and Inference for High Dimensional Microarray Data

 

Abstract: We develop a new approach to analyze high dimensional cell-cycle microarray data with no replicates. There are two kinds of correlations for cell-cycle microarray data. Measurements are correlated within a gene, and measurements are also correlated between genes since some genes may be biologically related. The proposed procedure combines a classification method, the quadratic inference function method and nonparametric techniques for complex high dimensional data. We first perform a gene classifying analysis to classify genes into classes with similar cell-cycle patterns, including a class with no cell-cycle phenomena at all. We use genes within the same class as pseudo-replicates to build nonparametric models and inference functions. In order to incorporate correlation of longitudinal measurements, the quadratic inference function method is also applied. This approach allows us to perform chi-squared tests for testing whether the coefficients are time varying or not. This also allows us to determine whether certain genes regulate cell cycles. A real data example on cell-cycle microarray data as well as simulations are illustrated.

 

 

 

 

Friday, December 9, 2005, 11am  (Note: Unusual Date)

Dr. Alex Dmitrienko, Eli Lilly

Title: Branching tests in clinical trials with multiple objectives    

Abstract: This talk discusses branching multiple tests with clinical trial applications. Branching tests arise in clinical trials with hierarchically ordered multiple objectives, for example, in the context of multiple dose-control tests with logical restrictions or analysis of multiple endpoints. The proposed branching approach is based on the principle of closed testing and generalizes the serial and parallel gatekeeping approaches. The branching testing methodology will be illustrated using a clinical trial with multiple endpoints (primary, secondary and tertiary) and multiple objectives (superiority and non-inferiority testing) as well as a dose-finding trial with multiple endpoints.