Cover Page



Half Title page

Title page

Copyright page



Chapter 1: Analysis of Over- and Underdispersed Data

1.1 Introduction

1.2 Overdispersed Binomial and Count Models

1.3 Other Approaches to Account for Overdispersion

1.4 Underdispersion

1.5 Software Notes


Chapter 2: Analysis of Variance (ANOVA)

2.1 Introduction

2.2 Factors, Levels, Effects, and Cells

2.3 Cell Means Model

2.4 One-Way Classification

2.5 Parameter Estimation

2.6 The R(.) Notation—Partitioning Sum of Squares

2.7 ANOVA—Hypothesis of Equal Means

2.8 Multiple Comparisons

2.9 Two-Way Crossed Classification

2.10 Balanced and Unbalanced Data

2.11 Interaction Between Rows and Columns

2.12 Analysis of Variance Table


Chapter 3: Assessment of Health-Related Quality of Life

3.1 Introduction

3.2 Choice of HRQOL Instruments

3.3 Establishment of Clear Objectives in HRQOL Assessments

3.4 Methods for HRQOL Assessment

3.5 HRQOL as the Primary End Point

3.6 Interpretation of HRQOL Results

3.7 Examples

3.8 Conclusion


Further Reading

Chapter 4: Bandit Processes and Response-Adaptive Clinical Trials: The Art of Exploration Versus Exploitation

4.1 Introduction

4.2 Exploration Versus Exploitation with Complete Observations

4.3 Exploration Versus Exploitation with Censored Observations

4.4 Conclusion


Chapter 5: Bayesian Dose-Finding Designs in Healthy Volunteers

5.1 Introduction

5.2 A Bayesian Decision-Theoretic Design

5.3 An Example of Dose Escalation in Healthy Volunteer Studies

5.4 Discussion


Chapter 6: Bootstrap

6.1 Introduction

6.2 Plug-In Principle

6.3 Monte Carlo Sampling— The “Second Bootstrap Principle”

6.4 Bias and Standard Error

6.5 Examples

6.6 Model Stability

6.7 Accuracy of Bootstrap Distributions

6.8 Bootstrap Confidence Intervals

6.9 Hypothesis Testing

6.10 Planning Clinical Trials

6.11 How Many Bootstrap Samples Are Needed

6.12 Additional References


Chapter 7: Conditional Power in Clinical Trial Monitoring

7.1 Introduction

7.2 Conditional Power

7.3 Weight-Averaged Conditional Power or Bayesian Predictive Power

7.4 Conditional Power of a Different Kind: Discordance Probability

7.5 Analysis of a Randomized Trial

7.6 Conditional Power: Pros and Cons


Chapter 8: Cost-Effectiveness Analysis

8.1 Introduction

8.2 Definitions and Design Issues

8.3 Cost and Effectiveness Data

8.4 The Analysis of Costs and Outcomes

8.5 Robustness and Generalizability in Cost-Effectiveness Analysis


Further Reading

Chapter 9: Cox-Type Proportional Hazards Models

9.1 Introduction

9.2 Cox Model for Univariate Failure Time Data Analysis

9.3 Marginal Models for Multivariate Failure Time Data Analysis

9.4 Practical Issues in Using the Cox Model

9.5 Examples

9.6 Extensions

9.7 Softwares and Codes


Further Reading

Chapter 10: Empirical Likelihood Methods in Clinical Experiments

10.1 Introduction

10.2 Classical EL: Several Ingredients for Theoretical Evaluations

10.3 The Relationship Between Empirical Likelihood and Bootstrap Methodologies

10.4 Bayes Methods Based on Empirical Likelihoods

10.5 Mixtures of Likelihoods

10.6 An Example: ROC Curve Analyses Based on Empirical Likelihoods

10.7 Applications of Empirical Likelihood Methodology in Clinical Trials or Other Data Analyses

10.8 Concluding Remarks



Chapter 11: Frailty Models

11.1 Introduction

11.2 Univariate Frailty Models

11.3 Multivariate Frailty Models

11.4 Software


Chapter 12: Futility Analysis

12.1 Introduction

12.2 Common Statistical Approaches to Futility Monitoring

12.3 Examples

12.4 Discussion


Further Reading

Chapter 13: Imaging Science in Medicine I: Overview

13.1 Introduction

13.2 Advances in Medical Imaging

13.3 Evolutionary Developments in Imaging

13.4 Conclusion


Chapter 14: Imaging Science in Medicine, II: Basics of X-Ray Imaging

14.1 Introduction to Medical Imaging: Different Ways of Creating Visible Contrast Among Tissues

14.2 What the Body Does to the X-Ray Beam: Subject Contrast From Differential Attenuation of the X-Ray Beam by Various Tissues

14.3 What the X-Ray Beam Does to the Body: Known Medical Benefits Versus Possible Radiogenic Risks

14.4 Capturing the Visual Image: Analog (20th Century) X-Ray Image Receptors

Chapter 15: Imaging Science in Medicine, III: Digital (21st Century) X-Ray Imaging

15.1 The Computer in Medical Imaging

15.2 The Digital Planar X-Ray Modalities: Computed Radiography and Digital Radiography and Fluoroscopy

15.3 Digital Fluoroscopy and Digital Subtraction Angiography

15.4 Digital Tomosynthesis: Planar Imaging in Three Dimensions

15.5 Computed Tomography: Superior Contrast in Three-Dimensional X-Ray Attenuation Maps

Chapter 16: Intention-to-Treat Analysis

16.1 Introduction

16.2 Missing Information

16.3 The Intention-to-Treat Design

16.4 Efficiency of the Intent-to-Treat Analysis

16.5 Compliance-Adjusted Analyses

16.6 Conclusion


Further Reading

Chapter 17: Interim Analyses

17.1 Introduction

17.2 Opportunities and Dangers of Interim Analyses

17.3 The Development of Techniques for Conducting Interim Analyses

17.4 Methodology for Interim Analyses

17.5 An Example: Statistics for Lamivudine

17.6 Interim Analyses in Practice

17.7 Conclusions


Chapter 18: Interrater Reliability

18.1 Definition

18.2 The Importance of Reliability in Clinical Trials

18.3 How Large a Reliability Coefficient Is Large Enough?

18.4 Design and Analysis of Reliability Studies

18.5 Estimate of the Reliability Coefficient—Parametric

18.6 Estimation of the Reliability Coefficient— Nonparametric

18.7 Estimation of the Reliability Coefficient—Binary

18.8 Estimation of the Reliability Coefficient—Categorical

18.9 Strategies to Increase Reliability (Spearman–Brown Projection)

18.10 Other Types of Reliabilities


Chapter 19: Intrarater Reliability

19.1 Introduction

19.2 Intrarater Reliability for Continuous Scores

19.3 Nominal Scale Score Data

19.4 Ordinal and Interval Score Data

19.5 Concluding Remarks


Further Reading

Chapter 20: Kaplan—Meier Plot

20.1 Introduction

20.2 Estimation of Survival Function

20.3 Additional Topics


Chapter 21: Logistic Regression

21.1 Introduction

21.2 Fitting the Logistic Regression Model

21.3 The Multiple Logistic Regression Model

21.4 Fitting the Multiple Logistic Regression Model

21.5 Example

21.6 Testing for the Significance of the Model

21.7 Interpretation of the Coefficients of the Logistic Regression Model

21.8 Dichotomous Independent Variable

21.9 Polytomous Independent Variable

21.10 Continuous Independent Variable

21.11 Multivariate Case


Chapter 22: Metadata

22.1 Introduction

22.2 History/Background

22.3 Data Set Metadata

22.4 Analysis Results Metadata

22.5 Regulatory Submission Metadata


Chapter 23: Microarray

23.1 Introduction

23.2 What is a Microarray?

23.3 Other Array Technologies

23.4 Define Objectives of the Study

23.5 Experimental Design for Microarray

23.6 Data Extraction

23.7 Microarray Informatics

23.8 Statistical Analysis

23.9 Annotation

23.10 Pathway, GO, and Class-Level Analysis Tools

23.11 Validation of Microarray Experiments

23.12 Conclusions


Chapter 24: Multi-Armed Bandits, Gittins Index, and Its Calculation

24.1 Introduction

24.2 Mathematical Formulation of Multi-Armed Bandits

24.3 Off-Line Algorithms for Computing Gittins Index

24.4 On-Line Algorithms for Computing Gittins Index

24.5 Computing Gittins Index for the Bernoulli Sampling Process

24.6 Conclusion


Chapter 25: Multiple Comparisons

25.1 Introduction

25.2 Strong and Weak Control of the FWE

25.3 Criteria for Deciding Whether Adjustment is Necessary

25.4 Implicit Multiplicity: Two-Tailed Testing

25.5 Specific Multiple Comparison Procedures


Chapter 26: Multiple Evaluators

26.1 Introduction

26.2 Agreement for Continuous Data

26.3 Agreement for Categorical Data

26.4 Summary and Discussion


Chapter 27: Noncompartmental Analysis

27.1 Introduction

27.2 Terminology

27.3 Objectives and Features of Noncompartmental Analysis

27.4 Comparison of Noncompartmental and Compartmental Models

27.5 Assumptions of NCA and Its Reported Descriptive Statistics

27.6 Calculation Formulas for NCA

27.7 Guidelines for Performance of NCA Based on Numerical Integration

27.8 Conclusions and Perspectives


Further Reading

Chapter 28: Nonparametric ROC Analysis for Diagnostic Trials

28.1 Introduction

28.2 Different Aspects of Study Design

28.3 Nonparametric Models and Hypotheses

28.4 Point Estimator

28.5 Asymptotic Distribution and Variance Estimator

28.6 Derivation of the Confidence Interval

28.7 Statistical Tests

28.8 Adaptations for Cluster Data

28.9 Results of a Diagnostic Study

28.10 Summary and Final Remarks


Chapter 29: Optimal Biological Dose for Molecularly Targeted Therapies

29.1 Introduction

29.2 Phase I Dose-Finding Designs for Cytotoxic Agents

29.3 Phase I Dose-Finding Designs for Molecularly Targeted Agents

29.4 Discussion


Further Reading

Chapter 30: Over- and Underdispersion Models

30.1 Introduction

30.2 Count Dispersion Models

30.3 Count Explanatory Models

30.4 Summary and Final Remarks


Chapter 31: Permutation Tests in Clinical Trials

31.1 Randomization Inference—Introduction

31.2 Permutation Tests—How They Work

31.3 Normal Approximation to Permutation Tests

31.4 Analyze as You Randomize

31.5 Interpretation of Permutation Analysis Results

31.6 Summary


Chapter 32: Pharmacoepidemiology, Overview

32.1 Introduction

32.2 The Case-Crossover Design

32.3 Confounding Bias

32.4 Risk Functions Over Time

32.5 Probabilistic Approach for Causality Assessment

32.6 Methods Based on Prescription Data


Chapter 33: Population Pharmacokinetic and Pharmacodynamic Methods

33.1 Introduction

33.2 Terminology

33.3 Fixed Effects Models

33.4 Random Effects Models

33.5 Model Building and Parameter Estimation

33.6 Software

33.7 Model Evaluation

33.8 Stochastic Simulation

33.9 Experimental Design

33.10 Applications


Further Reading

Chapter 34: Proportions: Inferences and Comparisons

34.1 Introduction

34.2 One-Sample Case

34.3 Two Independent Samples

34.4 Note on Software


Chapter 35: Publication Bias

35.1 Publication Bias and the Validity of Research Reviews

35.2 Research on Publication Bias

35.3 Data Suppression Mechanisms Related to Publication Bias

35.4 Prevention of Publication Bias

35.5 Assessment of Publication Bias

35.6 Impact of Publication Bias


Further Reading

Chapter 36: Quality of Life

36.1 Background

36.2 Measuring Health-Related Quality of Life

36.3 Development and Validation of HRQoL Measures

36.4 Use in Research Studies

36.5 Interpretation/Clinical Significance

36.6 Conclusions


Chapter 37: Relative Risk Modeling

37.1 Introduction

37.2 Why Model Relative Risks?

37.3 Data Structures and Likelihoods

37.4 Approaches to Model Specification

37.5 Mechanistic Models


Chapter 38: Sample Size Considerations for Morbidity/Mortality Trials

38.1 Introduction

38.2 General Framework for Sample Size Calculation

38.3 Choice of Test Statistics

38.4 Adjustment of Treatment Effect

38.5 Informative Noncompliance


Chapter 39: Sample Size for Comparing Means

39.1 Introduction

39.2 One-Sample Design

39.3 Two-Sample Parallel Design

39.4 Two-Sample Crossover Design

39.5 Multiple-Sample One-Way ANOVA

39.6 Multiple-Sample Williams Design

39.7 Discussion


Chapter 40: Sample Size for Comparing Proportions

40.1 Introduction

40.2 One-Sample Design

40.3 Two-Sample Parallel Design

40.4 Two-Sample Crossover Design

40.5 Relative Risk—Parallel Design

40.6 Relative Risk—Crossover Design

40.7 Discussion


Chapter 41: Sample Size for Comparing Time-to-Event Data

41.1 Introduction

41.2 Exponential Model

41.3 Cox’s Proportional Hazards Model

41.4 Log-Rank Test

41.5 Discussion


Chapter 42: Sample Size for Comparing Variabilities

42.1 Introduction

42.2 Comparing Intrasubject Variabilities

42.3 Comparing Intersubject Variabilities

42.4 Comparing Total Variabilities

42.5 Discussion


Chapter 43: Screening, Models of

43.1 Introduction

43.2 What is Screening?

43.3 Why Use Modeling?

43.4 Characteristics of Screening Models

43.5 A Simple Disease and Screening Model

43.6 Analytic Models for Cancer

43.7 Simulation Models for Cancer

43.8 Model Fitting and Validation

43.9 Models for Other Diseases

43.10 Current State and Future Directions


Chapter 44: Screening Trials

44.1 Introduction

44.2 Design Issues

44.3 Sample Size

44.5 Analysis

44.6 Trial Monitoring


Chapter 45: Secondary Efficacy End Points

45.1 Introduction

45.2 Literature Review

45.3 Review of Methodology for Multiplicity Adjustment and Gatekeeping Strategies for Secondary End Points

45.4 Summary


Further Reading

Chapter 46: Sensitivity, Specificity, and Receiver Operator Characteristic (ROC) Methods

46.1 Evaluating a Single Binary Test Against a Binary Criterion

46.2 Evaluation of a Single Binary Test: ROC Methods

46.3 Evaluation of a Test Response Measured on an Ordinal Scale: ROC Methods

46.4 Evaluation of Multiple Different Tests

46.5 The Optimal Sequence of Tests

46.6 Sampling and Measurement Issues

46.7 Summary


Chapter 47: Software for Genetics/Genomics

47.1 Introduction

47.2 Data Management

47.3 Genetic Analysis

47.4 Genomic Analysis

47.5 Other


Further Reading

Chapter 48: Stability Study Designs

48.1 Introduction

48.2 Stability Study Designs

48.3 Criteria for Design Comparison

48.4 Stability Protocol

48.5 Basic Design Considerations

48.6 Conclusions


Chapter 49: Subgroup Analysis

49.1 Introduction

49.2 The Dilemma of Subgroup Analysis

49.3 Planned Versus Unplanned Subgroup Analysis

49.4 Frequentist Methods

49.5 Testing Treatment by Subgroup Interactions

49.6 Subgroup Analyses in Positive Clinical Trials

49.7 Confidence Intervals for Treatment Effects within Subgroups

49.8 Bayesian Methods


Chapter 50: Survival Analysis, Overview

50.1 Introduction

50.2 History

50.3 Survival Analysis Concepts

50.4 Nonparametric Estimation and Testing

50.5 Parametric Inference

50.6 Comparison with Expected Survival

50.7 The Cox Regression Model

50.8 Other Regression Models for Survival Data

50.9 Multistate Models

50.10 Other Kinds of Incomplete Observation

50.11 Multivariate Survival Analysis

50.12 Concluding Remarks


Chapter 51: The FDA and Regulatory Issues

51.1 Caveat

51.2 Introduction

51.3 Chronology of Drug Regulation in the United States

51.4 FDA Basic Structure

51.5 IND Application Process

51.6 Drug Development and Approval Time Frame

51.7 NDA Process

51.8 U.S. Pharmacopeia and FDA

51.9 CDER Freedom of Information Electronic Reading Room

51.10 Conclusion

Chapter 52: The Kappa Index

52.1 Introduction

52.2 The Kappa Index

52.3 Inference for Kappa via Generalized Estimating Equations

52.4 The Dependence of Kappa on Marginal Rates

52.5 General Remarks


Chapter 53: Treatment Interruption

53.1 Introduction

53.2 Therapeutic TI Studies in HIV/AIDS

53.3 Management of Chronic Disease

53.4 Analytic Treatment Interruption in Therapeutic Vaccine Trials

53.5 Randomized Discontinuation Designs

53.6 Final Comments


Chapter 54: Trial Reports: Improving Reporting, Minimizing Bias, and Producing Better Evidence-Based Practice

54.1 Introduction

54.2 Reporting Issues in Clinical Trials

54.3 Moral Obligation to Improve the Reporting of Trials

54.4 Consequences of Poor Reporting of Trials

54.5 Distinguishing Between Methodological and Reporting Issues

54.6 One Solution to Poor Reporting: CONSORT 2010 and CONSORT Extensions

54.7 Impact of CONSORT

54.8 Guidance for Reporting Randomized Trial Protocols: SPIRIT

54.9 Trial Registration

54.10 Final Thoughts


Chapter 55: U.S. Department of Veterans Affairs Cooperative Studies Program

55.1 Introduction

55.2 History of the Cooperative Studies Program (CSP)

55.3 Organization and Functioning of the CSP

55.4 Roles of the Biostatistician and Pharmacist in the CSP

55.5 Ongoing and Completed Cooperative Studies (1972–2000)

55.6 Current Challenges and Opportunities

55.7 Concluding Remarks


Chapter 56: Women’s Health Initiative: Statistical Aspects and Selected Early Results

56.1 Introduction

56.2 WHI Clinical Trial and Observational Study

56.3 Study Organization

56.4 Principal Clinical Trial Comparisons, Power Calculations, and Safety and Data Monitoring

56.5 Biomarkers and Intermediate Outcomes

56.6 Data Management and Computing Infrastructure

56.7 Quality Assurance Program Overview

56.8 Early Results from the WHI Clinical Trial

56.9 Summary and Discussion


Chapter 57: World Health Organization (WHO): Global Health Situation

57.1 Introduction

57.2 Program Activities to the End of the Twentieth Century

57.3 Vision for the Use and Generation of Data in the First Quarter of the Twenty-First Century


Further Reading


Methods and Applications of Statistics in Clinical Trials


Advisory Editor

N. Balakrishnan

McMaster University, Canada

The Wiley Series in Methods and Applications of Statistics is a unique grouping of research that features classic contributions from Wiley’s Encyclopedia of Statistical Sciences, Second Edition (ESS, 2e) alongside newly written articles that explore various problems of interest and their intrinsic connection to statistics. The goal of this collection is to encompass an encyclopedic scope of coverage within individual books that unify the most important and interesting applications of statistics within a specific field of study. Each book in the series successfully upholds the goals of ESS, 2e by combining established literature and newly-developed contributions written by leading academics, researchers, and practitioners in a comprehensive and accessible format. The result is a succinct reference that unveils modern, cutting-edge approaches to acquiring, analyzing, and presenting data across diverse subject areas.


Balakrishnan • Methods and Applications of Statistics in the Life and Health Sciences

Balakrishnan • Methods and Applications of Statistics in Business, Finance, and Management Science

Balakrishnan • Methods and Applications of Statistics in Engineering, Quality Control, and the Physical Sciences

Balakrishnan • Methods and Applications of Statistics in the Social and Behavioral Sciences

Balakrishnan • Methods and Applications of Statistics in the Atmospheric and Earth Sciences

Balakrishnan • Methods and Applications of Statistics in Clinical Trials, Volume 1: Concepts, Principles, Trials, and Designs

Balakrishnan • Methods and Applications of Statistics in Clinical Trials, Volume 2: Planning, Analysis, and Inferential Methods

Title Page


Per Kragh Andersen, University of Copenhagen, Copenhagen, Denmark,

Garnet L. Anderson, Fred Hutchinson Cancer Research Center, Seattle, WA,

Chul Ahn, University of Texas Southwestern Medical Center, Dallas, TX,

Edgar Brunner, Professor Emeritus of Biostatistics, University Medical Center, Göttingen, Germany,

Jürgen B. Bulitta, State University of New York at Buffalo, Buffalo, NY,

Jianwen Cai, University of North Carolina, Chapel Hill, NC,

Patrizio Capasso, University of Kentucky, Lexington, KY,

Robert C. Capen, Merck Research Laboratories West Point, PA

Jhelum Chakravorty, McGill University, Montreal, QC, Canada,

Chi Wan Chen, Pfizer Inc., New York, NY

David H. Christiansen, Christiansen Consulting, Boise, ID

Shein-Chung Chow, Duke University Durham, NC,

Joseph F. Collins

Jason T. Connor, Berry Consultants, Orlando, FL,

Richard J. Cook, University of Waterloo, Waterloo, ON, Canada,

Xiangqin Cui, University of Alabama at Birmingham, Birmingham, AL.

C. B. Dean, Western University, Western Science Centre, London, ON, Canada,

Yu Deng, University of North Carolina, Chapel Hill, NC,

Diane L. Fairclough, University of Colorado Health Sciences, Center Denver, CO,

John R. Feussner, Medical University of South Carolina, Charleston, SC

Boris Freidlin, National Cancer Institute, Bethesda, MD,

Patricia A. Granz, UCLA Jonsson Comprehensive Cancer Center, Los Angeles, CA,

Courtney Gray-McGuire, Case Western Reserve University, Cleveland, OH,

Birgit Grund, University of Minnesota, Minneapolis, MN,

Kilem L. Gwet, Advanced Analytics, LLC, Gaithersburg, MD,

H. R. Hapsara, World Health Organization, Geneva, Switzerland,

William R. Hendee, Medical College of Wisconsin, Milwaukee, WI,

William G. Henderson

Tim Hesterberg, Insightful Corporation, Seattle, WA

Nicholas H. G. Holford, University of Auckland, Auckland, New Zealand,

Norbert Holländer, University Hospital of Freiburg, Freiburg, Germany,

David W. Hosmer, University of Massachusetts, Amherst, MA,

Alan D. Hutson, University at Buffalo, Buffalo, NY,

Peter B. Imrey, Cleveland Clinic, Cleveland, OH,

Elizabeth Juarez-Colunga, University of Colorado Denver, Aurora, CO,

Seung-Ho Kang, Ewha Woman’s University, Seoul, South Korea

Jörg Kaufmann, AG Schering SBU Diagnostics & Radiopharmaceuticals, Berlin, Germany

Niels Keiding, University of Copenhagen, Copenhagen, Denmark,

Célestin C. Kokonendji, University of Franche-Comté, Besançon, France,

Helena Chmura Kraemer, Stanford University, Palo Alto, CA,

John M. Lachin, George Washington University, Washington, DC,

Philip W. Lavori, Stanford University School of Medicine, Standford, CA,

Morven Leese, Institute of Psychiatry—Health Services and Population Research Department, London, UK

Stanley Lemeshow, Ohio State University, Columbus, OH,

Jason J. Z. Liao, Merck Research Laboratories West Point, PA,

Tsae-Yun Daphne Lin, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Rockville, MD,

Qing Lu, Michigan State University, East Lansing, MI,

Aditya Mahajan, McGill University, Montreal, QC, Canada,

Michael A. McIsaac, University of Waterloo, Waterloo, ON, Canada,

David Moher, Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada and Department of Epidemiology and Community Medicine, University of Ottawa, Ottawa, ON, Canada,

Grier P. Page, RTI International, Research Triangle Park, NC,

Peter Peduzzi, Yale School of Public Health, New Haven, CT,

Ross L. Prentice, Fred Hutchinson Cancer Research Center, Seattle, WA,

Philip C. Prorok, National Institutes of Health, Bethesda, MD,

Michael A. Proschan, National Institute of Allergy and Infectious Diseases, Bethesda, MD,

Frank Rockhold, GlaxoSmithKline R&D, King of Prussia, PA,

Hannah R. Rothstein, City University of New York, NY,

W. Janusz Rzeszotarski, U.S. Food and Drug Administration, Rockville, MD

Mike R. Sather, Department of Veterans Affairs, Albuquerque, NM,

Tony Segreti, Research Triangle Institute, Research Triangle, NC

Larissa Shamseer, Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada and Department of Epidemiology and Community Medicine, University of Ottawa, Ottawa, ON, Canada,

Joanna H. Shih, National Cancer Institute, Bethesda, MD

Richard M. Simon, National Cancer Institute, Bethesda, MD,

Yeunjoo Song, Case Western Reserve University, Cleveland, OH

Chris Stevenson, Monash University, Victoria, Australia,

Samy Suissa, McGill University, Montreal, QC, Canada,

Ming T. Tan, Georgetown University, Washington, DC,

Duncan C. Thomas University of Southern California, Los Angeles, CA,

Susan Todd, University of Reading Reading, Berkshire, UK,

Lucy Turner, Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada,

Albert Vexler, University at Buffalo, Buffalo, NY,

Hansheng Wang, Peking University Beijing, P. R. China,

Xikui Wang, University of Manitoba, Winnipeg, MN, Canada,

C. S. Wayne Weng, Chung Yuan Christian University, Chungli, Taiwan

Andreas Wienke, University Halle-Wittenberg, Halle, Germany,

Anthony B. Wolbarst, University of Kentucky, Lexington, KY,

Andrew R. Wyant, University of Kentucky, Lexington, KY,

Yang Xie, University of Texas Southwestern Medical Center, Dallas, TX

Jihnhee Yu, University at Buffalo, Buffalo, NY,

Antonia Zapf, University Medical Center, Göttingen, Germany,

Donglin Zeng, University of North Carolina, Chapel Hill, NC,

Yinghui Zhou, The University of Reading, Reading, Berkshire, UK

David M. Zucker, Hebrew University of Jerusalem Jerusalem, Israel,


Planning, developing, and implementing clinical trials, have become an important and integral part of life. More and more efforts and care go into conducting various clinical trials as they have been responsible in making key advances in medicine and treatments to different illnesses. Today, clinical trials have become mandatory in the development and evaluation of modern drugs and in identifying the association of risk factors to diseases. Due to the complexity of various issues surrounding clinical trials, regulatory agencies oversee their approval and also ensure impartial review. The main purpose of this two-volume handbook is to provide a detailed exposition of historical developments and also to highlight modern advances on methods and analysis for clinical trials.

It is important to mention that the four-volume Wiley Encyclopedia of Clinical Trials served as a basis for this handbook. While many pertinent entries from this Encyclopedia have been included here, a number of them have been updated to reflect recent developments on their topics. Some new articles detailing modern advances in statistical methods in clinical trials and their applications have also been included.

A volume of this size and nature cannot be successfully completed without the cooperation and support of the contributing authors, and my sincere thanks and gratitude go to all of them. Thanks are also due to Mr. Steve Quigley and Ms. Sari Friedman (of John Wiley & Sons, Inc.) for their keen interest in this project from day one, as well as for their support and constant encouragement (and, of course, occasional nudges, too) throughout the course of this project. Careful and diligent work of Mrs. Debbie Iscoe in the typesetting of this volume and of Angioline Loredo at the production state, is gratefully acknowledged. Partial financial support of the Natural Sciences and Engineering Research Council of Canada also assisted in the preparation of this handbook, and this support is much appreciated.

This is the seventh in a series of handbooks on methods and applications of statistics. While the first handbook has focused on life and health sciences, the second handbook has focused on business, finance, and management sciences, the third has focused on engineering, quality control, and physical sciences, the fourth has focused on behavioral and social sciences, the fifth has focused on atmospheric and earth sciences, and the sixth handbook has concentrated on methods and applications of statistics to clinical trials. This is the second of two volumes describing in detail statistical developments concerning clinical trials, focusing specifically on planning, analysis, and inferential methods.

It is my sincere hope that this handbook and the others in the series will become basic reference resources for those involved in these fields of research!


Hamilton, Canada
February 2014

Chapter 1

Analysis of Over- and Underdispersed Data

Elizabeth Juarez-Colunga and C. B. Dean

1.1 Introduction

In the analysis of discrete data, for example, count data analyzed under a Poisson model, or binary data analyzed under a binomial model quite often the empirical variance exceeds the theoretical variance under the presumed model. This phenomenon is called overdispersion. If overdispersion is ignored, standard errors of parameter estimates will be underestimated, and therefore p-values for tests and hypotheses will be too small, leading to incorrectly declaring a predictor as significant when in fact it may not be.

The Poisson and binomial distributions are simple models but have strict assumptions. In particular, they assume a special mean-variance relationship since each of these distributions is determined by a single parameter. On the other hand, the normal distribution is determined by two parameters, the mean μ and variance σ2, which characterize the location and the spread of the data around the mean. In both the Poisson and binomial distributions, the variance is fixed once the mean or the probability of success has been defined.

Hilbe [25] provides a very comprehensive discussion of what he calls apparent overdispersion, which refers to scenarios in which the data exhibit variation beyond what can be explained by the model and this lack of fit is due to several “fixable” reasons. These reasons may be omitting important predictors in the model, the presence of outliers, omitting important interactions as predictors, the need of a transformation for a predictor, and misspecifying the link function for relating the mean response to the predictors. Hilbe [25] also discusses how to recognize overdispersion, and how to adjust for it when it is present beyond apparent cases, and provides an excellent overall review of the topic.

It is important to note that if apparent overdispersion has been ruled out, in log-linear or logistic analyses, the point estimates of the covariate effects will be quite similar regardless of whether overdispersion is accounted for or not. Hence, treatment and other effects will not be aberrant or give a hint of the presence of overdispersion. As well, this suggests that adjusting for overdispersion can be handled through adjustments of variance estimates [35].

Evidence of apparent or real overdispersion exists when the Pearson or deviance residuals are too large [6]; the corresponding Pearson and deviance goodness-of-fit statistics indicate a poor fit. Several tests have been developed for overdispersion in the context of Poisson or binomial analyses [11, 12, 54], as well as in the context of zero-heavy data [30, 51, 53, 52],

1.2 Overdispersed Binomial and Count Models

1.2.1 Overdispersed Binomial Model

In the binomial context, overdispersion typically arises because the independence assumption is violated. This is commonly caused by clustering of responses; for instance, clinics or hospitals may induce a clustering effect due to differences in patient care strategies across institutions.

Let Yi denote a binomial response for cluster i, i = 1, …, M, which results in the sum of mi binary outcomes Yij, that is, , where j denotes individual j, j = 1, …, mi. If Yij are independent binary variables taking values 0 or 1 with probabilities (1 – pi) and pi, respectively, then E(Yi) = mipi and var(Yi) = mipi (1 − pi). If there exists correlation between two responses in any given cluster, with corr(Yij, Yik) = ψ > 0, then

(1) equation

leading to overdispersion. Note that ψ < 0 leads to underdispersion. If we consider pis as random variables with E(pi) = π and var(pi) = ψπ(1 – π), then the unconditional mean and variance also have the form of (1). And if we further assume that the pi follow Beta(α, β) distribution, the distribution of Yi is the so called beta-binomial distribution, which has been studied extensively (see, for example, Hinde and Demétrio [26] and Molenberghs et al. [37]).

1.2.2 Overdispersed Poisson Model

Poisson and overdispersed Poisson data are examples of data from counting processes that arise when individuals experience repeated occurrence of events over time. Such data are known as recurrent event data (see, for example, Cook and Lawless [10] and Juarez-Colunga [29]). Consider M individuals each monitored for occurrence of events from a start time 0 through time τi, called the termination time, i = 1, …, M. Let {Ni(t), t ≥ 0} be the right-continuous counting process that records the number of events for individual i over the interval [0, t]. The termination time is here assumed to be independent of the counting process {Ni(t), t ≥ 0}. Let the intensity of the counting process be λi(t|H (t)) = , where Hi(t) = {Ni(s) : 0 ≤ st} represents the history of the process up to time t. This intensity represents the instantaneous probability of occurrence of an event at time t. If the counting process is Poisson, given the memoryless property of the Poisson process, the intensity only depends on the history through t, λi(t|H(t)) = λi(t), and the expected number of events over the entire follow-up can be written as . Let the total number of events in the entire follow-up be ni+ for individual i; then ni+ follows a Poisson distribution with mean μi+ = E(ni+) = var(ni+).

Two types of data are common in counting processes, and we will consider both here in the context of overdispersion: (1) individual i gives rise to ni+ event times recorded as ti1 < ti2 < ··· < tini+ < τi, and (2) only counts within specific follow-up times 0 = Ti,0 < Ti,1 < ··· < Ti,ei = τi are available; these are called panel counts and are denoted nip = Ni(Ti,p) – Ni(Ti,p−1), p = 1, 2 ···, ei, with the total aggregated count for individual i denoted


A simple way to incorporate overdispersion is through the use of an individual-specific random effect vi. Given vi, and the covariate vector xi corresponding to the ith individual, the counting process Ni(t) may be modeled as a Poisson process with intensity function

(2) equation

where ρ is a twice-differentiable baseline intensity function, depending on the parameter α, and β are the regression effects. We may take E(vi) = 1 without loss of generality, and let var(vi) = ϕ. The function λ(t; x) is now interpreted as a population average rate function among subjects with covariate vector x, since E(dN(t)|x) = λ(t; x)dt. In addition to representing covariates unaccounted for, vi may also be a cluster effect, taking the same value for all individuals within the same cluster. This can be used to account for unknown clinic effects, for example, where individuals are patients clustered within clinics. When vi follows a gamma distribution, the marginal distribution of ni+ is negative binomial. The variance of the count of total aggregated events ni+ has the form .

Let the expected number of events over the entire follow-up [0, τi] be μi+ = Ri exp(x′i β), where Ri = ρ(t; α)dt is called the cumulative baseline intensity function. Similarly, defining the cumulative baseline intensity function in panel period p as Rip = , we have μip = E(nip) = Rip exp(xi β).

The likelihood function based on continuous or panel follow-up can be expressed in the same framework as follows. Let θ = (β′, α′, ϕ)′, and let ωipl be the time of the lth event, from the start of the study, for the ith individual in panel period p, i = 1, …, M, p = 1, …, ei, l = 1, …, nip. The likelihood based on either the full data, consisting of event times (subscripted by d = f), or the panel data (subscripted by d = p) factorizes as:

(3) equation


(4) equation


(5) equation

(6) equation

is the likelihood for a mixed Poisson model based on the total counts observed for individual i. The likelihood L(θ) becomes the negative binomial if vi is gamma distributed (i.e., G(vi;.) is a gamma distribution). If there is a single panel, Lp(θ) [see Equation (3)] will reduce to the simple mixed Poisson kernel, L(θ), where the response is the total count of events in the entire follow-up time.

Overdispersed recurrent event counts are often encountered in trials where the main interest is to test whether certain treatments are effective in reducing the recurrences of events, as illustrated in the example Section 1.2.3. In this case, the βs are parametrized such that the treatment effects are measured relative to treatment 1, so that β1 reflects the overall mean and α describes the shape of the intensity function ρ(t, α); common forms of ρ(t, α) are exponential (exp(αt)) and Weibull (αtα−1).

1.2.3 Example

Consider a clinical trial, conducted by the Veterans Administration Co-operative Urological Research Group, that studied the effects of placebo pills, pyridoxine pills, and periodic instillation of thiotepa into the bladder on the frequency of recurrence of bladder cancer [8]. The data appear in Andrews and Herzberg [2]. All 116 patients had bladder cancer when they entered the study; the tumors were removed, and the patients were randomly assigned to one of the three treatments. Here we consider estimation of the treatment effect under both a design with continuous follow-up, as in the study, and an artificial design, for illustrative purposes, with 2 equally spaced scheduled follow-up visits over 64 months; for the panel design, we record information on event recurrences at the scheduled follow-up times and at termination times.

Table 1 reports parameter estimates and their standard errors of a Weibull baseline model for both Poisson and negative binomial analyses, under a 2-panel design as well as an analysis of the full data based on continuous follow-up. Based on both 2-panel and full data analyses there is substantial overdispersion in the data, with ϕ = 1.351 in the analysis based on continuous follow-up. The estimate of the Weibull shape parameter α is quite close to unity, and the standard errors of the regression parameter estimates from the overdispersed model are significantly larger than those from the simple Poisson analyses. The latter leads to a significant protective effect of thiotepa treatment (β3) based on the Poisson analysis, but not based on the overdispersed model.

Table 1: Parameter estimates (Est) and their standard errors (SE), resulting from the Poisson and negative binomial (NB) likelihood fit to the bladder cancer data. The regression parameters β1, β2, β3 correspond to the three treatment groups, parametrized with respect to the placebo, and α parametrizes the baseline intensity function.

1.3 Other Approaches to Account for Overdispersion

1.3.1 Generalized Linear Mixed Model

A general class of models that encompasses the incorporation of several random effects, not necessarily independent, is generalized linear mixed models. This may include an individual-specific random effect, as discussed above and also more complex structures that can accommodate dependencies in outcome variables as well as in random effects. A generalized linear mixed model specifies that

(7) equation

where μ and xi are the mean of the response and the vector of covariates, corresponding to the ith individual, respectively; zi is a vector of covariates determining the random effects structure, and the vector of random effects γ is distributed with a mean of zero and finite variance matrix; g is the link function. Conditional on γ, the responses are assumed to have a distribution in the exponential family, for example, Poisson or binomial.

Maximum likelihood estimation involves q-dimensional integration, where q is the dimension of γ; often random effects are assumed to be Gaussian. Tuerlinckx et al. [47] provide a review of methods used for estimation of generalized linear mixed models, discussing methods used to approximate the integral when integrating over the random effects distribution and methods that approximate the integrand of the marginal likelihood. Within the first set of methods, quadrature, Monte Carlo-based numerical methods, and expectation-maximization algorithms are reviewed; within the second, which approximate the integrand, Laplace’s and quasi-likelihood methods are considered.

With overdispersion present, the use of the Poisson or binomial maximum likelihood equations for estimating the regression parameters in the mean is still valid. The usual likelihood equations obtained assuming a generalized linear model are unbiased, estimating equations regardless of any misspecification of the variance structure. Hence, an alternate approach to the use of generalized linear mixed models is to use the corresponding generalized linear model and adjust variance estimates. In this case, often as a final step, the variance is estimated by the sandwich estimator formula, which is an empirical estimator; this approach has become very popular in the last few decades [31, 46].

Nonparametric approaches for handling random effects have also been developed. Lindsay [32] provides a classic comprehensive source on the topic. More recently, Böhning and Seidel [3] provide a review of advances in estimation in mixture models, including nonparametric estimation, the EM algorithm, likelihood ratio tests for testing the number of components in the mixture, special mixtures such as zero-inflated Poisson models, multivariate mixtures, and testing and adjusting for heterogeneity. Groeneboom et al. [20] propose an algorithm, called the support reduction algorithm, to estimate M-estimators in mixture models through iterative unconstrained optimization. Wang [50] proposes three algorithms based on the constrained Newton method [49] to estimate semiparametric mixture models. In these, the mixture distribution G is left unspecified and a finite-dimensional parameter β is common to all mixture components. The three methods are based on (1) alternating estimation of parameters G and β, (2) profiling the likelihood, and (3) modifying the support set; they all use the constrained Newton method and an additional optimization algorithm for unconstrained problems.

There have been some efforts in combining models that account both for overdispersion and clustering effects, the latter perhaps arising from longitudinal measurements. Booth et al. [4] propose a negative binomial model to account for overdispersion, which incorporates random effects, in the linear predictor of the mean, to account for such clustering effects; numerical methods or the EM algorithm is proposed for estimation. Along the same lines, Molenberghs et al. [36] discuss a similar model with gamma and normal random effects to account for overdispersion and clustering effects and Molenberghs et al. [37] generalize the model to a family of generalized linear models for repeated measures with normal and conjugate random effects. Iddi and Molenberghs [27] discuss a marginalized model to account for overdispersion and longitudinal correlation.

Serial correlation may also be accommodated, in addition to overdispersion, through Gaussian time series [23]. Jowaheer and Sutradhar [28] use generalized estimating equations to account for autocorrelation structures as well as overdispersion in longitudinal counts. Parameters are estimated via a two-stage iterative procedure. Henderson and Shimakura [24] and Fiocco et al. [18] discuss a model that, conditional on a frailty, follows a Poisson distribution for counts of events and uses a gamma serially correlated process to model dependency between observations arising from the same individual. In this generalization of the individual frailty model, the random effects are first-order autocor-related. Henderson and Shimakura [24] estimate the parameters of the model using a composite likelihood method based on pairs of time points, while Fiocco et al. [18] discuss an alternative approach using a two-stage procedure. In the two-stage procedure all parameters except the frailty correlation are estimated at the first stage while, in the second stage, the correlation of the frailties is estimated, based on pairs of observations.

1.3.2 Zero-Inflated Models

Sometimes apparent overdispersion is induced by the presence of another mode in the data, often at 0. In these cases, the remedy is to fit a model that handles the extra zeros that cannot be accounted for through the Poisson distribution [7, 40]. However, it may also occur that there is overdispersion beyond zero-inflation, in which case models accounting for both extra zeros and overdispersion have been developed, for example, the zero-inflated negative binomial [19]. There has been great interest in the last decade in accounting as well for correlation structures such as longitudinal, cluster, or spatial components. Ainsworth [1] provides a review of zero-inflated models, pointing out several references, mainly in the field of environmental statistics, that address such challenges in zero-heavy models. Hall [22] considers the challenges of simultaneously modeling within—and between—subject heterogeneity, while Dobbie and Welsh [14] consider serial correlation; both of these are framed in the context of zero-heavy count data models. Along the same lines, Wan and Chan [48] discuss a modeling approach based on a geometric process that accounts for overdispersion in zero-heavy models and, additionally, can handle serial correlation.

1.4 Underdispersion

Underdispersion is less common, but also found in count and binary data. Ridout and Besbeas [41] review methods for dealing with underdispersed counts, including (1) weighted Poisson models, in which weights are assigned to each probability density value [9,13]; (2) double Poisson models, in which the distribution has one more parameter θ than the Poisson and E(X) ≈ λ and var(X) ≈ λ/θ [15]; (3) birth processes, which are generalizations of Poisson processes in which the birth rate at any time is a function of the number of events that have already occurred [16, 17]; [for example, Bosch and Ryan [5] propose a class of distributions λk = η(k + 1)δ, where δ < 0 corresponds to underdispersion, δ > 0 to overdispersion, and δ = 0 reduces to Poisson distribution]; (4vk1−v