**Contents**

**Cover**

**Half Title page**

**Title page**

**Copyright page**

**Contributors**

**Preface**

**Chapter 1: Analysis of Over- and Underdispersed Data**

1.1 Introduction

1.2 Overdispersed Binomial and Count Models

1.3 Other Approaches to Account for Overdispersion

1.4 Underdispersion

1.5 Software Notes

References

**Chapter 2: Analysis of Variance (ANOVA)**

2.1 Introduction

2.2 Factors, Levels, Effects, and Cells

2.3 Cell Means Model

2.4 One-Way Classification

2.5 Parameter Estimation

2.6 The *R*(.) Notation—Partitioning Sum of Squares

2.7 ANOVA—Hypothesis of Equal Means

2.8 Multiple Comparisons

2.9 Two-Way Crossed Classification

2.10 Balanced and Unbalanced Data

2.11 Interaction Between Rows and Columns

2.12 Analysis of Variance Table

References

**Chapter 3: Assessment of Health-Related Quality of Life**

3.1 Introduction

3.2 Choice of HRQOL Instruments

3.3 Establishment of Clear Objectives in HRQOL Assessments

3.4 Methods for HRQOL Assessment

3.5 HRQOL as the Primary End Point

3.6 Interpretation of HRQOL Results

3.7 Examples

3.8 Conclusion

References

Further Reading

**Chapter 4: Bandit Processes and Response-Adaptive Clinical Trials: The Art of Exploration Versus Exploitation**

4.1 Introduction

4.2 Exploration Versus Exploitation with Complete Observations

4.3 Exploration Versus Exploitation with Censored Observations

4.4 Conclusion

References

**Chapter 5: Bayesian Dose-Finding Designs in Healthy Volunteers**

5.1 Introduction

5.2 A Bayesian Decision-Theoretic Design

5.3 An Example of Dose Escalation in Healthy Volunteer Studies

5.4 Discussion

References

**Chapter 6: Bootstrap**

6.1 Introduction

6.2 Plug-In Principle

6.3 Monte Carlo Sampling— The “Second Bootstrap Principle”

6.4 Bias and Standard Error

6.5 Examples

6.6 Model Stability

6.7 Accuracy of Bootstrap Distributions

6.8 Bootstrap Confidence Intervals

6.9 Hypothesis Testing

6.10 Planning Clinical Trials

6.11 How Many Bootstrap Samples Are Needed

6.12 Additional References

References

**Chapter 7: Conditional Power in Clinical Trial Monitoring**

7.1 Introduction

7.2 Conditional Power

7.3 Weight-Averaged Conditional Power or Bayesian Predictive Power

7.4 Conditional Power of a Different Kind: Discordance Probability

7.5 Analysis of a Randomized Trial

7.6 Conditional Power: Pros and Cons

References

**Chapter 8: Cost-Effectiveness Analysis**

8.1 Introduction

8.2 Definitions and Design Issues

8.3 Cost and Effectiveness Data

8.4 The Analysis of Costs and Outcomes

8.5 Robustness and Generalizability in Cost-Effectiveness Analysis

References

Further Reading

**Chapter 9: Cox-Type Proportional Hazards Models**

9.1 Introduction

9.2 Cox Model for Univariate Failure Time Data Analysis

9.3 Marginal Models for Multivariate Failure Time Data Analysis

9.4 Practical Issues in Using the Cox Model

9.5 Examples

9.6 Extensions

9.7 Softwares and Codes

References

Further Reading

**Chapter 10: Empirical Likelihood Methods in Clinical Experiments**

10.1 Introduction

10.2 Classical EL: Several Ingredients for Theoretical Evaluations

10.3 The Relationship Between Empirical Likelihood and Bootstrap Methodologies

10.4 Bayes Methods Based on Empirical Likelihoods

10.5 Mixtures of Likelihoods

10.6 An Example: ROC Curve Analyses Based on Empirical Likelihoods

10.7 Applications of Empirical Likelihood Methodology in Clinical Trials or Other Data Analyses

10.8 Concluding Remarks

Appendix

References

**Chapter 11: Frailty Models**

11.1 Introduction

11.2 Univariate Frailty Models

11.3 Multivariate Frailty Models

11.4 Software

References

**Chapter 12: Futility Analysis**

12.1 Introduction

12.2 Common Statistical Approaches to Futility Monitoring

12.3 Examples

12.4 Discussion

References

Further Reading

**Chapter 13: Imaging Science in Medicine I: Overview**

13.1 Introduction

13.2 Advances in Medical Imaging

13.3 Evolutionary Developments in Imaging

13.4 Conclusion

References

**Chapter 14: Imaging Science in Medicine, II: Basics of X-Ray Imaging**

14.1 Introduction to Medical Imaging: Different Ways of Creating Visible Contrast Among Tissues

14.2 What the Body Does to the X-Ray Beam: Subject Contrast From Differential Attenuation of the X-Ray Beam by Various Tissues

14.3 What the X-Ray Beam Does to the Body: Known Medical Benefits Versus Possible Radiogenic Risks

14.4 Capturing the Visual Image: Analog (20th Century) X-Ray Image Receptors

**Chapter 15: Imaging Science in Medicine, III: Digital (21st Century) X-Ray Imaging**

15.1 The Computer in Medical Imaging

15.2 The Digital Planar X-Ray Modalities: Computed Radiography and Digital Radiography and Fluoroscopy

15.3 Digital Fluoroscopy and Digital Subtraction Angiography

15.4 Digital Tomosynthesis: Planar Imaging in Three Dimensions

15.5 Computed Tomography: Superior Contrast in Three-Dimensional X-Ray Attenuation Maps

**Chapter 16: Intention-to-Treat Analysis**

16.1 Introduction

16.2 Missing Information

16.3 The Intention-to-Treat Design

16.4 Efficiency of the Intent-to-Treat Analysis

16.5 Compliance-Adjusted Analyses

16.6 Conclusion

References

Further Reading

**Chapter 17: Interim Analyses**

17.1 Introduction

17.2 Opportunities and Dangers of Interim Analyses

17.3 The Development of Techniques for Conducting Interim Analyses

17.4 Methodology for Interim Analyses

17.5 An Example: Statistics for Lamivudine

17.6 Interim Analyses in Practice

17.7 Conclusions

References

**Chapter 18: Interrater Reliability**

18.1 Definition

18.2 The Importance of Reliability in Clinical Trials

18.3 How Large a Reliability Coefficient Is Large Enough?

18.4 Design and Analysis of Reliability Studies

18.5 Estimate of the Reliability Coefficient—Parametric

18.6 Estimation of the Reliability Coefficient— Nonparametric

18.7 Estimation of the Reliability Coefficient—Binary

18.8 Estimation of the Reliability Coefficient—Categorical

18.9 Strategies to Increase Reliability (Spearman–Brown Projection)

18.10 Other Types of Reliabilities

References

**Chapter 19: Intrarater Reliability**

19.1 Introduction

19.2 Intrarater Reliability for Continuous Scores

19.3 Nominal Scale Score Data

19.4 Ordinal and Interval Score Data

19.5 Concluding Remarks

References

Further Reading

**Chapter 20: Kaplan—Meier Plot**

20.1 Introduction

20.2 Estimation of Survival Function

20.3 Additional Topics

References

**Chapter 21: Logistic Regression**

21.1 Introduction

21.2 Fitting the Logistic Regression Model

21.3 The Multiple Logistic Regression Model

21.4 Fitting the Multiple Logistic Regression Model

21.5 Example

21.6 Testing for the Significance of the Model

21.7 Interpretation of the Coefficients of the Logistic Regression Model

21.8 Dichotomous Independent Variable

21.9 Polytomous Independent Variable

21.10 Continuous Independent Variable

21.11 Multivariate Case

References

**Chapter 22: Metadata**

22.1 Introduction

22.2 History/Background

22.3 Data Set Metadata

22.4 Analysis Results Metadata

22.5 Regulatory Submission Metadata

References

**Chapter 23: Microarray**

23.1 Introduction

23.2 What is a Microarray?

23.3 Other Array Technologies

23.4 Define Objectives of the Study

23.5 Experimental Design for Microarray

23.6 Data Extraction

23.7 Microarray Informatics

23.8 Statistical Analysis

23.9 Annotation

23.10 Pathway, GO, and Class-Level Analysis Tools

23.11 Validation of Microarray Experiments

23.12 Conclusions

References

**Chapter 24: Multi-Armed Bandits, Gittins Index, and Its Calculation**

24.1 Introduction

24.2 Mathematical Formulation of Multi-Armed Bandits

24.3 Off-Line Algorithms for Computing Gittins Index

24.4 On-Line Algorithms for Computing Gittins Index

24.5 Computing Gittins Index for the Bernoulli Sampling Process

24.6 Conclusion

References

**Chapter 25: Multiple Comparisons**

25.1 Introduction

25.2 Strong and Weak Control of the FWE

25.3 Criteria for Deciding Whether Adjustment is Necessary

25.4 Implicit Multiplicity: Two-Tailed Testing

25.5 Specific Multiple Comparison Procedures

References

**Chapter 26: Multiple Evaluators**

26.1 Introduction

26.2 Agreement for Continuous Data

26.3 Agreement for Categorical Data

26.4 Summary and Discussion

References

**Chapter 27: Noncompartmental Analysis**

27.1 Introduction

27.2 Terminology

27.3 Objectives and Features of Noncompartmental Analysis

27.4 Comparison of Noncompartmental and Compartmental Models

27.5 Assumptions of NCA and Its Reported Descriptive Statistics

27.6 Calculation Formulas for NCA

27.7 Guidelines for Performance of NCA Based on Numerical Integration

27.8 Conclusions and Perspectives

References

Further Reading

**Chapter 28: Nonparametric ROC Analysis for Diagnostic Trials**

28.1 Introduction

28.2 Different Aspects of Study Design

28.3 Nonparametric Models and Hypotheses

28.4 Point Estimator

28.5 Asymptotic Distribution and Variance Estimator

28.6 Derivation of the Confidence Interval

28.7 Statistical Tests

28.8 Adaptations for Cluster Data

28.9 Results of a Diagnostic Study

28.10 Summary and Final Remarks

References

**Chapter 29: Optimal Biological Dose for Molecularly Targeted Therapies**

29.1 Introduction

29.2 Phase I Dose-Finding Designs for Cytotoxic Agents

29.3 Phase I Dose-Finding Designs for Molecularly Targeted Agents

29.4 Discussion

References

Further Reading

**Chapter 30: Over- and Underdispersion Models**

30.1 Introduction

30.2 Count Dispersion Models

30.3 Count Explanatory Models

30.4 Summary and Final Remarks

References

**Chapter 31: Permutation Tests in Clinical Trials**

31.1 Randomization Inference—Introduction

31.2 Permutation Tests—How They Work

31.3 Normal Approximation to Permutation Tests

31.4 Analyze as You Randomize

31.5 Interpretation of Permutation Analysis Results

31.6 Summary

References

**Chapter 32: Pharmacoepidemiology, Overview**

32.1 Introduction

32.2 The Case-Crossover Design

32.3 Confounding Bias

32.4 Risk Functions Over Time

32.5 Probabilistic Approach for Causality Assessment

32.6 Methods Based on Prescription Data

References

**Chapter 33: Population Pharmacokinetic and Pharmacodynamic Methods**

33.1 Introduction

33.2 Terminology

33.3 Fixed Effects Models

33.4 Random Effects Models

33.5 Model Building and Parameter Estimation

33.6 Software

33.7 Model Evaluation

33.8 Stochastic Simulation

33.9 Experimental Design

33.10 Applications

References

Further Reading

**Chapter 34: Proportions: Inferences and Comparisons**

34.1 Introduction

34.2 One-Sample Case

34.3 Two Independent Samples

34.4 Note on Software

References

**Chapter 35: Publication Bias**

35.1 Publication Bias and the Validity of Research Reviews

35.2 Research on Publication Bias

35.3 Data Suppression Mechanisms Related to Publication Bias

35.4 Prevention of Publication Bias

35.5 Assessment of Publication Bias

35.6 Impact of Publication Bias

References

Further Reading

**Chapter 36: Quality of Life**

36.1 Background

36.2 Measuring Health-Related Quality of Life

36.3 Development and Validation of HRQoL Measures

36.4 Use in Research Studies

36.5 Interpretation/Clinical Significance

36.6 Conclusions

References

**Chapter 37: Relative Risk Modeling**

37.1 Introduction

37.2 Why Model Relative Risks?

37.3 Data Structures and Likelihoods

37.4 Approaches to Model Specification

37.5 Mechanistic Models

References

**Chapter 38: Sample Size Considerations for Morbidity/Mortality Trials**

38.1 Introduction

38.2 General Framework for Sample Size Calculation

38.3 Choice of Test Statistics

38.4 Adjustment of Treatment Effect

38.5 Informative Noncompliance

References

**Chapter 39: Sample Size for Comparing Means**

39.1 Introduction

39.2 One-Sample Design

39.3 Two-Sample Parallel Design

39.4 Two-Sample Crossover Design

39.5 Multiple-Sample One-Way ANOVA

39.6 Multiple-Sample Williams Design

39.7 Discussion

References

**Chapter 40: Sample Size for Comparing Proportions**

40.1 Introduction

40.2 One-Sample Design

40.3 Two-Sample Parallel Design

40.4 Two-Sample Crossover Design

40.5 Relative Risk—Parallel Design

40.6 Relative Risk—Crossover Design

40.7 Discussion

References

**Chapter 41: Sample Size for Comparing Time-to-Event Data**

41.1 Introduction

41.2 Exponential Model

41.3 Cox’s Proportional Hazards Model

41.4 Log-Rank Test

41.5 Discussion

References

**Chapter 42: Sample Size for Comparing Variabilities**

42.1 Introduction

42.2 Comparing Intrasubject Variabilities

42.3 Comparing Intersubject Variabilities

42.4 Comparing Total Variabilities

42.5 Discussion

References

**Chapter 43: Screening, Models of**

43.1 Introduction

43.2 What is Screening?

43.3 Why Use Modeling?

43.4 Characteristics of Screening Models

43.5 A Simple Disease and Screening Model

43.6 Analytic Models for Cancer

43.7 Simulation Models for Cancer

43.8 Model Fitting and Validation

43.9 Models for Other Diseases

43.10 Current State and Future Directions

References

**Chapter 44: Screening Trials**

44.1 Introduction

44.2 Design Issues

44.3 Sample Size

44.5 Analysis

44.6 Trial Monitoring

References

**Chapter 45: Secondary Efficacy End Points**

45.1 Introduction

45.2 Literature Review

45.3 Review of Methodology for Multiplicity Adjustment and Gatekeeping Strategies for Secondary End Points

45.4 Summary

References

Further Reading

**Chapter 46: Sensitivity, Specificity, and Receiver Operator Characteristic (ROC) Methods**

46.1 Evaluating a Single Binary Test Against a Binary Criterion

46.2 Evaluation of a Single Binary Test: ROC Methods

46.3 Evaluation of a Test Response Measured on an Ordinal Scale: ROC Methods

46.4 Evaluation of Multiple Different Tests

46.5 The Optimal Sequence of Tests

46.6 Sampling and Measurement Issues

46.7 Summary

References

**Chapter 47: Software for Genetics/Genomics**

47.1 Introduction

47.2 Data Management

47.3 Genetic Analysis

47.4 Genomic Analysis

47.5 Other

References

Further Reading

**Chapter 48: Stability Study Designs**

48.1 Introduction

48.2 Stability Study Designs

48.3 Criteria for Design Comparison

48.4 Stability Protocol

48.5 Basic Design Considerations

48.6 Conclusions

References

**Chapter 49: Subgroup Analysis**

49.1 Introduction

49.2 The Dilemma of Subgroup Analysis

49.3 Planned Versus Unplanned Subgroup Analysis

49.4 Frequentist Methods

49.5 Testing Treatment by Subgroup Interactions

49.6 Subgroup Analyses in Positive Clinical Trials

49.7 Confidence Intervals for Treatment Effects within Subgroups

49.8 Bayesian Methods

References

**Chapter 50: Survival Analysis, Overview**

50.1 Introduction

50.2 History

50.3 Survival Analysis Concepts

50.4 Nonparametric Estimation and Testing

50.5 Parametric Inference

50.6 Comparison with Expected Survival

50.7 The Cox Regression Model

50.8 Other Regression Models for Survival Data

50.9 Multistate Models

50.10 Other Kinds of Incomplete Observation

50.11 Multivariate Survival Analysis

50.12 Concluding Remarks

References

**Chapter 51: The FDA and Regulatory Issues**

51.1 Caveat

51.2 Introduction

51.3 Chronology of Drug Regulation in the United States

51.4 FDA Basic Structure

51.5 IND Application Process

51.6 Drug Development and Approval Time Frame

51.7 NDA Process

51.8 U.S. Pharmacopeia and FDA

51.9 CDER Freedom of Information Electronic Reading Room

51.10 Conclusion

**Chapter 52: The Kappa Index**

52.1 Introduction

52.2 The Kappa Index

52.3 Inference for Kappa via Generalized Estimating Equations

52.4 The Dependence of Kappa on Marginal Rates

52.5 General Remarks

References

**Chapter 53: Treatment Interruption**

53.1 Introduction

53.2 Therapeutic TI Studies in HIV/AIDS

53.3 Management of Chronic Disease

53.4 Analytic Treatment Interruption in Therapeutic Vaccine Trials

53.5 Randomized Discontinuation Designs

53.6 Final Comments

References

**Chapter 54: Trial Reports: Improving Reporting, Minimizing Bias, and Producing Better Evidence-Based Practice**

54.1 Introduction

54.2 Reporting Issues in Clinical Trials

54.3 Moral Obligation to Improve the Reporting of Trials

54.4 Consequences of Poor Reporting of Trials

54.5 Distinguishing Between Methodological and Reporting Issues

54.6 One Solution to Poor Reporting: CONSORT 2010 and CONSORT Extensions

54.7 Impact of CONSORT

54.8 Guidance for Reporting Randomized Trial Protocols: SPIRIT

54.9 Trial Registration

54.10 Final Thoughts

References

**Chapter 55: U.S. Department of Veterans Affairs Cooperative Studies Program**

55.1 Introduction

55.2 History of the Cooperative Studies Program (CSP)

55.3 Organization and Functioning of the CSP

55.4 Roles of the Biostatistician and Pharmacist in the CSP

55.5 Ongoing and Completed Cooperative Studies (1972–2000)

55.6 Current Challenges and Opportunities

55.7 Concluding Remarks

References

**Chapter 56: Women’s Health Initiative: Statistical Aspects and Selected Early Results**

56.1 Introduction

56.2 WHI Clinical Trial and Observational Study

56.3 Study Organization

56.4 Principal Clinical Trial Comparisons, Power Calculations, and Safety and Data Monitoring

56.5 Biomarkers and Intermediate Outcomes

56.6 Data Management and Computing Infrastructure

56.7 Quality Assurance Program Overview

56.8 Early Results from the WHI Clinical Trial

56.9 Summary and Discussion

References

**Chapter 57: World Health Organization (WHO): Global Health Situation**

57.1 Introduction

57.2 Program Activities to the End of the Twentieth Century

57.3 Vision for the Use and Generation of Data in the First Quarter of the Twenty-First Century

Reference

Further Reading

**Index**

**Methods and Applications of Statistics in Clinical Trials**

**WILEY SERIES IN METHODS AND APPLICATIONS OF STATISTICS**

The ** Wiley Series in Methods and Applications of Statistics** is a unique grouping of research that features classic contributions from Wiley’s

**WILEY SERIES IN METHODS AND APPLICATIONS OF STATISTICS**

Balakrishnan • *Methods and Applications of Statistics in the Life and Health Sciences*

Balakrishnan • *Methods and Applications of Statistics in Business, Finance, and Management Science*

Balakrishnan • *Methods and Applications of Statistics in Engineering, Quality Control, and the Physical Sciences*

Balakrishnan • *Methods and Applications of Statistics in the Social and Behavioral Sciences*

Balakrishnan • *Methods and Applications of Statistics in the Atmospheric and Earth Sciences*

Balakrishnan • *Methods and Applications of Statistics in Clinical Trials, Volume 1: Concepts, Principles, Trials, and Designs*

Balakrishnan • *Methods and Applications of Statistics in Clinical Trials, Volume 2: Planning, Analysis, and Inferential Methods*

Copyright © 2014 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey. All rights reserved.

Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic format. For information about Wiley products, visit our web site at www.wiley.com.

*Library of Congress Cataloging-in-Publication Data:*

Methods and applications of statistics in clinical trials vol 2/ [edited by] N. Balakrishnan.

p.; cm. — (Methods and applications of statistics)

Includes bibliographical references and index.

ISBN 978-1-118-30476-1 (cloth)

I. Balakrishnan, N., 1956– editor of compilation. II. Series: Wiley series in methods and applications of statistics.

[DNLM: 1. Clinical Trials as Topic. 2. Statistics as Topic. QV 771.4]

R853.C55

610.72’4—dc23

2013034342

**Contributors**

**Per Kragh Andersen**, University of Copenhagen, Copenhagen, Denmark, *pka@biostat.ku.dk*

**Garnet L. Anderson**, Fred Hutchinson Cancer Research Center, Seattle, WA, *garnet@whi.org*

**Chul Ahn**, University of Texas Southwestern Medical Center, Dallas, TX, *chul.ahn@utsouthwestern.edu*

**Edgar Brunner**, Professor Emeritus of Biostatistics, University Medical Center, Göttingen, Germany, *Edgar.Brunner@ams.med.uni-goettingen.de*

**Jürgen B. Bulitta**, State University of New York at Buffalo, Buffalo, NY, *Jurgen.Bulitta@monash.edu*

**Jianwen Cai**, University of North Carolina, Chapel Hill, NC, *cai@bios.unc.edu*

**Patrizio Capasso**, University of Kentucky, Lexington, KY, *patriziocapasso@aol.com*

**Robert C. Capen**, Merck Research Laboratories West Point, PA

**Jhelum Chakravorty**, McGill University, Montreal, QC, Canada, *jhelum.chakravorty@mail.mcgill.ca*

**Chi Wan Chen**, Pfizer Inc., New York, NY

**David H. Christiansen**, Christiansen Consulting, Boise, ID

**Shein-Chung Chow**, Duke University Durham, NC, *sheinchung.chow@duke.edu*

**Joseph F. Collins**

**Jason T. Connor**, Berry Consultants, Orlando, FL, *jason@berryconsultants.com*

**Richard J. Cook**, University of Waterloo, Waterloo, ON, Canada, *rjcook@uwaterloo.ca*

**Xiangqin Cui**, University of Alabama at Birmingham, Birmingham, AL. *xcui@uab.edu*

**C. B. Dean**, Western University, Western Science Centre, London, ON, Canada, *dean@stats.uwo.ca*

**Yu Deng**, University of North Carolina, Chapel Hill, NC, *ydeng@bios.unc.edu*

**Diane L. Fairclough**, University of Colorado Health Sciences, Center Denver, CO, *dianefairclough@earthlink.net*

**John R. Feussner**, Medical University of South Carolina, Charleston, SC

**Boris Freidlin**, National Cancer Institute, Bethesda, MD, *freidlinb@ctep.nci.nih.gov*

**Patricia A. Granz**, UCLA Jonsson Comprehensive Cancer Center, Los Angeles, CA, *pganz@ucla.edu*

**Courtney Gray-McGuire**, Case Western Reserve University, Cleveland, OH, *courtney.gray-mcguire@case.edu*

**Birgit Grund**, University of Minnesota, Minneapolis, MN, *birgit@umn.edu*

**Kilem L. Gwet**, Advanced Analytics, LLC, Gaithersburg, MD, *gwet62@gmail.com*

**H. R. Hapsara**, World Health Organization, Geneva, Switzerland, *hapsarah@who.org*

**William R. Hendee**, Medical College of Wisconsin, Milwaukee, WI, *whendee@mcw.edu*

**William G. Henderson**

**Tim Hesterberg**, Insightful Corporation, Seattle, WA

**Nicholas H. G. Holford**, University of Auckland, Auckland, New Zealand, *n.Holford@auckland.ac.nz*

**Norbert Holländer**, University Hospital of Freiburg, Freiburg, Germany, *norbert.hollaender@novartis.com*

**David W. Hosmer**, University of Massachusetts, Amherst, MA, *hosmer@schoolph.umass.edu*

**Alan D. Hutson**, University at Buffalo, Buffalo, NY, *ahutson@buffalo.edu*

**Peter B. Imrey**, Cleveland Clinic, Cleveland, OH, *imreyp@ccf.org*

**Elizabeth Juarez-Colunga**, University of Colorado Denver, Aurora, CO, *elizabeth.juarez-colunga@ucdenver.edu*

**Seung-Ho Kang**, Ewha Woman’s University, Seoul, South Korea

**Jörg Kaufmann**, AG Schering SBU Diagnostics & Radiopharmaceuticals, Berlin, Germany

**Niels Keiding**, University of Copenhagen, Copenhagen, Denmark, *nike@sund.ku.dk*

**Célestin C. Kokonendji**, University of Franche-Comté, Besançon, France, *celestin.kokonendji@univ-fcomte.fr*

**Helena Chmura Kraemer**, Stanford University, Palo Alto, CA, *hckhome@pacbell.net*

**John M. Lachin**, George Washington University, Washington, DC, *jml@bsc.gwu.edu*

**Philip W. Lavori**, Stanford University School of Medicine, Standford, CA, *lavori@stanford.edu*

**Morven Leese**, Institute of Psychiatry—Health Services and Population Research Department, London, UK

**Stanley Lemeshow**, Ohio State University, Columbus, OH, *lemeshow.1@osu.edu*

**Jason J. Z. Liao**, Merck Research Laboratories West Point, PA, *Jason.Liao@tevausa.com*

**Tsae-Yun Daphne Lin**, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Rockville, MD, *daphne.lin@fda.hhs.gov*

**Qing Lu**, Michigan State University, East Lansing, MI, *qlu@msu.edu*

**Aditya Mahajan**, McGill University, Montreal, QC, Canada, *aditya.mahajan@mcgill.ca*

**Michael A. McIsaac**, University of Waterloo, Waterloo, ON, Canada, *mamcisaa@uwaterloo.ca*

**David Moher**, Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada and Department of Epidemiology and Community Medicine, University of Ottawa, Ottawa, ON, Canada, *dmoher@ohri.ca*

**Grier P. Page**, RTI International, Research Triangle Park, NC, *gpage@rti.org*

**Peter Peduzzi**, Yale School of Public Health, New Haven, CT, *peter.peduzzi@yale.edu*

**Ross L. Prentice**, Fred Hutchinson Cancer Research Center, Seattle, WA, *rprentic@fhcrc.org*

**Philip C. Prorok**, National Institutes of Health, Bethesda, MD, *Philip.Prorok@nih.hhs.gov*

**Michael A. Proschan**, National Institute of Allergy and Infectious Diseases, Bethesda, MD, *ProschaM@mail.nih.gov*

**Frank Rockhold**, GlaxoSmithKline R&D, King of Prussia, PA, *frank.w.rockhold@gsk.com*

**Hannah R. Rothstein**, City University of New York, NY, *Hannah.Rothstein@baruch.cuny.edu*

**W. Janusz Rzeszotarski**, U.S. Food and Drug Administration, Rockville, MD

**Mike R. Sather**, Department of Veterans Affairs, Albuquerque, NM, *mike.sather@va.gov*

**Tony Segreti**, Research Triangle Institute, Research Triangle, NC

**Larissa Shamseer**, Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada and Department of Epidemiology and Community Medicine, University of Ottawa, Ottawa, ON, Canada, *lshamseer@ohri.ca*

**Joanna H. Shih**, National Cancer Institute, Bethesda, MD

**Richard M. Simon**, National Cancer Institute, Bethesda, MD, *rsimon@mail.nih.gov*

**Yeunjoo Song**, Case Western Reserve University, Cleveland, OH

**Chris Stevenson**, Monash University, Victoria, Australia, *Christopher.Stevenson@monash.edu*

**Samy Suissa**, McGill University, Montreal, QC, Canada, *samy.suissa@clinepi.mcgill.ca*

**Ming T. Tan**, Georgetown University, Washington, DC, *mtt34@georgetown.edu*

**Duncan C. Thomas** University of Southern California, Los Angeles, CA, *dthomas@usc.edu*

**Susan Todd**, University of Reading Reading, Berkshire, UK, *s.c.todd@reading.ac.uk*

**Lucy Turner**, Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada, *lturner@ohri.ca*

**Albert Vexler**, University at Buffalo, Buffalo, NY, *avexler@buffalo.edu*

**Hansheng Wang**, Peking University Beijing, P. R. China, *hansheng@gsm.pku.edu.cn*

**Xikui Wang**, University of Manitoba, Winnipeg, MN, Canada, *xikui.wang@umanitoba.ca*

**C. S. Wayne Weng**, Chung Yuan Christian University, Chungli, Taiwan

**Andreas Wienke**, University Halle-Wittenberg, Halle, Germany, *andreas.wienke@uk-halle.de*

**Anthony B. Wolbarst**, University of Kentucky, Lexington, KY, *awolbarst2@outlook.com*

**Andrew R. Wyant**, University of Kentucky, Lexington, KY, *andrew.wyant@uky.edu*

**Yang Xie**, University of Texas Southwestern Medical Center, Dallas, TX

**Jihnhee Yu**, University at Buffalo, Buffalo, NY, *jinheeyu@buffalo.edu*

**Antonia Zapf**, University Medical Center, Göttingen, Germany, *Antonia.Zapf@med.uni-goettingen.de*

**Donglin Zeng**, University of North Carolina, Chapel Hill, NC, *dzeng@email.unc.edu*

**Yinghui Zhou**, The University of Reading, Reading, Berkshire, UK

**David M. Zucker**, Hebrew University of Jerusalem Jerusalem, Israel, *mszucker@mscc.huji.ac.il*

*Preface*

Planning, developing, and implementing clinical trials, have become an important and integral part of life. More and more efforts and care go into conducting various clinical trials as they have been responsible in making key advances in medicine and treatments to different illnesses. Today, clinical trials have become mandatory in the development and evaluation of modern drugs and in identifying the association of risk factors to diseases. Due to the complexity of various issues surrounding clinical trials, regulatory agencies oversee their approval and also ensure impartial review. The main purpose of this two-volume handbook is to provide a detailed exposition of historical developments and also to highlight modern advances on methods and analysis for clinical trials.

It is important to mention that the four-volume *Wiley Encyclopedia of Clinical Trials* served as a basis for this handbook. While many pertinent entries from this *Encyclopedia* have been included here, a number of them have been updated to reflect recent developments on their topics. Some new articles detailing modern advances in statistical methods in clinical trials and their applications have also been included.

A volume of this size and nature cannot be successfully completed without the cooperation and support of the contributing authors, and my sincere thanks and gratitude go to all of them. Thanks are also due to Mr. Steve Quigley and Ms. Sari Friedman (of John Wiley & Sons, Inc.) for their keen interest in this project from day one, as well as for their support and constant encouragement (and, of course, occasional nudges, too) throughout the course of this project. Careful and diligent work of Mrs. Debbie Iscoe in the typesetting of this volume and of Angioline Loredo at the production state, is gratefully acknowledged. Partial financial support of the Natural Sciences and Engineering Research Council of Canada also assisted in the preparation of this handbook, and this support is much appreciated.

This is the seventh in a series of handbooks on *methods and applications of statistics*. While the first handbook has focused on life and health sciences, the second handbook has focused on business, finance, and management sciences, the third has focused on engineering, quality control, and physical sciences, the fourth has focused on behavioral and social sciences, the fifth has focused on atmospheric and earth sciences, and the sixth handbook has concentrated on methods and applications of statistics to clinical trials. This is the second of two volumes describing in detail statistical developments concerning clinical trials, focusing specifically on planning, analysis, and inferential methods.

It is my sincere hope that this handbook and the others in the series will become basic reference resources for those involved in these fields of research!

**PROF. N. BALAKRISHNAN**

McMASTER UNIVERSITY

Hamilton, Canada

February 2014

In the analysis of discrete data, for example, count data analyzed under a Poisson model, or binary data analyzed under a binomial model quite often the empirical variance exceeds the theoretical variance under the presumed model. This phenomenon is called *overdispersion.* If overdispersion is ignored, standard errors of parameter estimates will be underestimated, and therefore p-values for tests and hypotheses will be too small, leading to incorrectly declaring a predictor as significant when in fact it may not be.

The Poisson and binomial distributions are simple models but have strict assumptions. In particular, they assume a special mean-variance relationship since each of these distributions is determined by a single parameter. On the other hand, the normal distribution is determined by two parameters, the mean μ and variance σ^{2}, which characterize the location and the spread of the data around the mean. In both the Poisson and binomial distributions, the variance is fixed once the mean or the probability of success has been defined.

Hilbe [25] provides a very comprehensive discussion of what he calls *apparent overdispersion*, which refers to scenarios in which the data exhibit variation beyond what can be explained by the model and this lack of fit is due to several “fixable” reasons. These reasons may be omitting important predictors in the model, the presence of outliers, omitting important interactions as predictors, the need of a transformation for a predictor, and misspecifying the link function for relating the mean response to the predictors. Hilbe [25] also discusses how to recognize overdispersion, and how to adjust for it when it is present beyond apparent cases, and provides an excellent overall review of the topic.

It is important to note that if apparent overdispersion has been ruled out, in log-linear or logistic analyses, the point estimates of the covariate effects will be quite similar regardless of whether overdispersion is accounted for or not. Hence, treatment and other effects will not be aberrant or give a hint of the presence of overdispersion. As well, this suggests that adjusting for overdispersion can be handled through adjustments of variance estimates [35].

Evidence of apparent or real overdispersion exists when the Pearson or deviance residuals are too large [6]; the corresponding Pearson and deviance goodness-of-fit statistics indicate a poor fit. Several tests have been developed for overdispersion in the context of Poisson or binomial analyses [11, 12, 54], as well as in the context of zero-heavy data [30, 51, 53, 52],

In the binomial context, overdispersion typically arises because the independence assumption is violated. This is commonly caused by clustering of responses; for instance, clinics or hospitals may induce a clustering effect due to differences in patient care strategies across institutions.

Let *Y*_{i} denote a binomial response for cluster *i, i* = 1, …, *M*, which results in the sum of *m*_{i} binary outcomes *Y*_{ij}, that is, , where *j* denotes individual *j, j* = 1, …, *m*_{i}. If *Y*_{ij} are independent binary variables taking values 0 or 1 with probabilities (1 – *p*_{i}) and *p*_{i}, respectively, then E(*Y*_{i}) = *m*_{i}*p*_{i} and var(*Y*_{i}) = *m*_{i}*p*_{i} (1 − *p*_{i}). If there exists correlation between two responses in any given cluster, with corr(*Y*_{ij}, *Y*_{ik}) = ψ *>* 0, then

(1)

leading to overdispersion. Note that ψ < 0 leads to underdispersion. If we consider *p*_{i}s as random variables with E(*p*_{i}) = π and var(*p*_{i}) = ψπ(1 – π), then the unconditional mean and variance also have the form of (1). And if we further assume that the *p*_{i} follow Beta(α, β) distribution, the distribution of *Y*_{i} is the so called beta-binomial distribution, which has been studied extensively (see, for example, Hinde and Demétrio [26] and Molenberghs et al. [37]).

Poisson and overdispersed Poisson data are examples of data from counting processes that arise when individuals experience repeated occurrence of events over time. Such data are known as recurrent event data (see, for example, Cook and Lawless [10] and Juarez-Colunga [29]). Consider *M* individuals each monitored for occurrence of events from a start time 0 through time τ_{i}, called the *termination time, i* = 1, …, *M*. Let {*N*_{i}(*t*), *t* ≥ 0} be the right-continuous counting process that records the number of events for individual *i* over the interval [0, *t*]. The termination time is here assumed to be independent of the counting process {*N*_{i}(*t*), *t* ≥ 0}. Let the intensity of the counting process be λ_{i}(*t*|*H* (*t*)) = , where *H*_{i}(*t*) = {*N*_{i}(*s*) : 0 ≤ *s* ≤ *t*} represents the history of the process up to time *t.* This intensity represents the instantaneous probability of occurrence of an event at time *t.* If the counting process is Poisson, given the memoryless property of the Poisson process, the intensity only depends on the history through *t*, λ_{i}(*t*|*H*(*t*)) = λ_{i}(*t*), and the expected number of events over the entire follow-up can be written as . Let the total number of events in the entire follow-up be *n*_{i+} for individual *i*; then *n*_{i+} follows a Poisson distribution with mean μ_{i+} = E(*n*_{i+}) = var(*n*_{i+}).

Two types of data are common in counting processes, and we will consider both here in the context of overdispersion: (*1*) individual *i* gives rise to *n*_{i+} event times recorded as *t*_{i1} < *t*_{i2} < ··· < *t*_{ini+} < τ_{i}, and (*2*) only counts within specific follow-up times 0 = *T*_{i,0} < *T*_{i,1} < ··· < *T*_{i,ei} = τ_{i} are available; these are called *panel* *counts* and are denoted *n*_{ip} = *N*_{i}(*T*_{i,p}) – *N*_{i}(*T*_{i,p−1}), *p* = 1, 2 ···, *e*_{i}, with the total aggregated count for individual *i* denoted

A simple way to incorporate overdispersion is through the use of an individual-specific random effect *v*_{i}. Given *v*_{i}, and the covariate vector *x*_{i} corresponding to the *i*th individual, the counting process *N*_{i}(*t*) may be modeled as a Poisson process with intensity function

(2)

where ρ is a twice-differentiable baseline intensity function, depending on the parameter **α**, and **β** are the regression effects. We may take E(*v*_{i}) = 1 without loss of generality, and let var(*v*_{i}) = *ϕ*. The function λ(*t*; ** x**) is now interpreted as a population average rate function among subjects with covariate vector

Let the expected number of events over the entire follow-up [0, τ_{i}] be μ_{i+} = *R*_{i} exp(*x′*_{i} **β**), where *R*_{i} = ρ(*t*; **α**)*dt* is called the cumulative baseline intensity function. Similarly, defining the cumulative baseline intensity function in panel period *p* as *R*_{ip} = , we have μ_{ip} = E(*n*_{ip}) = *R*_{ip} exp(** x**′

The likelihood function based on continuous or panel follow-up can be expressed in the same framework as follows. Let **θ** = (β′, α′, *ϕ*)′, and let ω_{ipl} be the time of the *l*th event, from the start of the study, for the *i*th individual in panel period *p, i* = 1, …, *M, p* = 1, …, *e*_{i}, *l* = 1, …, *n*_{ip}. The likelihood based on either the full data, consisting of event times (subscripted by *d* = *f*), or the panel data (subscripted by *d* = *p*) factorizes as:

(3)

where

(4)

and

(5)

(6)

is the likelihood for a mixed Poisson model based on the total counts observed for individual *i.* The likelihood *L*(**θ**) becomes the negative binomial if *v*_{i} is gamma distributed (i.e., *G*(*v*_{i};.) is a gamma distribution). If there is a single panel, *L*_{p}(**θ**) [see Equation (3)] will reduce to the simple mixed Poisson kernel, *L*(**θ**), where the response is the total count of events in the entire follow-up time.

Overdispersed recurrent event counts are often encountered in trials where the main interest is to test whether certain treatments are effective in reducing the recurrences of events, as illustrated in the example Section 1.2.3. In this case, the βs are parametrized such that the treatment effects are measured relative to treatment 1, so that β_{1} reflects the overall mean and **α** describes the shape of the intensity function ρ(*t*, **α**); common forms of ρ(*t*, **α**) are exponential (exp(**α***t*)) and Weibull (α*t*^{α−1}).

Consider a clinical trial, conducted by the Veterans Administration Co-operative Urological Research Group, that studied the effects of placebo pills, pyridoxine pills, and periodic instillation of thiotepa into the bladder on the frequency of recurrence of bladder cancer [8]. The data appear in Andrews and Herzberg [2]. All 116 patients had bladder cancer when they entered the study; the tumors were removed, and the patients were randomly assigned to one of the three treatments. Here we consider estimation of the treatment effect under both a design with continuous follow-up, as in the study, and an artificial design, for illustrative purposes, with 2 equally spaced scheduled follow-up visits over 64 months; for the panel design, we record information on event recurrences at the scheduled follow-up times and at termination times.

Table 1 reports parameter estimates and their standard errors of a Weibull baseline model for both Poisson and negative binomial analyses, under a 2-panel design as well as an analysis of the full data based on continuous follow-up. Based on both 2-panel and full data analyses there is substantial overdispersion in the data, with *ϕ* = 1.351 in the analysis based on continuous follow-up. The estimate of the Weibull shape parameter α is quite close to unity, and the standard errors of the regression parameter estimates from the overdispersed model are significantly larger than those from the simple Poisson analyses. The latter leads to a significant protective effect of thiotepa treatment (β_{3}) based on the Poisson analysis, but not based on the overdispersed model.

A general class of models that encompasses the incorporation of several random effects, not necessarily independent, is generalized linear mixed models. This may include an individual-specific random effect, as discussed above and also more complex structures that can accommodate dependencies in outcome variables as well as in random effects. A generalized linear mixed model specifies that

(7)

where **μ** and *x*_{i} are the mean of the response and the vector of covariates, corresponding to the *i*th individual, respectively; *z*_{i} is a vector of covariates determining the random effects structure, and the vector of random effects γ is distributed with a mean of zero and finite variance matrix; *g* is the link function. Conditional on γ, the responses are assumed to have a distribution in the exponential family, for example, Poisson or binomial.

Maximum likelihood estimation involves q-dimensional integration, where q is the dimension of γ; often random effects are assumed to be Gaussian. Tuerlinckx et al. [47] provide a review of methods used for estimation of generalized linear mixed models, discussing methods used to approximate the integral when integrating over the random effects distribution and methods that approximate the integrand of the marginal likelihood. Within the first set of methods, quadrature, Monte Carlo-based numerical methods, and expectation-maximization algorithms are reviewed; within the second, which approximate the integrand, Laplace’s and quasi-likelihood methods are considered.

With overdispersion present, the use of the Poisson or binomial maximum likelihood equations for estimating the regression parameters in the mean is still valid. The usual likelihood equations obtained assuming a generalized linear model are unbiased, estimating equations regardless of any misspecification of the variance structure. Hence, an alternate approach to the use of generalized linear mixed models is to use the corresponding generalized linear model and adjust variance estimates. In this case, often as a final step, the variance is estimated by the sandwich estimator formula, which is an empirical estimator; this approach has become very popular in the last few decades [31, 46].

Nonparametric approaches for handling random effects have also been developed. Lindsay [32] provides a classic comprehensive source on the topic. More recently, Böhning and Seidel [3] provide a review of advances in estimation in mixture models, including nonparametric estimation, the EM algorithm, likelihood ratio tests for testing the number of components in the mixture, special mixtures such as zero-inflated Poisson models, multivariate mixtures, and testing and adjusting for heterogeneity. Groeneboom et al. [20] propose an algorithm, called the support reduction algorithm, to estimate M-estimators in mixture models through iterative unconstrained optimization. Wang [50] proposes three algorithms based on the constrained Newton method [49] to estimate semiparametric mixture models. In these, the mixture distribution *G* is left unspecified and a finite-dimensional parameter β is common to all mixture components. The three methods are based on (*1*) alternating estimation of parameters *G* and β, (*2*) profiling the likelihood, and (*3*) modifying the support set; they all use the constrained Newton method and an additional optimization algorithm for unconstrained problems.

There have been some efforts in combining models that account both for overdispersion and clustering effects, the latter perhaps arising from longitudinal measurements. Booth et al. [4] propose a negative binomial model to account for overdispersion, which incorporates random effects, in the linear predictor of the mean, to account for such clustering effects; numerical methods or the EM algorithm is proposed for estimation. Along the same lines, Molenberghs et al. [36] discuss a similar model with gamma and normal random effects to account for overdispersion and clustering effects and Molenberghs et al. [37] generalize the model to a family of generalized linear models for repeated measures with normal and conjugate random effects. Iddi and Molenberghs [27] discuss a marginalized model to account for overdispersion and longitudinal correlation.

Serial correlation may also be accommodated, in addition to overdispersion, through Gaussian time series [23]. Jowaheer and Sutradhar [28] use generalized estimating equations to account for autocorrelation structures as well as overdispersion in longitudinal counts. Parameters are estimated via a two-stage iterative procedure. Henderson and Shimakura [24] and Fiocco et al. [18] discuss a model that, conditional on a frailty, follows a Poisson distribution for counts of events and uses a gamma serially correlated process to model dependency between observations arising from the same individual. In this generalization of the individual frailty model, the random effects are first-order autocor-related. Henderson and Shimakura [24] estimate the parameters of the model using a composite likelihood method based on pairs of time points, while Fiocco et al. [18] discuss an alternative approach using a two-stage procedure. In the two-stage procedure all parameters except the frailty correlation are estimated at the first stage while, in the second stage, the correlation of the frailties is estimated, based on pairs of observations.

Sometimes apparent overdispersion is induced by the presence of another mode in the data, often at 0. In these cases, the remedy is to fit a model that handles the extra zeros that cannot be accounted for through the Poisson distribution [7, 40]. However, it may also occur that there is overdispersion beyond zero-inflation, in which case models accounting for both extra zeros and overdispersion have been developed, for example, the zero-inflated negative binomial [19]. There has been great interest in the last decade in accounting as well for correlation structures such as longitudinal, cluster, or spatial components. Ainsworth [1] provides a review of zero-inflated models, pointing out several references, mainly in the field of environmental statistics, that address such challenges in zero-heavy models. Hall [22] considers the challenges of simultaneously modeling within—and between—subject heterogeneity, while Dobbie and Welsh [14] consider serial correlation; both of these are framed in the context of zero-heavy count data models. Along the same lines, Wan and Chan [48] discuss a modeling approach based on a geometric process that accounts for overdispersion in zero-heavy models and, additionally, can handle serial correlation.

Underdispersion is less common, but also found in count and binary data. Ridout and Besbeas [41] review methods for dealing with underdispersed counts, including (*1*) weighted Poisson models, in which weights are assigned to each probability density value [9,13]; (*2*) double Poisson models, in which the distribution has one more parameter θ than the Poisson and E(*X*) ≈ λ and var(*X*) ≈ λ/θ [15]; (*3*) birth processes, which are generalizations of Poisson processes in which the birth rate at any time is a function of the number of events that have already occurred [16, 17]; [for example, Bosch and Ryan [5] propose a class of distributions λ_{k} = η(*k* + 1)^{δ}, where δ < 0 corresponds to underdispersion, δ *>* 0 to overdispersion, and δ = 0 reduces to Poisson distribution]; (*4**v**k*^{1−v}