ABSTRACT

The two principal analytical study designs used in traditional risk-factor epidemiology are cohort and case-control designs. In cohort studies, individuals are sampled and their exposure status is determined initially; subsequent disease status is the outcome variable. In case-control studies, cases and noncases are sampled, and the “outcomes” being compared are the covariates (including exposure). If the underlying cohort (or source population from which the cases arise) can be identified, the case-control study can be regarded as a special sampling design for efficiently learning about the risk factor associations in the cohort (Wacholder et al., 1992a); Wacholder.et.al.1992b; (Wacholder et al., 1992c). Indeed, case-control studies are of great public health interest and utility because the odds ratio that can be estimated from a case-control study approximates 16the relative risk comparing exposed to unexposed in a cohort study (Cornfield, 1951). Cohort and case-control designs are distinguished primarily by the direction of inference: cohort studies reason forwards in time from an exposure to disease, while case-control studies reason the other direction from disease back to possible causes. It is this direction of inference – not the temporal sequence of data collection – that is conceptually important. Either type of study can be conducted “retrospectively” (using records from the past) or “prospectively” (collecting new observations as they occur in the future). Although some authors have used the terms prospective and retrospective to refer to cohort and case-control designs, respectively, it is better to be explicit about the time periods over which the data were collected, whether for a cohort study or case-control study.