## Design of experiments

When the goal in a statistical study is to understand cause and effect, experiments are the only way to obtain convincing evidence for causation. This is an introductory discussion on experimental design, introducing its vocabulary, its characteristics and its principles. We use a hypothetical example of an experiment to illustrate the concepts.

An observational study is a study in which the researchers observe individuals and measure variables of interest but do not attempt to influence the response variable. In an experiment, the researchers deliberately impose some treatment on individuals and then observe the response variables. When the goal is to demonstrate cause and effect, experiment is the only source of convincing data.

Terminology
The individuals on which the experiment is performed are called the experimental units. If the experimental units are human beings, they are called subjects. A treatment is an experimental condition applied to the experimental units. The goal of an experiment is to determine whether changes in one or more explanatory variables have any effect on some response variables. For this reason, the distinction between explanatory variables and response variables is important. The explanatory variables are often called factors. Each factor may have several values (called levels). Many experiments study the joint effects of several factors. A treatment is then formed by combining a level of each of the factors.

Introduction of Examples
To illustrate the concepts, we use a hypothetical experiment. Suppose a new medication designed to reduce fever (and relieve aches and pain) is being tested for efficacy and side effects. For convenience, we call this new medication Drug X. There are three different dosages: 325 mg, 500 mg and 650 mg. The experiment enrolls 1200 patients with high fever to test Drug X. Assume that the subjects in this experiment include 600 men and 600 women with age ranging from 18 to 70. The primary outcome measure is the drop in body temperature three hours after taking the treatment, which is the yardstick by which to measure the success of Drug X.

The three basic principles of statistical design of experiments are Control, Randomization and Repetition. When we say the design of an experiment (or experimental design), we refer to the manner in which these three principles are carried out. There are three main experimental designs: completely randomized design, randomized block design and matched pairs design. We present several examples based on the hypothetical experiment to illustrate these ideas.

More Terminology
In all the examples below, the new medication Drug X is compared to a group receiving placebo. A placebo is a dummy treatment. In this example, it is a medication that has identical look, smell and taste as Drug X. The experiments described in these examples are double-blind, meaning that both the subjects and the experimenters do not know which treatment any subject has received.

Example 1a – Completely Randomized Design
The researchers randomly assigns the 1200 subjects into two treatment groups, Group 1 (600 subjects taking Drug X 325 mg) and Group 2 (600 subjects taking placebo). Three hours after taking the treatments, the researchers compare the change in body temperature between the treatment groups. In this examples, there are two treatments, Drug X and placebo.

The treatment of interest (Drug X) is called an intervention and the Drug X group is called the intervention group. The placebo group is sometimes called the non-intervention group.

This is a one-factor experiment, i.e. only one explanatory variable, namely fever reducing medication. The one factor has two levels (Drug X 325 mg and placebo). Figure 1 below is an outline of this design.

Example 1b – Completely Randomized Design
The example is similar to Example 1a except that there are four levels in the one factor. The researchers randomly assign the 1200 subjects into four treatment groups, Group 1 (300 subjects taking Drug X 325 mg), Group 2 (300 subjects taking Drug X 500 mg), Group 3 (300 subjects taking Drug X 650 mg) and Group 4 (300 subjects taking placebo). As in Example 1a, three hours after taking the treatments, the researchers compare the change in body temperature between the several treatment groups.

The various Drug X groups are called the intervention groups and the placebo group is called the non-intervention group. Figure 2 below illustrates this design.

The Principles of Experimental Design
Let’s discuss the basic principles outlined in Figures 1 and 2. First, the principle of control. The placebo group is called the control group, the group of subjects who receive a dummy treatment. Why is the control group necessary? Why compare different Drug X groups with the placebo group? Why not just apply the new fever reducing medication to all patients? Without the control group, we do not know whether the favorable responses from the patients are due to the new medication or to the placebo effect. Some patients respond well to any treatment, even a placebo. However, with a control group alongside Drug X groups, both the placebo effect and other influences operate on both the control group and Drug X groups. The only difference between the groups is the varying levels of Drug X. Thus the purpose of having a control group is to prevent confounding.

Two variables are confounded when their effect on a response variable (reduction in fever in our examples) cannot be distingushed from one another. Without the control group as comparison, the effect of Drug X and the placebo effect on the response variable (reduction in fever) cannot be distinguished from one another. There could be other variables that may influence the response variable (these variables are called lurking variables or confounding variables). Without the control group, the effect of Drug X and these lurking variables may also be confounded.

The first principle of experimental design is control. We just illustrate the simplest form of control, that is, the comparison of two or more treatments (other forms of control will be discussed below). The purpose of comparing treatments is to prevent the effect of the explanatory variables (the effect of the new fever reducing medication in our examples) being confounded with the placebo effect and other lurking variables.

The second principle of experimental design is randomization. Notice that the patients are assigned to either the Drug X groups or the placebo group through the use of random chance (conceptually, think drawing names from a hat). The goal of randomization is to produce treatment groups that are similar (except for chance variation) before the treatments begin.

The third principle of experimental design is repetition, which refers to the practice of applying the treatments to many experimental units. The goal of repetition is to reduce the role of chance variation on the results of the experiment. For example, if each treatment group has only one patient, the results would depend too much on which group gets lucky and is assigned a patient that is less sick (e.g. with milder fever conditions). If we assign many patients to each group, it will be unlikely that all patients in the Drug X groups will be less sick.

Prevention of Bias
Control (in particular, comparison of treatments) and randomization together prevent bias (i.e. systematic favoritism). For example, because of the placebo effect, uncontrolled experiments in medicine can give new medications or new therapies a higher rate of success. If patients are not assigned to treatment groups by chance, the subjects in the new medication group and the placebo group may not have similar characteristics and thus the results may become biased. For example, randomization prevents the possibility that the researchers try to assign the sicker patients to the new medication groups in an effort to help them. With randomization, there is no inherent bias resulting from some patients opting to take the new medication. In a randomized controlled experiment, both the experimenters and the participants do not have the right to choose the treatments.

In clinical trials involving medication, another way to prevent bias is through the technique of blinding, which refers to the non-disclosure of the treatment a subject is receiving. There are two types of blinding. An experiment is single-blind is one in which the subject does not know what treatment he or she is receiving. A double-blind experiment is one in which both the subject and the medical personnel in contact with the subject do not know which treatment the subject is receiving.

The double-blind technique avoids unconscious bias. In such an experiment, both the medical personnel and the subject do not adjust their behavior that may bias the results (e.g. the researcher may think that a placebo cannot help the patient).

Summary – Completely Randomized Design
The designs described in both Example 1a and Example 1b are called completely randomized designs and are the simplest statistical designs for experiments. These designs incorporated all three principles of control, randomization and repetition. A completely randomized design incorporates the simplest form of control, namely comparison. The goal of comparing different treatments is to prevent the confounding of the explanatory variables with lurking variables. The element of randomization is to produce treatment groups that are similar (except for chance variation) before the treatments begin. Comparison and randomization together prevent bias. The goal of repetition is to reduce the role of chance variation on the results of the experiment.

However, completely randomized designs are inferior to more elaborate designs. The reason is that it is possible that not all potential cofounding variables are removed. For example, men and women respond differently to medication. In the completely randomized designs in Examples 1a and 1b, the random assignment to treatment groups are done without regard to gender. These two examples ignore the differences between men and women. Though the patients are assigned by random chance to the treatment groups, it is possible that one treatment group is assigned more men than women. A better design will look separately at the responses of men and women. In other words, the researchers will separate out the men from the women and then randomly assign each gender group to the different treatment groups. This is called the randomized block design.

Example 2 – Randomized Block Design
The 1200 subjects are assigned to blocks, based on gender. Then subjects within each block are randomly assigned to the two treatment groups (Drug X 325 mg, and Placebo). The variable of gender is called a blocking variable. Three hours after taking the treatments, the researchers compare the change in body temperature between the treatment groups within each block. Figure 3 below outlines this randomized block design.

The randomized block design in this example is an improvement over the completely randomized design in Example 1a. In both Example 1a and Example 2, comparison of treatment groups is used to implicitly prevent confounding. However, the randomized block deisgn in Example 2 explicitly controls the variable of gender.

We can also create the blocking equivalence of Example 1b by randomly assigning subjects in each block to four treatments (Drug X 325 mg, Drug X 500 mg, Drug X 650 mg, and Placebo). The outline of this design is omitted.

Summary – Randomized Block Design
A block is a group of experimental units that are known, prior to the experiment, to be similar according to some variables and that these variables are expected to affect the response to the treatments. In the randomized block design, the randomization to treatments is carried out separately winthin each block. Blocks are another form of control. The block design is to control the variables that are used to form the blocks (these variables are called the blocking variables). In Example 2, the blocking variable is the gender.

The third main type of design is the matched pairs design, which is a special case of the randomized block design. This design is only applicable when the experiment has only two treatments and that the experimental units can be separated into pairs according to some blocking variables. Consider the following example.

Example 3 – Matched Pairs Design
The 1200 subjects are grouped into 600 matched pairs. The subjects in each pairs have the same gender and have similar age. Moreover, the subjects in each matched pair are assigned by random chance to the two treatments (Drug X 325 mg and placebo). The advantage of this design is that it explicitly controls both age and gender. Each matched pair is like a block (based on age and gender). Randomization is done separately within each pair. Three hours after taking the treatments, the researchers compare the change in body temperature within each matched pair.

Summary – The Matched Pairs Design
The matched pairs design is, in some ways, superior to completely randomized design and randomized block design. The requirements are that this design can only compare two treatments and that the group of experimental units can be matched in pairs (thus requiring more work on the part of the experimenters). Because matched subjects are more similar than unmatched subjects, the matched pairs design can explicitly control the variables that are used to form the pairs.

Randomization remains important in the matched pairs design. For example, which one of the subjects in a matched pair uses Drug X is decided by a coin toss. In contrast, in a completely randomized design, random chance is used to assign all the subjects all at once to the treatment groups. In a randomized block design, the random assignment is done separately within each block.

One common variation of the matched pairs design applies both treatments on the same subject. In such a design, each subject serves as his or her own control.

Conclusion
One important advantage of experiments over observational studies is that well designed experiments can provide good evidence for causation. In an experiment, an intervention (Drug X in our examples) is applied to enough experimental units to ensure that the results of the experiments will not be dependent on chance variation (the principle of repetition). The experimental units are randomly assigned to an intervention group and a non-intervention group (placebo group). This refers to the principles of randomization and control, which help reduce the potential of bias and prevent confounding by increasing the chance that confounding variables will operate equally on the intervention group and the placebo group. Then the only difference between the intervention group and the placebo group is the intervention. When the intervention group experiences favorable results, we can be confident that the intervention makes the difference.

Reference

1. Moore. D. S., McCabe G. P., Craig B. A., Introduction to the Practice of Statistics, 6th ed., W. H. Freeman and Company, New York, 2009