Statistics and Empirical Legal Studies Research Guide

Research Design

The research design process has at least the following basic components: asking research questions; reviewing the existing literature; developing theoretical answers to those questions and observable implications of those theories; thinking of rival hypotheses; and developing measures that will allow the assessment of the observable implications of our theories and the rival theories. Epstein & Martin 19 (2014); Ryan 152-157 (2015).

Asking Research Questions

Good research questions have real world implications and contribute to existing knowledge. Reviewing existing literature for gaps is one way to develop good research questions. Some other sources of good research questions include discrepancies between what scholars think they know and what they observe, current events (e.g., is a new law having the intended effect?), and commissioned research. Epstein & King 55 (2002). For example, imagine that some states are beginning to consider stricter gun control laws in response to recent mass shootings. Such a current event might inspire the research question, Do stricter gun control laws reduce gun violence?

Reviewing the Literature

As should be clear from the previous section on asking research questions, empirical study does not occur in isolation. New studies should address questions left unanswered by previous scholarship or otherwise contribute to "a scholarly conversation." Ryan 153. You have to review existing literature to participate in that conversation. A good literature review should demonstrate an understanding of standard research methods in the field and "the state of discovery about a particular topic." Id. See also Lawless et al. 20 (2016). It is not sufficient to search only law review articles and legal books for the literature review; empirical legal studies may also appear in the literature of political science, economics, psychology, sociology, or other disciplines.

Developing Theories and Their Observable Implications

Once you have a research question, you should develop one or more theories to answer it. A theory is "a reasoned and precise speculation about the answer to a research question." Epstein & Martin 31. As with question development, reviewing existing literature can help you develop theoretical answers.

Because it is usually impossible to directly test a theory, you must instead test "observable implications" that follow from it. Observable implications generally take the form of "a claim about the relationship between (or among) variables that we can . . . observe." Observable implications are also known as hypotheses. See Ryan 153. A variable is something that varies, and variables come in two types: dependent variables (the outcomes you are trying to explain) and independent variables (the events or factors that we think explain the dependent variables). Epstein & Martin 35; Ryan 152.

For example, consider our research question, Do stricter gun control laws reduce gun violence? One theoretical answer might be, The stricter a state's gun control laws, the less gun violence the state will have (assuming everything else is equal). You can't directly test the theory, because you can't personally alter state gun control laws to see what happens next. But you could test an observable implication of the theory: If the theory is correct, states with more restrictions on gun ownership should have lower rates of gun-related homicide than otherwise similar states that have fewer restrictions. The gun-related homicide rate in different states is the dependent variable, the outcome of interest to the researcher. Differing state laws on gun ownership are the independent variable, since they potentially explain the states' differing gun-related homicide rates.

Assessing Rival Hypotheses

In choosing variables for your study, it is not enough to examine only variables that flow directly from the observable implications of your preferred theory. You also need to consider variables derived from the observable implications of rival theories. Lawless et al. 21. "[I]t is only by posing sufficient challenges to its theory (and its observable implications) that research can make the strongest possible case." Epstein & King 76. Imagine if you published your empirical study of the relationship between state gun control laws and the gun-related homicide rate, but you failed to consider other possible factors that might influence gun violence, such as differing rates of illegal drug use or different average sentences for homicide. You would be leaving your theory open to obvious avenues of attack. Instead of ignoring other possible explanations, you should try to control for them. For example, you could gather data on drug use and sentence lengths as well as gun control laws. Then you could compare the gun-related homicide rates of two or more states that are similar in all variables except their gun control laws.


Once you have decided which variables to observe in testing your theory, you must next determine how to measure them. "Strict gun control laws" doesn't have a universal definition; you have to translate the abstract variable "strictness" into some concrete indicator(s). Epstein & Martin 42. One possibility might be to list all the different types of gun control laws you observe across different states, and record whether each exists in the states you are studying: Are background checks required? Are waiting periods imposed? Must gun owners be trained and licensed? Another possibility might be to examine penalties such as sentence lengths or fine amounts for violations of relevant legal restrictions.

After you have decided how to measure your concept, you would need to identify values of the measure. Id. at 43. For example, if you note the presence or absence of different gun control laws, do you simply add up the number of laws present in each state and call that a "strictness score"? Or should some types of laws be weighted more heavily than others? If you adopt a numeric scoring system, it should be clearly described so that others can apply it. Instead of assigning numeric scores, maybe you would assign categorical labels to states based on which restrictions they impose on gun ownership, e.g., "least restrictive," "somewhat restrictive," "moderately restrictive," and "most restrictive." Epstein & King 82. If you use categorical labels, you should define them specifically enough so that other researchers can unambiguously apply them. If other researchers cannot consistently apply your measurement scheme, your study will not be replicable.

The measures you adopt for your research will determine what information is recorded about the object of study. Everything else will be lost. Id. at 46. You should therefore be careful to measure enough dimensions of your subject to capture all the parts that are essential to your research question. In our gun control example, if we select a measure based on the number of different possible types of gun control provisions and we fail to identify some of those possible types, some information will be lost during data collection.

In empirical research, the appropriateness of a measure is evaluated on the basis of its validity and its reliability. Validity is the extent to which a measure reflects the concept it is measuring. Epstein & King 87. There are several different types of validity, including internal validity (the extent to which the research design allows the drawing of valid inferences about the relationships between variables, Lawless et al. 30); external validity (the extent to which research findings can be generalized beyond the current study, Lawless et al. 34); and construct validity (the extent to which the measures used to "operationalize" variables adequately capture the construct being studied, Lawless et al. 35). Reliability is the extent to which a measure produces "the same value (regardless of whether it is the right one) on the same standard for the same subject at the same time." Epstein & King at 83; see also Lawless et al. 37. If the same person steps on the same scale three times, and all three times the scale says the person weighs 150 pounds, then the scale's measurement of weight is reliable. Nevertheless, if the person actually weighs 165 pounds, the scale's measurements are not valid. A good research design will include some method for evaluating the validity and reliability of its measures.