Lesson Overview: Taking Experiments Out of the Box
This lesson for social science research methods at the University of Maine at Augusta considers the classic research form of the experiment, as presented in your textbook The Process of Social Research. As we considered in the previous lesson, survey research suffers from many problems and pitfalls that can lead to useless answers when careful samples are not gathered and when careful questions are not asked. When a high-quality sample is unattainable, experiments may be a useful alternative. Within certain limits, experiments can provide useful insights even when working with a non-representative group of experimental subjects. However, as with surveys, knowing what those limits are is essential for an experiment that is well done. In this lesson’s “Do It Yourself” exercise, I challenge you to pull off a seemingly simple and popular experiment with success. Can you deliver the Pepsi Challenge?
Control: the Strength and the Weakness of Experimental Methods
If you think about it a little bit, you may realize that observational and survey methods have an approach to the idea of control that is quite different from the approach to control in experiments. In administering surveys or engaging in qualitative participant observation, social scientists (with perhaps the unscrupulous exception of push pollers) do not attempt to alter the behavior of those they study. Instead, they attempt to measure various behaviors, attitudes and expressions of those they study, and then ask how variation in one variable is associated with variation in another variable: positively, negatively, or not at all? To the extent that survey or observational researchers exercise control at all, it is through the controlled selection of samples or cases, or through the controlled analysis of resulting data to look for statistical patterns. If observational research is successful, the act of observation has minimal direct impact on the people actually being studied.
In experiments, by contrast, the point of the exercise is to produce a consistent, demonstrable change in the social world by very carefully manipulating the world. By administering some stimulus or treatment, the goal of an experiment is to determine whether such a stimulus or treatment has a desired effect on subjects’ behavior. Experimental methods are focused on the idea of controlling the experimental setting so it can be securely determined that the stimulus itself is producing changes in behavior, not some other aspect of a situation. Control occurs in various aspects of the setup of an experiment:
- The practice of random selection into different experimental groups controls the possible effect of pre-existing differences by distributing those differences randomly between different groups.
- The use of pre-tests and post-tests allows experimental researchers to further control for pre-existing differences by only studying the change over time in subjects’ behavior over the course of an experiment.
- The use of a control group provides further control by allowing experimenters to test for the possibility that simple passage of time in an experimental setting leads to change. If a control group and an experimental group both change in the same way over time, a researcher may conclude that the experimental stimulus isn’t causing the change.
- When a control group is assessed in comparison to an experimental or treatment group, the effect of a stimulus is compared to the effect of simply taking part in an experiment (a variety of the Hawthorne Effect).
The greatest advantageous strength of experimental method, therefore, is the ability to control conditions meticulously, at least in a traditional laboratory experiment. Setting up an experiment is like setting up a stage play in which, if all props are lights and colors and sounds are managed properly, all attention is focused onto the important variable at hand. If a researcher is sufficiently theatrical, it may be difficult for the subject of an experiment to tell exactly what the dependent variable of an experiment might be, which can be a useful experience in research. Take the frequent command in psychological experiments for participants to count backward from one hundred by threes or sevens. The subject may perceive that the calculation of subtractions generates some explanatory variable. Actually, this is a distractor task that is meant to disrupt concentration, practicing of skills or commitment of information to memory.
A more uncertain example of control in experiments comes from my days as an undergraduate student, when I participated in a large number of experiments to help pay my way through school (from one experiment I still bear a literal scar). In perhaps the oddest experiment I can recall, I was asked to drink ten cups of coffee a day, then stop all caffeine intake for one day, and then enter a psychology lab to take memory tests. I still remember my massive, pounding headache as I walked across campus in the rain for those final tests. But I also still wonder whether those memory tests were beside the point, because as I sat in a bare waiting room at the appointed time, I was confronted by an open surgery journal with color photographs of brain surgery, a hand split by an axe and a lacerated face. These were all splayed on the coffee table in front of me as I waited for the (I thought at the time) tardy professor to show up. Was the real dependent variable in the experiment the extent of my reaction to these gruesome sights and to a tardy professor? I still am not sure whether the violent images and tardy professor were examples of poor or exquisite control of the experimental stage, but in combination with caffeine withdrawal they certainly had their effects: nausea and impatient annoyance. If this was the point of the experiment and some hidden camera captured my reaction, I applaud the researcher for his ingenuity in provoking my response. However, if all this was beside the point and the outcome of interest really was that memory test, then whatever results I provided were undoubtedly tarnished by the confounding variables of my reaction to the waiting room. In experiments, with the great opportunity to exert control comes a great responsibility to exert control. With the freedom to design an experiment in which an experimental stimulus is the only aspect that changes comes the burden of ensuring that an experimental stimulus is the only aspect that changes.
Experimental methods are predicated on the notion that an extreme amount of control over setting, props and proceedings is a good thing for research, because it allows one and only one variable — the stimulus serving as an independent variable — to change. But there is an adjective we use for a hyper-controlled, scripted proceeding on a stage — theatrical — and it is no mistake that one of the meanings of the adjective theatrical is artificial. Experiments are an artifice, not an observation of how the world really works. You wouldn’t mistake a piece of theater for real life, and many experimental researchers worry that you shouldn’t mistake behavior in experiments for behavior in natural settings. Experiments present subjects with unreal constraints in unreal settings, and for that reason their results may not be generalizable, which is to say that experimental subjects may act very differently in the lab than they would in everyday life.
Moving Outside the Box: Field Experiments
How can the best aspect of experimental settings (control) be combined with the best aspects of observational studies (real behavior in real life)? In recent years, new appreciation has developed for an innovative variety of experiment called the “field experiment.” In field experiments, researchers find some aspect of everyday social existence that can be controlled outside the laboratory and that can be subject to some kind of experimental treatment. If that sounds tricky, well, it is, but a surprising number of young researchers are finding new ways to apply experimental methods in the field.
A particularly popular new form of field experiment is called an audit study. Audit studies occur when pairs of well-trained actors are taught to dress the same way, use the same words, adopt the same body language, use the same interaction strategies, and are given documents indicating the same level of education, income, experience, neighborhood reputation, and other indicators of status. The actors in these pairs differ in just one way: one may be a man and the other a woman. One may have dark skin and another may have light skin. In these audit studies, that one difference in status acts as to create a control condition and an experimental stimulus condition. In an inversion of a traditional experiment, in which people are brought into the laboratory, these actors are sent out into the world — to act the same way but to vary in status. If one of the pair of actors is consistently treated differently, then it is reasonable to conclude that the reason for difference in treatment is not due to difference in behavior, or education, or income, or language, or neighborhood (all of which are the same), but rather is due to a difference in status.
Audit studies exist to examine the impact of all sorts of status in society, but the list of audit studies (also known as “field experiments”) demonstrating racial discrimination is particularly large and continues to grow (Ayres and Siegelman 1995; Bertrand and Mullainathan 2004; Pager 2003; Pager and Quillian 2005; Pager et al. 2009; Schreer et al. 2009). An October 2008 audit study involved a set of e-mail messages sent to 4,859 state legislators across the country (Butler and Broockman 2011). These e-mail messages were sent in sets, within which all characteristics of the message were held constant with two exceptions — that of race and that of political party. Here’s a template that the researchers used when they sent out e-mails; items within brackets are variables altered by the researchers:
Reference to party is straightforward: an e-mail message sent out refers explicitly to future Democratic or Republican party primary elections. Race is represented more subtly, but no less powerfully, by the fictitious sender’s name: “Jake Mueller” or “DeShawn Jackson.” These names were chosen because research has shown them to be highly racialized: the names Jake and Mueller are strongly associated with a white identity and the names DeShawn and Jackson are strongly associated with a black identity. “Jake Mueller” received responses 5.1% more often than “DeShawn Jackson.” This overall effect masks important variation according to the race of the legislator. Republican white state legislators responded to “Jake Mueller” 7.6% more often than to “DeShawn Jackson,” and Democratic white state legislators responded to “Jake Mueller” 6.8% more often than to “DeShawn Jackson.” Democratic non-white state legislators, on the other hand, responded to “DeShawn Jackson” 16.5% more often than to “Jake Mueller.” The effect indicates that state legislators practice racial discrimination that benefits constituents who are like them and that disadvantages constituents who are unlike them.
In a piece of research from last year’s publication of the journal Social Forces, sociologist S. Michael Gaddis sent out 1,008 fake job applications in which two features varied: the college or university from which an applicant graduated and the name an applicant used. Names were identified as “racialized” if they were strongly associated with black identity (DaQuan, Ebony, Jalen, Lamar, Nia, and Shanice) or white identity (Aubrey, Caleb, Charlie, Erica, Ronny and Lesly). The fake applicants’ alma maters were grouped into two categories: high-prestige universities such as Duke, Harvard or Stanford and “second-tier universities” that are respected but not as well-ranked (University of California-Riverside and University of North Carolina-Greensboro were two such universities). The quality of applicants’ records, and of the applications themselves, were held equal within pairs; only names and university names varied (Gaddis 2015).
The dependent variable in Gaddis’ research was whether these fictitious job applications would receive a response. Here are the rates of positive employer responses from employers in Gaddis’ study:
As you can see, there is an added value of whiteness; applicants with Black names who graduated from elite universities obtained employer responses at about the same rate as White applicants from second-tier universities. When comparing Black and White applicants claiming graduation from universities of the same status, Black applicants received responses about 5% less often than their White peers.
Gaddis describes a kind of stratification as students exit the educational system, but what about stratification while students are trying to enter the system? Katherine Milkman and her colleagues (2012) also used audit study techniques of matching equally-qualified, similarly-communicating students to uncover stratification in graduate school admission. In order to get into graduate school, you must communicate effectively with that graduate school to obtain necessary information. Do graduate schools treat prospective students differently according to irrelevant characteristics? Do graduate schools discriminate? To find out, Milkman’s research team sent out messages from fictional students seeking to apply to graduate schools, e-mail messages that were sent to the professors acting as directors of 6,300 graduate programs in the United States. Names were selected that strongly indicated applicants’ race and gender. The request from applicants: could we meet in one week’s time? Milkman and her colleagues wanted to find out how quickly professors would respond to these requests. The results:
The disparity you see here is discrimination, and it is demonstrated not in a laboratory but in the real world through the adaptation of experimental method through audit studies.
DIY Activity #10: Give the Pepsi Challenge
This lesson ends on a somber yet important note, but for the Do-It-Yourself Activity associated with this lesson, we’ll return to the whimsical. Watch the video below, which presents a classic 1981 television commercial:
In this video, actor Gabe Kaplan carries out an experiment called the “Pepsi Challenge” in which people were asked to try a sip of the soda Coca Cola, try a sip of Pepsi Cola, then say which one they liked best.
Let’s imagine a variant of the Pepsi Challenge in which subjects aren’t asked which soda they like, but instead are simply asked to identify which of the two colas with which they’re presented is Pepsi and which is Coke. Cola identification is the experiment’s dependent variable, and its possible values are “correct” and incorrect.” Let’s further imagine an independent variable, presence of ice, as a treatment variable in which ice cubes are either present or absent in the sample of soda presented to your experimental subjects.
Your mission in this DIY Activity is to carry out five sessions of this experiment testing the effect of the independent variable on the dependent variable. In your experiment, you must create and describe an experimental method that allows you to carry out a double-blinded test (you may need assistants). Report both your method and a table containing your experimental data, uploading your work to the area under the “DIY Activities” link of our course Blackboard page called “DIY Activity #5: Give the Pepsi Challenge.”
Ayres, Ian and Peter Siegelman. 1995. “Race and Gender Discrimination in Bargaining for a New Car.” American Economic Review 85(3): 304-321
Bertrand, Marianne and Sendhil Mullainathan. 2004. “Are Emily and Greg More Employable than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination.” National Bureau of Economic Research, No. w9873.
Butler, Daniel M. and David E. Broockman. 2011. “Do Politicians Racially Discriminate Against Constituents? A Field Experiment on State Legislators.” American Journal of Political Science 55(3): 463-477.
Gaddis, S. Michael. 2015. “Discrimination in the Credential Society: an Audit Study of Race and College Selectivity in the Labor Market.” Social Forces 93(4): 1451-1479.
Milkman, Katherine L., Modupe Akinola, and Dolly Chugh. 2012. “Temporal Distance and Discrimination: An Audit Study in Academia.” Psychological Science 23(7): 710-717.
Moss-Racusin, Corinne A., John F. Dovidio, Victoria L. Brescoll, Mark J. Grahama and Jo Handelsman. 2012. “Science Faculty’s Subtle Gender Biases Favor Male Students.”Proceedings of the National Academy of Sciences 109(41): 16474-16479.
Pager, Devah. 2003. “The Mark of a Criminal Record.” American Journal of Sociology 108(5):937-975.
Pager, Devah and Lincoln Quillian. 2005. “Walking the Talk? What Employers Say Versus What They Do.” American Sociological Review 70(3): 355-380.
Pager, Devah, Bruce Western and Bart Bonikowski. 2009. “Discrimination in a Low-Wage Labor Market: A Field Experiment.” American Sociological Review 74(5): 777–799.
Schreer, George E., Saundra Smith, and Kirsten Thomas. 2009. “Shopping While Black: Examining Racial Discrimination in a Retail Setting.” Journal of Applied Social Psychology 39(6): 1432-1444.