Recent from talks
Nothing was collected or created yet.
Field experiment
View on WikipediaThis article needs additional citations for verification. (June 2022) |
| Part of a series on |
| Research |
|---|
| Philosophy portal |
Field experiments are experiments carried out outside of laboratory settings. They are different from others in that they are conducted in real-world settings often unobtrusively and control not only the subject pool but selection and overtness, as defined by leaders such as John A. List. This is in contrast to laboratory experiments, which enforce scientific control by testing a hypothesis in the artificial and highly controlled setting of a laboratory. Field experiments have some contextual differences as well from naturally occurring experiments and quasi-experiments.[1] While naturally occurring experiments rely on an external force (e.g. a government, nonprofit, etc.) controlling the randomization treatment assignment and implementation, field experiments require researchers to retain control over randomization and implementation. Quasi-experiments occur when treatments are administered as-if randomly (e.g. U.S. Congressional districts where candidates win with slim margins,[2] weather patterns, natural disasters, etc.).
In a field experiment, researchers randomly assign subjects (or other sampling units) to either treatment or control groups to test claims of causal relationships. Random assignment helps establish the comparability of the treatment and control group so that any differences between them that emerge after the treatment has been administered plausibly reflect the influence of the treatment rather than preexisting differences between the groups.
Field experiments encompass a broad array of experimental designs, each with varying degrees of generality. Some criteria of generality (e.g. authenticity of treatments, participants, contexts, and outcome measures) refer to the contextual similarities between the subjects in the experimental sample and the rest of the population. They are increasingly used in the social sciences to study the effects of policy-related interventions in domains such as health, education, crime, social welfare, and politics.
Characteristics
[edit]Under random assignment, outcomes of field experiments are reflective of the real-world because subjects are assigned to groups based on non-deterministic probabilities.[3] Two other core assumptions underlie the ability of the researcher to collect unbiased potential outcomes: excludability and non-interference.[4][5] The excludability assumption provides that the only relevant causal agent is through the receipt of the treatment. Asymmetries in assignment, administration or measurement of treatment and control groups violate this assumption. The non-interference assumption, or Stable Unit Treatment Value Assumption (SUTVA), indicates that the value of the outcome depends only on whether or not the subject is assigned the treatment and not whether or not other subjects are assigned to the treatment. When these three core assumptions are met, researchers are more likely to provide unbiased estimates through field experiments.
After designing the field experiment and gathering the data, researchers can use statistical inference tests to determine the size and strength of the intervention's effect on the subjects. Field experiments allow researchers to collect diverse amounts and types of data. For example, a researcher could design an experiment that uses pre- and post-trial information in an appropriate statistical inference method to see if an intervention has an effect on subject-level changes in outcomes.
Practical uses
[edit]Field experiments offer researchers a way to test theories and answer questions with higher external validity because they simulate real-world occurrences.[6] Compared to surveys and lab experiments, one strength of field experiments is that they can test people without them being aware that they are in a study, which could influence how they respond (called the "Hawthorne Effect"). For example, researchers used a field experiment by posting different types of employment ads to test people's preferences for stable versus exciting jobs as a way to check the validity of people's responses to survey measures.[7]
Some researchers argue that field experiments are a better guard against potential bias and biased estimators. Field experiments can act as benchmarks for comparing observational data to experimental results. Using field experiments as benchmarks can help determine levels of bias in observational studies, and, since researchers often develop a hypothesis from an a priori judgment, benchmarks can help to add credibility to a study.[8] While some argue that covariate adjustment or matching designs might work just as well in eliminating bias, field experiments can increase certainty[9] by displacing omitted variable bias because they better allocate observed and unobserved factors.[10]
Researchers can utilize machine learning methods to simulate, reweight, and generalize experimental data.[11] This increases the speed and efficiency of gathering experimental results and reduces the costs of implementing the experiment. Another cutting-edge technique in field experiments is the use of the multi armed bandit design,[12] including similar adaptive designs on experiments with variable outcomes and variable treatments over time.[13]
Limitations
[edit]There are limitations of and arguments against using field experiments in place of other research designs (e.g. lab experiments, survey experiments, observational studies, etc.). Given that field experiments necessarily take place in a specific geographic and political setting, there is a concern about extrapolating outcomes to formulate a general theory regarding the population of interest. However, researchers have begun to find strategies to effectively generalize causal effects outside of the sample by comparing the environments of the treated population and external population, accessing information from larger sample size, and accounting and modeling for treatment effects heterogeneity within the sample.[14] Others have used covariate blocking techniques to generalize from field experiment populations to external populations.[15]
Noncompliance issues affecting field experiments (both one-sided and two-sided noncompliance)[16][17] can occur when subjects who are assigned to a certain group never receive their assigned intervention. Other problems to data collection include attrition (where subjects who are treated do not provide outcome data) which, under certain conditions, will bias the collected data. These problems can lead to imprecise data analysis; however, researchers who use field experiments can use statistical methods in calculating useful information even when these difficulties occur.[17]
Using field experiments can also lead to concerns over interference[18] between subjects. When a treated subject or group affects the outcomes of the nontreated group (through conditions like displacement, communication, contagion etc.), nontreated groups might not have an outcome that is the true untreated outcome. A subset of interference is the spillover effect, which occurs when the treatment of treated groups has an effect on neighboring untreated groups.
Field experiments can be expensive, time-consuming to conduct, difficult to replicate, and plagued with ethical pitfalls. Subjects or populations might undermine the implementation process if there is a perception of unfairness in treatment selection (e.g. in 'negative income tax' experiments communities may lobby for their community to get a cash transfer so the assignment is not purely random). There are limitations to collecting consent forms from all subjects. Comrades administering interventions or collecting data could contaminate the randomization scheme. The resulting data, therefore, could be more varied: larger standard deviation, less precision and accuracy, etc. This leads to the use of larger sample sizes for field testing. However, others argue that, even though replicability is difficult, if the results of the experiment are important then there a larger chance that the experiment will get replicated. As well, field experiments can adopt a "stepped-wedge" design that will eventually give the entire sample access to the intervention on different timing schedules.[19] Researchers can also design a blinded field experiment to remove possibilities of manipulation.
Examples
[edit]The history of experiments in the lab and the field has left longstanding impacts in the physical, natural, and life sciences. Modern use field experiments has roots in the 1700s, when James Lind utilized a controlled field experiment to identify a treatment for scurvy.[20]
Other categorical examples of sciences that use field experiments include:
- Economists have used field experiments to analyze discrimination (e.g., in the labor market,[21][22] in housing,[23] in the sharing economy,[24] in the credit market,[25] or in integration[26]), health care programs,[27] charitable fundraising,[28] education,[29] information aggregation in markets, and microfinance programs.[30]
- Engineers often conduct field tests of prototype products to validate earlier laboratory tests and to obtain broader feedback.
- Researchers in social psychology often use field experiments, such as Stanley Milgram's Stanford Prison Experiment and Robert Cialdini's door-in-the-face study.[31]
- Agricultural science researcher R.A. Fisher analyzed randomized actual "field" experimental data[32] for crops.
- Political Science researcher Harold Gosnell conducted an early field experiment on voter participation in 1924 and 1925.[33]
- Ecology Joseph H. Connell’s field experiment.[34]
See also
[edit]References
[edit]- ^ Meyer, B. D. (1995). "Natural and quasi-experiments in economics" (PDF). Journal of Business & Economic Statistics. 13 (2): 151–161. doi:10.2307/1392369. JSTOR 1392369.
- ^ Lee, D. S.; Moretti, E.; Butler, M. J. (2004). "Do voters affect or elect policies? Evidence from the US House". The Quarterly Journal of Economics. 119 (3): 807–859. doi:10.1162/0033553041502153. JSTOR 25098703.
- ^ Rubin, Donald B. (2005). "Causal Inference Using Potential Outcomes". Journal of the American Statistical Association. 100 (469): 322–331. doi:10.1198/016214504000001880. S2CID 842793.
- ^ Nyman, Pär (2017). "Door-to-door canvassing in the European elections: Evidence from a Swedish field experiment". Electoral Studies. 45: 110–118. doi:10.1016/j.electstud.2016.12.002.
- ^ Broockman, David E.; Kalla, Joshua L.; Sekhon, Jasjeet S. (2017). "The Design of Field Experiments with Survey Outcomes: A Framework for Selecting More Efficient, Robust, and Ethical Designs". Political Analysis. 25 (4): 435–464. doi:10.1017/pan.2017.27. S2CID 233321039.
- ^ Duflo, Esther (2006). Field Experiments in Development Economics (Report). Massachusetts Institute of Technology. Archived from the original on 2012-03-06. Retrieved 2010-03-12.
- ^ Harati, Hamidreza; Talhelm, Thomas (2023-07-01). "Cultures in Water-Scarce Environments Are More Long-Term Oriented". Psychological Science. 34 (7): 754–770. doi:10.1177/09567976231172500. ISSN 0956-7976. PMID 37227787.
- ^ Harrison, G. W.; List, J. A. (2004). "Field experiments". Journal of Economic Literature. 42 (4): 1009–1055. doi:10.1257/0022051043004577. JSTOR 3594915.
- ^ LaLonde, R. J. (1986). "Evaluating the econometric evaluations of training programs with experimental data". The American Economic Review. 76 (4): 604–620. JSTOR 1806062.
- ^ Gordon, Brett R.; Zettelmeyer, Florian; Bhargava, Neha; Chapsky, Dan (2017). "A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook". Marketing Science. doi:10.2139/ssrn.3033144. S2CID 197733986.
- ^ Athey, Susan; Imbens, Guido (2016). "Recursive partitioning for heterogeneous causal effects: Table 1". Proceedings of the National Academy of Sciences. 113 (27): 7353–7360. doi:10.1073/pnas.1510489113. PMC 4941430. PMID 27382149.
- ^ Scott, Steven L. (2010). "A modern Bayesian look at the multi-armed bandit". Applied Stochastic Models in Business and Industry. 26 (6): 639–658. doi:10.1002/asmb.874.
- ^ Raj, V.; Kalyani, S. (2017). "Taming non-stationary bandits: A Bayesian approach". arXiv:1707.09727 [stat.ML].
- ^ Dehejia, R.; Pop-Eleches, C.; Samii, C. (2015). From local to global: External validity in a fertility natural experiment (PDF) (Report). National Bureau of Economic Research. w21459.
- ^ Egami, Naoki; Hartman, Erin (19 July 2018). "Covariate Selection for Generalizing Experimental Results" (PDF). Princeton.edu. Archived from the original (PDF) on 10 July 2020. Retrieved 31 December 2018.
- ^ Blackwell, Matthew (2017). "Instrumental Variable Methods for Conditional Effects and Causal Interaction in Voter Mobilization Experiments". Journal of the American Statistical Association. 112 (518): 590–599. doi:10.1080/01621459.2016.1246363. S2CID 55878137.
- ^ a b Aronow, Peter M.; Carnegie, Allison (2013). "Beyond LATE: Estimation of the Average Treatment Effect with an Instrumental Variable". Political Analysis. 21 (4): 492–506. doi:10.1093/pan/mpt013.
- ^ Aronow, P. M.; Samii, C. (2017). "Estimating average causal effects under general interference, with application to a social network experiment". The Annals of Applied Statistics. 11 (4): 1912–1947. arXiv:1305.6156. doi:10.1214/16-AOAS1005. S2CID 26963450.
- ^ Woertman, W.; de Hoop, E.; Moerbeek, M.; Zuidema, S. U.; Gerritsen, D. L.; Teerenstra, S. (2013). "Stepped wedge designs could reduce the required sample size in cluster randomized trials". Journal of Clinical Epidemiology. 66 (7): 752–758. doi:10.1016/j.jclinepi.2013.01.009. hdl:2066/117688. PMID 23523551.
- ^ Tröhler, U. (2005). "Lind and scurvy: 1747 to 1795". Journal of the Royal Society of Medicine. 98 (11): 519–522. doi:10.1177/014107680509801120. PMC 1276007. PMID 16260808.
- ^ Bertrand, Marianne; Mullainathan, Sendhil (2004). "Are Emily and Greg more employable than Lakisha and Jamal? A field experiment on labor market discrimination" (PDF). American Economic Review. 94 (4): 991–1013. doi:10.1257/0002828042002561.
- ^ Gneezy, Uri; List, John A (2006). "Putting behavioral economics to work: Testing for gift exchange in labor markets using field experiments" (PDF). Econometrica. 74 (5): 1365–1384. doi:10.1111/j.1468-0262.2006.00707.x.
- ^ Ahmed, Ali M; Hammarstedt, Mats (2008). "Discrimination in the rental housing market: A field experiment on the Internet". Journal of Urban Economics. 64 (2): 362–372. doi:10.1016/j.jue.2008.02.004.
- ^ Edelman, Benjamin; Luca, Michael; Svirsky, Dan (2017). "Racial discrimination in the sharing economy: Evidence from a field experiment". American Economic Journal: Applied Economics. 9 (2): 1–22. doi:10.1257/app.20160213.
- ^ Pager, Devah; Shepherd, Hana (2008). "The sociology of discrimination: Racial discrimination in employment, housing, credit, and consumer markets". Annual Review of Sociology. 34: 181–209. doi:10.1146/annurev.soc.33.040406.131740. PMC 2915460. PMID 20689680.
- ^ Nesseler, Cornel; Carlos, Gomez-Gonzalez; Dietl, Helmut (2019). "What's in a name? Measuring access to social activities with a field experiment". Palgrave Communications. 5 160: 1–7. doi:10.1057/s41599-019-0372-0. hdl:11250/2635691.
- ^ Ashraf, Nava; Berry, James; Shapiro, Jesse M (2010). "Can higher prices stimulate product use? Evidence from a field experiment in Zambia" (PDF). American Economic Review. 100 (5): 2383–2413. doi:10.1257/aer.100.5.2383. S2CID 6392533.
- ^ Karlan, Dean; List, John A (2007). "Does price matter in charitable giving? Evidence from a large-scale natural field experiment" (PDF). American Economic Review. 97 (5): 1774–1793. doi:10.1257/aer.97.5.1774. S2CID 10041821.
- ^ Fryer Jr, Roland G (2014). "Injecting charter school best practices into traditional public schools: Evidence from field experiments". The Quarterly Journal of Economics. 129 (3): 1355–1407. doi:10.1093/qje/qju011.
- ^ Field, Erica; Pande, Rohini (2008). "Repayment frequency and default in microfinance: evidence from India". Journal of the European Economic Association. 6 (2–3): 501–509. doi:10.1162/JEEA.2008.6.2-3.501.
- ^ Cialdini, Robert B.; Vincent, Joyce E.; Lewis, Stephen K.; Catalan, Jose; Wheeler, Diane; Darby, Betty Lee (February 1975). "Reciprocal concessions procedure for inducing compliance: The door-in-the-face technique". Journal of Personality and Social Psychology. 31 (2): 206–215. doi:10.1037/h0076284. ISSN 1939-1315.
- ^ Fisher, R.A. (1937). The Design of Experiments (PDF). Oliver and Boyd Ltd.
- ^ Gosnell, Harold F. (1926). "An Experiment in the Stimulation of Voting". American Political Science Review. 20 (4): 869–874. doi:10.1017/S0003055400110524.
- ^ Grodwohl, Jean-Baptiste; Porto, Franco; El-Hani, Charbel N. (2018-07-31). "The instability of field experiments: building an experimental research tradition on the rocky seashores (1950–1985)". History and Philosophy of the Life Sciences. 40 (3): 45. doi:10.1007/s40656-018-0209-y. ISSN 1742-6316. PMID 30066110. S2CID 51889466.
Field experiment
View on GrokipediaDefinition and Fundamentals
Core Definition
A field experiment is a research methodology that incorporates controlled manipulation of independent variables and randomization, akin to laboratory experiments, but conducts these interventions within participants' natural environments rather than artificial settings.[10] This approach enables the observation of behavioral responses under realistic conditions, where extraneous variables like social norms, incentives, and contextual factors influence outcomes in ways that laboratory isolation cannot replicate.[1] By embedding experimental rigor into everyday contexts—such as workplaces, markets, or communities—field experiments prioritize ecological validity, allowing inferences about causal effects that generalize beyond contrived scenarios.[11] Key characteristics include the deliberate assignment of treatments to randomly selected groups to minimize selection bias and confounding, while permitting natural participant behaviors and external influences to unfold.[4] Unlike purely observational studies, field experiments isolate treatment effects through this randomization, providing stronger evidence for causality than correlational data; however, they sacrifice some precision due to incomplete control over environmental noise.[12] In disciplines like economics and social sciences, variations such as natural field experiments involve covert interventions where subjects remain unaware of their participation, enhancing behavioral authenticity by avoiding Hawthorne effects.[13] The primary aim is to bridge the gap between abstract theory and practical application, testing hypotheses in settings where decisions carry real stakes, such as financial or reputational costs.[14] This method has proven particularly valuable for evaluating policy interventions, as evidenced by randomized trials in development economics that demonstrate causal impacts on outcomes like education or health adoption.[7] Despite logistical challenges, field experiments yield findings with higher external validity, informing evidence-based decisions in complex systems.[15]Types and Variations
Field experiments are classified into types based on the extent to which they incorporate elements of the field environment, as delineated by Harrison and List in their 2004 taxonomy published in the Journal of Economic Literature.[16] This framework evaluates experiments along dimensions such as subject pool (laboratory students versus field participants), informational environment (abstract versus context-specific), tasks (standardized lab procedures versus field-relevant activities), and stakes (hypothetical or symbolic versus consequential real-world outcomes).[16] The classification emphasizes a spectrum from those retaining laboratory-like controls to those fully embedded in natural settings, enabling causal inference while varying ecological validity.[16] Artefactual field experiments employ standard laboratory protocols but recruit participants from non-laboratory populations, such as professionals or consumers in their typical environments, to test behavioral responses under controlled conditions.[16] For instance, researchers might administer trust games—abstract economic tasks typically run in university labs—to field subjects like market vendors, preserving internal validity through randomization while introducing real-world participant heterogeneity.[16] This type mitigates selection biases from student samples but limits generalizability due to artificial tasks and low stakes.[16] Framed field experiments extend artefactual designs by embedding laboratory tasks within field-relevant contexts, such as using actual commodities as incentives or providing domain-specific instructions to enhance realism without altering core procedures.[16] An example includes offering real consumer goods as prizes in decision-making games conducted with shoppers, which introduces salient payoffs and contextual cues to better approximate natural motivations.[16] These experiments balance experimental control with increased external validity, though they may still suffer from awareness effects if participants recognize the contrived elements.[16] Natural field experiments represent the most field-oriented type, involving interventions in everyday environments with field participants undertaking routine tasks, often without subjects' knowledge of their involvement to minimize behavioral distortions like Hawthorne effects.[16] Classic examples encompass altering donation solicitations during door-to-door campaigns or varying product prices in retail settings to observe purchasing patterns, leveraging randomization for causal identification amid genuine stakes and unobtrusive measurement.[16] This variation excels in external validity for policy-relevant behaviors but demands careful ethical oversight and faces challenges in scalability and replication due to contextual dependencies.[16] Variations across disciplines adapt these types to specific domains, such as economics' focus on incentive structures in markets or psychology's emphasis on social influence in workplaces.[17] In political science, natural field experiments often test voter mobilization via randomized mailings or canvassing, as in Gerber and Green's 2000 study randomizing absentee ballot promotions to 29,380 households, which increased turnout by 8.7 percentage points. Public health applications frequently employ framed or natural designs for interventions like randomized condom distribution in clinics, prioritizing real-world compliance over lab abstraction.[17] Ethical and logistical adaptations, including covert versus overt implementations, further diversify designs, with covert approaches favored for behavioral authenticity despite consent controversies.[16]Comparison to Laboratory and Quasi-Experiments
Field experiments incorporate random assignment to treatments in naturalistic environments, paralleling laboratory experiments in enabling causal identification by equalizing groups on observables and unobservables, but diverging in setting to prioritize real-world applicability over isolation of mechanisms.[18] Laboratory experiments achieve superior internal validity through meticulous control of extraneous variables in sterile conditions, minimizing confounds and demand effects, yet their contrived stimuli and participant pools often yield low external validity, as behaviors elicited may not translate beyond the lab.[19][20] Field experiments, by embedding interventions amid authentic incentives, distractions, and social dynamics, enhance ecological validity and generalizability, though they incur risks of spillover effects, non-compliance, and measurement noise that can dilute precision.[21][22]| Aspect | Laboratory Experiments | Field Experiments |
|---|---|---|
| Internal Validity | High: Rigorous controls and randomization isolate effects.[19] | Moderate to high: Randomization counters bias, but field confounds persist.[18][20] |
| External Validity | Low: Artificial contexts limit real-world mimicry.[23] | High: Natural settings capture genuine responses and scalability.[21] |
| Implementation | Feasible and cost-effective with small samples. | Logistically demanding, prone to attrition and ethical hurdles.[22] |