Preregistration in the Social Sciences: A Controversy and Available Resources
For years now, the practice preregistering clinical trials has worked to reduce publication bias dramatically (Drummond Rennie offers more details). Trying to build on this trend for transparency, the Open Knowledge Foundation, which runs the Open Economics Working Group, has expressed support for All Trials Registered, All Results Reported (http://www.alltrials.net). This initiative argues that all clinical trial results should be reported because the spread of this free information will reduce bad treatment decisions in the future and allow others to find missed opportunities for good treatments. The idea of preregistration, therefore, has proved valuable for the medical profession.
In a similar push for openness, a debate now is emerging about the merits of preregistration in the social sciences. Specifically, could social scientific disciplines benefit from investigators’ committing themselves to a research design before the observation of their outcome variable? The winter 2013 issue of Political Analysis takes up this issue with a symposium on research registration, wherein two articles make a case in favor of preregistration, and three responses offer alternate views on this controversy.
There has been a trend for transparency in social research: Many journals now require authors to release public replication data as a condition for publication. Additionally, public funding agencies such as the U.S. National Science Foundation require public release of data as a condition for funding. This push for additional transparency allows for other researchers to conduct secondary analyses that may build on past results and also allows empirical findings to be subjected to scrutiny as new theory, data, and methods emerge. Preregistering a research design is a natural next step in this transparency process as it would allow readers, including other scholars, to gain a sense of how the project was developed and how the researcher made tough design choices.
Another advantage of preregistering a research design is it can curb the prospects of publication bias. Gerber & Malhotra observe that papers produced in print tend to have a higher rate of positive results in hypothesis tests than should be expected. Registration has the potential to curb publication bias, or at least its negative consequences. Even if committing oneself to a research design does not change the prospect for publishing an article in the traditional format, it would signal to the larger audience that a study was developed and that a publication never emerged. This would allow the scholarly community at large to investigate further, perhaps reanalyze data that were not published in print, and if nothing else get a sense of how preponderant null findings are for commonly-tested hypotheses. Also, if more researchers tie their hands in a registration phase, then there is less room for activities that might push a result over a common significance threshold.
To illustrate how preregistration can be useful, my article in this issue of Political Analysis analyzes the effect of Republican candidates’ position on the immigration issue on their share of the two-party vote in 2010 elections for the U.S. House of Representatives. In this analysis, I hypothesized that Republican candidates may have been able to garner additional electoral support by taking a harsh stand on the issue. I designed my model to estimate the effect on vote share of taking a harsher stand on immigration, holding the propensity of taking a harsh stand constant. This propensity was based on other factors known to shape election outcomes, such as district ideology, incumbency, campaign finances, and previous vote share. I crafted my design before votes were counted in the 2010 election and publicly posted it to the Society for Political Methodology’s website as a way of committing myself to this design.
In the figure, the horizontal axis represents values that the propensity scores for harsh rhetoric could take. The tick marks along the base of the graph indicate actual values in the data of the propensity for harsh rhetoric. The vertical axis represents the expected change in proportion of the two party vote that would be expected for moving from a welcoming position to a hostile position. The figure shows a solid black line, which indicates my estimate of the effect of a Republican’s taking a harsh stand on immigration on his or her proportion of the two-party vote. The two dashed black lines indicate the uncertainty in this estimate of the treatment effect. As can be seen, the estimated effects come with considerable uncertainty, and I can never reject the prospect of a zero effect.
However, a determined researcher could have tried alternate specifications until a discernible result emerged. The figure also shows a red line representing the estimated treatment effect from a simpler model that also omits the effect of how liberal or conservative the district is. The dotted red lines represent the uncertainty in this estimate. As can be seen, this reports a uniform treatment effect of 0.079 that is discernible from zero. After “fishing” with the model specification, a researcher could have manufactured a result suggesting that Republican candidates could boost their share of the vote by 7.9 percentage points by moving from a welcoming to a hostile stand on immigration! Such a result would be misleading because it overlooks district ideology. Whenever investigators commit themselves to a research design, this reduces the prospect of fishing after observing the outcome variable.
I hope to have illustrated the usefulness of preregistration and hope the idea will spread. Currently, though, there is not a comprehensive study registry in the social sciences. However, several proto-registries are available to researchers. All of these registries offer the opportunity for self-registration, wherein the scholar can commit him or herself to a design as a later signal to readers, reviewers, and editors.
In particular, any researcher from any discipline who is interested in self-registering a study is welcome to take advantage of the Political Science Registered Studies Dataverse. This dataverse is a fully-automated resource that allows researchers to upload design information, pre-outcome data, and any preliminary code. Uploaded designs will be publicized via a variety of free media. List members are welcome to subscribe to any of these announcement services, which are linked in the header of the dataverse page.
Besides this automated system, there are also a few other proto-registries of note:
* The EGAP: Experiments in Governance and Politics (http://e-gap.org/design-registration/) website has a registration tool that now accepts and posts detailed preanalysis plans. In instances when designs are sensitive, EGAP offers the service of accepting and archiving sensitive plans with an agreed trigger for posting them publicly.
* J-PAL: The Abdul Latif Jameel Poverty Action Lab (http://www.povertyactionlab.org/Hypothesis-Registry) has been hosting a hypothesis registry since 2009. This registry is for pre-analysis plans of researchers working on randomized controlled trials, which may be submitted before data analysis begins.
* The American Political Science Association’s Experimental Research Section (http://ps-experiments.ucr.edu/) hosts a registry for experiments at its website. (Please note, however, that the website currently is down for maintenance.)