Support Us

You are browsing the archive for Open Tools.

The AEA Registry for Randomized Controlled Trials

Patrick McNeal - July 4, 2013 in External Projects, Featured, Open Tools, Trials Registration

The American Economic Association (AEA) has recently launched a registry for randomized controlled trials in economics (https://www.socialscienceregistry.org). The registry aims to address the growing number of requests for registration by funders and peer reviewers, make access to results easier and more transparent, and help solve the problem of publication bias by providing a single place where all trials are registered in advance of their start.

Screenshot of www.socialscienceregistry.org

In order to encourage registration, the process was designed to be very light. There are only 18 required fields (such as name and a small subset of IRB requirements,) and the entire process should take less than 20 minutes. There is also the opportunity to add much more, including power calculations and an optional pre-analysis plan. To protect confidential and other sensitive design information, most of the information can remain hidden while the project is ongoing.

Please contact support [at] socialscienceregistry.org with any questions, comments or support issues.

Releasing the Automated Game Play Datasets

Velichka Dimitrova - March 7, 2013 in Announcements, Data Release, Featured, Open Data, Open Economics, Open Research, Open Tools

We are very happy to announce that the Open Economics Working Group is releasing the datasets of the research project “Small Artificial Human Agents for Virtual Economies“, implemented by Professor David Levine and Professor Yixin Chen at the Washington University of St. Louis and funded by the National Science Foundation [See dedicated webpage].

The authors who have participated in the study have given their permission to publish their data online. We hope that through making this data available online we will aid researchers working in this field. This initiative is motivated by our belief that in order for economic research to be reliable and trusted, it should be possible to reproduce research findings – which is difficult or even impossible without the availability of the data and code. Making material openly available reduces to a minimum the barriers for doing reproducible research.

If you are interested to know more or you like to get help in releasing research data in your field, please contact us at: economics [at] okfn.org

List of Datasets and Code

Andreoni, J. & Miller, J.H., 1993. Rational cooperation in the finitely repeated prisoner’s dilemma: Experimental evidence. The Economic Journal, pp.570–585.

Link to publication | Link to data
Bó, P.D., 2005. Cooperation under the shadow of the future: experimental evidence from infinitely repeated games. The American Economic Review, 95(5), pp.1591–1604.


Link to publication
| Link to data

Charness, G., Frechette, G.R. & Qin, C.-Z., 2007. Endogenous transfers in the Prisoner’s Dilemma game: An experimental test of cooperation and coordination. Games and Economic Behavior, 60(2), pp.287–306.

Link to publication | Link to data
Clark, K., Kay, S. & Sefton, M., 2001. When are Nash equilibria self-enforcing? An experimental analysis. International Journal of Game Theory, 29(4), pp.495–515.

Link to publication | Link to data
Duffy, John and Feltovich, N., 2002. Do Actions Speak Louder Than Words? An Experimental Comparison of Observation and Cheap Talk. Games and Economic Behavior, 39(1), pp.1–27.

Link to publication | Link to data
Duffy, J. & Ochs, J., 2009. Cooperative behavior and the frequency of social interaction. Games and Economic Behavior, 66(2), pp.785–812.

Link to publication
| Link to data
Knez, M. & Camerer, C., 2000. Increasing cooperation in prisoner’s dilemmas by establishing a precedent of efficiency in coordination games. Organizational Behavior and Human Decision Processes, 82(2), pp.194–216.

Link to publication
| Link to data
Ochs, J., 1995. Games with unique, mixed strategy equilibria: An experimental study. Games and Economic Behavior, 10(1), pp.202–217.

Link to publication | Link to data
Ong, D. & Chen, Z., 2012. Tiger Women: An All-Pay Auction Experiment on Gender Signaling of Desire to Win. Available at SSRN 1976782.

Link to publication | Link to data
Vlaev, I. & Chater, N., 2006. Game relativity: How context influences strategic decision making. Journal of Experimental Psychology: Learning, Memory, and Cognition; Journal of Experimental Psychology: Learning, Memory, and Cognition, 32(1), p.131.

Link to publication | Link to data

Project Background

An important need for developing better economic policy prescriptions is an improved method of validating theories. Originally economics depended on field data from surveys and laboratory experiments. An alternative method of validating theories is through the use of artificial or virtual economies. If a virtual world is an adequate description of a real economy, then a good economic theory ought to be able to predict outcomes in that setting. An artificial environment offers enormous advantages over the field and laboratory: complete control – for example, over risk aversion and social preferences – and great speed in creating economies and validating theories. In economics the use of virtual economies can potentially enable us to deal with heterogeneity, with small frictions, and with expectations that are backward looking rather than determined in equilibrium. These are difficult or impractical to combine in existing calibrations or Monte Carlo simulations.

The goal of this project is to build artificial agents by developing computer programs that act like human beings in the laboratory. We focus on the simplest type of problem of interest to economists: the simple one-shot two-player simultaneous move games. There is a wide variety of existing published data on laboratory behavior that will be our primary testing ground for our computer programs. As we achieve greater success with this we want to see if our programs can adapt themselves to changes in the rules: for example, if payments are changed in a certain way, the computer programs will play differently: do people do the same? In some cases we may be able to answer these questions with data from existing studies; in others we will need to conduct our own experimental studies.

There is a great deal of existing research relevant to the current project. The state of the art in the study of virtual economies is agent-based modeling (Bonabeau (2002)). In addition, crucially related are both the theoretical literature on learning in games, and the empirical literature on behavior in the experimental laboratory. From the perspective of theory, the most relevant economic research is Foster and Vohra’s (1999) work on calibrated play and the related work on smooth fictitious play (Fudenberg and Levine (1998)) and regret algorithms (Hart and Mas-Colell (2000)). There is also a relevant literature in the computational game theory literature on regret optimization such as Nisan et al. (2007). Empirical work on human play in the laboratory has two basic threads: the research on first time play such as Nagel (1995) and the hierarchical models of Stahl and Wilson (1994), Costa-Gomes, Crawford, and Broseta (2001) and Camerer, Ho, and Chong (2004). Second are the learning models, most notably the reinforcement learning model of Erev and Roth (1998) and the EWA model (Ho, Camerer, and Chong (2007)). This latter model can be considered state of the art, including as it does both reinforcement and fictitious play type learning and initial play from a cognitive hierarchy.

Preregistration in the Social Sciences: A Controversy and Available Resources

James Monogan - February 20, 2013 in Featured, Open Data, Open Economics, Open Research, Open Tools

For years now, the practice preregistering clinical trials has worked to reduce publication bias dramatically (Drummond Rennie offers more details). Trying to build on this trend for transparency, the Open Knowledge Foundation, which runs the Open Economics Working Group, has expressed support for All Trials Registered, All Results Reported (http://www.alltrials.net). This initiative argues that all clinical trial results should be reported because the spread of this free information will reduce bad treatment decisions in the future and allow others to find missed opportunities for good treatments. The idea of preregistration, therefore, has proved valuable for the medical profession.

In a similar push for openness, a debate now is emerging about the merits of preregistration in the social sciences. Specifically, could social scientific disciplines benefit from investigators’ committing themselves to a research design before the observation of their outcome variable? The winter 2013 issue of Political Analysis takes up this issue with a symposium on research registration, wherein two articles make a case in favor of preregistration, and three responses offer alternate views on this controversy.

There has been a trend for transparency in social research: Many journals now require authors to release public replication data as a condition for publication. Additionally, public funding agencies such as the U.S. National Science Foundation require public release of data as a condition for funding. This push for additional transparency allows for other researchers to conduct secondary analyses that may build on past results and also allows empirical findings to be subjected to scrutiny as new theory, data, and methods emerge. Preregistering a research design is a natural next step in this transparency process as it would allow readers, including other scholars, to gain a sense of how the project was developed and how the researcher made tough design choices.

Another advantage of preregistering a research design is it can curb the prospects of publication bias. Gerber & Malhotra observe that papers produced in print tend to have a higher rate of positive results in hypothesis tests than should be expected. Registration has the potential to curb publication bias, or at least its negative consequences. Even if committing oneself to a research design does not change the prospect for publishing an article in the traditional format, it would signal to the larger audience that a study was developed and that a publication never emerged. This would allow the scholarly community at large to investigate further, perhaps reanalyze data that were not published in print, and if nothing else get a sense of how preponderant null findings are for commonly-tested hypotheses. Also, if more researchers tie their hands in a registration phase, then there is less room for activities that might push a result over a common significance threshold.

To illustrate how preregistration can be useful, my article in this issue of Political Analysis analyzes the effect of Republican candidates’ position on the immigration issue on their share of the two-party vote in 2010 elections for the U.S. House of Representatives. In this analysis, I hypothesized that Republican candidates may have been able to garner additional electoral support by taking a harsh stand on the issue. I designed my model to estimate the effect on vote share of taking a harsher stand on immigration, holding the propensity of taking a harsh stand constant. This propensity was based on other factors known to shape election outcomes, such as district ideology, incumbency, campaign finances, and previous vote share. I crafted my design before votes were counted in the 2010 election and publicly posted it to the Society for Political Methodology’s website as a way of committing myself to this design.

immigComparison

In the figure, the horizontal axis represents values that the propensity scores for harsh rhetoric could take. The tick marks along the base of the graph indicate actual values in the data of the propensity for harsh rhetoric. The vertical axis represents the expected change in proportion of the two party vote that would be expected for moving from a welcoming position to a hostile position. The figure shows a solid black line, which indicates my estimate of the effect of a Republican’s taking a harsh stand on immigration on his or her proportion of the two-party vote. The two dashed black lines indicate the uncertainty in this estimate of the treatment effect. As can be seen, the estimated effects come with considerable uncertainty, and I can never reject the prospect of a zero effect.

However, a determined researcher could have tried alternate specifications until a discernible result emerged. The figure also shows a red line representing the estimated treatment effect from a simpler model that also omits the effect of how liberal or conservative the district is. The dotted red lines represent the uncertainty in this estimate. As can be seen, this reports a uniform treatment effect of 0.079 that is discernible from zero. After “fishing” with the model specification, a researcher could have manufactured a result suggesting that Republican candidates could boost their share of the vote by 7.9 percentage points by moving from a welcoming to a hostile stand on immigration! Such a result would be misleading because it overlooks district ideology. Whenever investigators commit themselves to a research design, this reduces the prospect of fishing after observing the outcome variable.

I hope to have illustrated the usefulness of preregistration and hope the idea will spread. Currently, though, there is not a comprehensive study registry in the social sciences. However, several proto-registries are available to researchers. All of these registries offer the opportunity for self-registration, wherein the scholar can commit him or herself to a design as a later signal to readers, reviewers, and editors.

In particular, any researcher from any discipline who is interested in self-registering a study is welcome to take advantage of the Political Science Registered Studies Dataverse. This dataverse is a fully-automated resource that allows researchers to upload design information, pre-outcome data, and any preliminary code. Uploaded designs will be publicized via a variety of free media. List members are welcome to subscribe to any of these announcement services, which are linked in the header of the dataverse page.

Besides this automated system, there are also a few other proto-registries of note:

* The EGAP: Experiments in Governance and Politics (http://e-gap.org/design-registration/) website has a registration tool that now accepts and posts detailed preanalysis plans. In instances when designs are sensitive, EGAP offers the service of accepting and archiving sensitive plans with an agreed trigger for posting them publicly.

* J-PAL: The Abdul Latif Jameel Poverty Action Lab (http://www.povertyactionlab.org/Hypothesis-Registry) has been hosting a hypothesis registry since 2009. This registry is for pre-analysis plans of researchers working on randomized controlled trials, which may be submitted before data analysis begins.

* The American Political Science Association’s Experimental Research Section (http://ps-experiments.ucr.edu/) hosts a registry for experiments at its website. (Please note, however, that the website currently is down for maintenance.)

First Open Economics International Workshop Recap

Velichka Dimitrova - January 25, 2013 in Economic Publishing, Events, Featured, Open Access, Open Data, Open Economics, Open Research, Open Tools, Workshop

The first Open Economics International Workshop gathered 40 academic economists, data publishers and funders of economics research, researchers and practitioners to a two-day event at Emmanuel College in Cambridge, UK. The aim of the workshop was to build an understanding around the value of open data and open tools for the Economics profession and the obstacles to opening up information, as well as the role of greater openness of the academy. This event was organised by the Open Knowledge Foundation and the Centre for Intellectual Property and Information Law and was supported by the Alfred P. Sloan Foundation. Audio and slides are available at the event’s webpage.

Open Economics Workshop

Setting the Scene

The Setting the Scene session was about giving a bit of context to “Open Economics” in the knowledge society, seeing also examples from outside of the discipline and discussing reproducible research. Rufus Pollock (Open Knowledge Foundation) emphasised that there is necessary change and substantial potential for economics: 1) open “core” economic data outside the academy, 2) open as default for data in the academy, 3) a real growth in citizen economics and outside participation. Daniel Goroff (Alfred P. Sloan Foundation) drew attention to the work of the Alfred P. Sloan Foundation in emphasising the importance of knowledge and its use for making decisions and data and knowledge as a non-rival, non-excludable public good. Tim Hubbard (Wellcome Trust Sanger Institute) spoke about the potential of large-scale data collection around individuals for improving healthcare and how centralised global repositories work in the field of bioinformatics. Victoria Stodden (Columbia University / RunMyCode) stressed the importance of reproducibility for economic research and as an essential part of scientific methodology and presented the RunMyCode project.

Open Data in Economics

The Open Data in Economics session was chaired by Christian Zimmermann (Federal Reserve Bank of St. Louis / RePEc) and was about several projects and ideas from various institutions. The session examined examples of open data in Economics and sought to discover whether these examples are sustainable and can be implemented in other contexts: whether the right incentives exist. Paul David (Stanford University / SIEPR) characterised the open science system as a system which is better than any other in the rapid accumulation of reliable knowledge, whereas the proprietary systems are very good in extracting the rent from the existing knowledge. A balance between these two systems should be established so that they can work within the same organisational system since separately they are distinctly suboptimal. Johannes Kiess (World Bank) underlined that having the data available is often not enough: “It is really important to teach people how to understand these datasets: data journalists, NGOs, citizens, coders, etc.”. The World Bank has implemented projects to incentivise the use of the data and is helping countries to open up their data. For economists, he mentioned, having a valuable dataset to publish on is an important asset, there are therefore not sufficient incentives for sharing.

Eustáquio J. Reis (Institute of Applied Economic Research – Ipea) related his experience on establishing the Ipea statistical database and other projects for historical data series and data digitalisation in Brazil. He shared that the culture of the economics community is not a culture of collaboration where people willingly share or support and encourage data curation. Sven Vlaeminck (ZBW – Leibniz Information Centre for Economics) spoke about the EDaWaX project which conducted a study of the data-availability of economics journals and will establish publication-related data archive for an economics journal in Germany.

Legal, Cultural and other Barriers to Information Sharing in Economics

The session presented different impediments to the disclosure of data in economics from the perspective of two lawyers and two economists. Lionel Bently (University of Cambridge / CIPIL) drew attention to the fact that there is a whole range of different legal mechanism which operate to restrict the dissemination of information, yet on the other hand there is also a range of mechanism which help to make information available. Lionel questioned whether the open data standard would be always the optimal way to produce high quality economic research or whether there is also a place for modulated/intermediate positions where data is available only on conditions, or only in certain part or for certain forms of use. Mireille van Eechoud (Institute for Information Law) described the EU Public Sector Information Directive – the most generic document related to open government data and progress made for opening up information published by the government. Mireille also pointed out that legal norms have only limited value if you don’t have the internalised, cultural attitudes and structures in place that really make more access to information work.

David Newbery (University of Cambridge) presented an example from the electricity markets and insisted that for a good supply of data, informed demand is needed, coming from regulators who are charged to monitor markets, detect abuse, uphold fair competition and defend consumers. John Rust (Georgetown University) said that the government is an important provider of data which is otherwise too costly to collect, yet a number of issues exist including confidentiality, excessive bureaucratic caution and the public finance crisis. There are a lot of opportunities for research also in the private sector where some part of the data can be made available (redacting confidential information) and the public non-profit sector also can have a tremendous role as force to organise markets for the better, set standards and focus of targeted domains.

Current Data Deposits and Releases – Mandating Open Data?

The session was chaired by Daniel Goroff (Alfred P. Sloan Foundation) and brought together funders and publishers to discuss their role in requiring data from economic research to be publicly available and the importance of dissemination for publishing.

Albert Bravo-Biosca (NESTA) emphasised that mandating open data begins much earlier in the process where funders can encourage the collection of particular data by the government which is the basis for research and can also act as an intermediary for the release of open data by the private sector. Open data is interesting but it is even more interesting when it is appropriately linked and combined with other data and the there is a value in examples and case studies for demonstrating benefits. There should be however caution as opening up some data might result in less data being collected.

Toby Green (OECD Publishing) made a point of the different between posting and publishing, where making content available does not always mean that it would be accessible, discoverable, usable and understandable. In his view, the challenge is to build up an audience by putting content where people would find it, which is very costly as proper dissemination is expensive. Nancy Lutz (National Science Foundation) explained the scope and workings of the NSF and the data management plans required from all economists who are applying for funding. Creating and maintaining data infrastructure and compliance with the data management policy might eventually mean that there would be less funding for other economic research.

Trends of Greater Participation and Growing Horizons in Economics

Chris Taggart (OpenCorporates) chaired the session which introduced different ways of participating and using data, different audiences and contributors. He stressed that data is being collected in new ways and by different communities, that access to data can be an enormous privilege and can generate data gravities with very unequal access and power to make use of and to generate more data and sometimes analysis is being done in new and unexpected ways and by unexpected contributors. Michael McDonald (George Mason University) related how the highly politicised process of drawing up district lines in the U.S. (also called Gerrymandering) could be done in a much more transparent way through an open-source re-districting process with meaningful participation allowing for an open conversation about public policy. Michael also underlined the importance of common data formats and told a cautionary tale about a group of academics misusing open data with a political agenda to encourage a storyline that a candidate would win a particular state.

Hans-Peter Brunner (Asian Development Bank) shared a vision about how open data and open analysis can aid in decision-making about investments in infrastructure, connectivity and policy. Simulated models about investments can demonstrate different scenarios according to investment priorities and crowd-sourced ideas. Hans-Peter asked for feedback and input on how to make data and code available. Perry Walker (new economics foundation) spoke about the conversation and that a good conversation has to be designed as it usually doesn’t happen by accident. Rufus Pollock (Open Knowledge Foundation) concluded with examples about citizen economics and the growth of contributions from the wider public, particularly through volunteering computing and volunteer thinking as a way of getting engaged in research.

During two sessions, the workshop participants also worked on Statement on the Open Economics principles will be revised with further input from the community and will be made public on the second Open Economics workshop taking place on 11-12 June in Cambridge, MA.