Support Us

You are browsing the archive for Open Economics.

Open model of an oil contract

- October 22, 2013 in External Projects, Featured, Open Data, Open Economics

Please come and kick the tires of our open model of an oil contract!

In the next month or so, OpenOil and its partners will publish what we believe will be the first financial model of an oil contract under Creative Commons license. We would like to take this opportunity to invite the Open Economics community to come and kick the wheels on the model when it is ready, and help us improve it.

We need you because we expect a fair degree of heat from those with a financial or reputational stake in continued secrecy around these industries. We expect the brunt of attacks to be on the basis that we are wrong. And of course we will be wrong in some way. It’s inevitable. So we would like our defence to be not, “no we’re never wrong”, but “yes, sometimes we are wrong, but transparently so and for the right reasons – and look, here are a bunch of friends who have already pointed out these errors, which have been corrected. You got some specific critiques, come give them. But the price of criticism is improvement – the open source way!” We figure Open Economics is the perfect network to seek that constructive criticism.

screengrab

Ultimately, we want to grow an open source community which will help grow a systematic understanding of the economics of the oil and gas industry independent of investor or government stakes, since the public policy impact of these industries and relevant flows are too vital to be left to industry specialists. There are perhaps 50 countries in the world where such models could transform public understanding of industries which dominate the political economy.

The model itself is still being fine-tuned but I’d like to take this chance to throw out a few heuristics that have occurred in the process of building it.

Public interest modelling. The model is being built by professionals with industry experience but its primary purpose is to inform public policy, not to aid investment decisions or serve as negotiation support for either governments or companies. This has determined a distinct approach to key issues such as management of complexity and what is an acceptable margin of error.

Management of complexity. Although there are several dozen variables one could model, and which typically appear in the models produced for companies, we deliberately exclude a long tail of fiscal terms, such as ground rent and signature bonuses, on the basis that the gain in reduction of margin of error is less than the loss from increasing complexity for the end user. We also exclude many of the fine tuning implementations of the taxation system. We list these terms in a sheet so those who wish can extend the model with them. It would be great, for example, to get tax geek help on refining some of these issues.

A hierarchy of margins of error. Extractives projects can typically last 25 years. The biggest single margin of error is not within human power to solve – future price. All other uncertainties or estimates pale in comparison with its impact on returns to all stakeholders. Second are the capex and opex going into a project. The international oil company may be the only real source of these data, and may or may not share them in disaggregated form with the government – everyone else is in the dark. For public interest purposes, the margin of error created by all other fiscal terms and input assumptions combined is less significant, and manageable.

Moving away from the zero-sum paradigm. Because modelling has traditionally been associated with the negotiation process, and perhaps because of the wider context surrounding extractive industries, a zero-sum paradigm often predominates in public thinking around the terms of these contracts. But the model shows graphically two distinct ways in which that paradigm does not apply. First, in agreements with sufficient progressivity, rising commodity prices could mean simultaneous rise of both government take and a company’s Internal Rate of Return. Second, a major issue for governments and societies depending on oil production is volatility – the difference between using minimal and maximal assumptions across all of the inputs will likely produce a difference in result which is radical. One of a country’s biggest challenges then is focusing enough attention on regulating itself, its politicians’ appetite for spending, its public’s appetite for patronage. We know this of course in the real world. Iraq received $37 billion in 2007, then $62 billion in 2008, then $43 billion or so in 2009. But it is the old journalistic difference between show and tell. A model can show this in your country, with your conditions.

The value of contract transparency. Last only because self-evident is the need for primary extractives conracts between states and companies to enter the public domain. About seven jurisdictions around the world publish all contracts so far but it is gaining traction as a norm in the governance community. The side-effects of the way extractive industries are managed now are almost all due to the ill-understood nature of rent. Even corruption, the hottest issue politically, may often simply be a secondary effect of the rent-based nature of the core activities. Publishing all contracts is the single biggest measure that would get us closer to being able to address the root causes of Resource Curse.

See http://openoil.net/ for more details.

Open Economics: the story so far…

- August 30, 2013 in Advisory Panel, Announcements, Events, Featured, Open Data, Open Economics, Projects

A year and a half ago we embarked on the Open Economics project with the support of the Alfred P. Sloan Foundation and we would like a to share a short recap of what we have been up to.

Our goal was to define what open data means for the economics profession and to become a central point of reference for those who wanted to learn what it means to have openness, transparency and open access to data in economics.

Advisory Panel of the Open Economics Working Group:
openeconomics.net/advisory-panel/

Advisory Panel

We brought together an Advisory Panel of twenty senior academics who advised us and provided input on people and projects we needed to contact and issues we needed to tackle. The progress of the project has depended on the valuable support of the Advisory Panel.

1st Open Economics Workshop, Dec 17-18 ’12, Cambridge, UK:
openeconomics.net/workshop-dec-2012/

2nd Open Economics Workshop, 11-12 June ’13, Cambridge, MA:
openeconomics.net/workshop-june-2013

International Workshops

We also organised two international workshops, first one held in Cambridge, UK on 17-18 December 2012 and second one in Cambridge U.S. on 11-12 June 2013, convening academics, funders, data publishers, information professionals and students to share ideas and build an understanding about the value of open data, the still persisting barriers to opening up information, as well as the incentives and structures which our community should encourage.

Open Economics Principles

While defining open data for economics, we also saw the need to issue a statement on the openness of data and code – the Open Economics Principles – to emphasise that data, program code, metadata and instructions, which are necessary to replicate economics research should be open by default. Having been launched in August, this statement is now being widely endorsed by the economics community and most recently by the World Bank’s Data Development Group.

Projects

The Open Economics Working Group and several more involved members have worked on smaller projects to showcase how data can be made available and what tools can be built to encourage discussions and participation as well as wider understanding about economics. We built the award-winning app Yourtopia Italy – http://italia.yourtopia.net/ for a user-defined multidimensional index of social progress, which won a special prize in the Apps4Italy competition.




Yourtopia Italy: application of a user-defined multidimensional index of social progress: italia.yourtopia.net

We created the Failed Bank Tracker, a list and a timeline visualisation of the banks in Europe which failed during the last financial crisis and released the Automated Game Play Datasets, the data and code of papers from the Small Artificial Agents for Virtual Economies research project, implemented by Professor David Levine and Professor Yixin Chen at the Washington University of St. Louis. More recently we launched the Metametrik prototype of a platform for the storage and search of regression results in the economics.


MetaMetrik: a prototype for the storage and search of econometric results: metametrik.openeconomics.net

We also organised several events in London and a topic stream about open knowledge and sustainability at the OKFestival with a panel bringing together a diverse range of panelists from academia, policy and the open data community to discuss how open data and technology can help improve the measurement of social progress.

Blog and Knowledge Base

We blogged about issues like the benefits of open data from the perspective of economics research, the EDaWaX survey of the data availability of economics journals, pre-registration of in the social sciences, crowd-funding as well as open access. We also presented projects like the Statistical Memory of Brazil, Quandl, the AEA randomized controlled trials registry.

Some of the issues we raised had a wider resonance, e.g. when Thomas Herndon found significant errors in trying to replicate the results of Harvard economists Reinhart and Rogoff, we emphasised that while such errors may happen, it is a greater crime not to make the data available with published research in order to allow for replication.

Some outcomes and expectations

We found that opening up data in economics may be a difficult matter, as many economists utilise data which cannot be open because of privacy, confidentiality or because they don’t own that data. Sometimes there are insufficient incentives to disclose data and code. Many economists spend a lot of resources in order to build their datasets and obtain an advantage over other researchers by making use of information rents.

Some journals have been leading the way in putting in place data availability requirements and funders have been demanding data management and sharing plans, yet more general implementation and enforcement is still lacking. There are now, however, more tools and platforms available where researchers can store and share their research content, including data and code.

There are also great benefits in sharing economics data: it enables the scrutiny of research findings and gives a possibility to replicate research, it enhances the visibility of research and promotes new uses of the data, avoids unnecessary costs for data collection, etc.

In the future we hope to concentrate on projects which would involve graduate students and early career professionals, a generation of economics researchers for whom sharing data and code may become more natural.

Keep in touch

Follow us on Twitter @okfnecon, sign up to the Open Economics mailing list and browse our projects and resources at openeconomics.net.

Looking for the Next Open Economics Project Coordinator

- July 3, 2013 in Announcements, Featured, Open Economics

### Open Economics Project Coordinator

The Open Economics Working Group is looking for a project coordinator to lead the Open Economics project in the next phase. The Open Economics Project Coordinator will be the point of contact for the Working Group and will work closely with a community of economists, data publishers, research data professionals, lawyers and funders to make more data and content in economics open, coordinate the creation of tools which aid researchers and facilitate stakeholder dialogue. Some of the responsibilities include:

  • Coordinating the project through all phases of project development including initiating, planning, executing, controlling and closing the project.
  • Representing Open Economics Working Group at local and international events, point of contact for the Working Group.
  • Leading communications: Responsible for communications with the Working Group members, the community, interested individuals and organisations, point of contact for the project PI and the Advisory Panel arranging the details of conference calls and leading communication with individual AP members and their participation in the workshop and other activities.
  • Community coordinator: Writing news to the mailing list, and using social media to promote activities to the network and beyond, maintaining the website of the Open Economics project: planning design, content and presentation of the project and the Working Group, organising and coordinating online meetings / online sprints and other online communication.
  • Maintaining the website: Inviting and supervising contributions to the blog, actively searching authors, setting agenda for presented content and projects, blog author: putting together content for the blog: surveying relevant projects, publishing news about forthcoming events and documentation (slides, audios, summary) of past events and activities
  • Point of contact for the project, responsible for collaboration and communication to other projects within the Open Knowledge Foundation.
  • Preparing reports: Writing both financial and substantive midterm and final report for the funder as well as weekly reports for the project team.
  • Point of contact and support for the Open Economics fellows: Planning and supervising the recruitment process of the fellows, maintaining regular contact with the fellows, monitoring progress of the fellows’ projects and providing necessary support.
  • Event development and management: concept, planning, research on invitees and relevant projects, programme drafting, sending and following up on invitations, event budgeting, organising the entire event.

#### Person specification

  • Someone self-driven, organised and an excellent communicator. This person should be comfortable running a number of initiatives at the same time, speaking at events and travelling.
  • Having background in economics and knowledge of quantitative research and data analysis.
  • Preferably some knowledge of academic research and some familiarity with stakeholders in the area of economics research.
  • Be comfortable with using online communication and working from different locations.
  • Having ability to engage with community members at all levels – from senior academics to policy-makers, developers, and journalists.
  • #### Location
    We will consider applicants based anywhere in the world; however a mild preference is given to those close to one of our hubs in London, Berlin or Cambridge.

    #### Pay & closing date
    The rate is negotiable based on experience. The closing date for applications is July 15, 2013.

    ####How to apply
    To apply please send a cover letter highlighting relevant experience, your CV and explaining your interest in the role to [email protected]

    First Opinion series on Transparency in Social Science Research

    - June 7, 2013 in Berkeley Initiative for Transparency in the Social Sciences (BITSS), External Projects, Featured, Open Data, Open Economics, Open Research

    The Berkeley Initiative for Transparency in the Social Sciences (BITSS) is a new effort to promote transparency in empirical social science research. The program is fostering an active network of social science researchers and institutions committed to strengthening scientific integrity in economics, political science, behavioral science, and related disciplines.

    Central to the BITSS effort is the identification of useful strategies and tools for maintaining research transparency, including the use of study registries, pre-analysis plans, data sharing, and replication. With its institutuional hub at UC Berkeley, the network facilitates discussion and critique of existing strategies, testing of new methods, and broad dissemination of findings through interdisciplinary convenings, special conference sessions, and online public engagement.

    The first opinion series on transparency in social science research (see: http://cegablog.org/transparency-series/) was published on the CEGA Development Blog in March 2013. The series built on a seminal research meeting held at the University of California, Berkeley on December 7, 2012, which brought together a select interdisciplinary group of scholars – from biostatistics, economics, political science and psychology – with a shared interest in promoting transparency in empirical social science research.

    Second Open Economics International Workshop

    - June 5, 2013 in Announcements, Events, Featured, Open Data, Open Economics, Workshop

    Next week, on June 11-12, at the MIT Sloan School of Management, the Open Economics Working Group of the Open Knowledge Foundation will gather about 40 economics professors, social scientists, research data professionals, funders, publishers and journal editors for the second Open Economics International Workshop.

    The event will follow up on the first workshop held in Cambridge UK and will conclude with agreeing a statement on the Open Economics principles. Some of the speakers include Eric von Hippel, T Wilson Professor of Innovation Management and also Professor of Engineering Systems at MIT, Shaida Badiee, Director of the Development Data Group at the World Bank and champion for the Open Data Initiative, Micah Altman, Director of Research and Head of the Program on Information Science for the MIT Libraries as well as Philip E. Bourne, Professor at the University of California San Diego and Associate Director of the RCSB Protein Data Bank.

    The workshop will address topics including:

    • Research data sharing: how and where to share economics social science research data, enforce data management plans, promote better data management and data use
    • Open and collaborative research: how to create incentives for economists and social scientists to share their research data and methods openly with the academic community
    • Transparent economics: how to achieve greater involvement of the public in the research agenda of economics and social science

    The knowledge sharing in economics session will invite a discussion between Joshua Gans, Jeffrey S. Skoll Chair of Technical Innovation and Entrepreneurship at the Rotman School of Management at the University of Toronto and Co-Director of the Research Program on the Economics of Knowledge Contribution and Distribution, John Rust, Professor of Economics at Georgetown University and co-founder of EconJobMarket.org, Gert Wagner, Professor of Economics at the Berlin University of Technology (TUB) and Chairman of the German Census Commission and German Council for Social and Economic Data as well as Daniel Feenberg, Research Associate in the Public Economics program and Director of Information Technology at the National Bureau of Economic Research.

    The session on research data sharing will be chaired by Thomas Bourke, Economics Librarian at the European University Institute, and will discuss the efficient sharing of data and how to create and enforce reward structures for researchers who produce and share high quality data, gathering experts from the field including Mercè Crosas, Director of Data Science at the Institute for Quantitative Social Science (IQSS) at Harvard University, Amy Pienta, Acquisitions Director at the Inter-university Consortium for Political and Social Research (ICPSR), Joan Starr, Chair of the Metadata Working Group of DataCite as well as Brian Hole, the founder of the open access academic publisher Ubiquity Press.

    Benjamin Mako Hill, researcher and PhD Candidate at the MIT and Berkman Center for Internet and Society at Harvard Univeresity, will chair the session on the evolving evidence base of social science, which will highlight examples of how economists can broaden their perspective on collecting and using data through different means: through mobile data collection, through the web or through crowd-sourcing and also consider how to engage the broader community and do more transparent economic research and decision-making. Speakers include Amparo Ballivian, Lead Economist working with the Development Data Group of the World Bank, Michael P. McDonald, Associate Professor at George Mason University and co-principle investigator on the Public Mapping Project and Pablo de Pedraza, Professor at the University of Salamanca and Chair of Webdatanet.

    The morning session on June 12 will gather different stakeholders to discuss how to share responsibility and how to pursue joint action. It will be chaired by Mireille van Eechoud, Professor of Information Law at IViR and will include short statements by Daniel Goroff, Vice President and Program Director at the Alfred P. Sloan Foundation, Nikos Askitas, Head of Data and Technology at the Institute for the Study of Labor (IZA), Carson Christiano, Head of CEGA’s partnership development efforts and coordinating the Berkeley Initiative for Transparency in the Social Sciences (BITSS) and Jean Roth, the Data Specialist at the National Bureau of Economic Research.

    At the end of the workshop the Working Group will discuss the future plans of the project and gather feedback on possible initiatives for translating discussions in concrete action plans. Slides and audio will be available on the website after the workshop. If you have any questions please contact economics [at] okfn.org

    Open Access Economics: To share or not to share?

    - May 22, 2013 in Featured, Open Access, Open Data, Open Economics, Open Research

    Last Friday, Barry Eichengreen, professor of Economics and Political Science at Berkeley, wrote about “Open Access Economics” at the prestigious commentary, analysis and opinion page Project Syndicate, where influential professionals, politicians, economists, business leaders and Nobel laureates share opinions about current economic and political issues.

    He reaffirmed that indeed the results of the Reinhart and Rogoff study were used by some politicians to justify austerity measures taken by governments around the world with stifling public debt.

    Professor Eichengreen also criticised the National Bureau of Economic Research (NBER) for failing to require data and code for the “flawed study” of the Harvard economists, which appeared first in the distinguished working paper series of NBER.

    In line with the discussion we started at the LSE Social Impact Blog and the New Scientist, Barry Eichengreen brought home the message that indeed the enforcement of a data availability would have made a difference in this case.

    At the same time, some express doubts about the need to share data and think about excuses to avoid sharing the data related to their publication. Economists at the anonymous web forum Econjobrumors.com have been joking about the best ways to avoid sharing data.

    Here are some of “creative” suggestions on how the anonymous author could get around sending their data:

    “Refer him to your press secretary”
    “Tell him you had a computer virus that wiped out the dataset”
    “Not obliged to let anyone free ride. Can you explain it like that?”
    “Tell him its proprietary data and you can’t share it without having to kill him.”
    “Tell him, ‘I’ll show you mine if you show me yours.”
    “…say you signed NDA.”
    “Huddle in the corner of your office wrapped in a blanket and some hot coco from the machine down the hall and wait for the inevitable.”
    “Don’t reply.”

    Anonymous author: “No, did not make up the results. But let’s just say you really do not want to play with the data in any way. No good for significance.”
    Anonymous comment: “Added a couple of extra stars for good luck?”.

    While many of the discussions on the anonymous blog are employing humour and jokes, this discussion reflects a mainstream attitude towards data sharing. It also shows how uncertain are some authors of the robustness of their results – even if they did not make any Reinhart and Rogoff excel mistakes, they are hesitating about sharing lest closer scrutiny would expose weaker methodology. Maybe more disclosure – there data can be shared – could improve the way research is done.

    Automated Game Play Datasets: New Releases

    - April 24, 2013 in Announcements, Data Release, Featured, Open Data, Open Economics, Open Research

    Last month we released ten datasets from the research project “Small Artificial Human Agents for Virtual Economies“, implemented by Professor David Levine and Professor Yixin Chen at the Washington University of St. Louis and funded by the National Science Foundation [See dedicated webpage].

    We are now happy to announce that the list has grown with seven more datasets, now hosted at datahub.io which were added this month, including:


    Clark, K. & Sefton, M., 2001. Repetition and signalling: experimental evidence from games with efficient equilibria. Economics Letters, 70(3), pp.357–362.

    Link to publication | Link to data
    Costa-Gomes, M. and Crawford, V. 2006. “Cognition and Behavior in Two-Person guessing Games: An Experimental Study.” The American Economic Review, 96(5), pp.1737-1768

    Link to publication | Link to data
    Costa-Gomes, M., Crawford, V. and Bruno Broseta. 2001. “Cognition and Behavior in Normal-Form Games: An Experimental Study.” Econometrica, 69(5), pp.1193-1235

    Link to publication | Link to data
    Crawford, V., Gneezy, U. and Yuval Rottenstreich. 2008. “The Power of Focal Points is Limited: Even Minute Payoff Asymmetry May Yield Large Coordination Failures.” The American Economic Review, 98(4), pp.1443-1458

    Link to publication | Link to data
    Feltovich, N., Iwasaki, A. and Oda, S., 2012. Payoff levels, loss avoidance, and equilibrium selection in games with multiple equilibria: an experimental study. Economic Inquiry, 50(4), pp.932-952.

    Link to publication | Link to data
    Feltovich, N., & Oda, S., 2013. The effect of matching mechanism on learning in games played under limited information, Working paper

    Link to publication | Link to data
    Schmidt D., Shupp R., Walker J.M., and Ostrom E. 2003. “Playing Safe in Coordination Games: The Roles of Risk Dominance, Payoff Dominance, and History of Play.” Games and Economic Behaviour, 42(2), pp.281–299.

    Link to publication | Link to data

    Any questions or comments? Please get in touch: economics [at] okfn.org

    Reinhart-Rogoff Revisited: Why we need open data in economics

    - April 18, 2013 in Featured, Open Data, Open Economics, Open Research, Public Finance and Government Data

    Another economics scandal made the news this week. Harvard Kennedy School professor Carmen Reinhart and Harvard University professor Kenneth Rogoff argued in their 2010 NBER paper that economic growth slows down when the debt/GDP ratio exceeds the threshold of 90 percent of GDP. These results were also published in one of the most prestigious economics journals – the American Economic Review (AER) – and had a powerful resonance in a period of serious economic and public policy turmoil when governments around the world slashed spending in order to decrease the public deficit and stimulate economic growth.

    Carmen Reinhart

    Kenneth Rogoff

    Yet, they were proven wrong. Thomas Herndon, Michael Ash and Robert Pollin from the University of Massachusetts (UMass) tried to replicate the results of Reinhart and Rogoff and criticised them on the basis of three reasons:

    • Coding errors: due to a spreadsheet error five countries were excluded completely from the sample resulting in significant error of the average real GDP growth and the debt/GDP ratio in several categories
    • Selective exclusion of available data and data gaps: Reinhart and Rogoff exclude Australia (1946-1950), New Zealand (1946-1949) and Canada (1946-1950). This exclusion is alone responsible for a significant reduction of the estimated real GDP growth in the highest public debt/GDP category
    • Unconventional weighting of summary statistics: the authors do not discuss their decision to weight equally by country rather than by country-year, which could be arbitrary and ignores the issue of serial correlation.

    The implications of these results are that countries with high levels of public debt experience only “modestly diminished” average GDP growth rates and as the UMass authors show there is a wide range of GDP growth performances at every level of public debt among the twenty advanced economies in the survey of Reinhart and Rogoff. Even if the negative trend is still visible in the results of the UMass researchers, the data fits the trend very poorly: “low debt and poor growth, and high debt and strong growth, are both reasonably common outcomes.”

    Source: Herndon, T., Ash, M. & Pollin, R., “Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogoff, Public Economy Research Institute at University of Massachusetts: Amherst Working Paper Series. April 2013.

    What makes it even more compelling news is that it is all a tale from the state of Massachusetts: distinguished Harvard professors (#1 university in the US) challenged by empiricists from the less known UMAss (#97 university in the US). Then despite the excellent AER data availability policy – which acts as a role model for other journals in economics – has failed to enforce it and make the data and code of Reinhart and Rogoff available to other researchers.

    Coding errors happen, yet the greater research misconduct was not allowing for other researchers to review and replicate the results through making the data openly available. If the data and code were available upon publication already in 2010, it may not have taken three years to prove these results wrong, which may have probably influenced the direction of public policy around the world towards stricter austerity measures. Sharing research data means a possibility to replicate and discuss, enabling the scrutiny of research findings as well as improvement and validation of research methods through more scientific enquiry and debate.

    Get in Touch

    The Open Economics Working Group advocates the release of datasets and code along with published academic articles and provides practical assistance to researchers who would like to do so. Get in touch if you would like to learn more by writing us at economics [at] okfn.org and signing for our mailing list.

    References

    Herndon, T., Ash, M. & Pollin, R., “Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogoff, Public Economy Research Institute at University of Massachusetts: Amherst Working Paper Series. April 2013: Link to paper |
    Link to data and code

    Open Research Data Handbook – Call for Case Studies

    - April 9, 2013 in Announcements, Call for participation, Featured, Open Data, Open Economics, Open Research

    The OKF Open Research Data Handbook – a collaborative and volunteer-led guide to Open Research Data practices – is beginning to take shape and we need you! We’re looking for case studies showing benefits from open research data: either researchers who have personal stories to share or people with relevant expertise willing to write short sections.

    Designed to provide an introduction to open research data, we’re looking to develop a resource that will explain what open research data actually is, the benefits of opening up research data, as well as the processes and tools which researchers need to do so, giving examples from different academic disciplines.

    Leading on from a couple of sprints, a few of us are in the process of collating the first few chapters, and we’ll be asking for comment on these soon.

    In the meantime, please provide us with case studies to include, or let us know if you are willing to contribute areas of expertise to this handbook.

    i want you

    We now need your help to gather concrete case studies which detail your experiences of working with Open Research Data. Specifically, we are looking for:

    • Stories of the benefits you have seen as a result of open research data practices
    • Challenges you have faced in open research, and how you overcame them
    • Case studies of tools you have used to share research data or to make it openly available
    • Examples of how failing to follow open research practices has hindered the progress of science, economics, social science, etc.
    • … More ideas from you!

    Case studies should be around 200-500 words long. They should be concrete, based on real experiences, and should focus on one specific angle of open research data (you can submit more than one study!).

    Please fill out the following form in order to submit a case study:

    Link to form

    If you have any questions, please contact us on researchhandbook [at] okfn.org

    Disclosure and ‘cook booking’

    - March 25, 2013 in Contribution Economy, Economic Publishing, Featured, Open Data, Open Economics

    This blog post is cross-posted from the Contribution Economy Blog.


    Many journals now have open data policies but they are sparingly enforced. So many scientists do not submit data. The question is: what drives them not to submit? Is it laziness? Is it a desire to keep the data to themselves? Or is it something more sinister? After all, the open data rules were, in part, to allow for replication experiments to ensure that the reported results were accurate.

    Robert Trivers reports on an interesting study by Wicherts, Bakker, and Mlenar that correlates disclosure of data with the statistical strength of results in psychological journals.

    Here is where they got a dramatic result. They limited their research to two of the four journals whose scientists were slightly more likely to share data and most of whose studies were similar in having an experimental design. This gave them 49 papers. Again, the majority failed to share any data, instead behaving as a parody of academics. Of those asked, 27 percent failed to respond to the request (or two follow-up reminders)—first, and best, line of self-defense, complete silence—25 percent promised to share data but had not done so after six years and 6 percent claimed the data were lost or there was no time to write a codebook. In short, 67 percent of (alleged) scientists avoided the first requirement of science—everything explicit and available for inspection by others.

    Was there any bias in all this non-compliance? Of course there was. People whose results were closer to the fatal cut-off point of p=0.05 were less likely to share their data. Hand in hand, they were more likely to commit elementary statistical errors in their own favor. For example, for all seven papers where the correctly computed statistics rendered the findings non-significant (10 errors in all) none of the authors shared the data. This is consistent with earlier data showing that it took considerably longer for authors to respond to queries when the inconsistency in their reported results affected the significance of the results (where responses without data sharing!). Of a total of 1148 statistical tests in the 49 papers, 4 percent were incorrect based only on the scientists’ summary statistics and a full 96 percent of these mistakes were in the scientists’ favor. Authors would say that their results deserved a ‘one-tailed test’ (easier to achieve) but they had already set up a one-tailed test, so as they halved it, they created a ‘one-half tailed test’. Or they ran a one-tailed test without mentioning this even though a two-tailed test was the appropriate one. And so on. Separate work shows that only one-third of psychologists claim to have archived their data—the rest make reanalysis impossible almost at the outset! (I have 44 years of ‘archived’ lizard data—be my guest.) It is likely that similar practices are entwined with the widespread reluctance to share data in other “sciences” from sociology to medicine. Of course this statistical malfeasance is presumably only the tip of the iceberg, since in the undisclosed data and analysis one expects even more errors.

    It’s correlation but it is troubling. The issue is that authors present results selectively and sadly this is not picked up in peer review processes. Of course, it goes without saying that even with open data, it takes effort to replicate and then publish alternative results and conclusions.