Support Us

You are browsing the archive for Open Data.

Italy must expand its online franchise: a policy promising attractive side-effects for the Italian economy

- November 26, 2013 in Featured, Open Data


This blog has been reposted from the Bruegel Blog.

“55 years after its promulgation, how would you like to change the Italian constitution?” This is the rather difficult question posed to Italians in the online public consultation that closed in early October. Nonetheless, this attempt to improve the discourse between policy-making institutions and their citizens may have represented a distorted reality, skewed toward the most educated. To successfully use emerging technologies to enhance its democracy, Italy must expand the online franchise, by improving broadband access and bringing down service costs for more Italians – a policy promising attractive side-effects for the Italian economy at a time of challenge.

With the online public consultation on the Italian constitution, two democratic innovations were attempted in Italy. First, using a public consultation to collect citizen opinions over a large and challenging topic instead of a more specific one. Second, releasing results from a public consultation in open data form, under the creative common license CC-BY, which authorizes work sharing and remixing. But to be effective, new democratic channels need to be supported by adequate infrastructure and a consistent digital culture. At the current state, this might not be the case with Italy as showed by data.

Similarly to its European partners, Italy’s most active Internet users are young people between the age of 16 and 35, with around 80% of this group accessing online services regularly. In comparison, the same is true for only 50% of those between 44 and 54 years old (Eurostat). If political interest prevails over lack of digital competence, the survey outcomes could be skewed to the older part of the population and vice versa. The first data released by ISTAT (the Italian Institute of Statistics, that has analyzed survey outcomes) show that the percentage of respondents between 38 and 57 accounts alone for over 40% of the total, and it doubled the percentage of respondents between 23 and 37.

Furthermore, as the consultation has been available online only, significant parts of the population may have been put at a disadvantage when wanting to participate, either due to a lack of competence or simply an absence of Internet access. Italy is affected by a big digital divide. In 2012, household access to broadband was 55% (EU 27 70% in 2012), with regional variations across the country of up to 20%.

The number of respondent types could be narrower than hoped. New instruments of democracy may attract a part of the population less intimate with canonical democratic measures (e.g. vote in the election, debate in public forum), but once again data betray expectations. Most of the survey participants had at least high school diploma indeed. But this could be already foreseen looking at Eurostat data. Those latter show significant positive correlations between education level and internet usage. Not only does this technological advantage has potentially favored those with higher education, but the most educated respondents are also overrepresented among those already exercising their civil rights through traditional tools.  

Those threats hang on the validity and efficacy of the survey, where only around 0.4% of people with right to vote has participated. Introducing new democratic measures is a first step into a political and economical modernization path. Nonetheless, to face this challenge Italy needs to guarantee equal and homogeneous access to broadband across the territory for all the citizens.  

Some policies to this end have already been adopted. The Italian government launched ‘the National Plan for Broadband’ in 2008, aimed at reducing the infrastructure deficit currently excluding 8.5 millions of people from broadband access. The European Digital Agenda stipulates that  all Europeans should have access to Internet above 30 Megabyte per second by 2020. In order to respect these objectives, Italy presented ‘the Strategic Project for Ultra Broadband’ at the end of 2012 and the public investment tranche of 900 million euros for the plan, announced last February, aims to create over 5000 new jobs and raise GDP by 1.3 billion euros.

Access does not mean use. Equal broadband access needs to be accompanied by affordable prices. In 2012, Italian citizens paid 25% more than the OECD average for broadband access. The government should work primarily to open the market to more competition, or even to intervene on market failures at the price level.

131120_valeria

Source: OECD

Betting on broadband infrastructure is a winning game. Offline, it will significantly increase GDP, the number of jobs and the level of innovation across the country. Online, though the experience from other countries suggests it offers no magic pill, it could improve dialogue with institutions, providing more options for participation, for example also via new e-government tools.

Open model of an oil contract

- October 22, 2013 in External Projects, Featured, Open Data, Open Economics

Please come and kick the tires of our open model of an oil contract!

In the next month or so, OpenOil and its partners will publish what we believe will be the first financial model of an oil contract under Creative Commons license. We would like to take this opportunity to invite the Open Economics community to come and kick the wheels on the model when it is ready, and help us improve it.

We need you because we expect a fair degree of heat from those with a financial or reputational stake in continued secrecy around these industries. We expect the brunt of attacks to be on the basis that we are wrong. And of course we will be wrong in some way. It’s inevitable. So we would like our defence to be not, “no we’re never wrong”, but “yes, sometimes we are wrong, but transparently so and for the right reasons – and look, here are a bunch of friends who have already pointed out these errors, which have been corrected. You got some specific critiques, come give them. But the price of criticism is improvement – the open source way!” We figure Open Economics is the perfect network to seek that constructive criticism.

screengrab

Ultimately, we want to grow an open source community which will help grow a systematic understanding of the economics of the oil and gas industry independent of investor or government stakes, since the public policy impact of these industries and relevant flows are too vital to be left to industry specialists. There are perhaps 50 countries in the world where such models could transform public understanding of industries which dominate the political economy.

The model itself is still being fine-tuned but I’d like to take this chance to throw out a few heuristics that have occurred in the process of building it.

Public interest modelling. The model is being built by professionals with industry experience but its primary purpose is to inform public policy, not to aid investment decisions or serve as negotiation support for either governments or companies. This has determined a distinct approach to key issues such as management of complexity and what is an acceptable margin of error.

Management of complexity. Although there are several dozen variables one could model, and which typically appear in the models produced for companies, we deliberately exclude a long tail of fiscal terms, such as ground rent and signature bonuses, on the basis that the gain in reduction of margin of error is less than the loss from increasing complexity for the end user. We also exclude many of the fine tuning implementations of the taxation system. We list these terms in a sheet so those who wish can extend the model with them. It would be great, for example, to get tax geek help on refining some of these issues.

A hierarchy of margins of error. Extractives projects can typically last 25 years. The biggest single margin of error is not within human power to solve – future price. All other uncertainties or estimates pale in comparison with its impact on returns to all stakeholders. Second are the capex and opex going into a project. The international oil company may be the only real source of these data, and may or may not share them in disaggregated form with the government – everyone else is in the dark. For public interest purposes, the margin of error created by all other fiscal terms and input assumptions combined is less significant, and manageable.

Moving away from the zero-sum paradigm. Because modelling has traditionally been associated with the negotiation process, and perhaps because of the wider context surrounding extractive industries, a zero-sum paradigm often predominates in public thinking around the terms of these contracts. But the model shows graphically two distinct ways in which that paradigm does not apply. First, in agreements with sufficient progressivity, rising commodity prices could mean simultaneous rise of both government take and a company’s Internal Rate of Return. Second, a major issue for governments and societies depending on oil production is volatility – the difference between using minimal and maximal assumptions across all of the inputs will likely produce a difference in result which is radical. One of a country’s biggest challenges then is focusing enough attention on regulating itself, its politicians’ appetite for spending, its public’s appetite for patronage. We know this of course in the real world. Iraq received $37 billion in 2007, then $62 billion in 2008, then $43 billion or so in 2009. But it is the old journalistic difference between show and tell. A model can show this in your country, with your conditions.

The value of contract transparency. Last only because self-evident is the need for primary extractives conracts between states and companies to enter the public domain. About seven jurisdictions around the world publish all contracts so far but it is gaining traction as a norm in the governance community. The side-effects of the way extractive industries are managed now are almost all due to the ill-understood nature of rent. Even corruption, the hottest issue politically, may often simply be a secondary effect of the rent-based nature of the core activities. Publishing all contracts is the single biggest measure that would get us closer to being able to address the root causes of Resource Curse.

See http://openoil.net/ for more details.

Fundamental Stock Valuation on an Open Platform

- September 3, 2013 in External Projects, Featured, Open Data

Investors have traditionally relied on Wall Street analysts for projections of companies’ intrinsic values.  Wall Street analysts typically come up with their valuations using Discounted Cash flow (DCF) analysis. However, they do not disclose the proprietary models used for arriving at buying, selling or holding recommendations. ThinkNum has a solution which allows users to build their own models.

A cash flow model is a tool for translating projections of a company’s future operating performance like revenue growth and costs of goods into an intrinsic value for the company. Without viewing the assumptions underlying a model, a leap of faith is required in order to use the model’s outputs. With Thinknum, users can view and change any formula or assumption that drives the valuation. The interactive nature of the application allows users to conduct ‘what-if’ analysis to test how sensitive a company’s valuation is to changes in a specific performance measure.

To get started, all that is needed is a stock ticker. After entering the ticker, Thinknum displays a model using the mean of analysts’ revenue growth projections. We load the historical numbers for the company’s balance sheet, income statement and the statement of cash flows from corporate filings.  We then use the growth assumptions to project how the company’s financial performance will evolve over time and how much value will ultimately accrue to shareholders. Users can modify the model or build one from scratch. Users can also download the models into Excel spreadsheets.

Google DCF 3 Statement Model pictured above is an example of a model I recently built for valuing Google’s stock price. If you disagree with my assumptions of Google’s revenue growth you can simply change those assumptions and compute the new value. DCF models can be used to make rational investment decisions by comparing the model’s intrinsic value to the current market price.

One important caveat is any model is only as good as the assumptions underlying it. We provide data from over 2,000 sources in an attempt to place proper context around companies and help analysts make the best assumptions based on all the information available. ThinkNum users can plot any number in the cash flow models over time. Visualizing numbers over time and comparing metrics across the industry help users gain insight into the company’s historical performance and how such performance might vary going forward. For example, simply type total_revenue(goog) into the expression window to pull up the total historical revenue for Google. You can then click on the bar graphs to pull up the corporate filings used in the charts.

We are excited about the role the web can play in helping us make better decisions by rationally analyzing available data.

Open Economics: the story so far…

- August 30, 2013 in Advisory Panel, Announcements, Events, Featured, Open Data, Open Economics, Projects

A year and a half ago we embarked on the Open Economics project with the support of the Alfred P. Sloan Foundation and we would like a to share a short recap of what we have been up to.

Our goal was to define what open data means for the economics profession and to become a central point of reference for those who wanted to learn what it means to have openness, transparency and open access to data in economics.

Advisory Panel of the Open Economics Working Group:
openeconomics.net/advisory-panel/

Advisory Panel

We brought together an Advisory Panel of twenty senior academics who advised us and provided input on people and projects we needed to contact and issues we needed to tackle. The progress of the project has depended on the valuable support of the Advisory Panel.

1st Open Economics Workshop, Dec 17-18 ’12, Cambridge, UK:
openeconomics.net/workshop-dec-2012/

2nd Open Economics Workshop, 11-12 June ’13, Cambridge, MA:
openeconomics.net/workshop-june-2013

International Workshops

We also organised two international workshops, first one held in Cambridge, UK on 17-18 December 2012 and second one in Cambridge U.S. on 11-12 June 2013, convening academics, funders, data publishers, information professionals and students to share ideas and build an understanding about the value of open data, the still persisting barriers to opening up information, as well as the incentives and structures which our community should encourage.

Open Economics Principles

While defining open data for economics, we also saw the need to issue a statement on the openness of data and code – the Open Economics Principles – to emphasise that data, program code, metadata and instructions, which are necessary to replicate economics research should be open by default. Having been launched in August, this statement is now being widely endorsed by the economics community and most recently by the World Bank’s Data Development Group.

Projects

The Open Economics Working Group and several more involved members have worked on smaller projects to showcase how data can be made available and what tools can be built to encourage discussions and participation as well as wider understanding about economics. We built the award-winning app Yourtopia Italy – http://italia.yourtopia.net/ for a user-defined multidimensional index of social progress, which won a special prize in the Apps4Italy competition.




Yourtopia Italy: application of a user-defined multidimensional index of social progress: italia.yourtopia.net

We created the Failed Bank Tracker, a list and a timeline visualisation of the banks in Europe which failed during the last financial crisis and released the Automated Game Play Datasets, the data and code of papers from the Small Artificial Agents for Virtual Economies research project, implemented by Professor David Levine and Professor Yixin Chen at the Washington University of St. Louis. More recently we launched the Metametrik prototype of a platform for the storage and search of regression results in the economics.


MetaMetrik: a prototype for the storage and search of econometric results: metametrik.openeconomics.net

We also organised several events in London and a topic stream about open knowledge and sustainability at the OKFestival with a panel bringing together a diverse range of panelists from academia, policy and the open data community to discuss how open data and technology can help improve the measurement of social progress.

Blog and Knowledge Base

We blogged about issues like the benefits of open data from the perspective of economics research, the EDaWaX survey of the data availability of economics journals, pre-registration of in the social sciences, crowd-funding as well as open access. We also presented projects like the Statistical Memory of Brazil, Quandl, the AEA randomized controlled trials registry.

Some of the issues we raised had a wider resonance, e.g. when Thomas Herndon found significant errors in trying to replicate the results of Harvard economists Reinhart and Rogoff, we emphasised that while such errors may happen, it is a greater crime not to make the data available with published research in order to allow for replication.

Some outcomes and expectations

We found that opening up data in economics may be a difficult matter, as many economists utilise data which cannot be open because of privacy, confidentiality or because they don’t own that data. Sometimes there are insufficient incentives to disclose data and code. Many economists spend a lot of resources in order to build their datasets and obtain an advantage over other researchers by making use of information rents.

Some journals have been leading the way in putting in place data availability requirements and funders have been demanding data management and sharing plans, yet more general implementation and enforcement is still lacking. There are now, however, more tools and platforms available where researchers can store and share their research content, including data and code.

There are also great benefits in sharing economics data: it enables the scrutiny of research findings and gives a possibility to replicate research, it enhances the visibility of research and promotes new uses of the data, avoids unnecessary costs for data collection, etc.

In the future we hope to concentrate on projects which would involve graduate students and early career professionals, a generation of economics researchers for whom sharing data and code may become more natural.

Keep in touch

Follow us on Twitter @okfnecon, sign up to the Open Economics mailing list and browse our projects and resources at openeconomics.net.

EC Consultation on open research data

- July 17, 2013 in Featured, Open Access, Open Data

The European Commission held a public consultation on open access to research data on July 2 in Brussels inviting statements from researchers, industry, funders, IT and data centre professionals, publishers and libraries. The inputs of these stakeholders will play some role in revising the Commission’s policy and are particularly important for the ongoing negotiations on the next big EU research programme Horizon 2020, where about 25-30 billion Euros would be available for academic research. Five questions formed the basis of the discussion:

  • How we can define research data and what types of research data should be open?
  • When and how does openness need to be limited?
  • How should the issue of data re-use be addressed?
  • Where should research data be stored and made accessible?
  • How can we enhance “data awareness” and a “culture of sharing”?

Here is how the Open Knowledge Foundation responded to the questions:

How can we define research data and what types of research data should be open?

Research data is extremely heterogeneous, and would include (although not be limited to) numerical data, textual records, images, audio and visual data, as well as custom-written software, other code underlying the research, and pre-analysis plans. Research data would also include metadata – data about the research data itself – including uncertainties and methodology, versioned software, standards and other tools. Metadata standards are discipline-specific, but to be considered ‘open’, at a bare minimum it would be expected to provide sufficient information that a fellow researcher in the same discipline would be able to interpret and reuse the data, as well as be itself openly available and machine-readable. Here, we are specifically concerned with data that is being produced, and therefore can be controlled by the researcher, as opposed to data the researcher may use that has been produced by others.

When we talk about open research data, we are mostly concerned with data that is digital, or the digital representation of non-digital data. While primary research artifacts, such as fossils, have obvious and substantial value, the extent to which they can be ‘opened’ is not clear. However, the use of 3D scanning techniques can and should be used to enable the capture of many physical features or an image, enabling broad access to the artifact. This would benefit both researchers who are unable to travel to visit a physical object, as well as interested citizens who would typically be unable to access such an item.

By default there should be an expectation that all types of research data that can be made public, including all metadata, should be made available in machine-readable form and open as per the Open Definition. This means the data resulting from public work is free for anyone to use, reuse and redistribute, with at most a requirement to attribute the original author(s) and/or share derivative works. It should be publicly available and licensed with this open license.

When and how does openness need to be limited?

The default position should be that research data should be made open in accordance with the Open Definition, as defined above. However, while access to research data is fundamentally democratising, there will be situations where the full data cannot be released; for instance for reasons of privacy.

In these cases, researchers should share analysis under the least restrictive terms consistent with legal requirements, and abiding by the research ethics as dictated by the terms of research grant. This should include opening up non-sensitive data, summary data, metadata and code; and providing access to the original data available to those who can ensure that appropriate measures are in place to mitigate any risks.

Access to research data should not be limited by the introduction of embargo periods, and arguments in support of embargo periods should be considered a reflection of inherent conservatism among some members of the academic community. Instead, the expectation should be that data is to be released before the project that funds the data production has been completed; and certainly no later than the publication of any research output resulting from it.

How should the issue of data re-use be addressed?

Data is only meaningfully open when it is available in a format and under an open license which allows re-use by others. But simply making data available is often not sufficient for reusing it. Metadata must be provided that provides sufficient documentation to enable other researchers to replicate empirical results.

There is a role here for data publishers and repository managers to endeavour to make the data usable and discoverable by others. This can be by providing further documentation, the use of standard code lists, etc., as these all help make data more interoperable and reusable. Submission of the data to standard registries and use of common metadata also enable greater discoverability. Interoperability and the availability of data in machine-readable form are crucial to ensure data-mining and text-mining of the data can be performed, a form of re-use that must not be restricted.

Arguments are sometimes made that we should monitor levels of data reuse, to allow us to dynamically determine which data sets should be retained. We refute this suggestion. There is a moral responsibility to preserve data created by taxpayer funds, including data that represents negative results or that is not obviously linked to publications. It is impossible to predict possible future uses, and reuse opportunities may currently exist that may not be immediately obvious. It is also crucial to note the research interests change over time.

Where should research data be stored and made accessible?

Each discipline needs different options available to store data and open it up to their community and the world; there is no one-size-fits-all solution. The research data infrastructure should be based on open source software and interoperable based on open standards. With these provisions we would encourage researchers to use the data repository that best fits their needs and expectations, for example an institutional or subject repository. It is crucial that appropriate metadata about the data deposited is stored as well, to ensure this data is discoverable and can be re-used more easily.

Both the data and the metadata should be openly licensed. They should be deposited in machine-readable and open formats, similar to how the US government mandate this in their Executive Order on Government Information. This ensures the possibility to link repositories and data across various portals and makes it easier to find the data. For example, the open source data portal CKAN has been developed by the Open Knowledge Foundation, which enables the depositing of data and metadata and makes it easy to find and re-use data. Various universities, such as the Universities of Bristol and Lincoln, already use CKAN for these purposes.

How can we enhance data awareness and a culture of sharing?

Academics, research institutions, funders, and learned societies all have significant responsibilities in developing a culture of data sharing. Funding agencies and organisations disbursing public funds have a central role to play and must ensure research institutions, including publicly supported universities, have access to appropriate funds for longer-term data management. Furthermore, they should establish policies and mandates that support these principles.

Publication and, more generally sharing, of research data should be ingrained in the academic culture, and should be seen as a fundamental part of scholarly communication. However, it is often seen as detrimental to a career, partly as a result of the current incentive system set up by by universities and funders, partly as a result of much misunderstanding of the issues.

Educational and promotional activities should be set up to promote the awareness of open access to research data amongst researchers, to help disentangle the many myths, and to encourage them to self-identify as supporting open access. These activities should be set up in recognition of the fact that different disciplines are at different stages in the development of the culture of sharing. Simultaneously, universities and funders should explore options for creating incentives to encourage researchers to publish their research data openly. Acknowledgements of research funding, traditionally limited to publications, could be extended to research data and contribution of data curators should be recognised.

References

Quandl: find and use numerical data on the internet

- July 2, 2013 in External Projects, Featured, Open Data

Quandl.com is a platform for numerical data that currently offers 6 million free and open time series datasets. Conceptually, Quandl aspires to do for quantitative data what Wikipedia did for qualitative information: create one location where quantitative data is easily available, immediately usable and completely open.

A screenshot from the Quandl data page

Open Economics and Quandl thus share a number of core values and objectives. In fact, at Quandl we are working to build part of the “transparent foundation” that is central to the Open Economics mission.

Quandl was invented to alleviate a problem that almost every econometrician knows well: finding, validating, formatting and cleaning data is a tedious and time consuming prerequisite to econometric analysis. We’re gradually reducing the magnitude of this problem by bringing all open time series datasets to one place, one source at a time.

To do this, we’ve built a sort of “universal data parser” which has thus far parsed about 6.4 million datasets. We’ve asked nothing of any data publisher. As long as they spit out data somehow (Excel, text file, blog post, xml, api, etc) the “Q-bot” can slurp it up.

The result is www.quandl.com, a sort of “search engine” for time series data. The idea with Quandl is that you can find data fast. And more importantly, once you find it, it is ready to use. This is because Quandl’s bot returns data in a totally standard format. Which means we can then translate to any format a user wants.
Quandl is rich in financial, economic and sociological time series data. The data is easy to find. It is transparent to source. It can be easily merged with each other. It can be visualized and shared. It is all open. It is all free. There’s much more about our vision on the about page.

Everyday Quandl’s coverage increases thanks to contributions made by Quandl users. We aspire to get to a point where publishers instinctively choose to put their data on Quandl. This has already started to happen because Quandl offers a solid, highly usable and totally open platform for time series data. We will work to perpetuate this trend and thus do our small part to advance the open data movement.

First Opinion series on Transparency in Social Science Research

- June 7, 2013 in Berkeley Initiative for Transparency in the Social Sciences (BITSS), External Projects, Featured, Open Data, Open Economics, Open Research

The Berkeley Initiative for Transparency in the Social Sciences (BITSS) is a new effort to promote transparency in empirical social science research. The program is fostering an active network of social science researchers and institutions committed to strengthening scientific integrity in economics, political science, behavioral science, and related disciplines.

Central to the BITSS effort is the identification of useful strategies and tools for maintaining research transparency, including the use of study registries, pre-analysis plans, data sharing, and replication. With its institutuional hub at UC Berkeley, the network facilitates discussion and critique of existing strategies, testing of new methods, and broad dissemination of findings through interdisciplinary convenings, special conference sessions, and online public engagement.

The first opinion series on transparency in social science research (see: http://cegablog.org/transparency-series/) was published on the CEGA Development Blog in March 2013. The series built on a seminal research meeting held at the University of California, Berkeley on December 7, 2012, which brought together a select interdisciplinary group of scholars – from biostatistics, economics, political science and psychology – with a shared interest in promoting transparency in empirical social science research.

Second Open Economics International Workshop

- June 5, 2013 in Announcements, Events, Featured, Open Data, Open Economics, Workshop

Next week, on June 11-12, at the MIT Sloan School of Management, the Open Economics Working Group of the Open Knowledge Foundation will gather about 40 economics professors, social scientists, research data professionals, funders, publishers and journal editors for the second Open Economics International Workshop.

The event will follow up on the first workshop held in Cambridge UK and will conclude with agreeing a statement on the Open Economics principles. Some of the speakers include Eric von Hippel, T Wilson Professor of Innovation Management and also Professor of Engineering Systems at MIT, Shaida Badiee, Director of the Development Data Group at the World Bank and champion for the Open Data Initiative, Micah Altman, Director of Research and Head of the Program on Information Science for the MIT Libraries as well as Philip E. Bourne, Professor at the University of California San Diego and Associate Director of the RCSB Protein Data Bank.

The workshop will address topics including:

  • Research data sharing: how and where to share economics social science research data, enforce data management plans, promote better data management and data use
  • Open and collaborative research: how to create incentives for economists and social scientists to share their research data and methods openly with the academic community
  • Transparent economics: how to achieve greater involvement of the public in the research agenda of economics and social science

The knowledge sharing in economics session will invite a discussion between Joshua Gans, Jeffrey S. Skoll Chair of Technical Innovation and Entrepreneurship at the Rotman School of Management at the University of Toronto and Co-Director of the Research Program on the Economics of Knowledge Contribution and Distribution, John Rust, Professor of Economics at Georgetown University and co-founder of EconJobMarket.org, Gert Wagner, Professor of Economics at the Berlin University of Technology (TUB) and Chairman of the German Census Commission and German Council for Social and Economic Data as well as Daniel Feenberg, Research Associate in the Public Economics program and Director of Information Technology at the National Bureau of Economic Research.

The session on research data sharing will be chaired by Thomas Bourke, Economics Librarian at the European University Institute, and will discuss the efficient sharing of data and how to create and enforce reward structures for researchers who produce and share high quality data, gathering experts from the field including Mercè Crosas, Director of Data Science at the Institute for Quantitative Social Science (IQSS) at Harvard University, Amy Pienta, Acquisitions Director at the Inter-university Consortium for Political and Social Research (ICPSR), Joan Starr, Chair of the Metadata Working Group of DataCite as well as Brian Hole, the founder of the open access academic publisher Ubiquity Press.

Benjamin Mako Hill, researcher and PhD Candidate at the MIT and Berkman Center for Internet and Society at Harvard Univeresity, will chair the session on the evolving evidence base of social science, which will highlight examples of how economists can broaden their perspective on collecting and using data through different means: through mobile data collection, through the web or through crowd-sourcing and also consider how to engage the broader community and do more transparent economic research and decision-making. Speakers include Amparo Ballivian, Lead Economist working with the Development Data Group of the World Bank, Michael P. McDonald, Associate Professor at George Mason University and co-principle investigator on the Public Mapping Project and Pablo de Pedraza, Professor at the University of Salamanca and Chair of Webdatanet.

The morning session on June 12 will gather different stakeholders to discuss how to share responsibility and how to pursue joint action. It will be chaired by Mireille van Eechoud, Professor of Information Law at IViR and will include short statements by Daniel Goroff, Vice President and Program Director at the Alfred P. Sloan Foundation, Nikos Askitas, Head of Data and Technology at the Institute for the Study of Labor (IZA), Carson Christiano, Head of CEGA’s partnership development efforts and coordinating the Berkeley Initiative for Transparency in the Social Sciences (BITSS) and Jean Roth, the Data Specialist at the National Bureau of Economic Research.

At the end of the workshop the Working Group will discuss the future plans of the project and gather feedback on possible initiatives for translating discussions in concrete action plans. Slides and audio will be available on the website after the workshop. If you have any questions please contact economics [at] okfn.org

Open Access Economics: To share or not to share?

- May 22, 2013 in Featured, Open Access, Open Data, Open Economics, Open Research

Last Friday, Barry Eichengreen, professor of Economics and Political Science at Berkeley, wrote about “Open Access Economics” at the prestigious commentary, analysis and opinion page Project Syndicate, where influential professionals, politicians, economists, business leaders and Nobel laureates share opinions about current economic and political issues.

He reaffirmed that indeed the results of the Reinhart and Rogoff study were used by some politicians to justify austerity measures taken by governments around the world with stifling public debt.

Professor Eichengreen also criticised the National Bureau of Economic Research (NBER) for failing to require data and code for the “flawed study” of the Harvard economists, which appeared first in the distinguished working paper series of NBER.

In line with the discussion we started at the LSE Social Impact Blog and the New Scientist, Barry Eichengreen brought home the message that indeed the enforcement of a data availability would have made a difference in this case.

At the same time, some express doubts about the need to share data and think about excuses to avoid sharing the data related to their publication. Economists at the anonymous web forum Econjobrumors.com have been joking about the best ways to avoid sharing data.

Here are some of “creative” suggestions on how the anonymous author could get around sending their data:

“Refer him to your press secretary”
“Tell him you had a computer virus that wiped out the dataset”
“Not obliged to let anyone free ride. Can you explain it like that?”
“Tell him its proprietary data and you can’t share it without having to kill him.”
“Tell him, ‘I’ll show you mine if you show me yours.”
“…say you signed NDA.”
“Huddle in the corner of your office wrapped in a blanket and some hot coco from the machine down the hall and wait for the inevitable.”
“Don’t reply.”

Anonymous author: “No, did not make up the results. But let’s just say you really do not want to play with the data in any way. No good for significance.”
Anonymous comment: “Added a couple of extra stars for good luck?”.

While many of the discussions on the anonymous blog are employing humour and jokes, this discussion reflects a mainstream attitude towards data sharing. It also shows how uncertain are some authors of the robustness of their results – even if they did not make any Reinhart and Rogoff excel mistakes, they are hesitating about sharing lest closer scrutiny would expose weaker methodology. Maybe more disclosure – there data can be shared – could improve the way research is done.

Securing the Knowledge Foundations of Innovation

- May 15, 2013 in Advisory Panel, Featured, Open Access, Open Data, Open Research

Last month, Paul David, professor of Economics at Stanford University, Senior Fellow of the Stanford Institute for Economic Policy Research (SIEPR) and a member of the Advisory Panel delivered a keynote presentation at the International Seminar of the PROPICE in Paris.

Professor David expresses concern that the increased use of intellectual property rights (IPR) protections “has posed problems for open collaborative scientific research” and that the IPR regime has been used by businesses e.g. to “raise commercial rivals’ costs”, where empirical evidence shows has shown that business innovation is “is being inhibited by patent thickets”.

In describing the anti-commons issue, professor David also pointed out that research databases are likely sites for problems and emphasised the importance of protecting the future open access to critical data.

Also, high quality data would be very costly, where “…strengthening researchers’ incentives to create transparent, fully documented and dynamically annotated datasets to be used by others remains an insufficiently addressed problem”.

Read the whole presentation below: