Support Us

You are browsing the archive for Guo Xu.

The Benefits of Open Data (part II) – Impact on Economic Research

- October 21, 2012 in Open Economics

A couple of weeks ago, I wrote the first part of the three part series on Open Data in Economics. Drawing upon examples from top research that focused on how providing information and data can help increase the quality of public service provision, the article explored economic research on open data. In this second part, I would like to explore the impact of openness on economic research.

We live in a data-driven age

There used to be a time when data was costly: There was not much data around. Comparable GDP data, for example, has only been collected starting in the early mid 20th Century. Computing power was expensive and costly: Data and commands were stored on punch cards, and researchers only had limited hours to run their statistical analyses at the few computers available at hand.

Today, however, statistics and econometric analysis has arrived in every office: Open Data initiatives at the World Bank and governments have made it possible to download cross-country GDP and related data using a few mouse-clicks. The availability of open source statistical packages such as R allows virtually everyone to run quantitative analyses on their own laptops and computers. Consequently, the number of empirical papers have increased substantially. The left figure (taken from Espinosa et al. 2012) plots the number of econometric (statistical) outputs per article in a given year: Quantitative research has really taken off since the 1960s. Where researchers used datasets with a few dozens of observations, modern applied econometricians now often draw upon datasets boasting millions of detailed micro-level observations.

 Why we need open data and access

The main economic argument in favour of open data is gains from trade. These gains come in several dimensions: First, open data helps avoid redundancy. As a researcher, you may know there are often same basic procedures (such as cleaning datasets, merging datasets) that have been done thousands of times, by hundreds of different researchers. You may also have experienced the time wasted compiling a dataset someone else already put together, but was unwilling to share: Open data in these cases can save a lot of time, allowing you to build upon the work of others. By feeding your additions back to the ecosystem, you again ensure that others can build on your data work. Just like there is no need to re-invent the wheel several times, the sharing of data allows researchers to build on existing data work and devote valuable time to genuinely new research.

Second, open data ensures the most efficient allocation of scarce resources – in this case datasets. Again, as a researcher, you may know that academics often treat their datasets as private gold mines. Indeed, entire research careers are often built on possessing a unique dataset. This hoarding often results in valuable data lying around on a forgotten harddisk, not fully used and ultimately wasted. What’s worse, the researcher – even though owning a unique dataset – may not be the most skilled to make full use of the dataset, while someone else may possess the necessary skills but not the data. Only recently, I had the opportunity to talk to a group of renown economists who – over the past decades – have compiled an incredibly rich dataset. During the conversation, it was mentioned that they themselves may have only exploited 10% of the data – and were urgently looking for fresh PhDs and talented researchers to unlock the full potential of the their data. But when data is open, there is no need to search, and data can be allocated to the most skilled researcher.

Finally, and perhaps most importantly, open data – by increasing transparency – also fosters scientific rigour: When datasets and statistical procedures are made available to everyone, a curious undergraduate student may be able to replicate and possibly refute the results of a senior researcher. Indeed, journals are increasingly asking researchers to publish their datasets along with the paper. But while this is a great step forward, most journals still keep the actual publication closed, asking for horrendous subscription fees. For example, readers of my first post may have noticed that many of the research articles linked could not be downloaded without a subscription or university affiliation. Since dissemination, replication and falsification are key features of science, the role of both open data and open access become essential to knowledge generation.

But there are of course challenges ahead: For example, while a wider access to data and statistical tools is a good thing, the ease of running regressions with a few mouse-clicks also results in a lot of mindless data mining and nonsensical econometric outputs. Quality control, hence, is and remains important. There are and in some cases also should be some barriers to data sharing. In some cases, researchers have invested a substantial time of their lives to construct their datasets, in which case it is understandable why some are uncomfortable to share their “baby” with just anyone. In addition, releasing (even anonymized) micro-level data often raises concerns of privacy protection. These issues – and existing solutions – will be discussed in the next post.

The Benefits of Open Data – Evidence from Economic Research

- October 3, 2012 in Open Data, Open Economics, Public Finance and Government Data

This contribution is by Guo Xu (OKFN Economics and LSE) and the first part of the blog series “Mainstreaming Open Economics”.

Looking back to the Open Knowledge Festival 2012 in September, there’s an impression that openness is everywhere: There are working groups on Open Science and Open Linguistics, topic streams on Gender and Diversity in Openness, and events like Open Prom and Open Sauna: Open Knowledge and Open Data, it seems, is omnipresent.

Looking beyond the Open Knowledge community, however, the situation is very different: In Economics, for example, not many know what “open data”, “open access” or “Open Economics” exactly mean. Indeed, not many even care. A common reaction is: “Yes, it sounds interesting and important, but does it really matter? And why should I care about it?”

In this post, I would like to give some hard evidence on the positive role of opening up information has had in economics, and sketch ideas for how to involve economists – professional or in training – to mainstream ideas of openness. The blog post is divided into three parts: The first part looks at economic research on open data. The second part looks at the impact of open data on economic research. The third part discusses challenges and ways forward.

The real world impacts of open information

Making information accessible to the public can improve public service delivery. In countries where corruption is pervasive, services and funds often do not reach the frontline provider. And even if services do reach the people, the quality of services provided is often shockingly poor: Survey evidence from Bangladesh, Ecuador, India, Peru and Uganda found absence rates as high as 20% and 35% for school teachers and health workers. In many cases, the staff is poorly trained.

Releasing data on service delivery in this case can help reduce corruption and improve public services. In Uganda, researchers provided information to parents by publishing funding data for a random subset of schools in local newspapers. In consequence, corruption decreased significantly, while schooling outcomes improved substantially. Similar evidence in health delivery and redistributive policies suggest that providing information can help the public to discipline public service providers, improving the quality of services.

Information can also expose corrupt politicians: The Federal Government of Brazil, for example, began to select and audit municipalities at random, releasing audit reports to the media. Researchers found that the audit outcomes had a significant impact on the reelection probability of politicians: Those exposed for corruption were punished at the ballots, and the impact was most pronounced in areas where the dissemination of information was favoured by local radio.

A story from fishermen in South India provides another example of how information can improve market efficiency: Studying the adoption of mobile phones in Kerala, researchers have found convincing evidence that access to information through mobile phones helped fishermen sell their catch at the market where the price was highest (and fish most demanded): Instead of sailing to a port and simply hoping for a good price, fishermen were empowered by technology to make informed decisions on how to trade.

Finally, the benefits of transparency are not only restricted to reducing corruption and lowering the cost of information: A comparative study finds that transparency – measured by accuracy and frequency of macroeconomic information released to the public – leads to lower borrowing costs in sovereign bond markets. Open data pays off in many ways – in many different contexts.

These are just a few selective examples on how cutting-edge economic research has identified the benefits of openness in a diverse range of situations. The cases I presented are not based on correlations, but carefully established causal relationships, leaving – at least within the context studied – little doubt that information matters – big time. Perhaps most importantly, these cases have also shown that open data must be understood in a broad sense: These interventions do not take advantage of linked data, do not use CSVs that are shared through Facebook or Twitter – often, these interventions are simple solutions that ultimately help improving the everyday lives of the people.

Analyzing the Yourtopia Dataset

- February 7, 2012 in Crowd-sourcing, Yourtopia

The following post is from Dirk Heine and Guo Xu, members of the Yourtopia project team. 

Last year, the Open Economics Working Group submitted Yourtopia, a crowd-sourced indicator of social progress, to the World Bank Apps4Development competition and has been awarded the third prize. Yourtopia allows users to assign weights on different dimensions of development (e.g. economy, health and education). Based on the weights submitted by all users, we constructed a robust aggregate weighting, reflecting a global “consensus weighting”, which can used as a consensus measure of development. One year later and after more than 4,000 submitted weightings, where do we stand? And perhaps most importantly, how does our “consensus weight” compare to conventional indices, such as the Human Development Index (HDI)?

The results are quite remarkable: Compared to the default weights of the HDI where economy, health and education receive equal weights (33% each), our consensus weight assigns 30% to economy, 34% to health and 36% to education. Two things are worthwhile pointing out:

1) The HDI weights, even though ad-hoc and arbitrary, are nearly identical to the weights obtained by crowd-sourcing. Taking measurement errors into account, we cannot reject that our consensus weights are equal to the HDI weights. Despite the criticism, the HDI appears to be quite robust.

2) Looking at the point estimates only, the consensus weights also suggest that education is the most important dimension of development, followed by health. This is not surprising as human capital plays a crucial role in fostering economic growth. The economy is merely a means towards expanding capabilities.

Finally, we were also able to explore cross-country variation: By matching the IP addresses of the users against their country of residence, we were able to merge individual weights to country-level means. Correlating the country-level averages against other socio-economic variables enables us to address interesting questions: For example, are weightings associated with country-level variables? Are people from richer countries more likely to assign higher weights to economy, or vice versa?

The figure below plots a country’s GDP per capita level against the average country-level consensus weight for “economy”. A high value for the weight indicates that more importance is given to the economy as an indicator of development. The plot suggests a significant negative relationship: People in rich countries tend to assign less importance to the “economy” dimension, while people in poor countries perceive the economy to be more important. If this is indeed the case, we have another reason to re-consider GDP per capita as a measure of social progress.

Of course, the results should be taken with a grain of salt: The submitted weights are obviously subject to selection bias, which can be substantial in developing countries as access to internet is relatively limited. In addition, measurement errors are likely to confound the results as users were allowed to submit several times. While the large sample size can help alleviate some of these concerns, the results should be seen as tentative.

Are you interested in our project?

Help us analyze the Yourtopia dataset! We have released the dataset and are looking forward to more sophisticated analyses!
We are also currently working on Yourtopia 2. If you would like to join the project or come along for a hackday, please contact us at economics [at] okfn.org.

 

Open Economics Hack Day Saturday January 28th 2012

- January 19, 2012 in Events, Hackathon

**This post is by [Velichka Dimitrova](https://okfn.org/members/vndimitrova/), Coordinator for the [Economics Working Group](http://openeconomics.net/) at the Open Knowledge Foundation.**

On Saturday 28th January we’re getting together for an Open Economics Hackday where we’ll be be wrangling data and building apps related to economics — all are welcome!

* When: Saturday 28th January, 11am GMT (12pm CET/6am EST) to ~7pm GMT (8pm CET/3pm EST)
* Sign up on the MeetUp page.
* Some people will also be around on Friday 27th (same times)
* Where: Online (IRC, Skype) and also in person in London – meet us at the public space coffee area in the main hall on floor G of the Barbican.
* Who: Anyone! Coder, data wrangler, economists, illustrator or writer …
* And here is the Etherpad.

As with all hackdays, exactly what gets work on gets decided on the day (you can add suggestions to the etherpad). However, one particular idea, which we could become a submission to Apps4Italy, is set out below.

### One Idea for What We’ll Work On: ProgressVote

One of the most fundamental questions in economic research is: how do we measure social progress? Policy makers have come up with alternative measures accounting for environmental impacts, inequality, happiness and other indicators of human development.

However, the multiplicity of factors has caused another problem – how do we decide on the importance of each individual factor in a composite index? They could be either equally important (such as in the HDI) or they could be given different weights.

In our last project [YourTopia][yourtopia] – which was one of the winners of last year’s World Bank [Apps4Development Prize][apps-prize] – we offered one possible solution by letting *you* decide on which dimensions and aspects of economic development to prioritize.

However there are limitations to such an approach: faced with a myriad of technical indicators people are often overwhelmed by the complexity: Does life expectancy at birth matter more than the inflation rate or the M2 money supply? And what does M2 money supply even mean?

[yourtopia]: http://yourtopia.net/
[apps-prize]: http://appsfordevelopment.challengepost.com/

In [ProgressVote][progressvote], we’d like to improve on YourTopia in a variety of ways:

First, by combining proxy voting with the crowd-based Yourtopia approach: Instead of voting for indicators, people vote for expert statements that interpret the dashboard of variables. By doing so, it is hoped to strike a balance between expert judgements and the interpretation of the general public: Experts may be more able to interpret technical data, but in the end it is the citizens who decide which expert statement to endorse.

Second, we’d like to add support time series — so you can see how progress (or lack of it) has evolved over time — as well as better geo support — for example, so it is possible to look at regions as well as countries have performed (consider Italy for instance).

[progressvote]: http://wiki.okfn.org/ProgressVote

Interested? Then come join us on Saturday 28th January!

Reminder: Open Knowledge Indicator Hackday 23rd of August 2011

- August 22, 2011 in Data Party, Open Knowledge Index, Projects

Just a quick reminder that the Hackday for the Open Knowledge Indicator will be on the 23rd of August, from 10 AM to 11 PM (UTC+1).
I’m sorry if some cannot make it at this date but I hope you will be able to join at a later stage – this certainly doesn’t mean you’re excluded!

We’ll be setting up a Etherpad – for joining, please add my Skype address guoxu_voip so I can put you on the group chat.

Open Economics at OKCon 2011

- June 29, 2011 in Conference, Events, Festival

A few members of the Open Economics Working Group will be attending the Open Knowledge Conference 2011 in Berlin, 30th June to 1st July. There will also be a presentation of Metametrik.

We have also been able to cooperate with a related project in the pre-OKCon Open Science workshop, working on a crowd-sourced data inputation system which could serve as a component of the Metametrik framework.

If you plan to attend the OKCon, please do contact us and have a chat! Looking forward to seeing you there.

Working Group Meeting 23rd June 2011

- June 16, 2011 in Projects

We are hosting a Skype meetup to take stock and discuss future projects on the:

23rd of June 2011, 7pm GMT+1 (British Summer Time)

We would like to invite you to join in – the Skype meeting will also be a great opportunity to learn about the Open Economics Working Group and explore the various ways in which you can contribute. To participate in the meeting, please drop a mail to [email protected] along with your Skype-id, so we can add you to the session. In the meantime, feel free to join our mailing list to stay updated.

Third Place in World Bank Contest

- April 27, 2011 in Announcements, Yourtopia

A few months back, we launched a simple app that allows anyone to say what kind of world, what “YourTopia”, they would like to live in. Created with the help of the new OKF Working Group Group on Economics, we submitted the app to the World Bank Apps4Development competition: Two days ago, the World Bank President Zoellick finally announced the winners of the competition and we are delighted to say that Yourtopia has been awarded the 3rd prize at the World Bank Apps4Development competition, chosen among over 100 other submissions.

(Photo: © Frank Vincent / World Bank)

As an OKF project, the award ceremony also gave us the opportunity to promote open data initiatives. Dirk Heine, who represented our team at the ceremony in the World Bank HQ in DC, was also able to present Yourtopia to a wider audience of stakeholders (including Robert Zoellick, Justin Lin and other IFI officials). Overall, there was great interest in Yourtopia: The idea of an open indicator for human development appealed to many people, ranging from reporters to researchers and policymakers.

Encouraged by the positive feedback, we are planning to build on the momentum and move forward with Yourtopia. We are also volunteering the prize money for future projects. Again, we would like to encourage anyone interested to join or suggest new ideas. If you are interested, please sign up for the OKF Open Economics mailing list or just send a mail to guo.xu[at] okfn [dot] org.