Metametrik first sprint

Velichka Dimitrova - May 31, 2013 in Featured, Metametrik

This blog post is written by Martin Keegan and Velichka Dimitrova.

On Saturday, May 25, a team of economists and programming experts gathered to plan a format for the saving of regression results in economics. The project would allow for the building of a database of empirical results, where queries would be made, allowing to answer questions like: do authors tend to get more significant results when using World Bank data instead of Penn World Table data and how conclusions about the relationships of variables have evolved over time.

Based on some ideas outlined by the Working Group last year, we worked on a post-publication version of a small system where an informed researcher would be able to enter regression results, which would be saved in a database and then these results would be queried by a researcher who wants to analyse the empirical literature.

We took an approach which turned out to be fairly similar to Guo’s recommendation: create a JSON schema capturing the basics of a regression result (the dependent variable, the goodness of fit, the sample size, standard errors and effect sizes of results), and then make tools which produce and consume data in this format.

So far, we have a tool which generates Metametrik format data, and a tool which reads this into a database. What’s needed next is web UIs that produce this data (for articles already published) and allow you to search it.

Metametrik Sprint in London, May 25

Velichka Dimitrova - May 2, 2013 in Announcements, Call for participation, Events, Featured, Metametrik, Sprint

The Open Economics Working Group is inviting to a one-day sprint to create a machine-readable format for the reporting of regression results.

  • When: May 25, Saturday, 10:00-16:00
  • Where: Centre for Creative Collaboration (tbc), 16 Acton Street, London, WC1X 9NG
The event is meant for graduate students in economics and quantitative social science as well as other scientists and researchers who are working with quantitative data analysis and regressions. We would also welcome developers with some knowledge in XML and other mark-up programming and others interested to contribute to this project.

About Metametrik

Metametrik, as a machine readable format and platform to store econometric results, will offer a universal form for presenting empirical results. Furthermore, the resulting database would present new opportunities for data visualisation and “meta-regressions”, i.e. statistical analysis of all empirical contributions in a certain area.

During the sprint we will create a prototype of a format for saving regression results of empirical economics papers, which would be the basis of meta analysis of relationships in economics. The Metametrik format would include:

  • XML (or another markup language) derived format to describe regression output, capturing what dependent and independent variables were used, type of dataset (e.g. time series, panel), sign and magnitude of the relationship (coefficient and t-statistic), data sources, type of regression (e.g. OLS, 2SLS, structural equations), etc.
  • a database to store the results (possible integration with CKAN) – a user interface to allow for entry of results to be translated and saved in the Metametrik format. Results could be also imported directly from statistical packages
  • Visualisation of results and GUI – enabling queries from the database and displaying basic statistics about the relationships.


Since computing power and data storage have become cheaper and more easily available, the number of empirical papers in economics has increased dramatically. Despite the large numbers of empirical papers, however, there is still no unified and machine readable standard for saving regression results. Researchers are often faced with a large volume of empirical papers, which describe regression results in similar yet differentiated ways.

Like bibliographic machine readable formats (e.g. bibtex), the new standard would facilitate the dissemination and organization of existing results. Ideally, this project would offer an open storage where researchers can submit their regression results (for example in an XML type format). The standard could also be implemented in a wide range of open source econometric packages and projects like R or RePec.

From a practical perspective, this project would greatly help to organize the large pile of existing regressions and facilitate literature reviews: If someone is interested in the relationship between democracy and economic development, for example, s/he need not go through the large pile of current papers but can simply look up the relationship on the open storage: The storage will then produce a list of existing results, along with intuitive visualizations (what % of results are positive/negative, how do the results evolve over time/i.e. is there a convergence in results). From an academic perspective, the project would also facilitate the compilation of meta-regressions that have become increasingly popular. Metametrik will be released under an open license.

