//
you're reading...
Data journalism, Open data

We need an open data Olympics. It’s not #Sochi2014

Sochi gold medal. Source: Zastavki.com

Sochi gold medal. Source: Zastavki.com

Find out how to scrape a medal table by scrolling down this piece

Looking for data around the Olympics? Good luck. By the time the Winter games in Sochi are over, there may be hundreds of Olympians polishing their lovely new medals. But try and access the speed that event was won in or the runners up and you will be reduced to cutting and pasting. The games are open; the data is closed.

It’s not so long ago that the Olympics took place in London. I wrote that:

These Olympics are a gift for Data journalism. Every day brings us a story that cries out for analysis

So, we scraped the data, made it available and asked our readers to get involved. I still feel that we shouldn’t have to do that but we did because this data belongs out in the open.

Data is a commodity in the Olympics. You might be able to see the curling or Jenny Jones win a medal in front of you, but the numbers behind that sport are a product to be bought and sold.  The result is a lack of transparency and openness.

Does it matter? Isn’t this only a sporting event?

It does because this is not the only area of life (and reporting) where basic information comes at a price.

This year there will be over 30 elections around the world. Most of them will not take place in environments where the voting data is open and available.

If we can’t even get the data out there for a sporting event watched by millions around the world, what chance do we have of making the┬ábasic stuff of democracy available?

How to scrape a medal table

So, want to download a medal table, keep it up to date and bring it into a Google spreadsheet? It’s actually easier than you might think.

Screen Shot 2014-02-10 at 6.15.43 PMSo, the BBC publishes this medal table on its website, constantly updated. You don’t really want to copy and paste that every day, so you can try the rather useful IMPORT HTML function in Google spreadsheets. This allows you to bring a table from a website into your spreadsheet.

There’s a great guide here to importing tables into Google spreadsheets using the IMPORT HTML function. Check it out with a Bollywood example below.

import-HTML

The command to scrape the medal table is:

=ImportHtml("http://www.bbc.com/sport/winter-olympics/2014/medals/countries", "table",1)

And the result is this:

Screen Shot 2014-02-10 at 5.57.56 PM

Neat, eh?

And, here’s some more data journalism and the Olympics content:

About Simon Rogers

Data journalist, writer, speaker. Author of 'Facts are Sacred', from Faber & Faber and a new range of infographics for children books from Candlewick. Edited and launched the Guardian Datablog. Now works for Twitter in San Francisco as Data Editor

Discussion

4 thoughts on “We need an open data Olympics. It’s not #Sochi2014

  1. Dear Simon, I have 2 more tips for scraping this kind of table.

    1) An addon for Chrome “Table Capture” that I loooooove https://chrome.google.com/webstore/detail/table-capture/iebpjdmgckacbodjpijphcplhebcmeop

    2) If you work with Excel, the Export Data from the Web works very good and usually gets us more info from table (e.g. metadata from images) https://twitter.com/fcoel/status/433333938077970432/photo/1

    But of course the ImportHTML function from Google Spreadsheets is the easiest & free way to scrap from any computer, even not yours! (at press rooms with pcs without chrome or excel installed)

    We teach these 3 to @LNdata journalists to use depending the circumstances.

    Regards from Buenos Aires, Argentina
    :)

    Flor

    Posted by Florencia Coelho | February 11, 2014, 8:25 pm

Trackbacks/Pingbacks

  1. Pingback: Data Viz News [44] | Visual Loop - February 15, 2014

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

About me

Data journalist, writer, speaker. Author of 'Facts are Sacred', published by Faber & Faber and a new range of infographics for children books from Candlewick. Data editor at Twitter, San Francisco. Created the Guardian Datablog. All opinions on this site are mine, not my employers'. Read more >>

Free to share

Creative commons

Please share me around. Everything here is free to use under a Creative Commons Attribution-NonCommercial 3.0 Unported License

Follow me on Twitter

Follow

Get every new post delivered to your Inbox.

Join 13,565 other followers

%d bloggers like this: