you're reading...
Books, Data journalism, Data visualisation, elections, Google data

The weight of words (and data)

The weight of words

The weight of words

This is the introduction I wrote to a new book from the excellent data team at Paris Match, published in French. Check out the project here.

What’s in a word? Quite a lot, as it turns out. The fortunes of a political campaign can turn with a single word. The right ones can propel a candidate to victory. The wrong ones can mean his or her demise.

But words have traditionally been a tricky thing to explore with data journalism. At first glance, the two things are as far apart as two things can possibly be. It’s the difference between qualitative and quantitative — and traditionally textual analysis in data journalism has been restricted to word clouds and other visual displays often based on counts of how many times the words were used.

But now, there’s something else that can help lend understanding to how words are used: search data.

And this is what was behind Le Poids des mots. When during the election, Paris Match wanted to examine the words of the candidates through their speeches, they decided to combine the frequency of word use with how those words resonated across France.

The back end of the process is almost as interesting as its visual representation. Data Match, the Paris Match data journalism team, created an internal database, which they have been updating since the election with speeches of president Emmanuel Macron. Overall, they analyzed around 450 speeches consisting of about two million words.

The idea was to provide readers more context into how language was used during the election campaign, and after. The visual became an examination of the political language through the lens of Google Trends data.

To make something like this takes collaboration. On one side was the Paris-Match data journalism team leaded by Anne-Sophie Lechevallier and Adrien Gaboulaud. On the other was the Google News Lab team in Paris, with data work by Trends curator Camilo Moreno.  They also worked with academics from Sciences Po, CNRS and multiple french universities.

The Paris Match data team were inspired by a format used in The New York Times during the 2012 conventions to show which word was most used by which party during political speeches, then they wanted to add a Trends layer to it. Each time an user typed a word in, the system would answer how often it has been used during conservative and socialist candidates’ speeches, and how it trended on Google.

The interactive application allowed the user to filter the results in different ways  to find not only how many times each candidate had said the word in speeches, on social networks, radio and television appearances — but also to show the evolution in search interest of that same word in France, thanks to search data.

If there is one data source that shows how data journalism has changed in recent years, it’s the availability of search data. There are billions of searches every day, and the dataset takes you beyond the echo chamber of social media, into a field where you can see the honest, unvarnished truth about what people *really* care about. Your search window is not a place where you can adopt a position or express your opinion, so much as it’s where you reveal your truest self.

Now that we have access to that data in new ways, we can tell new stories too.

In an era in which the notion of ‘truth’ itself is under attack every day, data journalism has never been so important. In a fast-moving, constantly changing news event, such as an election, it’s common for the facts to be ignored, but data journalism can help change that. Most importantly, it provides context, background and detail which helps us understand that story better.

We have seen this time and again, in elections across the globe. In the past, the focus on data was always on answering a simple question: who has won? And that’s fine, a perfectly respectable mission.

But increasingly, the lens of data journalism is turning to other questions around issues, opinions and the way technology is changing our lives. The answers we find are often more confusing than the questions, and lead to more queries than we ever imagined possible. It’s not an easy task, making sense of this new world, but it’s one that great data journalism is ideally suited for.

And that, for me, is is one of the most exciting facets of this project: how it’s applying classic data journalism techniques to new types of data. But more importantly it recalls the most important part of data journalism of all: to tell a great story in the best way possible.

Simon Rogers is Data Editor at the Google News Lab and Director of the Data Journalism Awards.

About Simon Rogers

Data journalist, writer, speaker. Author of 'Facts are Sacred', from Faber & Faber and a range of infographics for children books from Candlewick. Edited and launched the Guardian Datablog. Now works for Google in California as Data Editor and is Director of the Sigma awards for data journalism.


One thought on “The weight of words (and data)

  1. Awesome….This is a beautifully motivating piece…
    And a amazing idea for reporting too.!

    “How words were used during election campaign”



    Posted by Varsha | June 22, 2018, 2:27 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

About me

Data journalist, writer, speaker. Author of 'Facts are Sacred', published by Faber & Faber and a new range of infographics for children books from Candlewick. Data editor at Google, California. Formerly at Twitter, San Francisco. Created the Guardian Datablog. All opinions on this site are mine, not my employers'. Read more >>

Free to share

Creative commons

Please share me around. Everything here is free to use under a Creative Commons Attribution-NonCommercial 3.0 Unported License

Follow me on Twitter

%d bloggers like this: