We all can agree that data is information about the “real world,” collected and recorded. Writing on Poynter.org, Troy Thibodeaux, the former editor of interactive newsroom technology at the Associated Press, said this about the discipline of data journalism:
Real data journalism comes down to a couple of predilections: a tendency to look for what is categorizable, quantifiable and comparable in any news topic and a conviction that technology, properly applied to these aspects, can tell us something about the story that is both worth knowing and unknowable in any other way.
This turns out to be a great working definition for data, because it defines data in terms of what we want to do with it rather than what it is. We want to quantify data: How many Americans are unemployed? We also want to categorize data: How many of the unemployed are women? How many are men? Finally, we want to make comparisons: How does the number of women unemployed now compare with the number of women unemployed a year ago?
Looking at Visualizations
Let’s look at some data visualizations and ask critical questions of them to see if they are successful. Examine Bloomberg’s visualization of Trump’s budget proposal:
The key to any visualization is that it should make COMPARISONS. Always ask yourself, “compared to what”? The Bloomberg budget visualization doesn’t simply present the quantities–so many million for this department and so many million for that department–but it makes smart comparisons. What are those comparisons? The graphs compare the budget from one department to the next, but more importantly, it compares the change in the proposed budget from its current budget. As readers and citizens, we want to know what department is getting more funding, and what department is getting less (here shown as green vs. red). Hence, the change is the most important metric to show. Critically, it’s not just the change in numbers, but its the change in percentage. The EPA is being defunded by 2.6 billion dollars, but the impact is clearer when we understand that that change is 31.4% of its total budget. (Percent change is measured along the horizontal axis) Which department’s budget is seeing the greatest percentage increase? Which department’s budget is increasing the most in terms of dollars? Does the visualization make answering these questions an easy task?
The next visualization is from the Guardian, which presents the 2015 data for the Congressional representatives across the nation.
Successful visualizations don’t just let you compare, but they also let you personalize the data. How does the data relate to me? We call this “navel gazing” because we assume readers are most interested in their own situation (aka, they like to look at their own belly buttons). The Guardian’s interactive lets you choose categories (gender, race, education, etc.) to reveal who the folks in Congress are “most like you”. Does this visualization encourage personal exploration of the data?
Finally, explore the 3D interactive visualization of arms import and export, created for Google.
The interactive is an amazing display, allowing you to spin the globe and choose any country to show the amount of small arms import and export to and from other countries. But step back from the bells and whistles and try to identify what the reader would most likely want to do with this information. We would want to make comparisons. Here’s a question: I want you to compare the exports between China and the United States. Is it easy? Click on the U.S. and you see that the Export bar is half way up at 0.61 billion. Now click on China and you see that their Export bar is all the way to the top but at 58.1 million. Does this make it easy to compare? Visually, the orange bars aren’t consistent because the scale between countries varies (why is the higher number a smaller bar?). We must instead rely on the actual numbers. But if the actual graphics in a data visualization doesn’t help the reader, what good is it?
Your distance learning class assignment is the same as your homework assignment, which will be due next class: Pick one data visualization and critique on your blog with a 300-word post: is the graphical presentation effective at communicating the information? What is the takeaway message? Does it encourage exploration? Is it misleading? How are colors used? Does the form afford accurate comparisons?