How to add a Google Chart to your WordPress post

In Google Sheets, create your chart.

Click the triangle in the upper right corner of your chart and select Publish.

Screen Shot 2017-05-02 at 9.43.37 AM

Next, choose the Embed tab and copy the HTML code. You will first have to Publish your chart to make it public.

Screen Shot 2017-05-02 at 9.44.00 AM

In a new WordPress post, choose the HTML tab and paste the code.

Screen Shot 2017-05-02 at 9.44.19 AM

Choose the Visual tab to continue editing your post and adding more content. Your chart should appear.

Screen Shot 2017-05-02 at 9.44.29 AM

Publish your post.

Storytelling with Data

 Distance Learning Module: From data to spreadsheets to visualizations

Screen Shot 2017-03-27 at 10.53.20 AM

It isn’t hyperbole: Journalists today have access to more data than ever before. But exposing the stories buried in the numbers remains a challenge. From election results, budgets and census reports to Facebook updates and image uploads, journalists need to know how to find the important trends in data and shape them into compelling narratives.

Why spreadsheets are essential tools for understanding data

Screen Shot 2017-03-27 at 10.53.31 AM

Here’s an example of a story emerging from data. Mc Nelly Torres, a journalist from the Florida Center for Investigative Reporting, pored over spreadsheets in front of her. They detailed Florida boating accidents from 2008 to 2011, using numbers provided by the Florida Fish and Wildlife Conservation Commission. While Torres already knew that her state ranked No. 1 in the nation for boating fatalities, she wanted to dig deeper.

Her analysis of the data uncovered an interesting pattern that led her to investigate not only boating deaths but also the multibillion-dollar recreational maritime industry, its campaign donations to members of the Florida Boating Safety Advisory Council and recent legislation that failed to address the dangerous reality behind the grim statistics. Her spreadsheets showed that the vast majority of deaths occurred among boaters 35 years and older–a group that was exempt from legislation mandating boating-safety instruction. The law that aimed to make boaters safer didn’t apply to those with the

highest fatality rates. Here was a story that needed to be told, and Torres told it on NBC6 in Miami in 2013.

Screen Shot 2017-03-27 at 10.53.39 AM

Torres’ discovery of this story in a spreadsheet is just one example of the utility of data-analysis skills. Given all the data available to us today, spreadsheet skills are especially critical for journalists. A spreadsheet is a simple application but a powerful tool. It can reveal a pattern that fights conventional wisdom. It can confirm a hunch. It can highlight outliers–sharp deviations from the norm–that demand further investigation. As Torres’ NBC6 report demonstrated, a complex story of death, money, and politics can emerge from a simple page of numbers.

For more about how spreadsheets are integral to reporting, head over to the Investigative Reporters and Editors site and browse their posts in Behind the Stories.

In the upcoming unit on data journalism, you’ll learn to work with data using spreadsheets. You’ll learn how you can turn rows and columns of numbers into something that reveals important trends. We’ll cover the basics of spreadsheeting, but it is not meant to be a comprehensive survey of every spreadsheet function. We’ll teach you only what you need to know to get you up and running with your data. You’ll learn about data types and file formats. You’ll learn how to import numbers into spreadsheets and how to structure the data so you can manipulate and explore it. You’ll also learn how to look at and write about numbers responsibly. We’ll discuss normalization so that your comparisons make sense, and we’ll talk about percentage change, means and medians.

Don’t fret if you’re not good at math. Using spreadsheets doesn’t involve anything more than simple arithmetic and algebra. If you can add, subtract and use decimals, you’re in good shape.

We’ll also look at the sexier front-facing side of data journalism, which are the visualizations. The charts, graphs, and maps are the finished products of your explorations and analyses of the data. Developing data visualizations isn’t something that’s left for the graphics department, but a responsibility for the journalist to effectively communicate your story.

It’s important to remember that “data journalism” has two important sides to it–one being the exploratory side, where you use data to find patterns, trends, and stories, and the second being the explanatory side, where you use or visualize the data to communicate to your readers.

What is Data?

Screen Shot 2017-03-27 at 10.53.47 AM

We can’t analyze data unless we know what we’re dealing with. But it’s tough to come up with a single definition of data because it’s such a broad term. (Perhaps it’s not unlike the definition of art: You know it when you see it.) Difficult or not, we need to try.

We all can agree that data is information about the “real world,” collected and recorded. Writing on Poynter.org, Troy Thibodeaux, the former editor of interactive newsroom technology at the Associated Press, said this about the discipline of data journalism:

Real data journalism comes down to a couple of predilections: a tendency to look for what is categorizable, quantifiable and comparable in any news topic and a conviction that technology, properly applied to these aspects, can tell us something about the story that is both worth knowing and unknowable in any other way.

This turns out to be a great working definition for data, because it defines data in terms of what we want to do with it rather than what it is. We want to quantify data: How many Americans are unemployed? We also want to categorize data: How many of the unemployed are women? How many are men? Finally, we want to make comparisons: How does the number of women unemployed now compare with the number of women unemployed a year ago?

Looking at Visualizations

Let’s look at some data visualizations and ask critical questions of them to see if they are successful. Examine Bloomberg’s visualization of Trump’s budget proposal:

Screen Shot 2017-03-27 at 11.54.30 AM

The key to any visualization is that it should make COMPARISONS. Always ask yourself, “compared to what”? The Bloomberg budget visualization doesn’t simply present the quantities–so many million for this department and so many million for that department–but it makes smart comparisons. What are those comparisons? The graphs compare the budget from one department to the next, but more importantly, it compares the change in the proposed budget from its current budget. As readers and citizens, we want to know what department is getting more funding, and what department is getting less (here shown as green vs. red). Hence, the change is the most important metric to show. Critically, it’s not just the change in numbers, but its the change in percentage. The EPA is being defunded by 2.6 billion dollars, but the impact is clearer when we understand that that change is 31.4% of its total budget. (Percent change is measured along the horizontal axis) Which department’s budget is seeing the greatest percentage increase? Which department’s budget is increasing the most in terms of dollars? Does the visualization make answering these questions an easy task?

The next visualization is from the Guardian, which presents the 2015 data for the Congressional representatives across the nation.

Screen Shot 2017-03-27 at 11.57.30 AM.png

Successful visualizations don’t just let you compare, but they also let you personalize the data. How does the data relate to me? We call this “navel gazing” because we assume readers are most interested in their own situation (aka, they like to look at their own belly buttons). The Guardian’s interactive lets you choose categories (gender, race, education, etc.) to reveal who the folks in Congress are “most like you”. Does this visualization encourage personal exploration of the data?

Finally, explore the 3D interactive visualization of arms import and export, created for Google.

Screen Shot 2017-03-27 at 12.02.28 PM

The interactive is an amazing display, allowing you to spin the globe and choose any country to show the amount of small arms import and export to and from other countries. But step back from the bells and whistles and try to identify what the reader would most likely want to do with this information. We would want to make comparisons. Here’s a question: I want you to compare the exports between China and the United States. Is it easy? Click on the U.S. and you see that the Export bar is half way up at 0.61 billion. Now click on China and you see that their Export bar is all the way to the top but at 58.1 million. Does this make it easy to compare? Visually, the orange bars aren’t consistent because the scale between countries varies (why is the higher number a smaller bar?). We must instead rely on the actual numbers. But if the actual graphics in a data visualization doesn’t help the reader, what good is it?

Your distance learning class assignment is the same as your homework assignment, which will be due next class: Pick one data visualization and critique on your blog with a 300-word post: is the graphical presentation effective at communicating the information? What is the takeaway message? Does it encourage exploration? Is it misleading? How are colors used? Does the form afford accurate comparisons?

How to Find Data

There’s a lot of data out there. But where do you start to find what you need?

Some basic strategies that work pretty well: Google it. That’s never a bad place to start and it only takes a second. (And use Google in a smart way. Use key words specific to your data. Use filetype: to narrow your search for specific file types. For example, use filetype: csv for only csv file formats. Use the results to dig deeper and discover related agencies that may have the data).

  • Figure out who should have the data? Who might have it? Is this information only the NYPD or the IRS can collect? The Departments of City Planning, Buildings, Housing, Finance and Taxation all keep tabs on who owns property in New York City, where that property is located and what it can be used for. If you know who ought to have the numbers you’re looking for, you can start your search by asking them.
  • Look at recent reporting about the subject. Who has been releasing reports? Who has been cited in stories? Go ask them for data, or ask them for help finding it.
  • Wikipedia is a fantastic resource. Don’t be afraid of it. Most information there comes with a citation — don’t take some Wikipedia author’s word for it, but do look at the source they cited and confirm that the numbers are there.
  • Look for think tanks and aid organizations that specialize in the issue you’re interested in.
  • Ask a librarian

Know your sources

You can get data anywhere, so it is up to you to decide whether or not you’re working with reliable data. You should know where your sources are coming from — do they have an agenda that can help you understand how they’re framing the data they put out? You can roughly guess who is behind NRA Institute for Legislative Action, but what about Law Center to Prevent Gun Violence? Don’t assume that a think tank is reliable just because it kind of feels professional.

A famous example is the misleading website www.martinlutherking.org. Though the site appears to be an informational site about the civil rights leader Martin Luther King, Jr., it actually is a mouthpiece for the white supremacist group Stormfront.org. You can verify the ownership of domain sites using www.betterwhois.com.

Provenance

It is also up to you to know where your data is coming from. Did the organization hire a research firm to conduct a comprehensive study? Or did they post a little box on their website asking visitors how they feel?

Be skeptical: an advocate (or government agency) insisting that these numbers mean something doesn’t make it so.

Where to look?

The Journalism School’s Research Center maintains an excellent roundup of guides, many of which will point you to great data sets. Check out the census, business and crime guides in particular.

NICAR’s database library is a great resource. So is Amanda’s tumblr’s “data sources” tag.

Here’s a working guide from last semester: https://github.com/amandabee/cunyjdata/wiki/Where-to-Find-Data

Spreadsheets Walkthrough

Delimiters and Functions with Flu Data

To review spreadsheet basics, download the data from Google’s Flu Trends. The data is just text with a lot of commas. The goal is to get your data into tidy rows and columns in a spreadsheet, so you can start looking for interesting patterns or trends.

  1. Copy all the data (Cmd+A, then Cmd+C) and in Excel, paste the data (Cmd+V).
  2. The pasted data appears all in one column.

  1. Select column A, and choose Data > Text to Columns.

  1. The Convert Text to Columns Wizard appears, which allows you to define the “delimiter”, or the character that separates columns of data. Choose the comma as your delimiter. You can see a preview of the re-formatted data in the preview window below.

  1. In the next section of the Text to Column dialog box, you can format the date data so it’s preserved as Year-Month-Date.

  1. Now that all the data has been successfully transferred to your spreadsheet, you can start using formulas and/or other sorting/ filtering functions to explore the data. Let’s find the maximum number of flu searches in each country. At the bottom of the spreadsheet, at the end of column B, enter =Max(). All formulas begin with the equals sign, then the function name, and then a pair of parentheses. In between the parentheses, enter the range of data from which you want to find the maximum. You can select the cells with your mouse, or enter the beginning and end cell, separated with a colon.

  1. To extend your Max function to ALL your columns, simply click the bottom-right corner of the cell and drag it to the right. The function is extended and Excel “auto-increments” the range so the maximum is determined for each appropriate column.

Now that you know the maximum value for each country, you can create an =Max() function to identify which country (column) has the maximum. Your range would be the cells in the row that displays the max values.

You don’t always need to use functions. Use Data > Sort (Shift+Command+R) to sort your spreadsheet along a single column, in either descending or ascending order. That’s an easy way to re-order your data to see maximums and minimums.

Screen Shot 2013-09-12 at 4.30.55 PM

Screen Shot 2013-09-12 at 4.31.27 PM

Charting and Visual Encoding

These are the studies and readings that we discussed in class regarding visual encoding of your data:

Summary findings of encodings, from most accurate to least accurate:

  1. Position
  2. Length
  3. Angle
  4. Area
  5. Density and color saturation
  6. Color hue

Know some of the common chart types:

  • Bar charts: Trends for categories
  • Line charts: Trends for continuous series/continuous changes between x-axis (time series)
  • Scatter plot: correlation
  • Bubble plot: scatter plot + additional variable
  • Pie chart: show proportions
  • Area charts/stacked graphs: proportions

Excel Charts to SVG

Bitmap formats (JPEG, PNG, BMP, GIF) are images that are displayed with pixels, or tiny colored dots. Vector graphics, on the other hand, are rendered by the computer based on mathematical formulas and code. For example, a circle would be defined by a radius of a certain size, the color of the outline, the color of the filled inside, its location, and so on. Because vector graphics are defined mathematically, they don’t have a fixed resolution. You can zoom in or out of a vector image, and the display remains sharp. If you zoom into a bitmap file, you’ll see the pixels that make up the image.An example of charts rendered as vector graphics, from the Guardian.

 

Microsoft Excel can output charts as JPEG, GIF, BMP, or PNG (bitmap formats), or as PDF (vector). You can convert the PDF file into an SVG (Scaleable Vector Graphics), which is a common format for displaying vector graphics in a browser. Your images will be sharper and easier to edit in a vector graphics editing program like Adobe Illustrator.

 

  1. In Microsoft Excel, size your chart (width and height) in the FORMAT tab.
    Screen Shot 2014-03-25 at 10.21.59 AM
  2. Select the text in the horizontal and vertical axes, and choose Format Selection to change the Font to Arial. The default Excel font is a bit unusual and sometimes throws off the conversion of text in later steps.
    Screen Shot 2014-03-25 at 10.23.27 AM
  3. Right-click (Ctrl-click) on the chart frame and choose Save As Picture.
    Screen Shot 2014-03-25 at 10.22.30 AM
  4. Save your chart picture as a PDF.
    Screen Shot 2014-03-25 at 10.22.41 AM
  5. Open the PDF in Adobe Illustrator. If you’re comfortable with Illustrator, you can edit your chart.
    Screen Shot 2014-03-25 at 10.46.38 AM
  6. Choose File > Save As, and save as a SVG (compressed optional).
  7. If you don’t have access to Illustrator, you can use the free online tool CloudConvert(https://cloudconvert.org/svg-to-pdf), which can convert your PDF to SVG. The only downside is that you can’t edit your graphics.
  8. The resulting SVG file is simply an HTML text file that describes the shapes for your chart. You can open it up in a browser, or copy and paste it into another HTML document to display your chart.
    Screen Shot 2014-03-25 at 10.52.10 AM

Unfortunately, you can’t just paste the SVG code directly into a WordPress post, as WP (out of the box) doesn’t support SVG formats. There are plug-ins that allow SVG uploads and other hacks, but a simple solution is to upload your SVG to DigitalStorage and iframe it into your WP post, like this: (but WordPress.com doesn’t allow certain iframes)

http://russellchun.com/cuny/datajournalism/samplevector.html