Instructor: Kelly Kaufhold, Ph.D.

Data Journalism icon

Week of April 6

Description
Hi, everyone, and welcome to Data Journalism! You’ll learn about different types of data, how media organizations tell stories with data, and how to find your own data!

I’m Kelly Kaufhold and I worked as a multimedia journalist for 20 years and used data in my reporting on politics, science and business. But data, and telling stories with data, have both been around since WAY before I started reporting!

Data basically comes in two forms:

1) Discrete, or individual data – one case or person at a time, like property, tax or arrest records;

2) Aggregate, or grouped data – many cases on one topic all at once, like unemployment rates, FBI Uniform Crime Statistics or the U.S. Census.

In the in-person session, I’ll show you some examples of discrete data which you may find useful working with student journalists and I’ll have a variety of examples in the video and links for this week, but for this module and section of the PhDigital Bootcamp we’re going to concentrate on aggregate data.

First, watch this video introducing you to telling stories with data, including some history and examples; then see the suggested homework, below.

During our eventual in-person lecture, we’ll get a lot more hands-on and you’ll do much more than find and sort data; you’ll learn how to compare two or three variables at once in Pivot Tables; how to check your data and your analyses; and I’ll show you some visualization tools that quickly let you turn a spreadsheet into an interactive visual.

As you might expect, a large number of data stories and sources have emerged to track COVID-19; some of it vitally important, like the Johns Hopkins site tracking cases worldwide:

https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6

And the New York Times tracking U.S. cases and outcomes:

https://www.nytimes.com/interactive/2020/world/coronavirus-maps.html

And here are three twists on data surrounding coronavirus – privacy and a bit of levity.

The company Unacast tracks, bundles and sells cellphone behavior data to marketers (without our permission). Unacast came up with a way to track our compliance with social isolation. Here’s a Washington Post article about what they found, including a link to the company’s “Social Distancing Scorecard” tool.

A hint of good news: Global emissions are way down because we’ve all been self-isolating, staying off the roads, shuttering factories, etc. It’s a painful economic catastrophe but has been good for our air and water.h

Finally, some much-needed humor: Because of our social distancing, much behavior has declined dramatically (working, school attendance, traffic, eating out…). Guess what has increased pretty dramatically? Home alcohol consumption!

Here are a few recent examples of news stories reported with data:

Michael Bloomberg spent half a billion dollars on his failed presidential campaign. How does that compare to your personal budget and spending?

https://www.washingtonpost.com/graphics/2020/politics/wealth-comparison/

Americans are leaving the East Coast and Midwest moving to the West Coast… derived from a moving company’s data

http://www.dailymail.co.uk/news/article-5229083/Moving-study-People-continuing-west.html

More gun sales = more accidental gun deaths

https://www.washingtonpost.com/news/to-your-health/wp/2017/12/07/surge-in-gun-sales-after-sandy-hook-shooting-led-to-spike-in-accidental-gun-deaths-study-says/

And we’ll end with another example of tracking cell phone data (without your permission) to analyze your Thanksgiving habits!

https://www.politico.com/magazine/story/2017/11/23/how-donald-trump-ruined-thanksgiving-215862

Finally, we’ll visit the Texas Tribune after the data lecture – the Trib has a whole page of interactive data you can peruse to tell your own story with their data:

https://www.texastribune.org/series/news-apps-graphics-databases/

I especially like their visualization of how much Texas universities spend on athletics (hint: compare Texas State University’s athletic budget with the University of Texas at Austin’s).

https://college-sports.texastribune.org/

Assignments:

You’ll have two assignments for this week:

1) Find an example of data reporting and share a link to it on the #datajournalism Slack channel;

2) Search for some data, find a dataset you like and share it with us; AND, write a short note on what you learned from looking at ”interrogating” the data.

You can post your data and thoughts, share any questions, comments or requests on our Slack channel for #datajournalism this week or you can also email me at kellykaufhold@txstate.edu

Here are some places to look for data, in addition to a Google search with filetype:xls

The National Institute for Computer-Assisted Reporting maintains a big list of sources for data

Propublica, which is an independent investigative journalism outlet which partners with major news outlets like the New York Times and Washington Post, also maintains a list of data sources

https://www.propublica.org/datastore/

U.S. Census (select the Data tab)

https://www.census.gov/en.html

U.S. Census data download center

https://factfinder.census.gov/faces/nav/jsf/pages/download_center.xhtml

U.S. Census data downloads by topic (income, education, population, race…)

https://www.census.gov/support/USACdataDownloads.html

Enigma is a repository of databases on lots of industries and governments

https://www.enigma.com/data

Harvard’s “Dataverse” collects links to lots of open data – public, free. This is largely a collection of data from academics, like Dr. Carter and Dr. Kaufhold, who choose to share their data with other researchers.

https://dataverse.org/

Cooperative Congressional Election Study (a DETAILED multi-year database of election details)

https://cces.gov.harvard.edu/

Pew Center for the People and the Press (lots of opinion surveys about current events)

http://www.people-press.org/datasets/

Data World provides a large, diverse clearinghouse of global data, especially around trade and economic issues (you can get 3 datasets for free)

This UK site also has links to lots of fun business-related datasets

Crime data

https://www.bjs.gov/rawdata.cfm

https://www.ucrdatatool.gov/

Government data links (one repository for TONS of federal government data on crime, health, manufacturing…)

Data on gun thefts in the U.S.

https://www.thetrace.org/missing-pieces-data/

=== The Art and Science of Data Driven Journalism – a whole book ===

This is a whole book on data journalism, written by experts and published by the Tow Center for Digital Journalism at the Columbia School of Journalism, Columbia University, New York

http://towcenter.org/wp-content/uploads/2014/05/Tow-Center-Data-Driven-Journalism.pdf