Data to play with
Converging stacked bar chart: PISA attitudes toward reading
Source: North American (Canada, Mexico, and the United States) results from the 2009 Programme of International Student Assessment (PISA), provided by the OECD.
For me, reading is a waste of time |
42.2% |
40.6% |
11.0% |
6.1% |
I only read if I have to |
22.8% |
35.9% |
30.5% |
10.7% |
Reading is one of my favorite hobbies |
20.3% |
36.3% |
31.9% |
11.4% |
I feel happy if I receive a book as a present |
19.3% |
27.7% |
40.2% |
12.9% |
Dot plot: Kindergarten readiness intervention
Source: Data invented by Stephanie Evergreen
Literacy |
34 |
69 |
Language |
63 |
77 |
Mathematics |
67 |
75 |
Science |
92 |
98 |
Creative arts |
96 |
100 |
Slopegraph or dot plot: Bank profits head east
Source: The Economist, July 3, 2012
Asia Pacific |
18.9% |
53.9% |
North America |
26.5% |
23.4% |
Middle East and Africa |
4.2% |
6.5% |
Latin America |
2.5% |
6.5% |
Western Europe |
46.2% |
6.3% |
Central and Eastern Europe |
1.8% |
3.4% |
Resources
Books
- Alberto Cairo, The Truthful Art: Data, Charts, and Maps for Communication (Berkeley, California: New Riders, 2016).
- Stephanie D. H. Evergreen, Effective Data Visualization: The Right Chart for the Right Data (Thousand Oaks, CA: Sage, 2017).
- Dona M. Wong, The Wall Street Journal Guide to Information Graphics: The Dos and Don’ts of Presenting Data, Facts, and Figures (London: W. W. Norton & Company, 2010).
- Hadley Wickham and Garrett Grolemund, R for Data Science: Import, Tidy, Transform, Visualize, and Model Data (Sebastopol, California: O’Reilly Media, 2017). [FREE online]
- Alberto Cairo, The Functional Art: An Introduction to Information Graphics and Visualization (Berkeley, California: New Riders, 2013).
- Robin Williams, The Non-Designer’s Design & Type Books: Design and Typographic Principles for the Visual Novice, Deluxe Edition. (Berkeley, California: Peachpit Press, 2008).
Interesting and excellent real world examples
How to select the appropriate chart type
Many people have created many useful tools for selecting the correct chart type for a given dataset or question. Here are some of the best:
- The Data Visualisation Catalogue: Descriptions, explanations, examples, and tools for creating 60 different types of visualizations.
- The Data Viz Project: Descriptions and examples for 150 different types of visualizations. Also allows you to search by data shape and chart function (comparison, correlation, distribution, geographical, part to whole, trend over time, etc.).
- The Chartmaker Directory: xamples of how to create 51 different types of visualizations in 31 different software packages, including Excel, Tableau, and R.
- R Graph Catalog: R code for 124 ggplot graphs.
- Emery’s Essentials: Descriptions and examples of 26 different chart types.
Helpful data visualization resources
Working with R and ggplot2
Pro-tip: Searching for help with R on Google can be tricky because the program is, um, a single letter. Try searching for “rstats” instead. If you use Twitter, post R-related questions and content with #rstats. The R community on StackOverflow is also incredibly kind and helpful.
R in the wild
A popular (and increasingly standard) way for sharing your analyses and visualizations is to post an annotated explanation of your process somewhere online. RStudio allows you to publish knitted HTML files directly to RPubs, but you can also post your output to a blog or other type of website. Reading these kinds of posts is one of the best ways to learn R, since they walk you through each step of the process and show the code and output.
Here are some of the best examples I’ve come across:
Data
- Kaggle: Kaggle hosts machine learning competitions where people compete to create the fastest, most efficient, most predictive algorithms. A byproduct of these competitions is a host of fascinating datasets that are generally free and open to the public. See, for example, the European Soccer Database, the Salem Witchcraft Dataset or results from an Oreo flavors taste test.
- 360Giving: Dozens of British foundations follow a standard file format for sharing grant data and have made that data available online.
- US City Open Data Census: More than 100 US cities have committed to sharing dozens of types of data, including data about crime, budgets, campaign finance, lobbying, transit, and zoning. This site from the Sunlight Foundation and Code for America collects this data and rates cities by how well they’re doing.
- Political science and economics datasets: There’s a wealth of data available for political science- and economics-related topics:
- François Briatte’s extensive curated lists: Includes data from/about intergovernmental organizations (IGOs), nongovernmental organizations (NGOs), public opinion surveys, parliaments and legislatures, wars, human rights, elections, and municipalities.
- Thomas Leeper’s list of political science datasets: Good short list of useful datasets, divided by type of data (country-level data, survey data, social media data, event data, text data, etc.).
- Erik Gahner’s list of political science datasets: Huge list of useful datasets, divided by topic (governance, elections, policy, political elites, etc.)
Colors
- Adobe Color: Create, share, and explore rule-based and custom color palettes.
- ColorBrewer: Sequential, diverging, and qualitative color palettes that take accessibility into account.
- Colorgorical: Create color palettes based on fancy mathematical rules for perceptual distance.
- Colorpicker for data: More fancy mathematical rules for color palettes (explanation).
- iWantHue: Yet another perceptual distance-based color palette builder.
- ColourLovers: Like Facebook for color palettes.
- Photochrome: Word-based color pallettes.