Skip to main content

Data Visualization

  • Chapter
  • First Online:
  • 7264 Accesses

Abstract

A visual is successful when the information encoded in the data is efficiently transmitted to an audience. Data visualization is the discipline dedicated to the principles and methods of translating data to visual form. In this chapter we discuss the principles that produce successful visualizations. The second section illustrates the principles through examples of best and worst practices. In the final section, we navigate through the construction of our best-example graphics.

The drawing shows me at one glance what might be spread over ten pages in a book.

— Ivan Turgenev, Fathers and Sons

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Three great examples:

  2. 2.

    If you are building interactive graphics or large-scale graphics via the web, there are better tools. Check out bootstrap, D3, and crossfilter.

  3. 3.

    When ordering is a problem, it is often referred to as the “Alabama First!” problem, given how often Alabama ends up at the top of lists that are thoughtlessly put, or left, in alphabetical order. Arrange your lists, like your factors, in an order that makes sense.

  4. 4.

    Kernels are an interesting side area of statistics and we will encounter them later in the chapter when we discuss loess smoothers. In order for a function to be a kernel, it must integrate to 1 and be symmetric about 0. The kernel is used to average the points in a neighborhood of a given value x. A simple average corresponds to a uniform kernel (all points get the same weight). Most high-performing kernels uses weights that diminish to 0 as you move further from a given x. The Epanechnikov kernel, which drops off with the square of distance and goes to zero outside a neighborhood, can be shown to be optimal with respect to mean square error. Most practitioners use Gaussian kernels, the default in R.

  5. 5.

    In practice d is almost always 1 or 2.

  6. 6.

    The purpose of these last two questions is to learn how to extract the bandwidth information from R directly.

References

  1. C.A. Brewer, G.W. Hatchard, M.A. Harrower, Colorbrewer in print: a catalog of color schemes for maps. Cartogr. Geogr. Inf. Sci. 30 (1), 5–32 (2003)

    Google Scholar 

  2. W.S. Cleveland, S.J. Devlin, Locally-weighted regression: an approach to regression analysis by local fitting. J. Am. Stat. Assoc. 83 (403), 596–610 (1988)

    Article  MATH  Google Scholar 

  3. D. Sarkar, Lattice Multivariate Data Visualization with R (Springer Science Business Media, New York, 2008)

    MATH  Google Scholar 

  4. S.J. Sheather, M.C. Jones, A reliable data-based bandwidth selection method for kernel density estimation. J. R. Stat. Soc. B 53, 683–690 (1991)

    MathSciNet  MATH  Google Scholar 

  5. H. Wickham, ggplot2: Elegant Graphics for Data Analysis (Use R!) (Springer, New York, 2009)

    Book  MATH  Google Scholar 

  6. H. Wickham. ggplot2 (Springer, New York, 2016)

    Book  MATH  Google Scholar 

  7. L. Wilkinson, The Grammar of Graphics, 2nd edn. (Springer, New York, 2005)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Steele, B., Chandler, J., Reddy, S. (2016). Data Visualization. In: Algorithms for Data Science. Springer, Cham. https://doi.org/10.1007/978-3-319-45797-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45797-0_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45795-6

  • Online ISBN: 978-3-319-45797-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics