TreeCovery: Coordinated dual treemap visualization for exploring the Recovery Act

https://doi.org/10.1016/j.giq.2011.07.004Get rights and content

Abstract

The American Recovery and Reinvestment Act dedicated $787 billion to stimulate the U.S. economy and mandated the release of the data describing the exact distribution of that money. The dataset is a large and complex one; one of its distinguishing features is its bi-hierarchical structure, arising from the distribution of money through agencies to specific projects and the natural aggregation of awards based on location. To offer a comprehensive overview of the data, a visualization must incorporate both these hierarchies. We present TreeCovery, a tool that accomplishes this through the use of two coordinated treemaps. The tool includes a number of innovative features, including coordinated zooming and filtering and a proportional highlighting technique across the two trees. TreeCovery was designed to facilitate data exploration, and initial user studies suggest that it will be helpful in insight generation. RATB (Recovery Accountability and Transparency Board) has tested TreeCovery and is considering including the concept in their visual analytics.

Highlights

► The Recovery Act money distribution is characterized by bi-hierarchical structure. ► Two coordinated treemaps offer a comprehensive overview of the data set. ► It features coordinated zooming and proportional highlighting techniques.

Introduction

In February 2009, President Obama signed an economic stimulus package into law, dedicating $787 billion to create jobs and boost the economy, with the provision that the distribution of the money would be completely transparent. The growing recognition of the importance of design excellence in e-government applications (Fedorowicz & Dias, 2010) has raised attention to models for measuring user satisfaction (Verdegem & Verleye, 2009) and usability guidelines for e-government websites (Donker-Kuijer, de Jong, & Lentz, 2010). These concerns have taken on increased importance as the Obama administration expands its data availability efforts under the Open Government Directive. In fulfillment of this requirement, the agencies in charge of distributing the money and all recipients issued periodic reports detailing how the money they controlled was spent. These publicly available reports comprise a large amount of data, containing information about the effectiveness of the stimulus package, the general trends of distribution, and potentially interesting outliers.

Some effort has already been expended toward producing visualizations of this data that could assist in revealing such details. The government commissioned a website, Recovery.gov, dedicated to this purpose, and several independent journalism outlets have produced their own applications, all offering a particular take on the data. Most of the existing visualizations consist primarily of either tabular or geographical displays. The goals of this effort were to geographically display the distribution, allocation, and expenditure of stimulus recovery funds.

While the data lends itself well to geographical layout, given that states and counties are convenient schemas for chunking the data, exclusive use of maps cannot adequately portray alternate views of the monetary distribution. Specifically, money was distributed through 28 agencies, which assigned it to projects at their discretion; funding was placed in the charge of the prime recipient, who in turn funded sub-recipients and/or vendors as necessary for the project. Agencies naturally funded projects nationwide, and recipients for each project were not necessarily all located in the same area. This view of the data—an agency > project > recipient hierarchy—cannot be adequately conveyed by a geographical substrate.

Our tool, TreeCovery, offers a way to explore data both geographically and according to the monetary outlays. TreeCovery accomplishes this goal through the use of two coordinated treemaps, one drawn with a geographic hierarchy and the other one with levels corresponding to the agency > project > recipient money flow. While the views presented by the two treemaps differ, the underlying data remains identical at all times. Filtering is coordinated across the views and a proportional highlighting technique is used for coordination.

In addition to the coordinated treemap design, we incorporated a few other features to improve exploration techniques. From news articles about the Recovery Act we found many of them using demographical statistics such as population or unemployment rates. We thus included census data for each county, and made it possible to filter by demographic attributes. We also added the ability to save snapshots of the current state of the treemap for3 later comparison. Finally, we included support for emphasizing invalid data values.

The goals of this effort were to enable users to

  • Explore the allocation of stimulus recovery funds by agency and states/counties;

  • Identify extreme cases such as concentration of spending by an agency in one state or county;

  • Discover unusual patterns of spending that show inequities by region, state, and county;

  • Understand which agencies were most active in their state or county;

  • Facilitate error detection and omissions in the data.

Of course, a powerful interface would also enable other tasks, which were beyond the initial planning of the designers. The larger goals are to empower policy makers, journalists, and citizen groups to have increased capacity to explore key data sets that are tied to national priorities.

TreeCovery has been tested by RATB and received positive feedback about the concept. Hopefully features of TreeCovery will be included into next version of RATB's visual analytic platform.

Treemaps are among the growing set of information visualization tools that could increase the analytic capabilities of government agency staffers, political analysts, journalists, and other interested citizens. The capacity to identify interesting patterns, clusters, gap, outliers, and others features is increasingly important in detecting fraud, ensuring fair allocation of resources, and refining policies to ensure effective use of public funds.

Section 2 discusses related work, while Section 3 provides an explanation of our analysis process, including a detailed illustration of Spotfire's2 ability to support exploration of the Recovery Act data. Section 4 explains TreeCovery in detail, while Section 5 offers some sample insights found by the tool. 6 Future work, 7 Implications and recommendations suggest future work and offer conclusions.

Section snippets

Related work

Because the stimulus information is both newsworthy and publically available, many visualizations of the data are already available. First and foremost, recovery.gov offers geographical maps displaying award locality (Fig. 1). The maps can be zoomed in to state and zip code levels and show dots each representing a project colored by its award type—contract, grant, and loan. The site also offers some pie and bar chart summaries, as well as tabular data. While the basic information is thus

Analysis and methodology

To design our tool, we first needed to determine the chief goals of stimulus data visualization. As recipient reports of the Recovery Act had just been released, it was not easy to find end-users who had already done extensive work on the data. Thus, instead of using direct interviews or a survey, we decided to do reverse-engineering on relevant news articles in order to understand the process of journalists analyzing it. Further, we analyzed the7 data with Spotfire, one of the most versatile

TreeCovery

We designed TreeCovery to be useful for investigative journalists and citizen watchdogs that have some domain knowledge and experience in data analysis. It streamlines the exploration process available through existing visualization techniques and adds more features for data analysis. This section elaborates on the development platform, data, and UI components of TreeCovery.

Insights

To demonstrate the utility of TreeCovery, we give three examples of finding insights.

Future work

While TreeCovery provides some innovative features and encompasses many exploration aids, it can, of course, be greatly improved. As observed during usability evaluation, we can make exploration easier by adding visual references of the size comparison between previous and current status. The shoebox feature could potentially allow more extensive comparison among saved treemaps if the saved views were more interactive. The next version of TreeCovery will allow the entire treemap to be saved and

Implications and recommendations

Our experience in implementing, showing, and evaluating TreeCovery demonstrates the capabilities of information visualization tools to enable policy makers, journalists, and citizen groups to conduct more effective explorations of policy-related data. TreeCovery is especially effective when there are dual hierarchies (e.g. geography and agency structures) and quantitative values (e.g. expenditures or jobs created). Users can find specific amounts for agencies and states/counties, compare

Conclusion

The American Recovery and Reinvestment Act provided for a substantial sum of money, $787 billion, to be distributed with the goal of economic stimulus. Tracking that distribution involves a large, multi-attribute set that can be organized as a dual hierarchy of money flow and geographical allocation. Many visualizations of the stimulus data have already been developed, but none of them adequately portray this dual hierarchy or offer flexible exploration capabilities. Our tool, TreeCovery, uses

Miguel Rios is a visualization scientist in Twitter Inc. There he builds tools to visualize and analyze Twitter's unique data sets and discovers insights that the company shares with the world. Before Twitter, Miguel worked as a research assistant in University of Maryland's Human Computer Interaction Lab where he also was a graduate student. Miguel's interest are large scale data visualization and information design.

References (20)

  • J. Fedorowicz et al.

    A decade of design in digital government research

    Government Information Quarterly

    (2010)
  • P. Verdegem et al.

    User-centered e-government in practice: A comprehensive model for measuring user satisfaction

    Government Information Quarterly

    (2009)
  • M. Burch et al.

    Trees in a treemap: Visualizing multiple hierarchies

  • M.W. Donker-Kuijer et al.

    Usable guidelines for usable website? An analysis of five e-government heuristics

    Government Information Quarterly

    (2010)
  • R. Donovan

    The 5 worst cities for urban youth — ABC news

  • M.A. Fisherkeller et al.

    Prim9: An interactive multidimensional data display and analysis system

  • A. Fredrikson et al.

    Temporal, geographical and categorical aggregations viewed through coordinated displays: A case study with highway incident data

  • A. Glantz

    Idaho gets four times more stimulus money in contracts than Louisiana — NAM

  • J. Heer et al.

    Prefuse: A toolkit for interactive information visualization

  • M. Jern et al.

    Treemaps and choropleth maps applied to regional hierarchical statistical data

There are more references available in the full text version of this article.

Cited by (10)

  • Analyzing e-government design science artifacts: A systematic literature review

    2022, International Journal of Information Management
    Citation Excerpt :

    Governments have been leveraging ICTs to conducting evidence-based policy making and evaluation, grounding policy making in more reliable knowledge and data with the assistance of digital tools (Sanderson, 2002). For instance, studies within our literature review propose design artifacts to facilitate water restriction policy analysis (Recio, Ibáñez, Rubio, & Criado, 2005), tax policy evaluation (Goumagias, Hristu-Varsakelis, & Saraidaris, 2012) and the visualization of complex policies (e.g., the American Recovery and Reinvestment Act) (Rios-Berrios, Sharma, Lee, Schwartz, & Shneiderman, 2012). Finally, three papers (4.55%) fall under the category of e-democracy.

  • Visualizing trace of Java collection APIs by dynamic bytecode instrumentation

    2017, Journal of Visual Languages and Computing
    Citation Excerpt :

    The visualization is used to picture a bi-hierarchical financial dataset to see distribution of money. Features like coordinated zooming, feature highlighting, and filtering are also implemented in TreeCovery [6]. Authors in [14] report ResultMaps; a hierarchical treemap representation of digital library search.

  • Quantifying and optimizing visualization: An evolutionary computing-based approach

    2017, Information Sciences
    Citation Excerpt :

    Authors in [45] find treemap an effective visualization technique to view hierarchies in the data represented as a tree. TreeCovery, a treemap-based visualization is proposed in [7]. The visualization is used to show bi-hierarchical financial dataset to highlight the distribution of money.

  • A Sunburst-based hierarchical information visualization method and its application in public opinion analysis

    2016, Proceedings - 2015 8th International Conference on BioMedical Engineering and Informatics, BMEI 2015
View all citing articles on Scopus

Miguel Rios is a visualization scientist in Twitter Inc. There he builds tools to visualize and analyze Twitter's unique data sets and discovers insights that the company shares with the world. Before Twitter, Miguel worked as a research assistant in University of Maryland's Human Computer Interaction Lab where he also was a graduate student. Miguel's interest are large scale data visualization and information design.

Puneet Sharma holds a Masters Degree in Computer Science from University of Maryland College Park. His research interests include information visualization, Software Engineering and Computer Systems.

Tak Yeon Lee is a Ph.D. candidate in the area of Computer Science. He also holds a Master of Science in Industrial Design Engineering at Delft University of Technology, the Netherlands.

Rachel Schwartz graduated from the University of Maryland with an MSc in Computer Science in 2010. She is currently at Google.

Ben Shneiderman (http://www.cs.umd.edu/~ben) is a Professor in the Department of Computer Science and Founding Director (1983–2000) of the Human-Computer Interaction Laboratory (http://www.cs.umd.edu/hcil/) at the University of Maryland. He was elected as a Fellow of the Association for Computing (ACM) in 1997, a Fellow of the American Association for the Advancement of Science (AAAS) in 2001, and a Member of the National Academy of Engineering in 2010. He received the ACM SIGCHI Lifetime Achievement Award in 2001. He is the co-author with Catherine Plaisant of Designing the User Interface: Strategies for Effective Human-Computer Interaction (5th ed., 2010) http://www.awl.com/DTUI/. With Stu Card and Jock Mackinlay, he co-authored Readings in Information Visualization: Using Vision to Think (1999). His latest book, with Derek Hansen and Marc Smith, is Analyzing Social Media Networks with NodeXL (www.codeplex.com/nodexl, 2010).

1

The first four authors contributed equally to the project.

View full text