My research combines approaches from visualization, cognitive psychology, and AI to address topics related to presenting complex data and other information to the public.

Uncertainty Representations that Work

Many people, including analysts, find uncertainty and probability difficult to reason about when working with data. The use of null hypothesis significance testing is increasingly criticized; however we lack good representations of uncertainty that can provide analysts and readers of scientific literature with ''cognitive evidence'' for understanding variation, reliability, and related statistical concepts. Most uncertainty visualizations are static and present probability continuously, as in density plots or CDFs, or using difficult to understand constructs like confidence intervals. I have pioneered the use of hypothetical outcome plots (HOPs): animated visualizations in which each frame presents a draw from a distribution describing uncertainty in a parameter. Perceptual psychology shows that a frequency encoding does not require conscious effort to interpret, while judgment and decision making demonstrates how framing probabilities as frequency (e.g., 3 out of 10 rather than 30%) eases interpretation for novices and experts alike. Through controlled empirical studies of inference and perceptual decision making, I have shown how HOPs vastly improve multivariate probability estimation and make people more sensitive to the underlying trend in noisy data without requiring the same levels of training as more conventional static plots.

In some settings, animated plots are not feasible. I have also worked to develop non-animated discrete outcome encodings of uncertainty information, such as a quantile dotplot: a discrete outcome representation of a probability density function. Collaborators and I have shown how this encoding better supports everyday decisions and reasoning, including when to leave for the bus and judgments about how reliable the effect reported in media coverage of a scientific study is.

animated visualization of uncertaintyinteractive quantile dotplots

Representative Contributions

Integrating Prior Knowledge in Visualization Interaction

Users' prior knowledge undoubtedly impacts the con- clusions they draw from data. However, visualization design and evaluation techniques rarely account for prior beliefs. My research shows how enabling users to articulate their predictions of data via graphical elicitation before they see the observed data in a visualization can improve their ability to understand and recall the data. Properties of the alignment between a person's prior beliefs, the data, and others' (visualized) beliefs can be used to predict how people will update their beliefs. I am currently developing Bayesian models of visualization cognition. By comparing posterior beliefs about a visualized phenomena to normative beliefs under Bayesian inference, a Bayesian approach more precisely explains why some visualizations designs perform poorly with greater precision than existing evaluations, such as by quantifying discrepancies in actual and perceived data sample size and indicating where people may be reasoning with a few salient samples from their prior experiences ather than full subjective uncertainty distributions.

Representative Contributions

Formalizing Analogical Reasoning

People regularly rely on analogies to relate unfamiliar in- formation to something more familiar. My work has contributed databases and automated algorithms to generate measurement analogies to facilitate understanding of unfamiliar measurements (e.g., ''300 gal is about the volume of a hot tub'', ''59 acres is twice the size of Millenium Park'' for a reader in Chicago). I have also developed tools to support analogical reasoning when working with multiple visualizations of related data, such as sequenced visualizations or dashboards. I proposed and evaluated a constraint-based model of multiple view consistency that can be used to address the inconsistent use of visual encodings across views (e.g., the same color means different things across views, the same field is plotted with multiple different axes ranges across views, etc.) that arises from system defaults based on single view effectiveness guidelines. In both lines of work, I proposed effectiveness criteria, which I instantiated in automated algorithms that explicitly model trade-offs between conflicting criteria to produce solutions that mimic manually authored representations by expert designers and journalists.

The distances and areas listed in a news article are re-expressed into measurements relevant to a location provided by the user.

Representative Contributions

Visualization-based Communication

Visualizations are often used to communicate about data, for example in the media where interactive graphics are designed to supplement text articles, and in communicating analysis in research and industry settings. Presentation order, annotation, and consistency in the design of multiple visualizations are just a few considerations that impact communication-oriented visualization.

Most visualization tools, however, focus on supporting analysis. I have worked in the area of narrative visualization, or the use of statistical graphics to tell stories around data. One aim of my work in this area is to develop better tools for automatically and semi-automatically constructing such visualizations. For example, how can we develop semi-automated algorithms that help suggest good presentation orders and designs to a narrative visualization designer? Can we automatically construct annotated visualizations to make it easier for news and other organizations to generate visualizations to contextualize text articles?

Annotated Visualization exampleAnnotated Visualization showing flow of general to specific

Representative Contributions

Talking About Science

Scientific jargon makes scientific research inaccessible to lay people and scientists in other fields. Text simplification approaches can help, but we lack usable tools that can help authors (such as science journalists or bloggers) and readers access possible simplifications of jargon when they need them. We are developing interfaces that use word embeddings and other methods to learn simplifications (e.g., mappings between complex and simple terms) from large corpora and then suggest them on demand as a person reads or writes about science.

Related Contributions