The goal of ggvis is to make it easy to build interactive graphics for exploratory data analysis. ggvis has a similar underlying theory to ggplot2 (the grammar of graphics), but it’s expressed a little differently, and adds new features to make your plots interactive. ggvis also incorporates reactive programming ideas drawn from shiny.
The graphics produced by ggvis are fundamentally web graphics and work very differently from traditional R graphics. This allows us to implement exciting new features like interactivity, but it comes at a cost. For example, every interactive ggvis plot must be connected to a running R session (static plots do not need a running R session to be viewwed). This is great for exploration, because you can do anything in your interactive plot you can do in R, but it’s not so great for publication. We will overcome these issues in time, but for now be aware that we have many existing tools to reimplement before you can do everything with ggvis that you can do with base graphics.
This vignette is divided into four main sections:
qvis()
.Each section will introduce to a major idea in ggvis, and point to more detailed explanation in other vignettes.
qvis()
Every ggvis graphic starts with a call to ggvis()
or qvis()
:
qvis()
is a quick way of getting your hands dirty with plotting.
ggvis()
makes fewer assumptions about what you’re trying to do. This gives you more control, but also means you need to type more to get a basic plot.
We’ll start with qvis()
because it lets us make graphics very quickly. You can learn more about ggvis()
in the data hierarchy vignette.
A basic qvis()
call creates a scatterplot. It has three arguments: a data set, the variable displayed on the x-axis and the variable on the y-axis.
qvis(mtcars, ~wt, ~mpg)
(If you’re not using RStudio, you’ll notice that this plot opens in your web browser. That’s because all ggvis graphics are web graphics, and need to be shown in the browser. RStudio includes a built-in browser so it can show you the plots directly.)
If you only supply a dataset and an x-position, qvis()
will draw a barchart or histogram depending on whether the x variable is categorical or continuous. We use ~
before the variable name to indicate that we don’t want to literally use the value of the wt
variable (which doesn’t exist), but instead we want we want to use the wt
variable inside in the dataset.
# A histogram
qvis(mtcars, ~wt)
#> Guess: transform_bin(binwidth = 0.13) # range / 30
# A barchart
qvis(mtcars, ~factor(cyl))
You can add more variables to the plot by mapping them to other visual properties like fill
, stroke
, size
and shape
.
qvis(mtcars, ~wt, ~mpg, stroke = ~vs)
qvis(mtcars, ~wt, ~mpg, fill = ~vs)
qvis(mtcars, ~wt, ~mpg, size = ~vs)
qvis(mtcars, ~wt, ~mpg, shape = ~factor(cyl))
If you want to make the points a fixed colour or size, you need to use :=
instead of =
. The :=
operator means to use a raw, unscaled value. This seems like something that qvis()
should be able to figure out by itself, but making it explicit allows you to create some useful plots that you couldn’t otherwise. See the properties and scales for more details.
qvis(mtcars, ~wt, ~mpg, fill := "red", stroke := "black")
qvis(mtcars, ~wt, ~mpg, size := 300)
qvis(mtcars, ~wt, ~mpg, shape := "cross")
As well as mapping visual properties to variables, or setting them to specific values, you can also connect them to interactive controls.
Please note that most of the interactive plots below will not work properly when viewing this document as a web page. The first example below will work because it has been manually added in an iframe, but the others will not. If you want to see them in action, run the code in R.
The following example allows you to control the size and opacity of points with two sliders:
slider_s <- input_slider(10, 100)
slider_o <- input_slider(0, 1, value = 0.5)
qvis(mtcars, ~wt, ~mpg, size := slider_s, opacity := slider_o)
You can also connect interactive components to other plot parameters like the binwidth of a histogram:
binwidth <- input_slider(0, 2, value = 1, step = 0.1)
qvis(mtcars, ~wt, binwidth = binwidth)
#> Warning: Can't output dynamic/interactive ggvis plots in a knitr document.
#> Generating a static (non-dynamic, non-interactive) version of plot.
Behind the scenes, interactive plots are built with shiny, and you can currently only have one running at a time in a given R session. To finish with a plot, press the stop button in Rstudio, or close the browser window and then press Ctrl + C in R.
As well as input_slider()
, ggvis provides input_checkbox()
, input_checkboxgroup()
, input_numeric()
, input_radiobuttons()
, input_select()
and input_text()
. See the examples in the documentation for how you might use each one.
You can also use keyboard controls with left_right()
and up_down()
. Press the left and right arrows to control the size of the points in the next example.
keys_s <- left_right(10, 1000, step = 50)
qvis(mtcars, ~wt, ~mpg, size := keys_s, opacity := 0.5)
#> Warning: Can't output dynamic/interactive ggvis plots in a knitr document.
#> Generating a static (non-dynamic, non-interactive) version of plot.
You can also add on more complex types of interaction like tooltips:
qvis(mtcars, ~wt, ~mpg) + tooltip(function(df) df$wt)
#> Warning: Can't output dynamic/interactive ggvis plots in a knitr document.
#> Generating a static (non-dynamic, non-interactive) version of plot.
You’ll learn more about complex interaction in the interactivity vignette.
The previous examples showed how to make scatterplots and barcharts, but of course you can create many other types of visualisations. You change the visualisation using the layers
parameter. There are two types of layer:
Simple, which include primitives like points, lines and rectangles.
Composite, which combine data transformations with simple layers.
There are five simple layers:
Points, layer_point()
. This is the default layer when you useqvis()
with two variables. It has properties x
, y
, shape
, stroke
, fill
, strokeOpacity
, fillOpacity
, and opacity
.
qvis(mtcars, ~wt, ~mpg)
qvis(mtcars, ~wt, ~mpg, layers = "point")
Paths and polygons, layer_path()
.
df <- data.frame(x = 1:10, y = runif(10))
qvis(df, ~x, ~y, layers = "path")
If you supply a fill
, you’ll get a polygon
t <- seq(0, 2 * pi, length = 100)
df <- data.frame(x = sin(t), y = cos(t))
qvis(df, ~x, ~y, fill := "red", layers = "path")
Filled areas, layer_area()
. Use properties y
and y2
to control the extent of the area.
df <- data.frame(x = 1:10, y = runif(10))
qvis(df, ~x, ~y, layers = "area")
qvis(df, ~x, ~y + 0.1, y2 = ~y - 0.1, layers = "area")
Rectangles, layer_rect()
. The location and size of the rectangle is controlled by the x
, x2
, y
and y2
properties.
df <- data.frame(x1 = runif(5), x2 = runif(5), y1 = runif(5), y2 = runif(5))
qvis(df, ~x1, ~y1, x2 = ~x2, y2 = ~y2, fillOpacity := 0.1, layers = "rect")
Text, layer_text()
. The text layer has many new options to control the apperance of the text: text
(the label), dx
and dy
(margin in pixels between text and anchor point), angle
(rotate the text), font
(font name), fontSize
(size in pixels), fontWeight
(e.g. bold or normal), fontStyle
(e.g. italic or normal.)
df <- data.frame(x = 3:1, y = c(1, 3, 2), label = c("a", "b", "c"))
qvis(df, ~x, ~y, text := ~label, layers = "text")
qvis(df, ~x, ~y, text := ~label, layers = "text", fontSize := 50)
qvis(df, ~x, ~y, text := ~label, layers = "text", angle := 45)
Four richer layers are:
layer_line()
which automatically orders by the x variable:
t <- seq(0, 2 * pi, length = 20)
df <- data.frame(x = sin(t), y = cos(t))
qvis(df, ~x, ~y, layers = "path")
qvis(df, ~x, ~y, layers = "line")
layer_histogram()
and layer_barchart()
which allow you to explore the distribution of continuous and discrete variables respectively. It bins the data then displays the results with layer_rect()
. This is the default layer when qvis()
is called with a dataset and an x
variable.
qvis(mtcars, ~wt)
#> Guess: transform_bin(binwidth = 0.13) # range / 30
qvis(mtcars, ~wt, layers = "histogram")
#> Guess: transform_bin(binwidth = 0.13) # range / 30
The most important parameter to layer_histogram()
is the bin width:
qvis(mtcars, ~wt, binwidth = 1)
qvis(mtcars, ~wt, binwidth = 0.1)
layer_smooth()
fits a smooth model to the data, and displays predictions with a line and confidence interval. It’s useful to highlight the trend in noisy data:
qvis(mtcars, ~wt, ~mpg, layers = "smooth")
#> Guessing method = loess
#> Guessing formula = y ~ x
You can control the degree of wiggliness with the span parameter:
span <- input_slider(0.2, 1, value = 0.75)
qvis(mtcars, ~wt, ~mpg, layers = "smooth", span = span)
#> Warning: Can't output dynamic/interactive ggvis plots in a knitr document.
#> Generating a static (non-dynamic, non-interactive) version of plot.
#> Guessing method = loess
#> Guessing formula = y ~ x
You can learn more about layers in the layers vignette.
Rich graphics can be created by combining multiple layers on the same plot. There are two ways to combine layers. The first and simplest way is to provide a vector of layer names to layer
:
qvis(mtcars, ~wt, ~mpg, layers = c("smooth", "point"))
#> Guessing method = loess
#> Guessing formula = y ~ x
This is ok for very simple plots, but it’s fundamentally limited because you can’t use different properties or parameters on different layers. A more flexible approach is to add layers to a base plot:
qvis(mtcars, ~wt, ~mpg) + layer_smooth()
#> Guessing method = loess
#> Guessing formula = y ~ x
You could use this approach to add two smoothers with varying degrees of wiggliness:
qvis(mtcars, ~wt, ~mpg) +
layer_smooth(span = 1, se = FALSE) +
layer_smooth(span = 0.3, se = FALSE)
#> Guessing method = loess
#> Guessing formula = y ~ x
#> Guessing method = loess
#> Guessing formula = y ~ x
There’s an important difference between qvis()
and individual layers. qvis()
does some magic to guess whether an argument is a property or an argument to a transformation. The individual layers don’t use this magic, so you need to identify properties by wrapping them in props()
:
qvis(mtcars, ~wt, ~mpg, layers = c("point", "smooth"), stroke := "red")
#> Guessing method = loess
#> Guessing formula = y ~ x
qvis(mtcars, ~wt, ~mpg) +
layer_smooth(props(stroke := "red"))
#> Guessing method = loess
#> Guessing formula = y ~ x
It’s confusing to read code that refers to properties in two different ways, so when you start adding on more complex layers it’s better to switch from qvis()
to ggvis()
:
ggvis(mtcars, props(x = ~wt, y = ~mpg)) +
layer_point() +
layer_smooth(props(stroke := "red"))
#> Guessing method = loess
#> Guessing formula = y ~ x
You’ll learn more about building up rich hierarchical graphics in data hierarchy.
There are also other optional components that you can include:
scales
, to control the mapping between data and visual properties. These are described in the properties and scales vignette.
legends
and axes
to control the appearance of the guides produced by the scales. See the axes and legends vignette for more details.