Properties and scales

Understanding how properties and scales work in ggvis an important step to mastering basic static plots, and is also necessary for great interactive graphics.

In this chapter, you will learn:

the convenient props() wrapper which makes the most common types of property mappings available through a concise interface
the prop() function, which is more verbose, but gives you full control over all options
what scales do, how properties and scales are connected, how you can override the defaults.

Note that unlike ggplot2, scales do not control the appearance of their guides: see the axes and legends vignette for how to customise their display.

The `props()` wrapper

Every ggvis mark is associated with a set of properties that governs how it is displayed. These properties can be constant values (like 5, “blue”, or “square”), or mapped to variables in your dataset. ggplot2 syntax made a distinction between mapping variables and setting constants. For example, in ggplot2, you might say:

geom_point(aes(x = wt, y = mpg), colour = "red", size = 5)

But in ggvis, everything is a property:

layer_paths(x = ~wt, y = ~mpg, stroke := "red", strokeWidth := 5)

This section introduces props(), a convenient function for creating property objects and binding them to property names. The next chapter shows you how to create the property objects by hand, giving you more control over the specification at the cost of some extra typing.

Mapping vs setting (scaled vs unscaled)

In a data graphic, there is a mapping between values and visual properties. A value is something like 5, or “red”, or the numbers in a column of a data frame. A visual property is something like the x-position, y-position, size, or color of a point, rectangle, or other visual object.

Compared to ggplot2, the controls in ggvis may be a little confusing. However, ggplot2 has a number of special cases that are handled under the hood. Once you get the hang of props(), you should find it simpler than aes() and various ways of setting aesthetics in ggplot2.

The props() function creates objects that govern the relationship between values and properties. One important characteristic is whether the value-property relationship is scaled or unscaled. These are sometimes called mapping (for scaled values) and setting (unscaled values). When the relationship is scaled, the data values go through a mapping function that results in values in the visual property space.

For example, suppose that you want to use a variable in a data set on the x axis, and the data values are numbers from 55 to 75. If the relationship is scaled, then the data value 55 is typically mapped to an x position on the left side of the plot, the data value 75 is mapped to an x position on the right side of the plot, and values in between are linearly mapped somewhere between those positions.

If the relationship is unscaled, then the data values 55 to 75 would be used directly by the rendering engine. In the case of ggvis and Vega, a value of 55 means “55 pixels from the left edge”, and a value of 75 means “75 pixels from the left edge”. No matter how the plot is resized, those pixel positions would remain the same.

The props() function uses the = operate for mapping (scaled), and the := operator for setting (unscaled). It also uses the ~ operator to indicate that an expression should be evaluated in the data (and in ggvis, the data can change); without the ~ operator, the expression is evaluated immediately in the current environment. Generally speaking, you’ll want to use ~ for variables in the data, and not use it for constant values.

Here are some examples of how to use = (mapping) and := (setting), as well as ~ (evaluated in data) or not.

props(x = ~displ, y = ~mpg): map engine displacement (in the data) to x and miles per gallon to y
props(stroke := "red", fill := "blue"): set the stroke colour to red and the fill colour to blue.
props(x = ~displ, y := 10): map displacement to xand set the y position to 10 pixels (for unscaled y values, the origin is at the top).

Those examples follow a common pattern: = is always scaled, and := is never scaled. ~ is always used with the name of the variable. What if you try the opposite?

props(x = 0): sets the x position to the data value 0

It’s also possible provide a scaled constant instead of a raw constant. That’s useful when you want to label different layers in a plot:

mtcars %>% ggvis(x = ~wt, y = ~mpg) %>%
  layer_points() %>%
  layer_model_predictions(model = "lm", stroke = "lm") %>%
  layer_smooths(stroke = "loess")

(Note: this isn’t currently supported in ggvis because of a limitation of vega. See https://github.com/rstudio/ggvis/issues/29 for progress.)

Valid properties

Not all marks support all properties. The complete list of all properties is available in ?marks, and mark functions will check that you’ve supplied them valid properties, and will endeavour to provide you a helpful suggestion:

mtcars %>% ggvis() %>% layer_lines(strke = ~cyl)
#> Error: Unknown properties: strke. Did you mean: stroke?
mtcars %>% ggvis(strke = ~cyl) %>% layer_lines()
#> Error: Unknown properties: strke. Did you mean: stroke?

Capture of local variables

Variable properties can refer to both variables in the dataset and variables in the local environment:

df <- data.frame(x = 1:10)
f <- function(n) {
  df %>% ggvis(x = ~x, y = ~x ^ n) %>% layer_paths()
}
f(1)

f(2)

f(4)

Technically, ggvis uses the environment captured by the formula when it is created, which may be important if you’re generating formulas in one function and using them in another. You can always override the environment by calling prop() and supplying the env argument.

`prop()`

A prop has two key properties:

value: which can be a constant, the name of a variable (or an expression involving one or more variables), or an interactive input. An interactive input must yield either a constant, or an variable name/expression.
the scale: if scaled, a vega scale is in charge of converting the data value to something that makes sense as a visual property. If unscaled, the value is used as is.

Unscaled is the equivalent of using scale_identity in ggplot2.

Type	Scaled	Common use
variable	yes	Map variable to property
variable	no	Variable already contains visual values
constant	yes	Annotation position in data space
constant	no	Pixel location, or exact colour

Special evaluation and variables

prop() doesn’t do any special evaluation which means that you if you want a variable, you need to supply the name of a property, and a quoted expression or a one-sided formula:

prop("x", quote(mpg))

#> <variable> mpg (property: x, scale: x, event: update)

prop("y", ~cyl)

#> <variable> cyl (property: y, scale: y, event: update)

If you have the name of a variable as a string, you can convert it a name with as.name():

var <- "mpg"
prop("x", as.name(var))

#> <variable> mpg (property: x, scale: x, event: update)

If you have an R expression as a string, parse() it then extract the first element of the result:

expr <- "mpg / wt"
prop("x", parse(text = expr)[[1]])

#> <variable> mpg/wt (property: x, scale: x, event: update)

Properties -> scales

Like in ggplot2, scales control the mapping between data values and values interpreted by the drawing device. Scales are added to the top-level plot specification and are usually created with dscale (short for default scale):

# Override the default data limits:
mtcars %>% ggvis(~disp, ~wt) %>%
  layer_points() %>%
  scale_numeric("x", domain = c(50, 500), nice = FALSE) %>%
  scale_numeric("y", domain = c(0, 6), nice = FALSE)

Compared to ggplot2, ggvis has far fewer scales (3 vs 70), with each function doing much more. The three basic scales in ggvis, corresponding to the three basic vega scales are:

scale_quantitative: for quantitative, numeric values
scale_ordinal: for qualitative, categorical values
scale_time: for date/time values

The vega scales are in turn relatively simple wrappers for D3 scales so if you can’t find the details in these docs, or in the vega docs, you may need to look in the D3 docs. Fortunately the arguments are by and large named the same across all three systems, although ggvis uses underscores while vega and D3 use camelCase.

Each (scaled) property needs a scale. By default, scales are added so that every scaled property gets a scale with the same name as the property (with a few exceptions y2 to y, x2 to x, fillOpacity and strokeOpacity to opacity and so on.). See ?add_default_scales for details.

Scale arguments

The scales share the following arguments:

name: a string identifier for the scale. By default, this is the name of the property it is associated with - i.e. the scale for the x values is called “x”, but it doesn’t have to be, and the examples below show some cases where you need this extra flexibility.
domain: the input data values to the scale. If left blank, these will be learned from the properties that use this scale. But you can override it if you want to expand or restrict, or you want to match domains across multiple plots so that they’re easier to compare. domain is equivalent to limits in R graphics. For quantiative scales, you can use a missing value to only constrain one side of the domain.
range: the output visual values from the scale. For quantiative scales this is usually a vector of length two, for ordinal scales, it’s a vector the same length as the domain. Vega interprets some special values: “width”, “height”, “shapes”, “category10” (10 categorical colours).
reverse: a boolean flag that will flip the order of the range.

scale_quantitative and scale_time also share a few other properties:

nice: use nice axis labels? The algorithm is described in the D3 document
zero: include a zero in the scale?
clamp: clamp any values outside the domain to the min/max?

Default scales

Because the range of a scale is usually determined by the type of variable, ggvis provides the dscale function to automatically retrieve a default scale given a property name and variable type:

scale_numeric(vis, "x")
scale_numeric(vis, "y")
scale_nominal(vis, "shape")

You can also provide other arguments to the underlying scale:

scale_numeric(vis, "x", domain = c(10, 100))
scale_numeric(vis, "x", trans = "log")

So dscale() is usually a better way to create new scales that starting from the underlying scale objects.

Custom scales

You can add your own scales for properties that don’t otherwise have defaults. For example, imagine you wanted to use the font of a label to represent some data. There’s no default scale for font, but you could create one yourself:

df <- data.frame(x = runif(5), y = runif(5),
  labels = c("a", "b", "b", "a", "b"))
df %>% ggvis(~x, ~y, text := ~labels, font = ~labels, fontSize := 40) %>%
  layer_text() %>%
  scale_ordinal("font", range = c("Helvetica Neue", "Times New Roman"))

Note the use of text := ~labels: we don’t want to scale the labels - the raw values already make sense in the visual space.

Multiple scales for one property

Generally, you will override the default name of a scale in order to use more scales than the default. You could do this in order to create a dual-axis chart (which is generally a bad idea - read this paper for more details). If you do this, you will also need to add a scale object.

mtcars %>% ggvis(y = ~mpg) %>%
  layer_points(prop("x", ~disp, scale = "xdisp")) %>%
  layer_points(prop("x", ~wt, scale = "xwt"), fill := "blue") %>%
  add_axis("x", "xdisp", orient = "bottom") %>%
  add_axis("x", "xwt", orient = "bottom", offset = 20,
    properties = axis_props(labels = list(fill = "blue")))

Multiple properties for one scale

You could also force ggplot2 to use the same scale for properties that would otherwise use different scales. I’m not sure of a useful example of this, except to force stroke and fill to use the same scale:

df <- data.frame(x = 1:5, y = 1:5, a = runif(5), b = -runif(5))

df %>% 
  ggvis(x = ~x, y = ~y, stroke = ~a, fill = ~b, 
    strokeWidth := 5, size := 1000) %>%
  layer_points() %>%
  add_legend("stroke", properties = legend_props(legend = list(y = 50)))

df %>% 
  ggvis(x = ~x, y = ~y, stroke = ~a, prop("fill", ~b, scale = "stroke"),
    strokeWidth := 5, size := 1000) %>%
  layer_points() %>%
  add_legend("stroke", properties = legend_props(legend = list(y = 50)))

In this case we don’t need to manually add the correct scale, because ggvis has detected it for us automatically.

Property values

Vega renders either svg or canvas, but fortunately most properties are shared across svg or canvas. The following list describes what the property values mean and the set of allowable values.

x, x2, width, y, y2, height, strokeWidth, innerRadius, outerRadius: pixels. Note that by convention (0, 0) is located in the top-left, so y values are relative to the top of the screen and x values are relative to the left of the screen (as opposed to R where (0,0) is on the bottom right). Pixel positions should be greater than 0.
size: area, in pixels. Greater than 0.
opacity, fillOpacity, strokeOpacity: a number between 0 and 1
stroke, fill: colours
startAngle, endAngle: angle in radians
interpolate: “linear”, “step-before”, “step-after”, “basis”, “basis-open”, “cardinal”, “cardinal-open”, “monotone”. See the D3 docs for what they mean.
tension: a number between 0 and 1 that controls a tension parameter to some interpolants.See the D3 docs for more details.
url: a url.
align: “left”, “right”, “center”.
baseline: “top”, “middle”, “bottom”
text: a string
dx, dy: pixel offsets from anchor point
angle: an angle in degrees
font: the name of a font available from the browser.
fontWeight: a font weight, as a string (“normal”, “bold”, “lighter”, “bolder”) or number (100, 200, 300, 400, 500, 600, 700, 800, 900).
fontStyle: “normal”, “italic”, “oblique”