# Plotting vowels in ggplot2

phonetics
vowels
ggplot2
R
My tutorial on how to create vowel plots using ggplot2.
Author

Guilherme D. Garcia

Published

October 28, 2023

Created: September, 2018. Last updated: January 24, 2024

This post is a copy of my tutorial on plotting vowels, which has been on my website since 2018.

Just like anything else in R, there are different options to plot vowels. There are, for example, some specific packages you can use (`phonR` and `vowels`). But you can easily plot vowels without these packages, simply by using `ggplot2`—which may be useful if you’re already familiar with the package. If you’d like to create a simple vowel trapezoid, go here.

### Step 1: Basics

For this example, I’ll create some vowels (random F1 and F2 values taken from a normal distribution), but you can load some existing data, of course (`phonR`, for example). First, let’s see what a typical `ggplot` looks like.

Code
``````library(tidyverse)

set.seed(10)

vowels = tibble(vowel = rep(c("a", "e", "i", "o", "u"), each = 50),

F1 = c(rnorm(50, mean = 800, sd = 100),
rnorm(50, mean = 600, sd = 100),
rnorm(50, mean = 350, sd = 100),
rnorm(50, mean = 600, sd = 100),
rnorm(50, mean = 350, sd = 100)),

F2 = c(rnorm(50, mean = 1500, sd = 150),
rnorm(50, mean = 2000, sd = 150),
rnorm(50, mean = 2500, sd = 150),
rnorm(50, mean = 1000, sd = 150),
rnorm(50, mean = 800, sd = 150)))

ggplot(data = vowels, aes(x = F2, y = F1, color = vowel, label = vowel)) +
geom_text() +
theme_classic() +
theme(text = element_text(size = 13))``````

### Step 2: Axes

#### Reversed values

The very first problem with the plot above is that our axes must be reversed. Not only that: ideally, you’d want both F1 and F2 to start at the top-right corner of the plot, just like any typical vowel plot you see in papers.

Code
``````
ggplot(data = vowels, aes(x = F2, y = F1, color = vowel, label = vowel)) +
geom_text() +
scale_y_reverse() +
scale_x_reverse() +
theme(legend.position = "none",
text = element_text(size = 13)) +
theme_classic()``````

#### Axis position

It’s very easy to shift the axes: simply add a positional argument to `scale_x_reverse()` and `scale_y_reverse()`. It’s even easier if you use the `formants()` function from the `Fonology` package.

Code
``````library(Fonology)
ggplot(data = vowels, aes(x = F2, y = F1, color = vowel, label = vowel)) +
geom_text() +
formants() +
theme(legend.position = "none",
text = element_text(size = 13)) +
theme_classic()``````

Everything else is straightforward. You can now adjust the formatting, add some error bars etc. If you don’t know how to do that, keep reading.

### Step 3: Extras

#### Density plot

You could use the `geom_density_2d()` to highlight the density of the vowels.

Code
``````
ggplot(data = vowels, aes(x = F2, y = F1, color = vowel, label = vowel)) +
geom_text() +
formants() +
geom_density_2d() +
theme(legend.position = "none",
text = element_text(size = 13)) +
theme_classic()``````

#### Double error bars

Another thing you could do is use the mean F1 and F2 values along with their standard errors. That would give you two error bars, one for each dimension/variable. There are different ways to do that. For that, let’s use `geom_errorbar()` and `geom_errorbarh()`.

Code
``````# First, create summary table (tibble) with means and standard errors
# I'm using dplyr here (since I loaded tidyverse above)

means = vowels %>%
group_by(vowel) %>%
summarize(meanF1 = mean(F1),
meanF2 = mean(F2),
seF1 = sd(F1)/sqrt(n()),
seF2 = sd(F2)/sqrt(n()))``````

Now that we have all the information we need, we can just go ahead and plot the vowel means and associated standard errors.

Code
``````ggplot(data = means, aes(x = meanF2, y = meanF1, color = vowel)) +
geom_errorbar(aes(ymin = meanF1 - seF1,
ymax = meanF1 + seF1),
width = 0, linewidth = 1) +
geom_errorbarh(aes(xmin = meanF2 - seF2,
xmax = meanF2 + seF2),
height = 0, linewidth = 1) +
formants() +
theme(legend.position = "none",
text = element_text(size = 13)) +
theme_classic()``````

Ok, this looks good, but we have to fix one crucial thing: how do we want to signal the vowels…? One option is to add the vowels themselves to the plot. We probably don’t want them to be right in the middle of the error bars (since they would need to be big, and could therefore hide the actual bars).

You can add the vowels with `geom_text()` or `geom_label()`, and then adjust its position so that it doesn’t hide the bars (note that you need an addition `aes()` argument, namely, `label`). Another issue you probably want to fix is the presence of a legend (key), which is completely redundant given that we’re using `geom_text()`.

Code
``````
ggplot(data = means, aes(x = meanF2, y = meanF1, label = vowel)) +
geom_errorbar(aes(ymin = meanF1 - seF1,
ymax = meanF1 + seF1),
width = 0, linewidth = 1) +
geom_errorbarh(aes(xmin = meanF2 - seF2,
xmax = meanF2 + seF2),
height = 0, linewidth = 1) +
geom_text(position = position_nudge(x = 50, y = 50),
size = 5, color = "black") +
formants() +
theme_classic() +
theme(text = element_text(size = 13))``````

This looks better. You can naturally adjust the fontface, color etc. Finally, let’s adjust the labels (note the `\n` to break a line) and add Hz to our axes.

Code
``````library(scales)

ggplot(data = means, aes(x = meanF2, y = meanF1, label = vowel)) +
geom_errorbar(aes(ymin = meanF1 - seF1, ymax = meanF1 + seF1), width = 0, size = 1) +
geom_errorbarh(aes(xmin = meanF2 - seF2, xmax = meanF2 + seF2), height = 0, size = 1) +
geom_text(position = position_nudge(x = 50, y = 50), size = 5, color = "orange") +
scale_y_reverse(position = "right", labels = unit_format(unit = "Hz", sep = "")) +
scale_x_reverse(position = "top", labels = unit_format(unit = "Hz", sep = "")) +
labs(x = "F2\n",
y = "F1\n") +
theme_classic() +
theme(text = element_text(size = 13))
#> Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.

### Final details

Finally, let’s revisit the density plot and adjust its formatting as well. Note that I’m changing the font size, adding some transparency to the actual density layer (so it doesn’t get too cluttered), and controlling the axes a bit better (values and breaks).

Code
``````ggplot(data = vowels, aes(x = F2, y = F1, color = vowel, label = vowel)) +
geom_text(size = 6) + # Font size for vowels
scale_y_reverse(position = "right",
labels = unit_format(unit = "Hz", sep = ""),
breaks = seq(100, 1000, 250)) +
scale_x_reverse(position = "top",
labels = unit_format(unit = "Hz", sep = ""),
breaks = seq(200, 3000, 500)) +
labs(x = "F2\n",
y = "F1\n",
title = "Final plot (A)") +
geom_density_2d(alpha = 0.3) +
coord_cartesian(xlim = c(3000, 200),
ylim = c(1000, 100)) +
theme_classic() +
theme(legend.position = "none",
plot.title = element_text(hjust = 0.5), # Center plot title
text = element_text(size = 13))``````

Now with semi-transparent ellipses.

Code
``````ggplot(data = vowels, aes(x = F2, y = F1, color = vowel, label = vowel)) +
geom_text(size = 6) + # Font size for vowels
scale_y_reverse(position = "right",
labels = unit_format(unit = "Hz", sep = ""),
breaks = seq(100, 1000, 250)) +
scale_x_reverse(position = "top",
labels = unit_format(unit = "Hz", sep = ""),
breaks = seq(200, 3000, 500)) +
labs(x = "F2\n",
y = "F1\n",
title = "Final plot (B)") +
stat_ellipse(type = "norm", alpha = 0.3) +
coord_cartesian(xlim = c(3000, 200),
ylim = c(1000, 100)) +
theme_classic() +
theme(legend.position = "none",
text = element_text(size = 13),
plot.title = element_text(hjust = 0.5))``````

Finally, let’s keep the ellipses but only show the mean F1-F2 values for each vowel (this will give us a more minimalist plot). To accomplish this, `geom_label()` will need the `means` variable created above (but `stat_ellipse()` will still require `vowels`, so you’ll need to play around with two separate datasets, as shown below).

Code
``````ggplot(data = means, aes(x = meanF2, y = meanF1, color = vowel, label = vowel)) +
geom_label(size = 6, fill = "white") + # Font size for vowels
scale_y_reverse(position = "right",
labels = unit_format(unit = "Hz", sep = ""),
breaks = seq(100, 1000, 250)) +
scale_x_reverse(position = "top",
labels = unit_format(unit = "Hz", sep = ""),
breaks = seq(200, 3000, 500)) +
labs(x = "F2\n",
y = "F1\n",
title = "Final plot (C)") +
stat_ellipse(data = vowels, aes(x = F2, y = F1), type = "norm") +
coord_cartesian(xlim = c(3000, 200),
ylim = c(1000, 100)) +
theme_classic() +
theme(legend.position = "none",
plot.title = element_text(hjust = 0.5),
text = element_text(size = 13))``````
You can find more info on plotting vowels using `ggplot2` on this blog post.