Running and visualizing regression models
March 5, 2024
brms
+ other packagesAn analytical example: A test for disease \(D\) has 95% accuracy in positive results. \(D\) occurs in 0.1% of the population. What’s \(P(D|+)\)?
Bayes’ rule
\[ {P(\pi|data)} = \frac{{P(data|\pi)}{P(\pi)}}{P(data)} \]
Defining our values
\[\begin{align*} P(D) &= \frac{1}{1000} \\ P(\neg D) &= 1 - P(D) = \frac{999}{1000} \\ P(+|D) &= 0.95 \quad P(+|\neg D) = 0.05\\ \end{align*}\]
Calculating our posterior probability
\[\begin{align*} P(D|+) &= \frac{P(+|D) \times P(D)}{{P(+)}} \\ {P(+)} &= P(+|D) \times P(D) + P(+|\neg D) \times P(\neg D) \\ &= 0.95 \times 0.001 + 0.05 \times 0.999 = 0.0509 \\ P(D|+) &= \frac{0.95 \times 0.001}{0.0509} \\ &\boxed{\approx 1.87\%} \end{align*}\]
The stronger our priors, the harder it is to change our conclusions given the data. Let’s see this visually
Our methods should ideally incorporate what we already know about a given phenomenon (what the literature has already shown)
A simple algorithm
brms
(Bürkner, 2018)brms
allows us to use the familiar model syntax in RA simple model using Stan directly
data {
int<lower=0> N; // number of data points
vector[N] x; // predictor variable
vector[N] y; // outcome variable
}
parameters {
real alpha; // intercept
real beta; // slope
real<lower=0> sigma; // standard deviation of the errors
}
model {
// Priors
alpha ~ normal(0, 10); // Gaussian prior for intercept
beta ~ normal(0, 10); // Gaussian prior for slope
sigma ~ normal(0, 1); // Gaussian prior for standard deviation
// Likelihood
y ~ normal(alpha + beta * x, sigma); // Likelihood function for linear regression
}
tidyverse
+ some specific packages to help us run and visualize Bayesian modelsdanish
dataset (Winther Balling & Harald Baayen, 2008)Subject | Word | Affix | LogRT | RT |
---|---|---|---|---|
2s20 | vE6ltede | ede | 6.538169 | 691.02 |
2s16 | ruinere | ere | 6.664702 | 784.23 |
2s16 | dyrkede | ede | 6.370569 | 584.39 |
2s01 | vandrende | ende | 6.566841 | 711.12 |
2s18 | lyttede | ede | 6.485597 | 655.63 |
lm()
to run our model herebar
)Our model
\[\hat{y_i} = \hat\beta_0 + \hat\beta_{ede}x_i + \hat\beta_{ende}x_i + \hat\beta_{ere}x_i + \hat\beta_{lig}x_i + \hat{e}_i\]
Log RT | |||
Predictors | Estimates | CI | p |
(Intercept) | 6.80 | 6.77 – 6.82 | <0.001 |
Affix [ede] | -0.05 | -0.09 – -0.01 | 0.008 |
Affix [ende] | 0.05 | 0.02 – 0.09 | 0.006 |
Affix [ere] | 0.02 | -0.02 – 0.06 | 0.354 |
Affix [lig] | -0.06 | -0.10 – -0.02 | 0.001 |
Observations | 1040 | ||
R2 / R2 adjusted | 0.045 / 0.041 |
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = LogRT ~ Affix, data = dan)
$Affix
diff lwr upr p adj
ede-bar -0.05099340 -1.034311e-01 0.001444266 0.0612021
ende-bar 0.05413199 -5.440768e-05 0.108318388 0.0503757
ere-bar 0.01781355 -3.468447e-02 0.070311574 0.8863835
lig-bar -0.06136498 -1.139239e-01 -0.008806100 0.0127046
ende-ede 0.10512539 5.129924e-02 0.158951542 0.0000012
ere-ede 0.06880695 1.668084e-02 0.120933061 0.0029895
lig-ede -0.01037158 -6.255898e-02 0.041815820 0.9827775
ere-ende -0.03631844 -9.020339e-02 0.017566517 0.3500510
lig-ende -0.11549697 -1.694412e-01 -0.061552722 0.0000001
lig-ere -0.07917853 -1.314266e-01 -0.026930484 0.0003598
brms
: Family: gaussian
Links: mu = identity; sigma = identity
Formula: LogRT ~ Affix
Data: dan (Number of observations: 1040)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 6.80 0.01 6.77 6.82 1.00 2383 2828
Affixede -0.05 0.02 -0.09 -0.01 1.00 2808 2998
Affixende 0.05 0.02 0.01 0.09 1.00 2878 2880
Affixere 0.02 0.02 -0.02 0.06 1.00 2889 2780
Affixlig -0.06 0.02 -0.10 -0.02 1.00 2883 3171
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.20 0.00 0.19 0.21 1.00 4633 2965
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
#
post1_raw = as_draws_df(fit1a) |>
as_tibble()
post1_raw = post1_raw |>
select(contains("b_")) |>
rename("bar*" = b_Intercept,
"ede" = b_Affixede,
"ere" = b_Affixere,
"ende" = b_Affixende,
"lig" = b_Affixlig)
post1 = post1_raw |>
pivot_longer(names_to = "Parameter",
values_to = "Estimate",
cols = `bar*`:lig)
post1_summary = post1 |>
group_by(Parameter) |>
mean_qi(Estimate) |>
ungroup()
ROPE1 = rope(fit1a) |>
as_tibble()
Hover over numbers for annotations
#
ggplot(data = post1 |> filter(Parameter != "bar*"),
aes(x = Estimate, y = Parameter)) +
annotate("rect", ymin = -Inf, ymax = +Inf,
xmin = ROPE1$ROPE_low,
xmax = ROPE1$ROPE_high,
alpha = 0.3,
fill = "gray90",
color = "white") +
stat_halfeye(fill = "#B6B7DB", .width = c(0.5, 0.95)) +
theme_classic(base_family = "Futura") +
coord_cartesian(xlim = c(-0.2, 0.2)) +
labs(y = NULL,
x = "Posterior distribution",
title = "Effects relative to intercept (affix **bar**)") +
theme(plot.title = ggtext::element_markdown()) +
geom_label(data = post1_summary |> filter(Parameter != "bar*"),
aes(x = Estimate, y = Parameter, label = Parameter),
family = "Futura",
# position = position_nudge(y = -0.25),
label.padding = unit(0.4, "lines")) +
theme(axis.text.y = element_blank(),
axis.ticks.y = element_blank()) +
geom_text(data = post1_summary |>
filter(Parameter != "bar*") |>
mutate(across(where(is_double), ~round(., 2))),
aes(label = glue::glue("{Estimate} [{.lower}, {.upper}]"), x = Inf),
hjust = "inward", vjust = -0.5, family = "Futura", color = "gray60") +
geom_vline(xintercept = 0, linetype = "dashed", color = "black")
post1_raw_mc = post1_raw |>
mutate(ede_ende = (`bar*` + ede) - (`bar*` + ende),
ede_ere = (`bar*` + ede) - (`bar*` + ere),
ede_lig = (`bar*` + ede) - (`bar*` + lig),
ere_lig = (`bar*` + ere) - (`bar*` + lig),
ere_ende = (`bar*` + ere) - (`bar*` + ende),
ende_lig = (`bar*` + ende) - (`bar*` + lig)) |>
select(-`bar*`) |>
rename("bar_ede" = ede,
"bar_ende" = ende,
"bar_lig" = lig,
"bar_ere" = ere)
# Now, wide-to-long transform:
post1_mc = post1_raw_mc |>
pivot_longer(names_to = "Comparison",
values_to = "Estimate",
cols = bar_ede:ende_lig)
# Create some point and interval summaries for our posteriors:
post1_summary_mc = post1_mc |>
group_by(Comparison) |>
mean_qi(Estimate) |>
ungroup()
ggplot(data = post1_mc,
aes(x = Estimate, y = fct_reorder(Comparison, Estimate))) +
annotate("rect", ymin = -Inf, ymax = +Inf,
xmin = ROPE1$ROPE_low,
xmax = ROPE1$ROPE_high,
alpha = 0.3,
fill = "gray90",
color = "white") +
geom_vline(xintercept = 0, linetype = "dashed", color = "black") +
stat_halfeye(fill = "#B6B7DB", .width = c(0.5, 0.95)) +
theme_classic(base_family = "Futura") +
coord_cartesian(xlim = c(-0.2, 0.4)) +
labs(y = NULL,
x = "Posterior distribution for each comparison",
title = "**Multiple comparisons** of posterior distributions",
subtitle = "All but **two** comparisons display a credible statistical difference",
caption = "Cf. ANOVA + TukeyHSD, where half of all comparisons had p > 0.05") +
theme(plot.title = ggtext::element_markdown(),
plot.subtitle = ggtext::element_markdown()) +
geom_label(data = post1_summary_mc,
aes(x = Estimate, y = Comparison, label = Comparison),
family = "Futura", size = 3,
# position = position_nudge(y = -0.3),
label.padding = unit(0.4, "lines")) +
theme(axis.text.y = element_blank(),
axis.ticks.y = element_blank()) +
geom_text(data = post1_summary_mc |>
mutate(across(where(is_double), ~round(., 2))),
aes(label = glue::glue("{Estimate} [{.lower}, {.upper}]"), x = Inf),
hjust = "inward", vjust = -0.5, family = "Futura", color = "gray60", size = 3)
brms
, we can use the same familiar syntax from lme4
brms
. The R Journal, 10(1), 395–411. https://doi.org/10.32614/RJ-2018-017
Updating our posterior probability
\[\begin{align*} P(D|+) &= \frac{P(+|D) \times P(D)}{{P(+)}} \\ {P(+)} &= P(+|D) \times P(D) + P(+|\neg D) \times P(\neg D) \\ &= 0.95 \times \boxed{0.0187} + 0.05 \times 0.9813 = \boxed{0.06683} \\ P(D|+) &= \frac{0.95 \times \boxed{0.0187}}{0.06683} \\ &\boxed{\approx 26.58\%} \end{align*}\]