NOTE: This document is in progress. Feel free to suggest changes on the package’s GitHub page, or by email silas.tittes@gmail.com.
performr packageperformr implements a probabilistic Bayesian
hierarchical model (using Stan) to
predict tolerance/performance curves for a set of input taxa. The
manuscript that describes the method is in revision. This page is
intended to help those interested in using the model for their own
data.
Installing performr is done using the following
command:
#install performr
pak::pak("silastittes/performr")
#install cmdstanr and CmdStan (required to compile and run the Stan model)
install.packages("cmdstanr", repos = c("https://stan-dev.r-universe.dev", getOption("repos")))
cmdstanr::install_cmdstan()
#load other libraries used below
library(performr)
library(tidyverse)
library(ggridges)
theme_set(theme_minimal())
performr now uses cmdstanr as its Stan backend.
Unlike the previous rstan-based version, the Stan model is compiled on
first use rather than at package install time, so the install step above
should be fast. The cmdstanr::install_cmdstan() call
downloads and builds CmdStan — you only need to do this once. Run
cmdstanr::cmdstan_version() to confirm everything is set up
correctly.
performr depends on several R packages. These can take
some time to install, but usually goes smoothly as long as your R is up
to date! If you run into troubles, check the R session info at the
bottom of this page.
For demonstration and model validation, data can be simulated from the hyperpriors. The true parameter values for both hyper parameters and species-level parameters are returned in addition to a simulated data frame. All arguments have defaults, but fiddling with parameter values is encouraged.
I’ll demonstrate a data set with three species, 20 positions sampled along an environmental axis, and three replicates per species at each position along the axis (wouldn’t that be something):
set.seed(223524)
perf_df <- simulate_data(n_spp = 3, n_axis = 20, n_reps = 3)
Here are a few quick ways to look at the output:
head(perf_df$sim_data)
## # A tibble: 6 × 4
## x trait mu species
## <dbl> <dbl> <dbl> <int>
## 1 -4.72 0.0668 0.0389 1
## 2 -4.72 0.0471 0.0389 1
## 3 -4.72 0 0.0389 1
## 4 -4.32 0.0560 0.0983 1
## 5 -4.32 0.102 0.0983 1
## 6 -4.32 0.0726 0.0983 1
str(perf_df)
## List of 2
## $ true_params:List of 12
## ..$ shape1 : num [1:3] 3.84 3.17 2.25
## ..$ shape2 : num [1:3] 2.61 2.97 2.64
## ..$ stretch : num [1:3] 2.55 2.1 1.97
## ..$ x_min : num [1:3] -5.76 -5.39 -4.55
## ..$ x_max : num [1:3] 4.41 2.52 4.38
## ..$ nu : num [1:3] 17.1 33.6 26
## ..$ mu_shape1 : num 2.57
## ..$ mu_shape2 : num 2.91
## ..$ mu_stretch: num 1.75
## ..$ mu_min : num -5.39
## ..$ mu_max : num 4.22
## ..$ mu_nu : num 3.96
## $ sim_data : tibble [180 × 4] (S3: tbl_df/tbl/data.frame)
## ..$ x : num [1:180] -4.72 -4.72 -4.72 -4.32 -4.32 ...
## ..$ trait : num [1:180] 0.0668 0.0471 0 0.056 0.1024 ...
## ..$ mu : num [1:180] 0.0389 0.0389 0.0389 0.0983 0.0983 ...
## ..$ species: int [1:180] 1 1 1 1 1 1 1 1 1 1 ...
perf_df$sim_data %>%
mutate(species = factor(species)) %>%
ggplot(aes(x, trait, colour = species)) +
geom_point() +
geom_line(aes(x, mu))
Simulated data for three ‘species’ plotted with the curves used to generate the samples.
There are 3 columns of input data required to run the model, where each row represents an individual unit of sampling (name and order the columns as you please):
Integers representing the groups (or “species”) each sample belongs to – even if there is only one group.
An x-axis consisting of measurements from some environmental variable (soil moisture, temperature, etc.).
A measure of performance for each individual studied in that
environmental condition (trait). Rows with missing values must be
removed before fitting – tidyr::drop_na() is good for
this.
performr::stan_performance()Now that we have some example data, we can fit the model. Arguments that include “pr” are parameters for the priors (for brevity, I will gloss over prior details here and refer readers to the paper once available), which I’ve set to match the defaults of the simulation (so this is best case scenario for the model getting it right). Notice, I’ve set a seed to make the results reproducible, and I’ve reduced the total number of iterations to 200. The default iterations of 2000 (or more) should be used for analysis, but 200 is good for demonstration and troubleshooting purposes.
perf_out <-
stan_performance(
df = perf_df$sim_data,
response = trait,
treatment = x,
group_ids = species,
shape1_pr_mu = 2,
shape1_pr_sig = 1,
shape2_pr_mu = 2,
shape2_pr_sig = 1,
stretch_pr_mu = 1,
stretch_pr_sig = 1,
nu_pr_shape = 8,
nu_pr_scale = 3,
min_pr_mu = -5,
min_pr_sig = 1,
max_pr_mu = 5,
max_pr_sig = 1,
iter = 200,
seed = 7345
)
Stan may warn about max tree depth or divergent transitions. You can
pass max_treedepth (12–14 works well) and
adapt_delta (default 0.95) directly to
stan_performance() to address these. The optional
file_id argument saves the chain CSV files to your working
directory if you want to reload results later.
The output is the same as any Stan model.
knitr::kable(perf_out$summary(), digits = 3)
| variable | mean | median | sd | mad | q5 | q95 | rhat | ess_bulk | ess_tail |
|---|---|---|---|---|---|---|---|---|---|
| lp__ | -14.115 | -13.974 | 3.671 | 3.858 | -20.399 | -8.725 | 1.033 | 179.938 | 173.501 |
| shape1[1] | 3.593 | 3.553 | 0.363 | 0.342 | 3.053 | 4.224 | 1.020 | 322.327 | 171.916 |
| shape1[2] | 3.329 | 3.315 | 0.197 | 0.198 | 3.026 | 3.689 | 1.006 | 500.888 | 297.427 |
| shape1[3] | 2.438 | 2.435 | 0.092 | 0.090 | 2.297 | 2.604 | 1.006 | 309.818 | 275.726 |
| shape2[1] | 3.201 | 3.057 | 0.789 | 0.750 | 2.197 | 4.561 | 1.073 | 57.741 | 32.852 |
| shape2[2] | 3.085 | 3.108 | 0.303 | 0.321 | 2.594 | 3.558 | 1.008 | 364.156 | 234.587 |
| shape2[3] | 3.344 | 3.285 | 0.566 | 0.572 | 2.584 | 4.329 | 1.016 | 239.383 | 168.038 |
| stretch[1] | 2.548 | 2.553 | 0.206 | 0.216 | 2.244 | 2.866 | 1.021 | 155.902 | 245.103 |
| stretch[2] | 2.138 | 2.141 | 0.070 | 0.066 | 2.025 | 2.254 | 1.019 | 364.350 | 275.915 |
| stretch[3] | 1.823 | 1.818 | 0.095 | 0.091 | 1.665 | 1.975 | 1.003 | 305.407 | 241.361 |
| min_max[1,1] | -5.368 | -5.330 | 0.390 | 0.364 | -5.990 | -4.813 | 1.022 | 347.443 | 214.894 |
| min_max[2,1] | -5.523 | -5.501 | 0.180 | 0.180 | -5.852 | -5.273 | 1.004 | 533.458 | 356.949 |
| min_max[3,1] | -4.679 | -4.671 | 0.083 | 0.082 | -4.825 | -4.553 | 1.005 | 375.772 | 282.052 |
| min_max[1,2] | 5.013 | 4.977 | 0.675 | 0.671 | 4.011 | 6.124 | 1.046 | 96.257 | 125.304 |
| min_max[2,2] | 2.546 | 2.565 | 0.099 | 0.098 | 2.381 | 2.684 | 1.010 | 270.672 | 249.998 |
| min_max[3,2] | 4.846 | 4.820 | 0.519 | 0.518 | 4.032 | 5.706 | 1.014 | 234.324 | 242.541 |
| nu[1] | 18.249 | 18.169 | 1.854 | 1.702 | 15.088 | 21.416 | 1.016 | 528.802 | 339.784 |
| nu[2] | 38.531 | 38.591 | 3.704 | 3.788 | 32.601 | 44.525 | 1.005 | 370.027 | 243.143 |
| nu[3] | 32.698 | 32.671 | 3.111 | 2.913 | 27.295 | 37.828 | 1.011 | 659.827 | 262.034 |
| mu_shape1 | 2.822 | 2.845 | 0.520 | 0.504 | 1.978 | 3.642 | 1.010 | 400.262 | 261.553 |
| mu_shape2 | 2.906 | 2.906 | 0.571 | 0.615 | 2.048 | 3.797 | 1.009 | 267.482 | 288.215 |
| mu_stretch | 1.891 | 1.881 | 0.561 | 0.655 | 0.972 | 2.818 | 1.000 | 331.372 | 248.440 |
| mu_min | -5.121 | -5.114 | 0.550 | 0.553 | -6.036 | -4.239 | 1.022 | 266.354 | 280.657 |
| mu_max | 4.316 | 4.333 | 0.585 | 0.584 | 3.342 | 5.272 | 1.010 | 216.434 | 185.929 |
| mu_nu | 0.291 | 0.288 | 0.057 | 0.058 | 0.202 | 0.388 | 1.039 | 654.503 | 347.760 |
| x_min[1] | -5.368 | -5.330 | 0.390 | 0.364 | -5.990 | -4.813 | 1.022 | 347.443 | 214.894 |
| x_min[2] | -5.523 | -5.501 | 0.180 | 0.180 | -5.852 | -5.273 | 1.004 | 533.458 | 356.949 |
| x_min[3] | -4.679 | -4.671 | 0.083 | 0.082 | -4.825 | -4.553 | 1.005 | 375.772 | 282.052 |
| x_max[1] | 5.013 | 4.977 | 0.675 | 0.671 | 4.011 | 6.124 | 1.046 | 96.257 | 125.304 |
| x_max[2] | 2.546 | 2.565 | 0.099 | 0.098 | 2.381 | 2.684 | 1.010 | 270.672 | 249.998 |
| x_max[3] | 4.846 | 4.820 | 0.519 | 0.518 | 4.032 | 5.706 | 1.014 | 234.324 | 242.541 |
We can access the number of posterior draws directly from the model object.
ndraws <- posterior::ndraws(perf_out$draws())
There are a lot of good packages out there for visualizing posteriors. I haven’t learned any of them yet, but here’s a quick approach I’m fond of:
#choose parameter
param <- "x_max"
max_df <- perf_out$draws(variables = param) %>%
posterior::as_draws_df() %>%
select(-.chain, -.iteration, -.draw) %>%
gather(species, value)
## Warning: Dropping 'draws_df' class as required metadata was removed.
max_df %>%
ggplot(aes(value, species, fill = species)) +
geom_density_ridges(colour = "white") +
xlab(param) +
theme(legend.position = "none")
## Picking joint bandwidth of 0.116
Quickly, let’s eyeball how these marginal posteriors for the x_max parameter (where performance falls to zero above the optimum) for each species.
max_df %>%
group_by(species) %>%
summarise(
cred_025 = quantile(value, 0.025),
cred_975 = quantile(value, 0.975)
) %>%
mutate(
truth = perf_df$true_params$x_max
) %>%
mutate(
within_cred = truth > cred_025 & truth < cred_975
)
## # A tibble: 3 × 5
## species cred_025 cred_975 truth within_cred
## <chr> <dbl> <dbl> <dbl> <lgl>
## 1 x_max[1] 3.91 6.52 4.41 TRUE
## 2 x_max[2] 2.35 2.73 2.52 TRUE
## 3 x_max[3] 3.96 5.85 4.38 TRUE
Cool, looks like the 95% credible interval captures the true value in each case. It would be easy to extend this approach using a function and quickly verify each parameter. Notice the increased uncertainty in posteriors further from the treatments on the x-axis … Love that!
First, let’s generate a tidy data frame with all the parameters, plus some valuable derived parameters, like the optimum, area, and breadth for each species. We will use this data frame for other tasks below as well.
head(
tidy_perf <-
perform_df(
perf_out,
species_order = c(1, 2, 3)
)
)
## Warning: Dropping 'draws_df' class as required metadata was removed.
## # A tibble: 6 × 12
## # Groups: species [1]
## species x_min draw x_max shape1 shape2 stretch nu maxima breadth area
## <chr> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 -5.30 1 5.76 3.61 4.34 2.35 17.5 1.55 11.1 26.0
## 2 1 -5.99 2 4.97 4.12 3.59 2.37 18.7 1.65 11.0 26.0
## 3 1 -6.50 3 4.41 4.49 3.03 2.38 19.3 1.70 10.9 26.0
## 4 1 -5.12 4 4.32 3.39 2.48 2.62 17.9 1.64 9.43 24.7
## 5 1 -5.23 5 4.07 3.63 2.46 2.77 19.3 1.64 9.30 25.8
## 6 1 -5.24 6 5.20 3.41 2.93 2.65 17.3 1.86 10.4 27.7
## # ℹ 1 more variable: special <dbl>
The next step is to generate prediction intervals using the
predict_interval() function. The function generates
posterior quantiles for each set of posterior draws specified
(x_draws), and averages over the quantiles. The argument
p can takes a vector of credible levels, which can be
modified to consider other and/or additional levels.
#set up sequence over the observed data range
x_seq <- seq(min(perf_df$sim_data$x),
max(perf_df$sim_data$x),
length.out = 100)
#sub-sample draws randomly
poly_draws <- sample(1:ndraws, 100)
creds <-
predict_interval(
x = x_seq,
par_df = tidy_perf,
x_draws = poly_draws,
p = c(0.95, 0.5))
creds %>% group_by(species, level) %>% slice_head(n = 1)
## # A tibble: 6 × 6
## # Groups: species, level [6]
## species x lower upper mu level
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 -4.72 0.000850 0.0614 0.0231 50
## 2 1 -4.72 0 0.135 0.0231 95
## 3 2 -4.72 0.0780 0.121 0.0993 50
## 4 2 -4.72 0.0374 0.161 0.0993 95
## 5 3 -4.72 0.0000340 0.0236 0.00246 50
## 6 3 -4.72 0 0.0638 0.00246 95
Notice the output of predict_interval() also produces a
“mu” column, which contains the mean preduction of the curve for each
input species.
Here is an example of how the data could be plotted. Other approach
using plot() are certainly possible:
ggplot(data = filter(creds, level == 95)) +
geom_ribbon(
aes(x = x,
ymin = upper,
ymax = lower,
fill = species),
alpha = 0.3,
inherit.aes = F) +
geom_ribbon(
data = filter(creds, level == 50),
aes(x = x,
ymin = upper,
ymax = lower,
fill = species),
inherit.aes = F,
alpha = 0.7) +
geom_line(aes(x, mu, group = species), inherit.aes = F, colour = "white", lwd = 1) +
geom_point(
data = perf_df$sim_data,
aes(x,
trait,
alpha = 0.5),
inherit.aes = F,
colour = "grey70") +
theme(legend.position = "none")
The whole concept of posterior prediction and generating data from the model is one of my favorite things about Bayesian statistics. It allows for a lot of creativity in how we check our assumptions. We can generate data from the model using this approach:
Now we can generate pseudo observed data from the posterior draws.
tidy_perf %>%
filter(draw == 1) %>%
posterior_predict(par_df = .) %>%
head()
## # A tibble: 6 × 5
## x trait mu species draw
## <dbl> <dbl> <dbl> <chr> <int>
## 1 -5.41 0 0 1 1
## 2 -5.29 0 0.0000000272 1 1
## 3 -5.18 0 0.000267 1 1
## 4 -5.06 0.0645 0.00157 1 1
## 5 -4.95 0 0.00445 1 1
## 6 -4.83 0 0.00937 1 1
By default, the x-axis will be constructed as a sequence of 100
values that start from the minimum posterior draw value of x_min and end
at the maximum of x_max. This default can be changed using the x
argument. Filtering for posterior draws is not required before using the
function – posterior predictive data will be produced for every
posterior draw passed to par_df.
If you have questions, comments, or concerns, please get in touch. I would be excited to discuss!
<button data-toggle=“collapse” data-target=“#sessioninfo” class=“btn btn-primary btn-md btn-info> R session info
#writeLines(readLines(file.path(Sys.getenv("HOME"), ".R/Makevars")))
devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.6.0 (2026-04-24)
## os macOS Tahoe 26.3
## system aarch64, darwin23
## ui X11
## language (EN)
## collate C.UTF-8
## ctype C.UTF-8
## tz America/Denver
## date 2026-06-14
## pandoc 3.8.3 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/aarch64/ (via rmarkdown)
## quarto NA
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## abind 1.4-8 2024-09-12 [1] CRAN (R 4.6.0)
## backports 1.5.1 2026-04-03 [1] CRAN (R 4.6.0)
## bslib 0.11.0 2026-05-16 [1] CRAN (R 4.6.0)
## cachem 1.1.0 2024-05-16 [1] CRAN (R 4.6.0)
## checkmate 2.3.4 2026-02-03 [1] CRAN (R 4.6.0)
## cli 3.6.6 2026-04-09 [1] CRAN (R 4.6.0)
## cmdstanr 0.9.0 2025-03-30 [1] https://stan-dev.r-universe.dev (R 4.6.0)
## data.table 1.18.4 2026-05-06 [1] CRAN (R 4.6.0)
## devtools 2.5.2 2026-04-30 [1] CRAN (R 4.6.0)
## digest 0.6.39 2025-11-19 [1] CRAN (R 4.6.0)
## distributional 0.7.1 2026-06-11 [1] CRAN (R 4.6.0)
## dplyr * 1.2.1 2026-04-03 [1] CRAN (R 4.6.0)
## ellipsis 0.3.3 2026-04-04 [1] CRAN (R 4.6.0)
## evaluate 1.0.5 2025-08-27 [1] CRAN (R 4.6.0)
## farver 2.1.2 2024-05-13 [1] CRAN (R 4.6.0)
## fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.6.0)
## forcats * 1.0.1 2025-09-25 [1] CRAN (R 4.6.0)
## fs 2.1.0 2026-04-18 [1] CRAN (R 4.6.0)
## generics 0.1.4 2025-05-09 [1] CRAN (R 4.6.0)
## ggplot2 * 4.0.3 2026-04-22 [1] CRAN (R 4.6.0)
## ggridges * 0.5.7 2025-08-27 [1] CRAN (R 4.6.0)
## glue 1.8.1 2026-04-17 [1] CRAN (R 4.6.0)
## gtable 0.3.6 2024-10-25 [1] CRAN (R 4.6.0)
## hms 1.1.4 2025-10-17 [1] CRAN (R 4.6.0)
## htmltools 0.5.9 2025-12-04 [1] CRAN (R 4.6.0)
## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.6.0)
## jsonlite 2.0.0 2025-03-27 [1] CRAN (R 4.6.0)
## knitr 1.51 2025-12-20 [1] CRAN (R 4.6.0)
## labeling 0.4.3 2023-08-29 [1] CRAN (R 4.6.0)
## lifecycle 1.0.5 2026-01-08 [1] CRAN (R 4.6.0)
## lubridate * 1.9.5 2026-02-04 [1] CRAN (R 4.6.0)
## magrittr * 2.0.5 2026-04-04 [1] CRAN (R 4.6.0)
## matrixStats 1.5.0 2025-01-07 [1] CRAN (R 4.6.0)
## memoise 2.0.1 2021-11-26 [1] CRAN (R 4.6.0)
## otel 0.2.0 2025-08-29 [1] CRAN (R 4.6.0)
## performr * 0.2 2026-06-14 [1] Github (silastittes/performr@4940fb2)
## pillar 1.11.1 2025-09-17 [1] CRAN (R 4.6.0)
## pkgbuild 1.4.8 2025-05-26 [1] CRAN (R 4.6.0)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.6.0)
## pkgload 1.5.2 2026-04-22 [1] CRAN (R 4.6.0)
## posterior 1.7.1 2026-06-10 [1] https://stan-dev.r-universe.dev (R 4.6.0)
## processx 3.9.0 2026-04-22 [1] CRAN (R 4.6.0)
## ps 1.9.3 2026-04-20 [1] CRAN (R 4.6.0)
## purrr * 1.2.2 2026-04-10 [1] CRAN (R 4.6.0)
## R6 2.6.1 2025-02-15 [1] CRAN (R 4.6.0)
## RColorBrewer 1.1-3 2022-04-03 [1] CRAN (R 4.6.0)
## readr * 2.2.0 2026-02-19 [1] CRAN (R 4.6.0)
## rlang 1.2.0 2026-04-06 [1] CRAN (R 4.6.0)
## rmarkdown 2.31 2026-03-26 [1] CRAN (R 4.6.0)
## S7 0.2.2 2026-04-22 [1] CRAN (R 4.6.0)
## sass 0.4.10 2025-04-11 [1] CRAN (R 4.6.0)
## scales 1.4.0 2025-04-24 [1] CRAN (R 4.6.0)
## sessioninfo 1.2.4 2026-06-04 [1] CRAN (R 4.6.0)
## stringi 1.8.7 2025-03-27 [1] CRAN (R 4.6.0)
## stringr * 1.6.0 2025-11-04 [1] CRAN (R 4.6.0)
## tensorA 0.36.2.1 2023-12-13 [1] CRAN (R 4.6.0)
## tibble * 3.3.1 2026-01-11 [1] CRAN (R 4.6.0)
## tidyr * 1.3.2 2025-12-19 [1] CRAN (R 4.6.0)
## tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.6.0)
## tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.6.0)
## timechange 0.4.0 2026-01-29 [1] CRAN (R 4.6.0)
## truncnorm * 1.0-9 2023-03-20 [1] CRAN (R 4.6.0)
## tzdb 0.5.0 2025-03-15 [1] CRAN (R 4.6.0)
## usethis 3.2.1 2025-09-06 [1] CRAN (R 4.6.0)
## utf8 1.2.6 2025-06-08 [1] CRAN (R 4.6.0)
## vctrs 0.7.3 2026-04-11 [1] CRAN (R 4.6.0)
## withr 3.0.2 2024-10-28 [1] CRAN (R 4.6.0)
## xfun 0.58 2026-06-01 [1] CRAN (R 4.6.0)
## yaml 2.3.12 2025-12-10 [1] CRAN (R 4.6.0)
##
## [1] /Library/Frameworks/R.framework/Versions/4.6/Resources/library
## * ── Packages attached to the search path.
##
## ──────────────────────────────────────────────────────────────────────────────