Skip to content
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,4 @@ vignettes/loo2-non-factorized_cache/*
^release-prep\.R$
^_pkgdown\.yml$
^pkgdown$
^touchstone$
22 changes: 22 additions & 0 deletions .github/workflows/touchstone-comment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: Continuous Benchmarks (Comment)

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref }}
Comment thread
VisruthSK marked this conversation as resolved.
cancel-in-progress: true

on:
workflow_run:
workflows: ["Continuous Benchmarks (Receive)"]
types:
- completed

jobs:
upload:
runs-on: ubuntu-latest
if: >
${{ github.event.workflow_run.event == 'pull_request' }}
steps:
- uses: lorenzwalthert/touchstone/actions/comment@main
with:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to set up a token for this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in, can we get away with not using it? Not sure, same as above this is the default. Can remove and try to see if works?

Copy link
Copy Markdown
Member

@jgabry jgabry Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think we could just explicitly set a few permissions, e.g.

  permissions:
    contents: read
    pull-requests: write
    issues: write

but it's not a big deal. I think it's fine to use the token. Also maybe I'm misunderstanding how this work.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh right, no I think you're right. The default GHA is a touch stale and I think we should be doing things like this to make it slightly more secure.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified this, I hope the actual commenting works, but can only check that after merging :)


43 changes: 43 additions & 0 deletions .github/workflows/touchstone-receive.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
name: Continuous Benchmarks (Receive)

permissions:
contents: read

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref }}
cancel-in-progress: true

on:
pull_request:

jobs:
prepare:
runs-on: ubuntu-latest
outputs:
config: ${{ steps.read_touchstone_config.outputs.config }}
steps:
- name: Checkout repo
uses: actions/checkout@v6
with:
fetch-depth: 0

- id: read_touchstone_config
run: |
echo "config=$(jq -c . ./touchstone/config.json)" >> $GITHUB_OUTPUT

build:
needs: prepare
runs-on: ${{ matrix.config.os }}
strategy:
fail-fast: false
matrix:
config:
- ${{ fromJson(needs.prepare.outputs.config) }}
steps:
- name: Checkout repo
uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: lorenzwalthert/touchstone/actions/receive@main
with:
r-version: ${{ matrix.config.r }}
20 changes: 20 additions & 0 deletions data-raw/wine_loglik.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
library(brms)
library(loo)
options(brms.backend = "cmdstanr")
options(mc.cores = 4)

fitos <- read.delim("data-raw/winequality-red.csv", sep = ";") |>
unique() |>
scale() |>
as.data.frame() |>
brm(
ordered(quality) ~ .,
family = cumulative("logit"),
prior = prior(R2D2(mean_R2 = 1 / 3, prec_R2 = 3)),
data = _,
seed = 1,
silent = 2,
refresh = 0
)

saveRDS(log_lik(fitos), "touchstone/wine.rds")
1,600 changes: 1,600 additions & 0 deletions data-raw/winequality-red.csv

Large diffs are not rendered by default.

7 changes: 7 additions & 0 deletions touchstone/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
*
!script.R
!config.json
!.gitignore
!header.R
!footer.R
!wine.rds
8 changes: 8 additions & 0 deletions touchstone/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"os": "ubuntu-22.04",
"r": "4.4.3",
"rspm": "https://packagemanager.posit.co/cran/__linux__/jammy/latest",
"benchmarking_repo": "",
"benchmarking_ref": "",
"benchmarking_path": ""
}
10 changes: 10 additions & 0 deletions touchstone/footer.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# You can modify the PR comment footer here. You can use github markdown e.g.
# emojis like :tada:.
# This file will be parsed and evaluate within the context of
# `benchmark_analyze` and should return the comment text as the last value.
# See `?touchstone::pr_comment`
link <- "https://lorenzwalthert.github.io/touchstone/articles/inference.html"
glue::glue(
"\nFurther explanation regarding interpretation and",
" methodology can be found in the [documentation]({link})."
)
13 changes: 13 additions & 0 deletions touchstone/header.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# You can modify the PR comment header here. You can use github markdown e.g.
# emojis like :tada:.
# This file will be parsed and evaluate within the context of
# `benchmark_analyze` and should return the comment text as the last value.
# Available variables for glue substitution:
# * ci: confidence interval
# * branches: BASE and HEAD branches benchmarked against each other.
# See `?touchstone::pr_comment`
glue::glue(
"This is how benchmark results would change (along with a",
" {100 * ci}% confidence interval in relative change) if ",
"{system2('git', c('rev-parse', 'HEAD'), stdout = TRUE)} is merged into {branches[1]}:\n"
)
57 changes: 57 additions & 0 deletions touchstone/script.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# see `help(run_script, package = 'touchstone')` on how to run this
# interactively

# installs branches to benchmark
touchstone::branch_install()

touchstone::pin_assets("touchstone/wine.rds")

# These synthetic workloads are large enough to expose real slowdowns in the
# core `loo()` paths, but still short enough to keep PR feedback reasonably fast.
touchstone::benchmark_run(
expr_before_benchmark = {
suppressPackageStartupMessages(library(loo))
# benchmark_run() evaluates in a callr subprocess, so load pinned assets here.
wine_log_lik_matrix <- readRDS(touchstone::path_pinned_asset(
"touchstone/wine.rds"
))
matrix_r_eff <- rep(1, ncol(wine_log_lik_matrix))
},
loo_matrix = {
suppressWarnings(
Comment thread
VisruthSK marked this conversation as resolved.
loo(
wine_log_lik_matrix,
r_eff = matrix_r_eff,
cores = 1
)
)
},
n = 10
)

touchstone::benchmark_run(
expr_before_benchmark = {
suppressPackageStartupMessages(library(loo))
wine_log_lik_matrix <- readRDS(touchstone::path_pinned_asset(
"touchstone/wine.rds"
))
function_r_eff <- rep(1, ncol(wine_log_lik_matrix))
wine_data <- data.frame(obs = seq_len(ncol(wine_log_lik_matrix)))
wine_llfun <- function(data_i, draws) draws[, data_i$obs, drop = FALSE]
},
loo_function = {
suppressWarnings(
Comment thread
jgabry marked this conversation as resolved.
loo(
wine_llfun,
data = wine_data,
draws = wine_log_lik_matrix,
r_eff = function_r_eff,
cores = 1
)
)
},
n = 10
)

# create artifacts used downstream in the GitHub Action
touchstone::benchmark_analyze()
Binary file added touchstone/wine.rds
Binary file not shown.
Loading