Skip to content

Commit ac41e62

Browse files
authored
allow dummy axes for relevant layers (#442)
1 parent 8f608c5 commit ac41e62

27 files changed

Lines changed: 716 additions & 137 deletions

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,12 @@
1717
assumption in the VegaLite writer. We now correctly use the orientation to
1818
dodge in the correct dimension (#439).
1919

20+
### Changed
21+
22+
- `boxplot`, `violin`, and `range` now support omitting the categorical
23+
aesthetic, matching `bar`. `point` now treats both position aesthetics as
24+
optional.
25+
2026
## 0.3.2 - 2026-05-05
2127

2228
### Fixed

doc/syntax/layer/type/boxplot.qmd

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,12 @@ Boxplots display a summary of a continuous distribution. In the style of Tukey,
99
The following aesthetics are recognised by the boxplot layer.
1010

1111
### Required
12-
* Primary axis (e.g. `x`): The categorical variable to group by
1312
* Secondary axis (e.g. `y`): The continuous variable to summarize
1413

1514
### Optional
15+
* Primary axis (e.g. `x`): The categorical variable to group by. If omitted a
16+
single boxplot is drawn for the whole distribution and the (one-tick)
17+
categorical axis is hidden.
1618
* `stroke`: The colour of the box contours, whiskers, median line and outliers.
1719
* `fill`: The colour of the box interior.
1820
* `colour`: Shorthand for setting `stroke` and `fill` simultaneously. Note that the median line will have bad visibility if `stroke` and `fill` are the same.
@@ -96,6 +98,15 @@ DRAW boxplot
9698
MAPPING species AS y, bill_len AS x
9799
```
98100

101+
Omit the categorical axis to summarise the whole distribution as a single
102+
boxplot:
103+
104+
```{ggsql}
105+
VISUALISE FROM ggsql:penguins
106+
DRAW boxplot
107+
MAPPING bill_len AS y
108+
```
109+
99110
Pair a half-violin with a half-boxplot on the same category by setting opposite `side` values:
100111

101112
```{ggsql}

doc/syntax/layer/type/point.qmd

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,15 @@ The point layer is used to create scatterplots. The scatterplot is most useful f
1010
The following aesthetics are recognised by the point layer.
1111

1212
### Required
13-
* Primary axis (e.g. `x`): Position along the primary axis.
14-
* Secondary axis (e.g. `y`): Position along the secondary axis.
13+
The point layer has no required aesthetics.
1514

1615
### Optional
16+
* Primary axis (e.g. `x`): Position along the primary axis. If omitted, all
17+
points are drawn at a single discrete primary-axis position (a strip plot)
18+
and the categorical axis is hidden.
19+
* Secondary axis (e.g. `y`): Position along the secondary axis. Same dummy-axis
20+
treatment as the primary. If both axes are omitted, all rows pile up at a
21+
single point — only useful in combination with `aggregate`.
1722
* `size`: The size of each point
1823
* `colour`: The default colour of each point
1924
* `stroke`: The colour of the stroke around each point (if any). Overrides `colour`

doc/syntax/layer/type/range.qmd

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,13 @@ The range layer displays an interval between two values along the secondary axis
1010
The following aesthetics are recognised by the range layer.
1111

1212
### Required
13-
* Primary axis (e.g. `x`): Position along the primary axis.
1413
* Secondary axis minimum (e.g. `ymin`): Lower position along the secondary axis.
1514
* Secondary axis maximum (e.g. `ymax`): Upper position along the secondary axis.
1615

1716
### Optional
17+
* Primary axis (e.g. `x`): Position along the primary axis. If omitted a
18+
single interval is drawn over the whole dataset and the (one-tick)
19+
categorical axis is hidden.
1820
* `stroke`/`colour`: The colour of the lines in the range.
1921
* `opacity`: The opacity of the colour.
2022
* `linewidth`: The width of the lines in the range.

doc/syntax/layer/type/violin.qmd

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,12 @@ The violins are mirrored kernel density estimates, similar to the [density](dens
1111
The following aesthetics are recognised by the violin layer.
1212

1313
### Required
14-
* Primary axis (e.g. `x`): The categorical variable for grouping.
1514
* Secondary axis (e.g. `y`): The continuous variable to compute density for.
1615

1716
### Optional
17+
* Primary axis (e.g. `x`): The categorical variable for grouping. If omitted
18+
a single violin is drawn for the whole distribution and the (one-tick)
19+
categorical axis is hidden.
1820
* `stroke`: The colour of the contour lines.
1921
* `fill`: The colour of the inner area.
2022
* `colour`: Shorthand for setting `stroke` and `fill` simultaneously.

src/execute/layer.rs

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -567,7 +567,8 @@ where
567567
// Apply literal default remappings from geom defaults (e.g., y2 => 0.0 for bar baseline).
568568
// These apply regardless of stat transform, but only if user hasn't overridden them.
569569
// Defaults are always in aligned orientation.
570-
for (aesthetic, default_value) in layer.geom.default_remappings().defaults {
570+
let implicit_remappings = layer.geom.implicit_default_remappings();
571+
for (aesthetic, default_value) in &implicit_remappings {
571572
// Only process literal values here (Column values are handled in Transformed branch)
572573
if !matches!(default_value, DefaultAestheticValue::Column(_)) {
573574
// Only add if user hasn't already specified this aesthetic in remappings or mappings
@@ -591,7 +592,7 @@ where
591592
// Build stat column -> aesthetic mappings from geom defaults for renaming
592593
let mut final_remappings: HashMap<String, String> = HashMap::new();
593594

594-
for (aesthetic, default_value) in layer.geom.default_remappings().defaults {
595+
for (aesthetic, default_value) in &implicit_remappings {
595596
if let DefaultAestheticValue::Column(stat_col) = default_value {
596597
// Stat column mapping: stat_col -> aesthetic (for rename)
597598
final_remappings.insert(stat_col.to_string(), aesthetic.to_string());

src/execute/mod.rs

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ fn validate(
127127
// Validate remapping source columns are valid stat columns for this geom.
128128
// Geoms that opt into the Aggregate stat (`supports_aggregate`) also accept
129129
// `aggregate`, `count`, and any position aesthetic name as a stat source.
130-
let valid_stat_columns = layer.geom.valid_stat_columns();
130+
let valid_stat_columns = layer.geom.implicit_valid_stat_columns();
131131
let supports_aggregate = layer.geom.supports_aggregate();
132132
for stat_value in layer.remappings.aesthetics.values() {
133133
if let Some(stat_col) = stat_value.column_name() {
@@ -3048,11 +3048,12 @@ mod tests {
30483048
)
30493049
.unwrap();
30503050

3051-
// Query missing required aesthetic 'y' - should show 'y' not 'pos2'
3051+
// Query missing required aesthetic 'y' - should show 'y' not 'pos2'.
3052+
// Use line, which still requires both x and y (point's x is optional).
30523053
let query = r#"
30533054
SELECT * FROM test_data
30543055
VISUALISE
3055-
DRAW point MAPPING a AS x
3056+
DRAW line MAPPING a AS x
30563057
"#;
30573058

30583059
let result = prepare_data_with_reader(query, &reader);

src/plot/layer/geom/area.rs

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -60,10 +60,6 @@ impl GeomTrait for Area {
6060
Some(&["pos1"])
6161
}
6262

63-
fn needs_stat_transform(&self, _aesthetics: &Mappings) -> bool {
64-
true
65-
}
66-
6763
fn apply_stat_transform(
6864
&self,
6965
query: &str,

src/plot/layer/geom/bar.rs

Lines changed: 19 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ use std::collections::HashMap;
44
use std::collections::HashSet;
55

66
use super::stat_aggregate;
7-
use super::types::{get_column_name, POSITION_VALUES};
7+
use super::types::{get_column_name, wrap_stat_with_dummy_pos1, POSITION_VALUES};
88
use super::{
99
has_aggregate_param, DefaultAesthetics, DefaultParamValue, GeomTrait, GeomType,
1010
ParamConstraint, ParamDefinition, StatResult,
@@ -35,8 +35,8 @@ impl GeomTrait for Bar {
3535
// if we ever want to make 'width' an aesthetic, we'd probably need to
3636
// translate it to 'size'.
3737
defaults: &[
38-
("pos1", DefaultAestheticValue::Null), // Optional - stat may provide
39-
("pos2", DefaultAestheticValue::Null), // Optional - stat may compute
38+
("pos1", DefaultAestheticValue::Dummy), // Optional - stat synthesises a dummy if omitted
39+
("pos2", DefaultAestheticValue::Null), // Optional - stat computes count when omitted
4040
("pos2end", DefaultAestheticValue::Delayed),
4141
("weight", DefaultAestheticValue::Null),
4242
("fill", DefaultAestheticValue::String("black")),
@@ -50,14 +50,13 @@ impl GeomTrait for Bar {
5050
DefaultAesthetics {
5151
defaults: &[
5252
("pos2", DefaultAestheticValue::Column("count")),
53-
("pos1", DefaultAestheticValue::Column("pos1")),
5453
("pos2end", DefaultAestheticValue::Number(0.0)),
5554
],
5655
}
5756
}
5857

5958
fn valid_stat_columns(&self) -> &'static [&'static str] {
60-
&["count", "pos1", "proportion"]
59+
&["count", "proportion"]
6160
}
6261

6362
fn default_params(&self) -> &'static [ParamDefinition] {
@@ -85,10 +84,6 @@ impl GeomTrait for Bar {
8584
Some(&[])
8685
}
8786

88-
fn needs_stat_transform(&self, _aesthetics: &Mappings) -> bool {
89-
true // Bar stat decides COUNT vs identity based on y mapping
90-
}
91-
9287
fn apply_stat_transform(
9388
&self,
9489
query: &str,
@@ -100,8 +95,8 @@ impl GeomTrait for Bar {
10095
dialect: &dyn SqlDialect,
10196
aesthetic_ctx: &crate::plot::aesthetic::AestheticContext,
10297
) -> Result<StatResult> {
103-
if has_aggregate_param(parameters) {
104-
return stat_aggregate::apply(
98+
let inner = if has_aggregate_param(parameters) {
99+
stat_aggregate::apply(
105100
query,
106101
schema,
107102
aesthetics,
@@ -110,9 +105,20 @@ impl GeomTrait for Bar {
110105
dialect,
111106
aesthetic_ctx,
112107
self.aggregate_domain_aesthetics().unwrap_or(&[]),
113-
);
108+
)?
109+
} else {
110+
stat_bar_count(query, schema, aesthetics, group_by)?
111+
};
112+
// When the user omits the categorical axis, post-wrap with the dummy
113+
// pos1 column so the writer suppresses the one-tick axis. Composes
114+
// with both the aggregate and identity-path outputs (the `count`
115+
// branch of stat_bar_count already injects its own dummy column —
116+
// wrap_stat_with_dummy_pos1's idempotency keeps that path correct).
117+
if get_column_name(aesthetics, "pos1").is_none() {
118+
Ok(wrap_stat_with_dummy_pos1(query, inner))
119+
} else {
120+
Ok(inner)
114121
}
115-
stat_bar_count(query, schema, aesthetics, group_by)
116122
}
117123
}
118124

src/plot/layer/geom/boxplot.rs

Lines changed: 79 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
33
use std::collections::HashMap;
44

5-
use super::types::{POSITION_VALUES, SIDE_VALUES};
5+
use super::types::{wrap_with_dummy_axis, POSITION_VALUES, SIDE_VALUES};
66
use super::{DefaultAesthetics, GeomTrait, GeomType};
77
use crate::{
88
naming,
@@ -26,7 +26,10 @@ impl GeomTrait for Boxplot {
2626
fn aesthetics(&self) -> DefaultAesthetics {
2727
DefaultAesthetics {
2828
defaults: &[
29-
("pos1", DefaultAestheticValue::Required),
29+
// pos1 is dummy-able. `stat_boxplot` handles the synthesis
30+
// itself by pre-wrapping the input so the existing GROUP BY
31+
// collapses to a single boxplot of the whole pos2 distribution.
32+
("pos1", DefaultAestheticValue::Dummy),
3033
("pos2", DefaultAestheticValue::Required),
3134
("stroke", DefaultAestheticValue::String("black")),
3235
("fill", DefaultAestheticValue::String("white")),
@@ -46,10 +49,6 @@ impl GeomTrait for Boxplot {
4649
&["pos2"]
4750
}
4851

49-
fn needs_stat_transform(&self, _aesthetics: &Mappings) -> bool {
50-
true
51-
}
52-
5352
fn default_params(&self) -> &'static [super::ParamDefinition] {
5453
const PARAMS: &[ParamDefinition] = &[
5554
ParamDefinition {
@@ -122,9 +121,17 @@ fn stat_boxplot(
122121
let y = get_column_name(aesthetics, "pos2").ok_or_else(|| {
123122
GgsqlError::ValidationError("Boxplot requires 'y' aesthetic mapping".to_string())
124123
})?;
125-
let x = get_column_name(aesthetics, "pos1").ok_or_else(|| {
126-
GgsqlError::ValidationError("Boxplot requires 'x' aesthetic mapping".to_string())
127-
})?;
124+
125+
// pos1 is optional. When the user omits it, wrap the input query with a
126+
// synthetic dummy categorical column and group by that column, so the
127+
// existing GROUP BY / summary pipeline collapses to a single boxplot.
128+
let (working_query, x, use_dummy) = match get_column_name(aesthetics, "pos1") {
129+
Some(col) => (query.to_string(), col, false),
130+
None => {
131+
let dummy_col = naming::stat_column("pos1");
132+
(wrap_with_dummy_axis(query, "pos1"), dummy_col, true)
133+
}
134+
};
128135

129136
// Get coef parameter (validated by ParamConstraint::number_min)
130137
let ParameterValue::Number(coef) = parameters.get("coef").unwrap() else {
@@ -153,17 +160,25 @@ fn stat_boxplot(
153160
}
154161

155162
// Query for boxplot summary statistics
156-
let summary = boxplot_sql_compute_summary(query, &groups, &value_col, coef, dialect);
157-
let stats_query = boxplot_sql_append_outliers(&summary, &groups, &value_col, query, outliers);
163+
let summary = boxplot_sql_compute_summary(&working_query, &groups, &value_col, coef, dialect);
164+
let stats_query =
165+
boxplot_sql_append_outliers(&summary, &groups, &value_col, &working_query, outliers);
166+
167+
let mut stat_columns = vec![
168+
"type".to_string(),
169+
"value".to_string(),
170+
"value2".to_string(),
171+
];
172+
let mut dummy_columns: Vec<String> = vec![];
173+
if use_dummy {
174+
stat_columns.push("pos1".to_string());
175+
dummy_columns.push("pos1".to_string());
176+
}
158177

159178
Ok(StatResult::Transformed {
160179
query: stats_query,
161-
stat_columns: vec![
162-
"type".to_string(),
163-
"value".to_string(),
164-
"value2".to_string(),
165-
],
166-
dummy_columns: vec![],
180+
stat_columns,
181+
dummy_columns,
167182
consumed_aesthetics: vec!["pos2".to_string()],
168183
})
169184
}
@@ -522,9 +537,10 @@ mod tests {
522537
let boxplot = Boxplot;
523538
let aes = boxplot.aesthetics();
524539

525-
assert!(aes.is_required("pos1"));
540+
// pos1 is optional (omit → dummy categorical axis); pos2 is required.
541+
assert!(!aes.is_required("pos1"));
526542
assert!(aes.is_required("pos2"));
527-
assert_eq!(aes.required().len(), 2);
543+
assert_eq!(aes.required(), vec!["pos2"]);
528544
}
529545

530546
#[test]
@@ -587,6 +603,8 @@ mod tests {
587603
let boxplot = Boxplot;
588604
let remappings = boxplot.default_remappings();
589605

606+
// pos1 is `Dummy` in aesthetics() so the `Geom` wrapper auto-derives
607+
// its remapping. The trait method returns only the explicit entries.
590608
assert_eq!(remappings.defaults.len(), 3);
591609
assert!(remappings
592610
.defaults
@@ -599,6 +617,48 @@ mod tests {
599617
.contains(&("type", DefaultAestheticValue::Column("type"))));
600618
}
601619

620+
#[test]
621+
fn test_boxplot_dummy_pos1_when_unmapped() {
622+
use crate::plot::AestheticValue;
623+
let mut aesthetics = Mappings::new();
624+
aesthetics.insert(
625+
"pos2".to_string(),
626+
AestheticValue::standard_column("value".to_string()),
627+
);
628+
let mut parameters: HashMap<String, ParameterValue> = HashMap::new();
629+
parameters.insert("coef".to_string(), ParameterValue::Number(1.5));
630+
parameters.insert("outliers".to_string(), ParameterValue::Boolean(true));
631+
632+
let result = stat_boxplot(
633+
"SELECT * FROM data",
634+
&aesthetics,
635+
&[],
636+
&parameters,
637+
&AnsiDialect,
638+
)
639+
.expect("stat_boxplot should succeed without pos1");
640+
641+
match result {
642+
StatResult::Transformed {
643+
query,
644+
stat_columns,
645+
dummy_columns,
646+
consumed_aesthetics,
647+
} => {
648+
// The wrapped input introduces a synthetic pos1 column that the
649+
// GROUP BY then collapses to a single boxplot.
650+
assert!(query.contains("__ggsql_stat_dummy"));
651+
assert!(query.contains("__ggsql_stat_pos1"));
652+
assert!(stat_columns.contains(&"pos1".to_string()));
653+
assert!(stat_columns.contains(&"type".to_string()));
654+
assert!(stat_columns.contains(&"value".to_string()));
655+
assert_eq!(dummy_columns, vec!["pos1".to_string()]);
656+
assert_eq!(consumed_aesthetics, vec!["pos2".to_string()]);
657+
}
658+
_ => panic!("expected Transformed"),
659+
}
660+
}
661+
602662
#[test]
603663
fn test_boxplot_stat_consumed_aesthetics() {
604664
let boxplot = Boxplot;
@@ -608,13 +668,6 @@ mod tests {
608668
assert_eq!(consumed[0], "pos2");
609669
}
610670

611-
#[test]
612-
fn test_boxplot_needs_stat_transform() {
613-
let boxplot = Boxplot;
614-
let aesthetics = Mappings::new();
615-
assert!(boxplot.needs_stat_transform(&aesthetics));
616-
}
617-
618671
#[test]
619672
fn test_boxplot_display() {
620673
let boxplot = Boxplot;

0 commit comments

Comments
 (0)