Skip to content

Commit 5447d26

Browse files
spec: add mosaic-categorical specification (#3655)
## New Specification: `mosaic-categorical` Related to #3650 --- ### specification.md # mosaic-categorical: Mosaic Plot for Categorical Association Analysis ## Description A mosaic plot visualizes contingency tables by dividing a rectangular area into smaller rectangles whose areas are proportional to cell frequencies. This statistical visualization technique effectively shows relationships and associations between two or more categorical variables, making it easy to identify patterns, dependencies, and deviations from expected frequencies in cross-tabulated data. ## Applications - Analyzing survey response patterns across demographic groups - Exploring relationships between categorical variables in social science research - Visualizing contingency tables in medical studies (treatment vs outcome) - Examining association between product categories and customer segments ## Data - `category_1` (categorical) - First categorical variable (rows in contingency table) - `category_2` (categorical) - Second categorical variable (columns in contingency table) - `frequency` (numeric, optional) - Count or frequency for each combination; if omitted, computed from data - Size: Typically 2-6 levels per categorical variable for readability - Example: Titanic survival data cross-tabulated by class and survival status ## Notes - Rectangle widths represent marginal proportions of the first variable - Rectangle heights within each column represent conditional proportions of the second variable - Area of each rectangle is proportional to the cell frequency in the contingency table - Use statsmodels.graphics.mosaicplot for the core visualization - Color coding can indicate residuals or deviations from independence - Gap spacing between rectangles helps distinguish categories - Labels should identify both categorical variables clearly --- **Next:** Add `approved` label to the issue to merge this PR. --- :robot: *[spec-create workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/20886473590)* Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
1 parent b9ded83 commit 5447d26

2 files changed

Lines changed: 59 additions & 0 deletions

File tree

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# mosaic-categorical: Mosaic Plot for Categorical Association Analysis
2+
3+
## Description
4+
5+
A mosaic plot visualizes contingency tables by dividing a rectangular area into smaller rectangles whose areas are proportional to cell frequencies. This statistical visualization technique effectively shows relationships and associations between two or more categorical variables, making it easy to identify patterns, dependencies, and deviations from expected frequencies in cross-tabulated data.
6+
7+
## Applications
8+
9+
- Analyzing survey response patterns across demographic groups
10+
- Exploring relationships between categorical variables in social science research
11+
- Visualizing contingency tables in medical studies (treatment vs outcome)
12+
- Examining association between product categories and customer segments
13+
14+
## Data
15+
16+
- `category_1` (categorical) - First categorical variable (rows in contingency table)
17+
- `category_2` (categorical) - Second categorical variable (columns in contingency table)
18+
- `frequency` (numeric, optional) - Count or frequency for each combination; if omitted, computed from data
19+
- Size: Typically 2-6 levels per categorical variable for readability
20+
- Example: Titanic survival data cross-tabulated by class and survival status
21+
22+
## Notes
23+
24+
- Rectangle widths represent marginal proportions of the first variable
25+
- Rectangle heights within each column represent conditional proportions of the second variable
26+
- Area of each rectangle is proportional to the cell frequency in the contingency table
27+
- Use statsmodels.graphics.mosaicplot for the core visualization
28+
- Color coding can indicate residuals or deviations from independence
29+
- Gap spacing between rectangles helps distinguish categories
30+
- Labels should identify both categorical variables clearly
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# Specification-level metadata for mosaic-categorical
2+
# Auto-synced to PostgreSQL on push to main
3+
4+
spec_id: mosaic-categorical
5+
title: Mosaic Plot for Categorical Association Analysis
6+
7+
# Specification tracking
8+
created: 2026-01-11T00:09:10Z
9+
updated: null
10+
issue: 3650
11+
suggested: MarkusNeusinger
12+
13+
# Classification tags (applies to all library implementations)
14+
# See docs/reference/tagging-system.md for detailed guidelines
15+
tags:
16+
plot_type:
17+
- mosaic
18+
- heatmap
19+
data_type:
20+
- categorical
21+
- frequency
22+
domain:
23+
- statistics
24+
- research
25+
- general
26+
features:
27+
- basic
28+
- proportional
29+
- association

0 commit comments

Comments
 (0)