Skip to content

Commit 2dc4a06

Browse files
spec: add silhouette-basic specification (#2336)
## New Specification: `silhouette-basic` Related to #2334 --- ### specification.md # silhouette-basic: Silhouette Plot ## Description A silhouette plot visualizes the quality of clustering results by showing the silhouette coefficient for each sample, grouped by cluster assignment. Each horizontal bar represents a sample's silhouette score (-1 to 1), where positive values indicate good cluster membership and negative values suggest potential misclassification. This visualization helps evaluate cluster cohesion (how similar samples are to their own cluster) and separation (how distinct they are from neighboring clusters). ## Applications - Evaluating K-means, hierarchical, or other clustering algorithm results - Comparing different numbers of clusters to find optimal k value - Identifying poorly clustered or potentially misclassified samples - Validating cluster assignments before downstream analysis ## Data - `samples` (numeric) - feature vectors for each data point to be clustered - `cluster_labels` (integer) - cluster assignment for each sample (0 to k-1) - `silhouette_values` (numeric) - silhouette coefficient per sample (-1 to 1) - Size: 50-500 samples with 2-10 clusters for readable visualization - Example: clustering iris dataset into 3 species groups ## Notes - Display horizontal bars for each sample's silhouette score, sorted within each cluster - Group samples by cluster with distinct colors per cluster - Include vertical line at average silhouette score for reference - Annotate each cluster section with its average silhouette score - Use sklearn.metrics.silhouette_samples for computing individual scores - Clusters with consistently high scores (close to 1) indicate well-separated groups --- **Next:** Add `approved` label to the issue to merge this PR. --- :robot: *[spec-create workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/20528132379)* Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
1 parent dfcea7c commit 2dc4a06

2 files changed

Lines changed: 57 additions & 0 deletions

File tree

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# silhouette-basic: Silhouette Plot
2+
3+
## Description
4+
5+
A silhouette plot visualizes the quality of clustering results by showing the silhouette coefficient for each sample, grouped by cluster assignment. Each horizontal bar represents a sample's silhouette score (-1 to 1), where positive values indicate good cluster membership and negative values suggest potential misclassification. This visualization helps evaluate cluster cohesion (how similar samples are to their own cluster) and separation (how distinct they are from neighboring clusters).
6+
7+
## Applications
8+
9+
- Evaluating K-means, hierarchical, or other clustering algorithm results
10+
- Comparing different numbers of clusters to find optimal k value
11+
- Identifying poorly clustered or potentially misclassified samples
12+
- Validating cluster assignments before downstream analysis
13+
14+
## Data
15+
16+
- `samples` (numeric) - feature vectors for each data point to be clustered
17+
- `cluster_labels` (integer) - cluster assignment for each sample (0 to k-1)
18+
- `silhouette_values` (numeric) - silhouette coefficient per sample (-1 to 1)
19+
- Size: 50-500 samples with 2-10 clusters for readable visualization
20+
- Example: clustering iris dataset into 3 species groups
21+
22+
## Notes
23+
24+
- Display horizontal bars for each sample's silhouette score, sorted within each cluster
25+
- Group samples by cluster with distinct colors per cluster
26+
- Include vertical line at average silhouette score for reference
27+
- Annotate each cluster section with its average silhouette score
28+
- Use sklearn.metrics.silhouette_samples for computing individual scores
29+
- Clusters with consistently high scores (close to 1) indicate well-separated groups
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Specification-level metadata for silhouette-basic
2+
# Auto-synced to PostgreSQL on push to main
3+
4+
spec_id: silhouette-basic
5+
title: Silhouette Plot
6+
7+
# Specification tracking
8+
created: 2025-12-26T19:28:09Z
9+
updated: null
10+
issue: 2334
11+
suggested: MarkusNeusinger
12+
13+
# Classification tags (applies to all library implementations)
14+
# See docs/concepts/tagging-system.md for detailed guidelines
15+
tags:
16+
plot_type:
17+
- silhouette
18+
- bar
19+
data_type:
20+
- numeric
21+
- categorical
22+
domain:
23+
- statistics
24+
- machine-learning
25+
features:
26+
- basic
27+
- clustering
28+
- evaluation

0 commit comments

Comments
 (0)