Commit 0048402
spec: add scatter-embedding specification (#5261)
## New Specification: `scatter-embedding`
Related to #5236
---
### specification.md
# scatter-embedding: t-SNE and UMAP Embedding Visualization
## Description
A scatter plot displaying high-dimensional data projected into 2D space
using non-linear dimensionality reduction techniques such as t-SNE or
UMAP. Points are colored by cluster or class label, revealing groupings
and latent structure in the data. This is a standard visualization in
machine learning for exploring embeddings, single-cell RNA-seq data, and
NLP document clustering, helping practitioners verify that learned
representations capture meaningful distinctions.
## Applications
- Visualizing cell-type clusters in single-cell RNA-seq data after
dimensionality reduction in bioinformatics workflows
- Exploring word or document embeddings from NLP models to verify
semantic groupings and detect outliers
- Inspecting latent space structure of autoencoders or variational
autoencoders (VAEs) to assess representation quality
- Quality-checking clustering results from K-means or DBSCAN by
overlaying cluster assignments on the 2D projection
## Data
- `x` (float) — First embedding dimension (e.g., t-SNE 1 or UMAP 1)
- `y` (float) — Second embedding dimension (e.g., t-SNE 2 or UMAP 2)
- `label` (categorical) — Cluster or class assignment for coloring
points
- Size: 500–5000 points typical
- Example: Synthetic clustered data with 5–10 groups projected via t-SNE
or UMAP
## Notes
- Color each cluster/class with a distinct, colorblind-accessible color
and include a legend mapping colors to labels
- Optionally annotate cluster centroids with the cluster label text
- Use moderate point size with slight transparency (alpha) to handle
overlapping points in dense regions
- Include a subtitle noting the algorithm and key parameter (e.g.,
"t-SNE (perplexity=30)" or "UMAP (n_neighbors=15)")
- Axes represent embedding dimensions and typically should not have tick
labels, as the coordinates are not directly interpretable
---
**Next:** Add `approved` label to the issue to merge this PR.
---
:robot: *[spec-create
workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/24290812404)*
---------
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>1 parent d6c578c commit 0048402
4 files changed
Lines changed: 55 additions & 0 deletions
File tree
- plots/scatter-embedding
- implementations
- metadata
Whitespace-only changes.
Whitespace-only changes.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
0 commit comments