MarkusNeusinger
diff --git a/‎plots/scatter-embedding/implementations/.gitkeep‎ b/‎plots/scatter-embedding/implementations/.gitkeep‎
diff --git a/‎plots/scatter-embedding/metadata/.gitkeep‎ b/‎plots/scatter-embedding/metadata/.gitkeep‎
diff --git a/‎plots/scatter-embedding/specification.md‎
Lines changed: 28 additions & 0 deletions b/‎plots/scatter-embedding/specification.md‎
Lines changed: 28 additions & 0 deletions
diff --git a/‎plots/scatter-embedding/specification.yaml‎
Lines changed: 27 additions & 0 deletions b/‎plots/scatter-embedding/specification.yaml‎
Lines changed: 27 additions & 0 deletions
@@ -0,0 +1,28 @@
+# scatter-embedding: t-SNE and UMAP Embedding Visualization
+
+## Description
+
+A scatter plot displaying high-dimensional data projected into 2D space using non-linear dimensionality reduction techniques such as t-SNE or UMAP. Points are colored by cluster or class label, revealing groupings and latent structure in the data. This is a standard visualization in machine learning for exploring embeddings, single-cell RNA-seq data, and NLP document clustering, helping practitioners verify that learned representations capture meaningful distinctions.
+
+## Applications
+
+- Visualizing cell-type clusters in single-cell RNA-seq data after dimensionality reduction in bioinformatics workflows
+- Exploring word or document embeddings from NLP models to verify semantic groupings and detect outliers
+- Inspecting latent space structure of autoencoders or variational autoencoders (VAEs) to assess representation quality
+- Quality-checking clustering results from K-means or DBSCAN by overlaying cluster assignments on the 2D projection
+
+## Data
+
+- `x` (float) — First embedding dimension (e.g., t-SNE 1 or UMAP 1)
+- `y` (float) — Second embedding dimension (e.g., t-SNE 2 or UMAP 2)
+- `label` (categorical) — Cluster or class assignment for coloring points
+- Size: 500–5000 points typical
+- Example: Synthetic clustered data with 5–10 groups projected via t-SNE or UMAP
+
+## Notes
+
+- Color each cluster/class with a distinct, colorblind-accessible color and include a legend mapping colors to labels
+- Optionally annotate cluster centroids with the cluster label text
+- Use moderate point size with slight transparency (alpha) to handle overlapping points in dense regions
+- Include a subtitle noting the algorithm and key parameter (e.g., "t-SNE (perplexity=30)" or "UMAP (n_neighbors=15)")
+- Axes represent embedding dimensions and typically should not have tick labels, as the coordinates are not directly interpretable
@@ -0,0 +1,27 @@
+# Specification-level metadata for scatter-embedding
+# Auto-synced to PostgreSQL on push to main
+
+spec_id: scatter-embedding
+title: t-SNE and UMAP Embedding Visualization
+
+# Specification tracking
+created: "2026-04-11T20:22:05Z"
+updated: null
+issue: 5236
+suggested: MarkusNeusinger
+
+# Classification tags (applies to all library implementations)
+# See docs/reference/tagging-system.md for detailed guidelines
+tags:
+  plot_type:
+    - scatter
+  data_type:
+    - numeric
+    - categorical
+  domain:
+    - machine-learning
+    - science
+  features:
+    - color-mapped
+    - annotated
+    - 2d