Skip to content

Commit 1b479b8

Browse files
committed
fix(docs): update dataset size table to include vector sizes for clarity
1 parent 63dd630 commit 1b479b8

1 file changed

Lines changed: 13 additions & 12 deletions

File tree

bindings/python/docs/examples/download_data.md

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -176,15 +176,16 @@ These counts come from generated `vectors/stackoverflow-<size>-all.meta.json` fi
176176

177177
## Approximate Sizes
178178

179-
| Dataset | Size |
180-
| --- | --- |
181-
| MovieLens small | ~3.2 MB |
182-
| MovieLens large | ~1.5 GB |
183-
| MSMARCO 1M | ~3.9 GB |
184-
| MSMARCO 5M | ~20 GB |
185-
| MSMARCO 10M | ~39 GB |
186-
| StackOverflow small | ~642 MB |
187-
| StackOverflow medium | ~2.9 GB |
188-
| StackOverflow large | ~10 GB |
189-
| StackOverflow xlarge | ~50 GB |
190-
| StackOverflow full | ~323 GB |
179+
| Dataset | Dataset size (non-vectors) | Vector size |
180+
| --- | --- | --- |
181+
| MovieLens small | ~3.2 MB ||
182+
| MovieLens large | ~1.5 GB ||
183+
| MSMARCO 1M || ~3.9 GB |
184+
| MSMARCO 5M || ~20 GB |
185+
| MSMARCO 10M || ~39 GB |
186+
| StackOverflow tiny | ~34 MB | ~35 MB |
187+
| StackOverflow small | ~642 MB | ~495 MB |
188+
| StackOverflow medium | ~2.9 GB | ~2.0 GB |
189+
| StackOverflow large | ~10 GB | ~8.8 GB |
190+
| StackOverflow xlarge | ~50 GB | ~42 GB |
191+
| StackOverflow full | ~323 GB ||

0 commit comments

Comments
 (0)