Skip to content

Commit 20135e2

Browse files
authored
chore: fix broken HuggingFace wikitext dataset link (#4067)
1 parent cfd0090 commit 20135e2

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

docs/examples/Python/llm_dataset_creation.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ all of it on the disk at once. This becomes a considerable problem when you just
88
In this example, we will be bypassing this problem by downloading a text dataset in parts, tokenizing it and saving it as a Lance dataset.
99
This can be done for as many or as few data samples as you wish with average memory consumption approximately 3-4 GBs!
1010

11-
For this example, we are working with the `wikitext <https://huggingface.co/datasets/wikitext>`_ dataset,
11+
For this example, we are working with the `wikitext <https://huggingface.co/datasets/Salesforce/wikitext>`_ dataset,
1212
which is a collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia.
1313

1414
Preparing and pre-processing the raw dataset

0 commit comments

Comments
 (0)