Skip to content

Commit 326a4ba

Browse files
B4nanclaude
andcommitted
fix: export coerceNumber, update stale docs using old Configuration.set() API
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent cc117c3 commit 326a4ba

3 files changed

Lines changed: 9 additions & 14 deletions

File tree

docs/guides/parallel-scraping/parallel-scraper.mjs

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -73,15 +73,13 @@ if (!process.env.IN_WORKER_THREAD) {
7373
// or a configuration option. This is just for show 😈
7474
workerLogger.setLevel(log.LEVELS.DEBUG);
7575

76-
// Disable the automatic purge on start
77-
// This is needed when running locally, as otherwise multiple processes will try to clear the default storage (and that will cause clashes)
78-
Configuration.set('purgeOnStart', false);
79-
8076
// Get the request queue
8177
const requestQueue = await getOrInitQueue(false);
8278

83-
// Configure crawlee to store the worker-specific data in a separate directory (needs to be done AFTER the queue is initialized when running locally)
79+
// Disable the automatic purge on start and configure crawlee to store the worker-specific data in a separate directory
80+
// (needs to be done AFTER the queue is initialized when running locally)
8481
const config = new Configuration({
82+
purgeOnStart: false,
8583
storageClientOptions: {
8684
localDataDirectory: `./storage/worker-${process.env.WORKER_INDEX}`,
8785
},

docs/guides/parallel-scraping/parallel-scraping.mdx

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -132,22 +132,19 @@ We use this to ensure the parent process stays alive until all the worker proces
132132

133133
There are three steps we want to do for the worker processes:
134134

135-
- ensure the default storages do **not** get purged on start, as otherwise we'd lose the queue we prepared
136135
- get the queue that supports locking from the same location as the parent process
137-
- initialize a special storage for worker processes so they do not collide with each other
136+
- ensure the default storages do **not** get purged on start, as otherwise we'd lose the queue we prepared, and initialize a special storage for worker processes so they do not collide with each other
138137

139138
In order, that's what these lines do:
140139

141140
```javascript title="src/parallel-scraper.mjs"
142-
// Disable the automatic purge on start (step 1)
143-
// This is needed when running locally, as otherwise multiple processes will try to clear the default storage (and that will cause clashes)
144-
Configuration.set('purgeOnStart', false);
145-
146-
// Get the request queue from the parent process (step 2)
141+
// Get the request queue from the parent process (step 1)
147142
const requestQueue = await getOrInitQueue(false);
148143

149-
// Configure crawlee to store the worker-specific data in a separate directory (needs to be done AFTER the queue is initialized when running locally) (step 3)
144+
// Disable the automatic purge on start and configure crawlee to store the worker-specific data
145+
// in a separate directory (needs to be done AFTER the queue is initialized when running locally) (step 2)
150146
const config = new Configuration({
147+
purgeOnStart: false,
151148
storageClientOptions: {
152149
localDataDirectory: `./storage/worker-${process.env.WORKER_INDEX}`,
153150
},

packages/core/src/configuration.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ export const coerceBoolean = z.preprocess((val) => {
3131
return val;
3232
}, z.boolean());
3333

34-
const coerceNumber = z.preprocess((val) => {
34+
export const coerceNumber = z.preprocess((val) => {
3535
if (typeof val === 'string') return Number(val);
3636
return val;
3737
}, z.number());

0 commit comments

Comments
 (0)