Crawlee optimizes requests by suppressing redundant URLs and giving up on a URL after reaching a configurable retry limit. This is great, but I encountered an edge case.
Situation: You're extracting data from a page and realize that it wasn’t downloaded properly.
Ideal Goal: Re-queue the URL while decrementing its retry counter.
Alternative: Add the URL back to the queue with a fresh retry counter.
How can I achieve either of these?
Crawlee optimizes requests by suppressing redundant URLs and giving up on a URL after reaching a configurable retry limit. This is great, but I encountered an edge case.
Situation: You're extracting data from a page and realize that it wasn’t downloaded properly.
Ideal Goal: Re-queue the URL while decrementing its retry counter.
Alternative: Add the URL back to the queue with a fresh retry counter.
How can I achieve either of these?