Skip to content

Commit 3fcb862

Browse files
Mirza-Samad-Ahmed-BaigMantisus
authored andcommitted
fix: prevent get_request from permanently blocking requests (apify#1684)
• Fixed a file-system request-queue bug where get_request() incorrectly marked requests as in-progress, which could permanently block them from fetch_next_request(). I removed the side effect and added a regression test to ensure get_request() remains read-only Signed-off-by: Mirza-Samad-Ahmed-Baig <Mirzasamadahmedbaig@gmail.com>
1 parent f2f16ca commit 3fcb862

File tree

2 files changed

+14
-2
lines changed

2 files changed

+14
-2
lines changed

src/crawlee/storage_clients/_file_system/_request_queue_client.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -454,8 +454,6 @@ async def get_request(self, unique_key: str) -> Request | None:
454454
logger.warning(f'Request with unique key "{unique_key}" not found in the queue.')
455455
return None
456456

457-
state = self._state.current_value
458-
state.in_progress_requests.add(request.unique_key)
459457
await self._update_metadata(update_accessed_at=True)
460458
return request
461459

tests/unit/storage_clients/_file_system/test_fs_rq_client.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,3 +173,17 @@ async def test_data_persistence_across_reopens() -> None:
173173
assert {request1.url, request2.url} == {'https://example.com/1', 'https://example.com/2'}
174174

175175
await reopened_client.drop()
176+
177+
178+
async def test_get_request_does_not_mark_in_progress(rq_client: FileSystemRequestQueueClient) -> None:
179+
"""Test that get_request does not block a request from being fetched."""
180+
request = Request.from_url('https://example.com/blocked')
181+
await rq_client.add_batch_of_requests([request])
182+
183+
fetched = await rq_client.get_request(request.unique_key)
184+
assert fetched is not None
185+
assert fetched.unique_key == request.unique_key
186+
187+
next_request = await rq_client.fetch_next_request()
188+
assert next_request is not None
189+
assert next_request.unique_key == request.unique_key

0 commit comments

Comments
 (0)