Skip to content

Commit e18d395

Browse files
vdusekclaude
andauthored
test: Fix flaky tests in adaptive crawler and sitemap loader (#1847)
## Summary - Add `@pytest.mark.flaky(rerun=3)` to `test_adaptive_context_query_selector_beautiful_soup` — the parsel variant already had this marker for the same timing issue (#1650), but the BeautifulSoup variant was missing it. The test relies on Playwright detecting a JS-injected `<h2>` within a 200ms window, which is racy on macOS/Windows CI. - Increase `asyncio.wait_for` timeout from 2s to 10s in `test_data_persistence_for_sitemap_loading` — on Windows CI with httpx, the HTTP client hit a connection error and started retrying, exhausting the 2-second window. The test validates data persistence, not latency. Failed CI jobs: - [Unit tests (macos-latest, 3.14)](https://github.com/apify/crawlee-python/actions/runs/24454238446/job/71450602394) - [Unit tests (windows-latest, 3.12)](https://github.com/apify/crawlee-python/actions/runs/24454238446/job/71450602420) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent f241ead commit e18d395

File tree

2 files changed

+5
-1
lines changed

2 files changed

+5
-1
lines changed

tests/unit/crawlers/_adaptive_playwright/test_adaptive_playwright_crawler.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -631,6 +631,10 @@ async def request_handler(context: AdaptivePlaywrightCrawlingContext) -> None:
631631
mocked_browser_handler.assert_called_once_with()
632632

633633

634+
@pytest.mark.flaky(
635+
rerun=3,
636+
reason='Test is flaky on Windows and MacOS, see https://github.com/apify/crawlee-python/issues/1650.',
637+
)
634638
async def test_adaptive_context_query_selector_beautiful_soup(test_urls: list[str]) -> None:
635639
"""Test that `context.query_selector_one` works regardless of the crawl type for BeautifulSoup variant.
636640

tests/unit/request_loaders/test_sitemap_request_loader.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,7 @@ async def wait_for_sitemap_loader_not_empty(sitemap_loader: SitemapRequestLoader
144144
sitemap_loader = SitemapRequestLoader([str(sitemap_url)], http_client=http_client, persist_state_key=persist_key)
145145

146146
# Give time to load
147-
await asyncio.wait_for(wait_for_sitemap_loader_not_empty(sitemap_loader), timeout=2)
147+
await asyncio.wait_for(wait_for_sitemap_loader_not_empty(sitemap_loader), timeout=10)
148148

149149
await sitemap_loader.close()
150150

0 commit comments

Comments
 (0)