Skip to content

Commit 8c0dae6

Browse files
authored
test: Reduce flakiness in tests (#1716)
### Description - Properly consume the first `SystemInfoEvent` in the fixture to ensure no interference with tests - Add extra wait buffer to flaky test sensitive to timing issues - Take into account that `asyncio.sleep` time can be slightly shorter than expected - Run resource-sensitive tests alone on Mac. Mac executor is the most sensitive to resource-dependent tests. Reduce flakiness by running some of them alone on Mac only. This should reduce the flakiness, without hiding issues too much. (Mac executor on GitHub has its own issues, which we do not need to deal with. If the test is flaky on other platforms, it should be investigated. If flaky on Mac only, try to run it alone first. ) ### Issues - Closes: #1652
1 parent df1347a commit 8c0dae6

File tree

7 files changed

+45
-5
lines changed

7 files changed

+45
-5
lines changed

tests/unit/README.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,11 @@
11
# Unit tests
2+
3+
Some tests may exhibit flaky behavior in CI. The reason for flaky behavior should be understood as it can indicate bug in the code or design flaw in the test. There are other reasons related to test execution, such as some tests that are not (or can not be) properly isolated, or limited resource constraints of the test executor.
4+
5+
Here are some suggested approaches to mitigate flakiness, sorted in the order of preference:
6+
- Investigate the root cause and fix the code or test.
7+
- Apply one of the pytest marks to mitigate the flakiness:
8+
- `@run_alone_on_mac` - Test with such mark will run alone on macOS exeutor in CI (normally several tests run in parallel, which can cause resource-sensitive tests to fail.) Use for resource sensitive tests that are known to be flaky only on macOS.
9+
- `@run_alone` - Test with such mark will run alone on any executor. Use for resource sensitive tests that are known to be flaky on all platforms or for tests that can not be run in parallel with other test due to their design (This should be extremely rare).
10+
- `@pytest.mark.flaky` - Test with such mark will be retried several times if it fails. Use for tests that are known to be flaky, but the reason for flakiness is not understood or can not be easily mitigated.
11+
- `@pytest.mark.skip` - Test with such mark will be skipped. Use when none of the above approaches mitigate the test flakiness. Marking test as skipped should be a last resort, as it can hide potential bugs and give false sense of security. Skipped tests should be tracked in GitHub issue.

tests/unit/_autoscaling/test_snapshotter.py

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
from __future__ import annotations
22

3+
import asyncio
34
from datetime import datetime, timedelta, timezone
45
from logging import getLogger
56
from typing import TYPE_CHECKING, cast
@@ -23,8 +24,22 @@
2324

2425
@pytest.fixture
2526
async def event_manager() -> AsyncGenerator[LocalEventManager, None]:
26-
# Use a long interval to avoid interference from periodic system info events during tests
27-
async with LocalEventManager(system_info_interval=timedelta(hours=9999)) as event_manager:
27+
# Use a long interval to avoid interference from periodic system info events during tests and ensure the first
28+
# automatic event is consumed before yielding.
29+
30+
event_manager = LocalEventManager(system_info_interval=timedelta(hours=9999))
31+
32+
initial_system_info_consumed = asyncio.Event()
33+
34+
async def consume_automatic_system_info(_: EventSystemInfoData) -> None:
35+
initial_system_info_consumed.set()
36+
37+
event_manager.on(event=Event.SYSTEM_INFO, listener=consume_automatic_system_info)
38+
39+
async with event_manager:
40+
await initial_system_info_consumed.wait()
41+
event_manager.off(event=Event.SYSTEM_INFO, listener=consume_automatic_system_info)
42+
2843
yield event_manager
2944

3045

tests/unit/_statistics/test_request_max_duration.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,11 @@
77

88
async def test_request_max_duration_tracks_maximum() -> None:
99
"""Test that request_max_duration correctly tracks the maximum duration, not the minimum."""
10+
11+
# asyncio.sleep() can sleep slightly shorter than expected https://bugs.python.org/issue31539#msg302699
12+
asyncio_sleep_time_tolerance = 0.015
13+
sleep_time = 0.05
14+
1015
async with Statistics.with_default_state() as statistics:
1116
# Record a short request
1217
statistics.record_request_processing_start('request_1')
@@ -15,15 +20,15 @@ async def test_request_max_duration_tracks_maximum() -> None:
1520

1621
# Record a longer request
1722
statistics.record_request_processing_start('request_2')
18-
await asyncio.sleep(0.05) # 50ms delay
23+
await asyncio.sleep(sleep_time) # 50ms delay
1924
statistics.record_request_processing_finish('request_2')
2025
second_duration = statistics.state.request_max_duration
2126

2227
# The max duration should be updated to the longer request's duration
2328
assert second_duration is not None
2429
assert first_duration is not None
2530
assert second_duration >= first_duration
26-
assert second_duration.total_seconds() >= 0.05
31+
assert second_duration.total_seconds() >= (sleep_time - asyncio_sleep_time_tolerance)
2732

2833
# Record another short request - max should NOT decrease
2934
statistics.record_request_processing_start('request_3')

tests/unit/_utils/test_recurring_task.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
import pytest
88

99
from crawlee._utils.recurring_task import RecurringTask
10+
from tests.unit.utils import run_alone_on_mac
1011

1112

1213
@pytest.fixture
@@ -41,6 +42,7 @@ async def test_start_and_stop(function: AsyncMock, delay: timedelta) -> None:
4142
assert rt.task.done()
4243

4344

45+
@run_alone_on_mac
4446
async def test_execution(function: AsyncMock, delay: timedelta) -> None:
4547
task = RecurringTask(function, delay)
4648

tests/unit/browsers/test_browser_pool.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
import pytest
66

77
from crawlee.browsers import BrowserPool, PlaywrightBrowserPlugin
8+
from tests.unit.utils import run_alone_on_mac
89

910
if TYPE_CHECKING:
1011
from yarl import URL
@@ -92,6 +93,7 @@ async def test_new_page_with_each_plugin(server_url: URL) -> None:
9293
assert browser_pool.total_pages_count == 2
9394

9495

96+
@run_alone_on_mac
9597
async def test_with_default_plugin_constructor(server_url: URL) -> None:
9698
async with BrowserPool.with_default_plugin(headless=True, browser_type='firefox') as browser_pool:
9799
assert len(browser_pool.plugins) == 1

tests/unit/crawlers/_basic/test_basic_crawler.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1373,6 +1373,7 @@ async def test_timeout_in_handler(sleep_type: str) -> None:
13731373
# Test is skipped in older Python versions.
13741374
from asyncio import timeout # type:ignore[attr-defined] # noqa: PLC0415
13751375

1376+
non_realtime_system_coefficient = 2
13761377
handler_timeout = timedelta(seconds=1)
13771378
max_request_retries = 3
13781379
double_handler_timeout_s = handler_timeout.total_seconds() * 2
@@ -1401,7 +1402,7 @@ async def handler(context: BasicCrawlingContext) -> None:
14011402

14021403
# Timeout in pytest, because previous implementation would run crawler until following:
14031404
# "The request queue seems to be stuck for 300.0s, resetting internal state."
1404-
async with timeout(max_request_retries * double_handler_timeout_s):
1405+
async with timeout(max_request_retries * double_handler_timeout_s * non_realtime_system_coefficient):
14051406
await crawler.run(['https://a.placeholder.com'])
14061407

14071408
assert crawler.statistics.state.requests_finished == 1

tests/unit/utils.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
import sys
2+
3+
import pytest
4+
5+
run_alone_on_mac = pytest.mark.run_alone if sys.platform == 'darwin' else lambda x: x

0 commit comments

Comments
 (0)