Skip to content

Commit 4d35a18

Browse files
test(pickle): add Linux-only test for inotify instance exhaustion (Issue #24) (#275)
* test(pickle): add Linux-only test for inotify instance exhaustion (Issue #24)\n\nAdds a test to reproduce the 'inotify instance limit reached' error on Linux by rapidly creating many concurrent cache waits using the pickle backend. The test is skipped on non-Linux systems and is designed to be informative, not flaky.\n\nAlso updates CI to lower the inotify instance limit on Linux/local jobs, increasing the likelihood of hitting the error in CI.\n\nReferences: https://github.com/python-cachier/cachier/issues/24\n\n- Follows project and Python best practices for test isolation, OS-specific logic, and CI configuration.\n- Test is marked with @pytest.mark.pickle and uses pytest.skip for clarity.\n- CI change is Linux/local only, with clear comments.\n\nThis helps document and track the resource exhaustion issue, and provides a reproducible test for future backend or watchdog improvements. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * style(tests): fix long lines for Ruff E501 compliance in test_pickle_core.py * style(tests): split long lines in test_pickle_core.py to satisfy Ruff E501 * Make inotify test more aggressive and force failure instead of skip - Increase thread count to 4x the system limit (up to 4096) - Make the slow function slower (0.5s instead of 0.1s) - Increase wait_for_calc_timeout to 0.1s - Force test to fail instead of skip when limit not hit - Add more debugging output to understand what's happening This should help reproduce the inotify instance limit issue in CI. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reverse inotify test logic: now fails when bug exists, passes when fixed - Test now FAILS when inotify instance limit is reached (bug exists) - Test now PASSES when no inotify errors occur (bug is fixed) - This makes the test useful for verifying fixes to the inotify issue - Updated error messages to reflect the new logic * Add xfail marker to inotify test - Test is now marked as expected to fail with @pytest.mark.xfail - Will show as XFAIL when bug exists (expected) - Will show as XPASS when bug is fixed (pleasant surprise) - Provides clear reason for expected failure * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix all ruff lint errors in inotify test (line length, subprocess path, ternary, string wrapping) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent e70f9a7 commit 4d35a18

2 files changed

Lines changed: 111 additions & 0 deletions

File tree

.github/workflows/ci-test.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,11 @@ jobs:
5656
python -m pip install --upgrade pip
5757
python -m pip install -e . -r tests/requirements.txt
5858
59+
- name: Lower inotify instance limit (Linux only, for Issue #24 test)
60+
if: runner.os == 'Linux' && matrix.backend == 'local'
61+
run: sudo sysctl -w fs.inotify.max_user_instances=128
62+
# This helps the test_inotify_instance_limit_reached hit the limit in CI
63+
5964
- name: Unit tests (local)
6065
if: matrix.backend == 'local'
6166
run: pytest -m "not mongo and not sql and not redis" --cov=cachier --cov-report=term --cov-report=xml:cov.xml

tests/test_pickle_core.py

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
import Queue as queue # type: ignore
2727

2828
import hashlib
29+
import sys
2930

3031
import pandas as pd
3132

@@ -607,3 +608,108 @@ def _params_with_dataframe(*args, **kwargs):
607608
value_b = _params_with_dataframe(1, df=df_b)
608609

609610
assert value_a == value_b # same content --> same key
611+
612+
613+
@pytest.mark.pickle
614+
@pytest.mark.skipif(
615+
not sys.platform.startswith("linux"),
616+
reason="inotify instance limit is only relevant on Linux",
617+
)
618+
@pytest.mark.xfail(
619+
reason=(
620+
"inotify instance limit issue not yet fixed - test will pass "
621+
"when issue is resolved"
622+
)
623+
)
624+
def test_inotify_instance_limit_reached():
625+
"""Reproduces the inotify instance exhaustion issue (see Issue #24).
626+
627+
Rapidly creates many cache waits to exhaust inotify instances.
628+
Reference: https://github.com/python-cachier/cachier/issues/24
629+
630+
"""
631+
import queue
632+
import subprocess
633+
import time
634+
635+
# Try to get the current inotify limit
636+
try:
637+
result = subprocess.run(
638+
["/bin/cat", "/proc/sys/fs/inotify/max_user_instances"],
639+
capture_output=True,
640+
text=True,
641+
timeout=5,
642+
)
643+
if result.returncode == 0:
644+
current_limit = int(result.stdout.strip())
645+
print(f"Current inotify max_user_instances limit: {current_limit}")
646+
else:
647+
current_limit = None
648+
print("Could not determine inotify limit")
649+
except Exception as e:
650+
current_limit = None
651+
print(f"Error getting inotify limit: {e}")
652+
653+
@cachier(backend="pickle", wait_for_calc_timeout=0.1)
654+
def slow_func(x):
655+
time.sleep(0.5) # Make it slower to increase chance of hitting limit
656+
return x
657+
658+
# Start many threads to trigger wait_on_entry_calc
659+
threads = []
660+
errors = []
661+
results = queue.Queue()
662+
663+
# Be more aggressive - try to exhaust the limit
664+
N = (
665+
min(current_limit * 4, 4096) if current_limit is not None else 4096
666+
) # Try to exceed the limit more aggressively
667+
print(f"Starting {N} threads to test inotify exhaustion")
668+
669+
def call():
670+
try:
671+
results.put(slow_func(1))
672+
except OSError as e:
673+
errors.append(e)
674+
except Exception as e:
675+
# Capture any other exceptions for debugging
676+
errors.append(e)
677+
678+
for i in range(N):
679+
t = threading.Thread(target=call)
680+
threads.append(t)
681+
t.start()
682+
if i % 100 == 0:
683+
print(f"Started {i} threads...")
684+
685+
print("Waiting for all threads to complete...")
686+
for t in threads:
687+
t.join()
688+
689+
print(
690+
f"Test completed. Got {len(errors)} errors, {results.qsize()} results"
691+
)
692+
693+
# If any OSError with "inotify instance limit reached" is raised,
694+
# the test FAILS (expected failure due to the bug)
695+
if any("inotify instance limit reached" in str(e) for e in errors):
696+
print(
697+
"FAILURE: Hit inotify instance limit - this indicates the bug "
698+
"still exists"
699+
)
700+
raise AssertionError(
701+
"inotify instance limit reached error occurred. "
702+
f"Got {len(errors)} errors with inotify limit issues."
703+
)
704+
705+
# If no inotify errors but other errors, fail
706+
if errors:
707+
print(f"Unexpected errors occurred: {errors}")
708+
raise AssertionError(f"Unexpected OSErrors: {errors}")
709+
710+
# If no errors at all, the test PASSES (issue is fixed!)
711+
print(
712+
"SUCCESS: No inotify instance limit errors occurred - the issue "
713+
"appears to be fixed!"
714+
)
715+
# No need to return - test passes naturally

0 commit comments

Comments
 (0)