You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add smoke tests for CLI integration testing (#14)
* feat: add smoke tests for CLI integration testing
- Add smoke tests that verify end-to-end CLI functionality
- Test basic CLI operations (--version, --help, error handling)
- Test eval command with echo provider (no external dependencies)
- Test output formats (JSON, YAML, CSV)
- Test CLI flags (--repeat, --max-concurrency, --verbose, --no-cache)
- Test exit codes (0 for success, 100 for failures, 1 for errors)
- Test assertions (contains, icontains, failing assertions)
- Add pytest configuration with 'smoke' marker for selective testing
- Add comprehensive README documenting smoke test purpose and usage
Total: 20 smoke tests, all passing ✅
Smoke tests run against the installed promptfoo CLI via subprocess,
testing the Python wrapper integration with the Node.js CLI.
Run smoke tests:
pytest tests/smoke/ # Run all smoke tests
pytest tests/ -m smoke # Run only smoke-marked tests
pytest tests/ -m 'not smoke' # Skip smoke tests (unit tests only)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* ci: run unit tests and smoke tests in CI
Previously the CI was only testing CLI invocation but not running pytest.
Changes:
- Install dev dependencies (pytest, mypy, ruff) in test jobs
- Run unit tests with: pytest tests/ -v -m 'not smoke'
- Run smoke tests with: pytest tests/smoke/ -v
- Both 'test' and 'test-npx-fallback' jobs now run full test suite
This ensures:
✅ Unit tests run on all platforms (ubuntu, windows) and Python versions (3.9, 3.13)
✅ Smoke tests verify end-to-end CLI functionality
✅ Both global install and npx fallback paths are tested
* fix: use Optional for Python 3.9 compatibility in smoke tests
* fix: make platform-specific tests work on both Unix and Windows
- Split test_split_path into platform-specific versions (Unix/Windows)
- Split test_find_external_promptfoo_prevents_recursion for platform paths
- Use platform-appropriate node path in test_main_exits_when_neither_external_nor_npx_available
- Tests now skip appropriately on incompatible platforms
* fix: increase smoke test timeout for npx fallback scenarios
The first npx call can be slow as it downloads promptfoo.
Increased timeout from 60s to 120s to accommodate this.
* fix: handle None stdout/stderr in smoke tests
Add safety checks for None values from subprocess.run() output,
which can occur on Windows in certain error conditions.
* fix: address linting issues and add temp output to gitignore
- Fix line too long (123 > 120) in test_cli.py
- Run ruff format on test files
- Add tests/smoke/.temp-output/ to .gitignore
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* docs: update AGENTS.md with smoke test documentation
- Add comprehensive testing strategy section with unit vs smoke tests
- Document test directory structure
- Add smoke test details and commands
- Update CI/CD section to mention both test types
- Update project structure to include tests directory
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* style: add return type annotations and fix documentation wording
- Add `-> None` return type annotations to all smoke test methods
- Add Generator return type to setup_and_teardown fixture
- Update documentation to clarify tests run via Python wrapper
(not just npx)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix: resolve Windows CI test failures
- Add os.path.isfile mock to unit test to prevent _find_windows_promptfoo()
from finding real promptfoo installations on Windows CI runners
- Add UTF-8 encoding with error replacement to smoke tests to handle
Windows cp1252 encoding issues with npx output
- Add warmup_npx fixture to pre-download promptfoo via npx before tests,
preventing timeout on first test when npx needs to download package
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix: mock telemetry in CLI unit tests
Add record_wrapper_used mock to tests that mock subprocess.run to prevent
PostHog telemetry calls from interfering with mock call counts.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
0 commit comments