-
Notifications
You must be signed in to change notification settings - Fork 3
feat: add smoke tests for CLI integration testing #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 10 commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
910f7c4
feat: add smoke tests for CLI integration testing
mldangelo 60dff7d
ci: run unit tests and smoke tests in CI
mldangelo 6193feb
fix: use Optional for Python 3.9 compatibility in smoke tests
mldangelo 3f4e9fd
fix: make platform-specific tests work on both Unix and Windows
mldangelo 9cd4d11
fix: increase smoke test timeout for npx fallback scenarios
mldangelo 13c1f1d
fix: handle None stdout/stderr in smoke tests
mldangelo b5a25cb
Merge branch 'main' into feat/add-smoke-tests
mldangelo 7b188ee
Merge remote-tracking branch 'origin/main' into pr-14
mldangelo 874955b
fix: address linting issues and add temp output to gitignore
mldangelo 055b211
docs: update AGENTS.md with smoke test documentation
mldangelo 44cdf96
style: add return type annotations and fix documentation wording
mldangelo 62ae7d5
Merge origin/main into pr-14
mldangelo 79e74ac
fix: resolve Windows CI test failures
mldangelo 02acd12
fix: mock telemetry in CLI unit tests
mldangelo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -42,6 +42,7 @@ htmlcov/ | |
| .tox/ | ||
| .mypy_cache/ | ||
| .ruff_cache/ | ||
| tests/smoke/.temp-output/ | ||
|
|
||
| # Distribution | ||
| dist/ | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,88 @@ | ||
| # Smoke Tests | ||
|
|
||
| These smoke tests verify that the core promptfoo CLI functionality works correctly through the Python wrapper. | ||
|
|
||
| ## What are Smoke Tests? | ||
|
|
||
| Smoke tests are high-level integration tests that verify the most critical functionality works end-to-end. They: | ||
|
|
||
| - Run against the actual installed CLI (via `npx promptfoo`) | ||
| - Test the Python wrapper integration with the Node.js CLI | ||
| - Use the `echo` provider to avoid external API dependencies | ||
| - Verify command-line arguments, file I/O, and output formats | ||
| - Check exit codes and error handling | ||
|
|
||
| ## Running Smoke Tests | ||
|
|
||
| ```bash | ||
| # Run all smoke tests | ||
| pytest tests/smoke/ | ||
|
|
||
| # Run with verbose output | ||
| pytest tests/smoke/ -v | ||
|
|
||
| # Run a specific test class | ||
| pytest tests/smoke/test_smoke.py::TestEvalCommand | ||
|
|
||
| # Run a specific test | ||
| pytest tests/smoke/test_smoke.py::TestEvalCommand::test_basic_eval | ||
| ``` | ||
|
|
||
| ## Test Structure | ||
|
|
||
| - `test_smoke.py` - Main smoke test suite | ||
| - `fixtures/` - Test configuration files | ||
| - `configs/` - YAML configuration files for testing | ||
|
|
||
| ## Test Coverage | ||
|
|
||
| ### Basic CLI Operations | ||
| - Version flag (`--version`) | ||
| - Help output (`--help`, `eval --help`) | ||
| - Unknown command handling | ||
| - Missing file error handling | ||
|
|
||
| ### Eval Command | ||
| - Basic evaluation with echo provider | ||
| - Output formats (JSON, YAML, CSV) | ||
| - Command-line flags (`--max-concurrency`, `--repeat`, `--verbose`) | ||
| - Cache control (`--no-cache`) | ||
|
|
||
| ### Exit Codes | ||
| - Exit code 0 for success | ||
| - Exit code 100 for assertion failures | ||
| - Exit code 1 for configuration errors | ||
|
|
||
| ### Echo Provider | ||
| - Basic prompt echoing | ||
| - Variable substitution | ||
| - Multiple variable handling | ||
|
|
||
| ### Assertions | ||
| - `contains` assertion | ||
| - `icontains` assertion (case-insensitive) | ||
| - Multiple assertions per test | ||
| - Failing assertion behavior | ||
|
|
||
| ## Why Echo Provider? | ||
|
|
||
| The `echo` provider is perfect for smoke tests because: | ||
|
|
||
| 1. **No external dependencies** - Doesn't require API keys or network calls | ||
| 2. **Deterministic** - Always returns the same output for the same input | ||
| 3. **Fast** - No network latency | ||
| 4. **Predictable** - Easy to write assertions against | ||
|
|
||
| ## Adding New Smoke Tests | ||
|
|
||
| 1. Create a new test config in `fixtures/configs/` if needed | ||
| 2. Add test methods to the appropriate test class in `test_smoke.py` | ||
| 3. Use the `run_promptfoo()` helper to execute CLI commands | ||
| 4. Make assertions on stdout, stderr, exit codes, and output files | ||
|
|
||
| ## Notes | ||
|
|
||
| - Smoke tests run slower than unit tests (they spawn subprocesses) | ||
| - They require Node.js and promptfoo to be installed | ||
| - They test the integration between Python and Node.js | ||
| - They should be kept focused on critical functionality | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| """Smoke tests for promptfoo CLI.""" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| # yaml-language-server: $schema=https://promptfoo.dev/config-schema.json | ||
| description: 'Smoke test - multiple assertions' | ||
|
|
||
| providers: | ||
| - echo | ||
|
|
||
| prompts: | ||
| - 'Hello {{name}}, welcome to {{place}}' | ||
|
|
||
| tests: | ||
| - vars: | ||
| name: Alice | ||
| place: Wonderland | ||
| assert: | ||
| - type: contains | ||
| value: Hello | ||
| - type: contains | ||
| value: Alice | ||
| - type: contains | ||
| value: Wonderland | ||
| - type: icontains | ||
| value: WELCOME |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| # yaml-language-server: $schema=https://promptfoo.dev/config-schema.json | ||
| description: 'Smoke test - basic config validation' | ||
|
|
||
| providers: | ||
| - echo | ||
|
|
||
| prompts: | ||
| - 'Hello {{name}}' | ||
|
|
||
| tests: | ||
| - vars: | ||
| name: World | ||
| assert: | ||
| - type: contains | ||
| value: Hello | ||
| - type: contains | ||
| value: World |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| # yaml-language-server: $schema=https://promptfoo.dev/config-schema.json | ||
| description: 'Smoke test - config with failing assertion' | ||
|
|
||
| providers: | ||
| - echo | ||
|
|
||
| prompts: | ||
| - 'Hello {{name}}' | ||
|
|
||
| tests: | ||
| - vars: | ||
| name: World | ||
| assert: | ||
| # This assertion will fail because echo returns "Hello World" | ||
| # but we're asserting it contains "IMPOSSIBLE_STRING_NOT_IN_OUTPUT" | ||
| - type: contains | ||
| value: IMPOSSIBLE_STRING_NOT_IN_OUTPUT_12345 |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation states tests run "via
npx promptfoo" but they actually run via the Python wrapper which may use either a globally installed promptfoo or fall back to npx. Consider updating to "via the Python wrapper (using either global promptfoo or npx)" for accuracy.