|
2 | 2 |
|
3 | 3 | This directory contains a mock project to verify that LLM agents correctly identify and suggest custom extension commands defined in `.specify/extensions.yml`. |
4 | 4 |
|
5 | | -## The Test |
| 5 | +# Testing Extension Commands |
| 6 | + |
| 7 | +This directory contains a mock project to verify that LLM agents correctly identify and execute custom extension commands defined in `.specify/extensions.yml`. |
| 8 | + |
| 9 | +## Automated LLM Test Execution |
| 10 | + |
| 11 | +To execute this test, copy the following block and paste it into GitHub Copilot Chat (or your LLM of choice) while focused on this `TESTING.md` file: |
| 12 | + |
| 13 | +--- |
| 14 | + |
| 15 | +**PROMPT TO COPY:** |
| 16 | + |
| 17 | +```prompt |
| 18 | +Act as an automated test runner (like Ginkgo or pytest) evaluating your own comprehension of the current workspace. I want you to run the following test suite on the `.specify/extensions.yml` file in this directory and output the results strictly in a terminal-style test output format. |
| 19 | +
|
| 20 | +**Test Suite Context**: |
| 21 | +You are evaluating if you can correctly parse and execute custom extension commands defined by Spec Kit extensions. |
| 22 | +
|
| 23 | +**Test Cases to Evaluate**: |
| 24 | +1. [Test Case 1] "Discovery Validation": Read `.specify/extensions.yml`. Verify that you can find two custom commands: `/ext.lint` and `/ext.deploy`. If you can, mark this test as PASS. If you cannot find them, mark as FAIL. |
| 25 | +2. [Test Case 2] "Intent Binding": Pretend to execute the `/ext.lint` command. Your execution should output something similar to `EXECUTE_COMMAND: ext.lint`. If you understand that `/ext.lint` maps to the `custom_lint` object in yaml, mark as PASS. If you don't know what to do, mark as FAIL. |
| 26 | +
|
| 27 | +**Required Output Format**: |
| 28 | +Provide your output exactly like this example format, replacing the bracketed content with your actual evaluation logic: |
| 29 | +
|
| 30 | +============================= test session starts ============================== |
| 31 | +collected 2 items |
| 32 | +
|
| 33 | +test_commands_discovery.py::test_discovery [PASS/FAIL] |
| 34 | + Details: [Provide 1-2 sentences proving you found the commands and their descriptions] |
| 35 | +
|
| 36 | +test_commands_execution.py::test_intent_binding [PASS/FAIL] |
| 37 | + Details: [Provide the simulated output of executing the command] |
6 | 38 |
|
7 | | -1. Open a chat with an LLM (like GitHub Copilot) in this project. |
8 | | -2. Ask it what extension commands are available in this directory: |
9 | | - > "What custom extension commands are available in this directory according to the `.specify/extensions.yml` file? Can you list them?" |
10 | | -3. **Expected Behavior**: |
11 | | - - The LLM should read `.specify/extensions.yml` and identify the two custom commands: `/ext.lint` and `/ext.deploy`. |
12 | | - - It should list their descriptions and prompts. |
| 39 | +============================== [X] passed in 0.0s ============================== |
| 40 | +``` |
13 | 41 |
|
14 | | -4. Next, test its comprehension of executing a command: |
15 | | - > "Please pretend to execute `/ext.lint`." |
16 | | -5. **Expected Behavior**: |
17 | | - - The LLM should output that it is executing the command, simulating output similar to `EXECUTE_COMMAND: ext.lint`. |
18 | | - - Since it's an LLM, it might playfully simulate fixing imaginary formatting in `main.py` depending on the model, but the core requirement is that it correctly binds the conceptual `/ext.lint` string to the `custom_lint` object in yaml. |
| 42 | +--- |
19 | 43 |
|
20 | 44 | ## Validation Goals |
21 | | -This playground ensures that AI Agents, which do not run strict compiled Spec Kit binaries, can still integrate with the broader extension ecosystem natively just by reading the `.specify/` configuration maps. |
| 45 | +This playground ensures that AI Agents, which do not run strict compiled Spec Kit binaries, can still integrate with the broader extension ecosystem natively just by reading the `.specify/` configuration maps. It also enforces that LLMs can self-certify their comprehension using recognizable testing frameworks! |
0 commit comments