Comprehensive strategies for testing slash commands before deployment and distribution.
Testing commands ensures they work correctly, handle edge cases, and provide good user experience. A systematic testing approach catches issues early and builds confidence in command reliability.
What to test:
- YAML frontmatter syntax
- Markdown format
- File location and naming
How to test:
# Validate YAML frontmatter
head -n 20 .claude/commands/my-command.md | grep -A 10 "^---"
# Check for closing frontmatter marker
head -n 20 .claude/commands/my-command.md | grep -c "^---" # Should be 2
# Verify file has .md extension
ls .claude/commands/*.md
# Check file is in correct location
test -f .claude/commands/my-command.md && echo "Found" || echo "Missing"Automated validation script:
#!/bin/bash
# validate-command.sh
COMMAND_FILE="$1"
if [ ! -f "$COMMAND_FILE" ]; then
echo "ERROR: File not found: $COMMAND_FILE"
exit 1
fi
# Check .md extension
if [[ ! "$COMMAND_FILE" =~ \.md$ ]]; then
echo "ERROR: File must have .md extension"
exit 1
fi
# Validate YAML frontmatter if present
if head -n 1 "$COMMAND_FILE" | grep -q "^---"; then
# Count frontmatter markers
MARKERS=$(head -n 50 "$COMMAND_FILE" | grep -c "^---")
if [ "$MARKERS" -ne 2 ]; then
echo "ERROR: Invalid YAML frontmatter (need exactly 2 '---' markers)"
exit 1
fi
echo "✓ YAML frontmatter syntax valid"
fi
# Check for empty file
if [ ! -s "$COMMAND_FILE" ]; then
echo "ERROR: File is empty"
exit 1
fi
echo "✓ Command file structure valid"What to test:
- Field types correct
- Values in valid ranges
- Required fields present (if any)
Validation script:
#!/bin/bash
# validate-frontmatter.sh
COMMAND_FILE="$1"
# Extract YAML frontmatter
FRONTMATTER=$(sed -n '/^---$/,/^---$/p' "$COMMAND_FILE" | sed '1d;$d')
if [ -z "$FRONTMATTER" ]; then
echo "No frontmatter to validate"
exit 0
fi
# Check 'model' field if present
if echo "$FRONTMATTER" | grep -q "^model:"; then
MODEL=$(echo "$FRONTMATTER" | grep "^model:" | cut -d: -f2 | tr -d ' ')
if ! echo "sonnet opus haiku" | grep -qw "$MODEL"; then
echo "ERROR: Invalid model '$MODEL' (must be sonnet, opus, or haiku)"
exit 1
fi
echo "✓ Model field valid: $MODEL"
fi
# Check 'allowed-tools' field format
if echo "$FRONTMATTER" | grep -q "^allowed-tools:"; then
echo "✓ allowed-tools field present"
# Could add more sophisticated validation here
fi
# Check 'description' length
if echo "$FRONTMATTER" | grep -q "^description:"; then
DESC=$(echo "$FRONTMATTER" | grep "^description:" | cut -d: -f2-)
LENGTH=${#DESC}
if [ "$LENGTH" -gt 80 ]; then
echo "WARNING: Description length $LENGTH (recommend < 60 chars)"
else
echo "✓ Description length acceptable: $LENGTH chars"
fi
fi
echo "✓ Frontmatter fields valid"What to test:
- Command appears in
/help - Command executes without errors
- Output is as expected
Test procedure:
# 1. Start Claude Code
claude --debug
# 2. Check command appears in help
> /help
# Look for your command in the list
# 3. Invoke command without arguments
> /my-command
# Check for reasonable error or behavior
# 4. Invoke with valid arguments
> /my-command arg1 arg2
# Verify expected behavior
# 5. Check debug logs
tail -f ~/.claude/debug-logs/latest
# Look for errors or warningsWhat to test:
- Positional arguments work ($1, $2, etc.)
- $ARGUMENTS captures all arguments
- Missing arguments handled gracefully
- Invalid arguments detected
Test matrix:
| Test Case | Command | Expected Result |
|---|---|---|
| No args | /cmd |
Graceful handling or useful message |
| One arg | /cmd arg1 |
$1 substituted correctly |
| Two args | /cmd arg1 arg2 |
$1 and $2 substituted |
| Extra args | /cmd a b c d |
All captured or extras ignored appropriately |
| Special chars | /cmd "arg with spaces" |
Quotes handled correctly |
| Empty arg | /cmd "" |
Empty string handled |
Test script:
#!/bin/bash
# test-command-arguments.sh
COMMAND="$1"
echo "Testing argument handling for /$COMMAND"
echo
echo "Test 1: No arguments"
echo " Command: /$COMMAND"
echo " Expected: [describe expected behavior]"
echo " Manual test required"
echo
echo "Test 2: Single argument"
echo " Command: /$COMMAND test-value"
echo " Expected: 'test-value' appears in output"
echo " Manual test required"
echo
echo "Test 3: Multiple arguments"
echo " Command: /$COMMAND arg1 arg2 arg3"
echo " Expected: All arguments used appropriately"
echo " Manual test required"
echo
echo "Test 4: Special characters"
echo " Command: /$COMMAND \"value with spaces\""
echo " Expected: Entire phrase captured"
echo " Manual test required"What to test:
- @ syntax loads file contents
- Non-existent files handled
- Large files handled appropriately
- Multiple file references work
Test procedure:
# Create test files
echo "Test content" > /tmp/test-file.txt
echo "Second file" > /tmp/test-file-2.txt
# Test single file reference
> /my-command /tmp/test-file.txt
# Verify file content is read
# Test non-existent file
> /my-command /tmp/nonexistent.txt
# Verify graceful error handling
# Test multiple files
> /my-command /tmp/test-file.txt /tmp/test-file-2.txt
# Verify both files processed
# Test large file
dd if=/dev/zero of=/tmp/large-file.bin bs=1M count=100
> /my-command /tmp/large-file.bin
# Verify reasonable behavior (may truncate or warn)
# Cleanup
rm /tmp/test-file*.txt /tmp/large-file.binWhat to test:
- ` commands execute correctly
- Command output included in prompt
- Command failures handled
- Security: only allowed commands run
Test procedure:
# Create test command with bash execution
cat > .claude/commands/test-bash.md << 'EOF'
---
description: Test bash execution
allowed-tools: Bash(echo:*), Bash(date:*)
---
Current date: `date`
Test output: `echo "Hello from bash"`
Analysis of output above...
EOF
# Test in Claude Code
> /test-bash
# Verify:
# 1. Date appears correctly
# 2. Echo output appears
# 3. No errors in debug logs
# Test with disallowed command (should fail or be blocked)
cat > .claude/commands/test-forbidden.md << 'EOF'
---
description: Test forbidden command
allowed-tools: Bash(echo:*)
---
Trying forbidden: `ls -la /`
EOF
> /test-forbidden
# Verify: Permission denied or appropriate errorWhat to test:
- Commands work with other plugin components
- Commands interact correctly with each other
- State management works across invocations
- Workflow commands execute in sequence
Test scenarios:
# Setup: Command that triggers a hook
# Test: Invoke command, verify hook executes
# Command: .claude/commands/risky-operation.md
# Hook: PreToolUse that validates the operation
> /risky-operation
# Verify: Hook executes and validates before command completes# Setup: Multi-command workflow
> /workflow-init
# Verify: State file created
> /workflow-step2
# Verify: State file read, step 2 executes
> /workflow-complete
# Verify: State file cleaned up# Setup: Command uses MCP tools
# Test: Verify MCP server accessible
> /mcp-command
# Verify:
# 1. MCP server starts (if stdio)
# 2. Tool calls succeed
# 3. Results included in outputCreate a test suite script:
#!/bin/bash
# test-commands.sh - Command test suite
TEST_DIR=".claude/commands"
FAILED_TESTS=0
echo "Command Test Suite"
echo "=================="
echo
for cmd_file in "$TEST_DIR"/*.md; do
cmd_name=$(basename "$cmd_file" .md)
echo "Testing: $cmd_name"
# Validate structure
if ./validate-command.sh "$cmd_file"; then
echo " ✓ Structure valid"
else
echo " ✗ Structure invalid"
((FAILED_TESTS++))
fi
# Validate frontmatter
if ./validate-frontmatter.sh "$cmd_file"; then
echo " ✓ Frontmatter valid"
else
echo " ✗ Frontmatter invalid"
((FAILED_TESTS++))
fi
echo
done
echo "=================="
echo "Tests complete"
echo "Failed: $FAILED_TESTS"
exit $FAILED_TESTSValidate commands before committing:
#!/bin/bash
# .git/hooks/pre-commit
echo "Validating commands..."
COMMANDS_CHANGED=$(git diff --cached --name-only | grep "\.claude/commands/.*\.md")
if [ -z "$COMMANDS_CHANGED" ]; then
echo "No commands changed"
exit 0
fi
for cmd in $COMMANDS_CHANGED; do
echo "Checking: $cmd"
if ! ./scripts/validate-command.sh "$cmd"; then
echo "ERROR: Command validation failed: $cmd"
exit 1
fi
done
echo "✓ All commands valid"Test commands in CI/CD:
# .github/workflows/test-commands.yml
name: Test Commands
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Validate command structure
run: |
for cmd in .claude/commands/*.md; do
echo "Testing: $cmd"
./scripts/validate-command.sh "$cmd"
done
- name: Validate frontmatter
run: |
for cmd in .claude/commands/*.md; do
./scripts/validate-frontmatter.sh "$cmd"
done
- name: Check for TODOs
run: |
if grep -r "TODO" .claude/commands/; then
echo "ERROR: TODOs found in commands"
exit 1
fiEmpty arguments:
> /cmd ""
> /cmd '' ''Special characters:
> /cmd "arg with spaces"
> /cmd arg-with-dashes
> /cmd arg_with_underscores
> /cmd arg/with/slashes
> /cmd 'arg with "quotes"'Long arguments:
> /cmd $(python -c "print('a' * 10000)")Unusual file paths:
> /cmd ./file
> /cmd ../file
> /cmd ~/file
> /cmd "/path with spaces/file"Bash command edge cases:
# Commands that might fail
`exit 1`
`false`
`command-that-does-not-exist`
# Commands with special output
`echo ""`
`cat /dev/null`
`yes | head -n 1000000`#!/bin/bash
# test-command-performance.sh
COMMAND="$1"
echo "Testing performance of /$COMMAND"
echo
for i in {1..5}; do
echo "Run $i:"
START=$(date +%s%N)
# Invoke command (manual step - record time)
echo " Invoke: /$COMMAND"
echo " Start time: $START"
echo " (Record end time manually)"
echo
done
echo "Analyze results:"
echo " - Average response time"
echo " - Variance"
echo " - Acceptable threshold: < 3 seconds for fast commands"# Monitor Claude Code during command execution
# In terminal 1:
claude --debug
# In terminal 2:
watch -n 1 'ps aux | grep claude'
# Execute command and observe:
# - Memory usage
# - CPU usage
# - Process count- Command name is intuitive
- Description is clear in
/help - Arguments are well-documented
- Error messages are helpful
- Output is formatted readably
- Long-running commands show progress
- Results are actionable
- Edge cases have good UX
Recruit testers:
# Testing Guide for Beta Testers
## Command: /my-new-command
### Test Scenarios
1. **Basic usage:**
- Run: `/my-new-command`
- Expected: [describe]
- Rate clarity: 1-5
2. **With arguments:**
- Run: `/my-new-command arg1 arg2`
- Expected: [describe]
- Rate usefulness: 1-5
3. **Error case:**
- Run: `/my-new-command invalid-input`
- Expected: Helpful error message
- Rate error message: 1-5
### Feedback Questions
1. Was the command easy to understand?
2. Did the output meet your expectations?
3. What would you change?
4. Would you use this command regularly?Before releasing a command:
- File in correct location
- Correct .md extension
- Valid YAML frontmatter (if present)
- Markdown syntax correct
- Command appears in
/help - Description is clear
- Command executes without errors
- Arguments work as expected
- File references work
- Bash execution works (if used)
- Missing arguments handled
- Invalid arguments detected
- Non-existent files handled
- Special characters work
- Long inputs handled
- Works with other commands
- Works with hooks (if applicable)
- Works with MCP (if applicable)
- State management works
- Performance acceptable
- No security issues
- Error messages helpful
- Output formatted well
- Documentation complete
- Tested by others
- Feedback incorporated
- README updated
- Examples provided
# Check file location
ls -la .claude/commands/my-command.md
# Check permissions
chmod 644 .claude/commands/my-command.md
# Check syntax
head -n 20 .claude/commands/my-command.md
# Restart Claude Code
claude --debug# Verify syntax
grep '\$1' .claude/commands/my-command.md
grep '\$ARGUMENTS' .claude/commands/my-command.md
# Test with simple command first
echo "Test: \$1 and \$2" > .claude/commands/test-args.md# Check allowed-tools
grep "allowed-tools" .claude/commands/my-command.md
# Verify command syntax
grep '!`' .claude/commands/my-command.md
# Test command manually
date
echo "test"# Check @ syntax
grep '@' .claude/commands/my-command.md
# Verify file exists
ls -la /path/to/referenced/file
# Check permissions
chmod 644 /path/to/referenced/file- Test early, test often: Validate as you develop
- Automate validation: Use scripts for repeatable checks
- Test edge cases: Don't just test the happy path
- Get feedback: Have others test before wide release
- Document tests: Keep test scenarios for regression testing
- Monitor in production: Watch for issues after release
- Iterate: Improve based on real usage data