This guide helps you debug common issues with E2E tests and provides cleanup procedures.
If infrastructure lifecycle tests fail and leave LXD resources behind:
# Check running containers
lxc list
# Stop and delete the test container
lxc stop torrust-tracker-vm
lxc delete torrust-tracker-vm
# Or use OpenTofu to clean up
cd build/tofu/lxd
tofu destroy -auto-approveIf deployment workflow tests fail and leave Docker resources behind:
# Check running containers
docker ps -a
# Stop and remove test containers
docker stop $(docker ps -q --filter "ancestor=torrust-provisioned-instance")
docker rm $(docker ps -aq --filter "ancestor=torrust-provisioned-instance")
# Remove test images if needed
docker rmi torrust-provisioned-instanceLXD daemon not running:
sudo systemctl start lxdInsufficient privileges:
- Ensure your user is in the
lxdgroup - May need to log out and back in after adding to group
OpenTofu state corruption:
# Delete corrupted state and retry
rm build/tofu/lxd/terraform.tfstate
cargo run --bin e2e-infrastructure-lifecycle-testsCloud-init timeout:
- VM may need more time to complete initialization
- Check cloud-init status manually:
lxc exec torrust-tracker-vm -- cloud-init statusDocker daemon not running:
sudo systemctl start dockerContainer build failures:
- Check Docker image build logs
- Ensure Dockerfile syntax is correct
- Verify base image is accessible
SSH connectivity to container:
- Verify container networking is functional
- Check SSH service is running in container
- Validate SSH key permissions (should be 600)
Ansible connection errors:
- Check container SSH configuration
- Verify Ansible inventory has correct IP/port
- Ensure SSH key matches between test and container
Network connectivity in VMs:
- This is a known limitation on GitHub Actions
- Use split test suites for reliable testing in CI
- Complete workflow tests are for local use only
SSH connectivity failures:
- Usually means cloud-init is still running
- Wait for cloud-init to complete before SSH attempts
- Check SSH configuration hasn't failed during cloud-init
Mixed infrastructure issues:
- This test combines all provision and deployment issues
- Use split tests to isolate whether issue is in infrastructure or deployment
- Check both LXD and Docker logs
Use the --keep flag to inspect the environment after test completion.
cargo run --bin e2e-infrastructure-lifecycle-tests -- --keep
# After test completion, connect to the LXD container:
lxc exec torrust-tracker-vm -- /bin/bashcargo run --bin e2e-deployment-workflow-tests -- --keep
# After test completion, find and connect to the Docker container:
docker ps
docker exec -it <container-id> /bin/bashcargo run --bin e2e-complete-workflow-tests -- --keep
# Connect to the LXD VM as above
lxc exec torrust-tracker-vm -- /bin/bashProblem: GitHub Actions runners have SSH service running on port 22, which conflicts with test containers that also expose SSH on port 22.
Root Cause: When using Docker host networking (--network host), the container's SSH port 22 directly conflicts with the runner's SSH service on port 22.
Solution: Use Docker bridge networking (default) with dynamic port mapping:
- Container SSH port 22 is mapped to a random host port (e.g., 33061)
- The
registercommand accepts an optional--ssh-portargument to specify the mapped port - Ansible inventory is automatically updated with the custom SSH port
Implementation:
# E2E test discovers the mapped SSH port and passes it to register command
torrust-tracker-deployer register e2e-config --instance-ip 127.0.0.1 --ssh-port 33061Technical Details: See ADR: Register Command SSH Port Override for the complete architectural decision, implementation strategy, and alternatives considered.
This enhancement also supports real-world scenarios:
- Registering instances with non-standard SSH ports for security
- Working with containerized environments where port mapping is common
- Connecting to instances behind port-forwarding configurations
Some behaviors that appear as errors are actually expected. See docs/contributing/known-issues.md for:
- SSH host key warnings (red but normal in E2E tests)
- Expected stderr output that looks like errors but isn't
- Ansible warning messages that are safe to ignore
If you're still experiencing issues:
- Check the project's GitHub Issues for similar problems
- Review the contributing guide for development setup
- Consult the logging guide for enabling detailed logs
- Ask in project discussions or open a new issue with full context