Reinforcement learning agent for the board game Splendor, built with a custom Gymnasium environment and a PyTorch actor-critic network trained via self-play.
ai.Splendor/
├── game/ # Splendor game engine + Gymnasium environment (splendor-game)
└── ai/ # PyTorch model, agent, and training loop (splendor-ai)
Requires uv and Python ≥ 3.10.
Install everything
uv syncInstalls both splendor-game and splendor-ai into a shared .venv at the repo root.
Install only one package
# game engine only (no torch)
cd game && uv sync
# AI package only (includes splendor-game)
cd ai && uv syncRun a specific package's scripts
# from the repo root
uv run --package splendor-ai python -m ai.train
# or from inside ai/
cd ai && uv run python -m ai.trainuv run python -m ai.train
# or via the entry point:
uv run splendor-train --episodes 5000 --lr 3e-4 --checkpoint-dir checkpointsCheckpoints are saved to checkpoints/ every 500 episodes by default.
from splendor.env import SplendorEnv
env = SplendorEnv(num_players=2)
obs, info = env.reset(seed=42)
print(obs.shape) # (184,)
print(len(info["legal_actions"]))