Fix/learning pipeline#1
Open
Grigory163 wants to merge 3 commits into
Open
Conversation
…, and live bot - agent.py: Pre-norm DT with sinusoidal PE, return conditioning, causal mask, PER replay buffer, cosine LR scheduler, gradient clipping, match stats - runbot.py: 7-component reward function (tower damage, destruction bonus, elixir efficiency, tempo, defense), TensorBoard logging, terminal rewards - config.py: Hog 2.6 deck config, evolved card mappings, video parser settings - scaler.py: Robust game area detection with fallback for portrait mode - New: video_finder.py, video_parser.py, card_trainer.py, dataset_builder.py, katacr_converter.py, replay_trainer.py for offline training pipeline - Updated anchors for Russian locale (battle_anchor, game_anchor)
1. Epsilon: 1.0 → 0.4 start (pretrained weights exist, no need for pure random) Decay 0.995 → 0.98 (reach 0.05 in ~60 games instead of ~600) 2. Cross-episode sampling: ReplayBuffer.sample() now respects episode boundaries Previously could sample context windows spanning end of one game + start of next, corrupting returns-to-go and confusing the model 3. Action validation: learn_from_game() now filters out: - "do nothing" actions (not real decisions) - Steps where hand has <2 valid cards (vision glitches) 4. Training safeguards: - Min buffer size = max(context_len*3, 200) before training - Epochs scale with buffer size to prevent overfitting on tiny data 5. Checkpoint persistence: save/load now preserves epsilon, total_games, wins, recent_results, optimizer state — no more resetting to epsilon=1.0 on restart 6. Early playability check in decide_action() — skip if no card is affordable
Architecture overview, learning pipeline, setup guide, and roadmap.
580eeca to
299b9f7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.