Skip to content

add spatial decomp docs#26

Merged
mahf708 merged 2 commits into
mainfrom
mahf708/spatial-decomp
May 15, 2026
Merged

add spatial decomp docs#26
mahf708 merged 2 commits into
mainfrom
mahf708/spatial-decomp

Conversation

@mahf708
Copy link
Copy Markdown
Collaborator

@mahf708 mahf708 commented May 15, 2026

This pull request adds comprehensive documentation for spatial decomposition (model-parallel) training of the ACE2-ERA5 model and updates the main training guide for clarity and correctness. The new guide explains how to configure and launch multi-node, multi-GPU training with spatial decomposition, including environment variables, sizing constraints, and troubleshooting. The main workflow guide is updated to clarify the YAML configuration structure for training.

Documentation Additions and Improvements:

  • Added a new guide, ace2-spatial-decomp.md, detailing the workflow for ACE2-ERA5 training with spatial decomposition, including prerequisites, configuration, launch instructions, and troubleshooting tips.
  • Linked the new spatial decomposition guide from the navigation menu in mkdocs.yml for easier discoverability.

Training Configuration Guide Updates:

  • Updated the example YAML in ace2-workflow.md to move n_forward_steps: 1 under stepper_training, clarifying the configuration structure.
  • Removed a redundant n_forward_steps: 1 from the top-level of the YAML example in ace2-workflow.md to avoid confusion.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 15, 2026

PR Preview Action v1.8.1

QR code for preview link

🚀 View preview at
https://E3SM-Project.github.io/aigroup/pr-preview/pr-26/

Built to branch gh-pages at 2026-05-15 19:27 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

…al resources sections from spatial decomposition guide
@mahf708 mahf708 merged commit f1b6a4a into main May 15, 2026
1 check passed
@mahf708 mahf708 deleted the mahf708/spatial-decomp branch May 15, 2026 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant