Skip to content

Upgrade skypilot executor with 0.9.2#246

Merged
hemildesai merged 1 commit into
NVIDIA-NeMo:mainfrom
twmeissane:twmeissane/skypilot-0.9.2
May 29, 2025
Merged

Upgrade skypilot executor with 0.9.2#246
hemildesai merged 1 commit into
NVIDIA-NeMo:mainfrom
twmeissane:twmeissane/skypilot-0.9.2

Conversation

@twmeissane

Copy link
Copy Markdown
Contributor

Updates follow migration guidelines: https://docs.skypilot.co/en/latest/reference/faq.html\#migration-from-skypilot-0-8-1 Minor documentation improvement to ensure package compatibility during installation of sky

Co-authored by: sahger.lad@thoughtworks.com rahul.punjabi@thoughtworks.com

hemildesai
hemildesai previously approved these changes May 28, 2025

@hemildesai hemildesai left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for the contribution 🎉 . Other packagers should also be compatible with Skypilot now, will add support in a follow up PR.

Do the logs work fine with the upgrade? If not, we can follow up with a fix in a separate PR as well.

Comment thread test/core/execution/test_skypilot.py Fixed
@hemildesai

Copy link
Copy Markdown
Contributor

You might need to fix the formatting according to Ruff format / format (pull_request) for the CI check to pass. We can merge afterwards.

Updates follow migration guidelines: https://docs.skypilot.co/en/latest/reference/faq.html\#migration-from-skypilot-0-8-1
Minor documentation improvement to ensure package compatibility during
installation of sky

Co-authored by: sahger.lad@thoughtworks.com rahul.punjabi@thoughtworks.com
Signed-off-by: Meissane Chami <meissane.chami@thoughtworks.com>
Signed-off-by: twmeissane <meissane.chami@thoughtworks.com>
@twmeissane

Copy link
Copy Markdown
Contributor Author

Thank you so much for the contribution 🎉 . Other packagers should also be compatible with Skypilot now, will add support in a follow up PR.

Do the logs work fine with the upgrade? If not, we can follow up with a fix in a separate PR as well.

Hi @hemildesai the logs work fine on our end and they are available when we train on a single node. However when we move to multiple node we do encounter some issues both in v0.8 and v0.9 around the working directly /nemo_run/code. We believe this is potentially just come misconfiguration on our end.

@hemildesai

hemildesai commented May 29, 2025

Copy link
Copy Markdown
Contributor

Hi @hemildesai the logs work fine on our end and they are available when we train on a single node. However when we move to multiple node we do encounter some issues both in v0.8 and v0.9 around the working directly /nemo_run/code. We believe this is potentially just come misconfiguration on our end.

The working dir should be synced from local filesystem to /nemo_run/code on all the nodes. If that is not the case, feel free to file an issue and we can help investigate deeper.

@hemildesai hemildesai merged commit dae14e3 into NVIDIA-NeMo:main May 29, 2025
18 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants