Add tolerations and affinity support for inferenceExtension (EPP) pod#1436
Open
yardenmaymon-td wants to merge 1 commit into
Open
Add tolerations and affinity support for inferenceExtension (EPP) pod#1436yardenmaymon-td wants to merge 1 commit into
yardenmaymon-td wants to merge 1 commit into
Conversation
Propagate `inferenceExtension.tolerations` and reuse the scenario-wide `affinity` wrapper (as decode/prefill/standalone do) into the gaie values so the EPP pod can be scheduled onto tainted/specific nodes. Scoped to fields the inferencepool chart (v1.4.0/v1.5.0) actually consumes; it has no `nodeSelector`, so propagating one would be a no-op. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Yarden Maymon <yarden.maymon@twodelta.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Propagate optional scheduling fields for the EndpointPicker (EPP) pod into the inferencepool chart values (
config/templates/jinja/12_gaie-values.yaml.j2):inferenceExtension.tolerations— emitted only when a scenario sets a non-empty list, leaving the chart'stolerations: []default untouched otherwise.affinity— reuses the scenario-wideaffinityconvenience wrapper already consumed bydecode/prefill/standalone: a flataffinity.nodeSelectormap is expanded into a requirednodeAffinityterm, withpodAffinity/podAntiAffinitypassed through. Gated onaffinity.enabled.This lets the EPP pod be scheduled onto tainted nodes (e.g. GPU nodes) and steered toward specific nodes.
Why these fields
Scoped to what the inferencepool/epplib chart (v1.4.0 and v1.5.0) actually consumes in its EPP pod spec — it honours
tolerationsandaffinitybut has nonodeSelectorfield, so propagating a top-levelinferenceExtension.nodeSelectorwould be a silent no-op.Note: the pod(anti)affinity passthrough uses
indent(_, true)so the first rendered line is indented; the unparametrisedindent(6)used by the modelservice template would leave the first line at column 0 (only hidden there because those keys default to{}).Testing
tests/test_gaie_values_render.py— renders the template with and without each field, including a pod-anti-affinity case that locks the first-line indentation. Values are based ondefaults.yamlso the test stays resilient to unrelated template churn.pytest tests/→ 279 passed, 4 skipped.cicd/kindcanary renders cleanly (29 files, 0 errors).🤖 Generated with Claude Code