Skip to content
This repository was archived by the owner on Oct 15, 2025. It is now read-only.

fixing gateway#33

Merged
Gregory-Pereira merged 2 commits into
llm-d:mainfrom
Gregory-Pereira:fixing-gateway
May 6, 2025
Merged

fixing gateway#33
Gregory-Pereira merged 2 commits into
llm-d:mainfrom
Gregory-Pereira:fixing-gateway

Conversation

@Gregory-Pereira
Copy link
Copy Markdown
Member

@Gregory-Pereira Gregory-Pereira commented May 6, 2025

cc @tumido @nerdalert @sallyom

Changes:

  • dns compatible model name sanatizaiton
  • dynamically enabling metrics annotations in services via values modelservice.<component>.metrics.enabled
  • Upgrading to Prod MS image and fixing its runtime (Switch to prod image for llm-d/llm-d-model-service-dev #38)
  • Consistent gateway targeted labels and selectors for all gateway related manifests

Signed-off-by: greg pereira <grpereir@redhat.com>
Signed-off-by: greg pereira <grpereir@redhat.com>
@Gregory-Pereira
Copy link
Copy Markdown
Member Author

lets 🚢

@Gregory-Pereira Gregory-Pereira merged commit 7b6ddd7 into llm-d:main May 6, 2025
3 of 4 checks passed
Copy link
Copy Markdown
Member

@nerdalert nerdalert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested and lgtm

@tumido tumido linked an issue May 8, 2025 that may be closed by this pull request
@tumido tumido linked an issue May 8, 2025 that may be closed by this pull request
tumido added a commit that referenced this pull request May 15, 2025
Co-authored-by: greg pereira <grpereir@redhat.com>
Co-authored-by: Brent Salisbury <bsalisbu@redhat.com>
Co-authored-by: Chris Chase <cchase@redhat.com>
Co-authored-by: Ryan Cook <rcook@redhat.com>
Co-authored-by: sallyom <somalley@redhat.com>
Co-authored-by: Anil Kumar Vishnoi <vishnoianil@gmail.com>
Co-authored-by: Andrew Anderson <andy@clubanderson.com>

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

chore: trigger release

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

Update installer permissions

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

remove minikube flags from base installer

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

chore: renaming things

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

updating modelservice CRD for cmd support + automation

Signed-off-by: greg pereira <grpereir@redhat.com>

chore: replace proselint with vale

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

swap image back to ms official

Signed-off-by: greg pereira <grpereir@redhat.com>

chore: run test on release tags and add badge to releases in GH

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

labels + using gotpl decode service

Signed-off-by: greg pereira <grpereir@redhat.com>

temporarily remove vale lint

Signed-off-by: greg pereira <grpereir@redhat.com>

linting and fixing helpers

Signed-off-by: greg pereira <grpereir@redhat.com>

revert removing vale

Signed-off-by: greg pereira <grpereir@redhat.com>

msvc rbac v2 hack updates

Signed-off-by: greg pereira <grpereir@redhat.com>

msvc rbac v2 hack updates v3

Signed-off-by: greg pereira <grpereir@redhat.com>

quickstart README updates

bumping epp image to new amd target

Signed-off-by: greg pereira <grpereir@redhat.com>

chore: fix pre-commit-cache

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

Replaced broken glusterfs for single node hostPath

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

Fix minikube readme

Signed-off-by: Anil Vishnoi <vishnoianil@gmail.com>

chore: run chart releases as bumper bot, so we can trigger workflows from it

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

using imagepullsecrts for epp/pd secrets

Signed-off-by: greg pereira <grpereir@redhat.com>

Update charts/llm-d/templates/modelservice/_helpers.tpl

non global pull-secrets as well

Co-authored-by: Tom Coufal <7453394+tumido@users.noreply.github.com>

calling the IPS template properly

Signed-off-by: greg pereira <grpereir@redhat.com>

chart bump + linting

Signed-off-by: greg pereira <grpereir@redhat.com>

chore: fix test workflow on tag push (#35)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

fixing gateway (#33)

* fixing MS upgrade no pd role flag

Signed-off-by: greg pereira <grpereir@redhat.com>

* fixing gateway

Signed-off-by: greg pereira <grpereir@redhat.com>

---------

Signed-off-by: greg pereira <grpereir@redhat.com>

Make the storage PV and PVCs variable in the minikube installer (#49)

* Make the storage PV and PVCs variable

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

* Default to hostPath storage type

- No need to have conditionals for storage type in the
minikube installer.

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

* minikube readme updates

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

---------

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

Fix quickstart validation model names (#48)

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

feat: add Istio backend for Gateway (#45)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

removal of dev flag (#52)

Signed-off-by: Ryan Cook <rcook@redhat.com>

feat: migrate to community redis image (#53)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

removal of -dev from sidecar and use latest image tag (#54)

Signed-off-by: Ryan Cook <rcook@redhat.com>

fix: modelservice CR was wrong (#56)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

stub OCP ingress controller into opinionated install (#47)

* stub OCP ingress controller into opinionated install

Signed-off-by: greg pereira <grpereir@redhat.com>

* remove backstage references and respect passing host

Signed-off-by: greg pereira <grpereir@redhat.com>

---------

Signed-off-by: greg pereira <grpereir@redhat.com>

feat: update model service rbac (#58)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

modelservice servicemonitor (#22)

* add modelservice servicemonitor

Signed-off-by: sallyom <somalley@redhat.com>

* disable modelservice metrics in CI

Signed-off-by: sallyom <somalley@redhat.com>

* update installers for metrics collection

Signed-off-by: sallyom <somalley@redhat.com>

* update quickstart READMEs for metrics

Signed-off-by: sallyom <somalley@redhat.com>

* bump chart version

Signed-off-by: sallyom <somalley@redhat.com>

---------

Signed-off-by: sallyom <somalley@redhat.com>

fix: ensure model service controller can always grant epp role (#60)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

ensure metrics stack reinstalls in quickstarts (#61)

Signed-off-by: sallyom <somalley@redhat.com>

fix: kgateway proxyUID fixes - compatibility with multiple gateway types (#64)

Signed-off-by: greg pereira <grpereir@redhat.com>

chore: fix ci values file (#63)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

Add tolerations to values.yaml and baseconfigs (#50)

BYO model (#21)

* BYO model

Signed-off-by: greg pereira <grpereir@redhat.com>

* charts should not have sample app perspective, but treat MSVC as first class citizen

Signed-off-by: greg pereira <grpereir@redhat.com>

* defy remove sample app from MSVC base config - hack

Signed-off-by: greg pereira <grpereir@redhat.com>

* refactor base everything off modelartifactURI

Signed-off-by: greg pereira <grpereir@redhat.com>

* fix: use model service controller templating instead of helm

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

* more refactors

Signed-off-by: greg pereira <grpereir@redhat.com>

* minikube script compatability

Signed-off-by: greg pereira <grpereir@redhat.com>

* linting

Signed-off-by: greg pereira <grpereir@redhat.com>

---------

Signed-off-by: greg pereira <grpereir@redhat.com>
Signed-off-by: Tomas Coufal <tcoufal@redhat.com>
Co-authored-by: Tomas Coufal <tcoufal@redhat.com>

feat: upgrade to model service 0.0.8 (#62)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

docs: sanitize chart README, update main repo README and CONTRIBUTING with tips, faq and others (#72)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

quick helm fix (#73)

Signed-off-by: greg pereira <grpereir@redhat.com>

Update the minikube readme with byo model (#70)

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

Fixup the minikube model pvc to be dynamic (#69)

- Makes the PVC dynamic based on the PVC URI in the chart values
- Adds some debugging, will wrap those into a --debug flag in
a seperate patch.

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

Fix helm set pull secrets array in minikube installer (#75)

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

upgrading EPP image to tag reflecting inference-router repo migration (#74)

Signed-off-by: greg pereira <grpereir@redhat.com>

Add github workflow to run e2e test on the AWS instance (#77)

Signed-off-by: Anil Vishnoi <vishnoianil@gmail.com>

Fix install-deps.sh execution permission (#80)

Signed-off-by: Anil Vishnoi <vishnoianil@gmail.com>

Update quickstart for byo (#79)

Fix the llmd-deployer repo url (#81)

Signed-off-by: Anil Vishnoi <vishnoianil@gmail.com>

Use fine grain token to clone deployer repo (#82)

Signed-off-by: Anil Vishnoi <vishnoianil@gmail.com>

feat: add knobs for EPP env variables (#67)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

docs: fix Quickstart link in main README (#87)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

quiet prometheus install stdout in quickstarts (#85)

Signed-off-by: sallyom <somalley@redhat.com>

evaluate if the defined storage class exists (#84)

Signed-off-by: Ryan Cook <rcook@redhat.com>

Fixup disable metric (#71)

When DISABLE_METRICS is true, inject Helm args to set
modelservice.metrics.enabled=false and modelservice.serviceMonitor.enabled=false,
to stop the chart from rendering ServiceMonitor resources.

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

Remove the minikube runtime memory limit (#92)

- Makes room for cpu mem offload from vLLM.

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

using new NIXL only connector (#32)

* using new NIXL only connector

Signed-off-by: greg pereira <grpereir@redhat.com>

* runs but no cache hit

Signed-off-by: greg pereira <grpereir@redhat.com>

* no p/d services in prod example

Signed-off-by: greg pereira <grpereir@redhat.com>

* restore pd services deemed non-invasive

Signed-off-by: greg pereira <grpereir@redhat.com>

* keeping confimaps around but not using them in lmcache for dual connectors later

Signed-off-by: greg pereira <grpereir@redhat.com>

* downgrade to working image

Signed-off-by: greg pereira <grpereir@redhat.com>

* removing dead code placeholder sections

Signed-off-by: greg pereira <grpereir@redhat.com>

* linting

Signed-off-by: greg pereira <grpereir@redhat.com>

---------

Signed-off-by: greg pereira <grpereir@redhat.com>

Invert `--download-model` to `--skip-download-model` (#83)

Just flips the logic.

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

Add pods describe info to the logs (#96)

Signed-off-by: Anil Vishnoi <vishnoianil@gmail.com>

Update the quickstart ingress validations (#88)

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

docs: Neater preset table for NOTES.txt (#95)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

Add --all-containers=true to artifact logs (#97)

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

Remove runasroot to fix decode breakage (#98)

- Longer term, see if decode can be run as a regular user.

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

Enable metrics scraping from EPP (#93)

* add epp-service metrics collection

Signed-off-by: sallyom <somalley@redhat.com>

* bump chart version

Signed-off-by: sallyom <somalley@redhat.com>

---------

Signed-off-by: sallyom <somalley@redhat.com>

llm-d scheduler scorers configuration (#99)

Signed-off-by: Ricardo Noriega De Soto <rnoriega@redhat.com>

chore: fix test prereqs (#103)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

fix: update model service controller to 0.0.9 (#101)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

update model_id in validation endpoints (#105)

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

test: get proper tests going (#102)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

All image support w/ concurrent connectors (#100)

* All image support w/ concurrent connectorst

Signed-off-by: greg pereira <grpereir@redhat.com>

* dist url via pod IP + no config, fallback full env

Signed-off-by: greg pereira <grpereir@redhat.com>

* linting

Signed-off-by: greg pereira <grpereir@redhat.com>

---------

Signed-off-by: greg pereira <grpereir@redhat.com>

use prod EPP image post rename (#108)

Signed-off-by: greg pereira <grpereir@redhat.com>

remove legacy gpu-basic preset (#110)

Signed-off-by: greg pereira <grpereir@redhat.com>

Update the quickstart validate script to v1/completion (#111)

- Temporary until the vllm nixl patch lands with v1/chat support.

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

feat: add helm json schema (#114)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

fix: remove  from the chart (#113)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

chore: update issue templates and add autolabel (#120)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

template p/d replicas in ms from sample app (#119)

* template p/d replicas in ms from sample app

Signed-off-by: greg pereira <grpereir@redhat.com>

* removing helpers feedback

Signed-off-by: greg pereira <grpereir@redhat.com>

---------

Signed-off-by: greg pereira <grpereir@redhat.com>

docs: add section on documenting variables to the CONTRIBUTING guide (#122)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

chore(ci): add ghcr creds (#124)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

Fix escape log message in minikube installer (#125)

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

Fix a string escape in llmd-installer.sh (#126)

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

Fix HF_MODEL_ID validation in quickstart verify_env() (#127)

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

include clusterRouterBase in the schema (#129)

Signed-off-by: greg pereira <grpereir@redhat.com>

feat: migrate images registry to ghcr.io (#121)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

feat: update to model-service:0.0.10 (#130)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

Add debug logging wrapper and model value file and dir (#112)

- Fills out the --debug mode to include logging
- Add a sample `quickstart/models` directory for pre-canned
validated models.

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

chore: rename to llm-d org (#135)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

chore: remove openshift reference (#136)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

chore: try fixing test workflow after repo move

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

chore: downgrade min kube version to 1.30 (#139)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

chore: rename vllm-sim to llm-d-inference-sim (#140)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

docs: add prereqs (#142)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

bugfix: swap floats to ints for epp vars (#133)

* swap floats to ints for epp vars

Signed-off-by: greg pereira <grpereir@redhat.com>

* fix: update remaining variables

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

---------

Signed-off-by: greg pereira <grpereir@redhat.com>
Signed-off-by: Tomas Coufal <tcoufal@redhat.com>
Co-authored-by: Tomas Coufal <tcoufal@redhat.com>

Wire in the bitnami redis sc (#143)

- bug reported in #141

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

fix: make epp env variables merging possible and also extend sample app with epp env vars (#145)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

feat: add various selectors, constrains, tolerations etc (#147)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

Fix --values-file documentation in README.md (#148)

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

refactor: remove hfToken.create from the chart values (#149)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

chore: update badge for release decorator (#153)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

chore: refac prerequisites for quickstart (#150)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

Update the test-request.sh validation script (#156)

* Update the test-request.sh validation script

- Docs letting the user know if the hit podsec issues how to work
workaround them.
- replace shuf with RANDOM since its default on osx and linux
- Make namespace and model arguments

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

* fix: lint errors

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

---------

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>
Signed-off-by: Tomas Coufal <tcoufal@redhat.com>
Co-authored-by: Tomas Coufal <tcoufal@redhat.com>

feat: upgrade to `ghcr.io/llm-d/llm-d:0.0.8` (#161)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

docs: docs docs and more docs (#164)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

feat: upgrade to inference scheduler 0.0.3 (#163)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

remove create token from sample override file (#166)

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

swap to HF by default to avoid RWX storage issue trajectory (#165)

Signed-off-by: greg pereira <grpereir@redhat.com>
Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

Set mikefarah yq as a quickstart requirement (#167)

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

upping model download time, including resrouce stuff in json schema, remove ee by default (#169)

Signed-off-by: greg pereira <grpereir@redhat.com>

s/quay/ghcr/ updates to quickstart readmes (#172)

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

fix: repeated of quickstart need CRD cleanups (#173)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

pvc log cleanup in uninstall (#178)

- user doesnt care if that was skipped if not PVC
- pvc gets deleted with the ns

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

keep values schema for resources but not actual values (#174)

Signed-off-by: greg pereira <grpereir@redhat.com>

Skip setting BASE_OCP_DOMAIN when not on OpenShift (#155)

* Skip setting BASE_OCP_DOMAIN when not on openshift

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* pre-commit

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

---------

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

Remove redis persistence (#175)

* Remove redis persistence

* chore: bump chart version

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>

---------

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>
Co-authored-by: Tomas Coufal <tcoufal@redhat.com>

safer failures on uninstall full stack (#179)

Signed-off-by: greg pereira <grpereir@redhat.com>

safe uninstall (#180)

Signed-off-by: greg pereira <grpereir@redhat.com>

feat: populate CRB for metrics collection from epp (#171)

Signed-off-by: Tomas Coufal <tcoufal@redhat.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Switch to prod image for llm-d/llm-d-model-service-dev Prefill and Decode service names are sometimes invalid

2 participants