This repository was archived by the owner on Oct 15, 2025. It is now read-only.
fixing gateway#33
Merged
Merged
Conversation
Signed-off-by: greg pereira <grpereir@redhat.com>
Signed-off-by: greg pereira <grpereir@redhat.com>
Member
Author
|
lets 🚢 |
tumido
added a commit
that referenced
this pull request
May 15, 2025
Co-authored-by: greg pereira <grpereir@redhat.com> Co-authored-by: Brent Salisbury <bsalisbu@redhat.com> Co-authored-by: Chris Chase <cchase@redhat.com> Co-authored-by: Ryan Cook <rcook@redhat.com> Co-authored-by: sallyom <somalley@redhat.com> Co-authored-by: Anil Kumar Vishnoi <vishnoianil@gmail.com> Co-authored-by: Andrew Anderson <andy@clubanderson.com> Signed-off-by: Tomas Coufal <tcoufal@redhat.com> chore: trigger release Signed-off-by: Tomas Coufal <tcoufal@redhat.com> Update installer permissions Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> remove minikube flags from base installer Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> chore: renaming things Signed-off-by: Tomas Coufal <tcoufal@redhat.com> updating modelservice CRD for cmd support + automation Signed-off-by: greg pereira <grpereir@redhat.com> chore: replace proselint with vale Signed-off-by: Tomas Coufal <tcoufal@redhat.com> swap image back to ms official Signed-off-by: greg pereira <grpereir@redhat.com> chore: run test on release tags and add badge to releases in GH Signed-off-by: Tomas Coufal <tcoufal@redhat.com> labels + using gotpl decode service Signed-off-by: greg pereira <grpereir@redhat.com> temporarily remove vale lint Signed-off-by: greg pereira <grpereir@redhat.com> linting and fixing helpers Signed-off-by: greg pereira <grpereir@redhat.com> revert removing vale Signed-off-by: greg pereira <grpereir@redhat.com> msvc rbac v2 hack updates Signed-off-by: greg pereira <grpereir@redhat.com> msvc rbac v2 hack updates v3 Signed-off-by: greg pereira <grpereir@redhat.com> quickstart README updates bumping epp image to new amd target Signed-off-by: greg pereira <grpereir@redhat.com> chore: fix pre-commit-cache Signed-off-by: Tomas Coufal <tcoufal@redhat.com> Replaced broken glusterfs for single node hostPath Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> Fix minikube readme Signed-off-by: Anil Vishnoi <vishnoianil@gmail.com> chore: run chart releases as bumper bot, so we can trigger workflows from it Signed-off-by: Tomas Coufal <tcoufal@redhat.com> using imagepullsecrts for epp/pd secrets Signed-off-by: greg pereira <grpereir@redhat.com> Update charts/llm-d/templates/modelservice/_helpers.tpl non global pull-secrets as well Co-authored-by: Tom Coufal <7453394+tumido@users.noreply.github.com> calling the IPS template properly Signed-off-by: greg pereira <grpereir@redhat.com> chart bump + linting Signed-off-by: greg pereira <grpereir@redhat.com> chore: fix test workflow on tag push (#35) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> fixing gateway (#33) * fixing MS upgrade no pd role flag Signed-off-by: greg pereira <grpereir@redhat.com> * fixing gateway Signed-off-by: greg pereira <grpereir@redhat.com> --------- Signed-off-by: greg pereira <grpereir@redhat.com> Make the storage PV and PVCs variable in the minikube installer (#49) * Make the storage PV and PVCs variable Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> * Default to hostPath storage type - No need to have conditionals for storage type in the minikube installer. Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> * minikube readme updates Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> --------- Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> Fix quickstart validation model names (#48) Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> feat: add Istio backend for Gateway (#45) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> removal of dev flag (#52) Signed-off-by: Ryan Cook <rcook@redhat.com> feat: migrate to community redis image (#53) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> removal of -dev from sidecar and use latest image tag (#54) Signed-off-by: Ryan Cook <rcook@redhat.com> fix: modelservice CR was wrong (#56) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> stub OCP ingress controller into opinionated install (#47) * stub OCP ingress controller into opinionated install Signed-off-by: greg pereira <grpereir@redhat.com> * remove backstage references and respect passing host Signed-off-by: greg pereira <grpereir@redhat.com> --------- Signed-off-by: greg pereira <grpereir@redhat.com> feat: update model service rbac (#58) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> modelservice servicemonitor (#22) * add modelservice servicemonitor Signed-off-by: sallyom <somalley@redhat.com> * disable modelservice metrics in CI Signed-off-by: sallyom <somalley@redhat.com> * update installers for metrics collection Signed-off-by: sallyom <somalley@redhat.com> * update quickstart READMEs for metrics Signed-off-by: sallyom <somalley@redhat.com> * bump chart version Signed-off-by: sallyom <somalley@redhat.com> --------- Signed-off-by: sallyom <somalley@redhat.com> fix: ensure model service controller can always grant epp role (#60) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> ensure metrics stack reinstalls in quickstarts (#61) Signed-off-by: sallyom <somalley@redhat.com> fix: kgateway proxyUID fixes - compatibility with multiple gateway types (#64) Signed-off-by: greg pereira <grpereir@redhat.com> chore: fix ci values file (#63) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> Add tolerations to values.yaml and baseconfigs (#50) BYO model (#21) * BYO model Signed-off-by: greg pereira <grpereir@redhat.com> * charts should not have sample app perspective, but treat MSVC as first class citizen Signed-off-by: greg pereira <grpereir@redhat.com> * defy remove sample app from MSVC base config - hack Signed-off-by: greg pereira <grpereir@redhat.com> * refactor base everything off modelartifactURI Signed-off-by: greg pereira <grpereir@redhat.com> * fix: use model service controller templating instead of helm Signed-off-by: Tomas Coufal <tcoufal@redhat.com> * more refactors Signed-off-by: greg pereira <grpereir@redhat.com> * minikube script compatability Signed-off-by: greg pereira <grpereir@redhat.com> * linting Signed-off-by: greg pereira <grpereir@redhat.com> --------- Signed-off-by: greg pereira <grpereir@redhat.com> Signed-off-by: Tomas Coufal <tcoufal@redhat.com> Co-authored-by: Tomas Coufal <tcoufal@redhat.com> feat: upgrade to model service 0.0.8 (#62) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> docs: sanitize chart README, update main repo README and CONTRIBUTING with tips, faq and others (#72) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> quick helm fix (#73) Signed-off-by: greg pereira <grpereir@redhat.com> Update the minikube readme with byo model (#70) Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> Fixup the minikube model pvc to be dynamic (#69) - Makes the PVC dynamic based on the PVC URI in the chart values - Adds some debugging, will wrap those into a --debug flag in a seperate patch. Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> Fix helm set pull secrets array in minikube installer (#75) Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> upgrading EPP image to tag reflecting inference-router repo migration (#74) Signed-off-by: greg pereira <grpereir@redhat.com> Add github workflow to run e2e test on the AWS instance (#77) Signed-off-by: Anil Vishnoi <vishnoianil@gmail.com> Fix install-deps.sh execution permission (#80) Signed-off-by: Anil Vishnoi <vishnoianil@gmail.com> Update quickstart for byo (#79) Fix the llmd-deployer repo url (#81) Signed-off-by: Anil Vishnoi <vishnoianil@gmail.com> Use fine grain token to clone deployer repo (#82) Signed-off-by: Anil Vishnoi <vishnoianil@gmail.com> feat: add knobs for EPP env variables (#67) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> docs: fix Quickstart link in main README (#87) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> quiet prometheus install stdout in quickstarts (#85) Signed-off-by: sallyom <somalley@redhat.com> evaluate if the defined storage class exists (#84) Signed-off-by: Ryan Cook <rcook@redhat.com> Fixup disable metric (#71) When DISABLE_METRICS is true, inject Helm args to set modelservice.metrics.enabled=false and modelservice.serviceMonitor.enabled=false, to stop the chart from rendering ServiceMonitor resources. Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> Remove the minikube runtime memory limit (#92) - Makes room for cpu mem offload from vLLM. Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> using new NIXL only connector (#32) * using new NIXL only connector Signed-off-by: greg pereira <grpereir@redhat.com> * runs but no cache hit Signed-off-by: greg pereira <grpereir@redhat.com> * no p/d services in prod example Signed-off-by: greg pereira <grpereir@redhat.com> * restore pd services deemed non-invasive Signed-off-by: greg pereira <grpereir@redhat.com> * keeping confimaps around but not using them in lmcache for dual connectors later Signed-off-by: greg pereira <grpereir@redhat.com> * downgrade to working image Signed-off-by: greg pereira <grpereir@redhat.com> * removing dead code placeholder sections Signed-off-by: greg pereira <grpereir@redhat.com> * linting Signed-off-by: greg pereira <grpereir@redhat.com> --------- Signed-off-by: greg pereira <grpereir@redhat.com> Invert `--download-model` to `--skip-download-model` (#83) Just flips the logic. Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> Add pods describe info to the logs (#96) Signed-off-by: Anil Vishnoi <vishnoianil@gmail.com> Update the quickstart ingress validations (#88) Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> docs: Neater preset table for NOTES.txt (#95) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> Add --all-containers=true to artifact logs (#97) Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> Remove runasroot to fix decode breakage (#98) - Longer term, see if decode can be run as a regular user. Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> Enable metrics scraping from EPP (#93) * add epp-service metrics collection Signed-off-by: sallyom <somalley@redhat.com> * bump chart version Signed-off-by: sallyom <somalley@redhat.com> --------- Signed-off-by: sallyom <somalley@redhat.com> llm-d scheduler scorers configuration (#99) Signed-off-by: Ricardo Noriega De Soto <rnoriega@redhat.com> chore: fix test prereqs (#103) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> fix: update model service controller to 0.0.9 (#101) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> update model_id in validation endpoints (#105) Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> test: get proper tests going (#102) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> All image support w/ concurrent connectors (#100) * All image support w/ concurrent connectorst Signed-off-by: greg pereira <grpereir@redhat.com> * dist url via pod IP + no config, fallback full env Signed-off-by: greg pereira <grpereir@redhat.com> * linting Signed-off-by: greg pereira <grpereir@redhat.com> --------- Signed-off-by: greg pereira <grpereir@redhat.com> use prod EPP image post rename (#108) Signed-off-by: greg pereira <grpereir@redhat.com> remove legacy gpu-basic preset (#110) Signed-off-by: greg pereira <grpereir@redhat.com> Update the quickstart validate script to v1/completion (#111) - Temporary until the vllm nixl patch lands with v1/chat support. Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> feat: add helm json schema (#114) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> fix: remove from the chart (#113) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> chore: update issue templates and add autolabel (#120) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> template p/d replicas in ms from sample app (#119) * template p/d replicas in ms from sample app Signed-off-by: greg pereira <grpereir@redhat.com> * removing helpers feedback Signed-off-by: greg pereira <grpereir@redhat.com> --------- Signed-off-by: greg pereira <grpereir@redhat.com> docs: add section on documenting variables to the CONTRIBUTING guide (#122) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> chore(ci): add ghcr creds (#124) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> Fix escape log message in minikube installer (#125) Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> Fix a string escape in llmd-installer.sh (#126) Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> Fix HF_MODEL_ID validation in quickstart verify_env() (#127) Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> include clusterRouterBase in the schema (#129) Signed-off-by: greg pereira <grpereir@redhat.com> feat: migrate images registry to ghcr.io (#121) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> feat: update to model-service:0.0.10 (#130) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> Add debug logging wrapper and model value file and dir (#112) - Fills out the --debug mode to include logging - Add a sample `quickstart/models` directory for pre-canned validated models. Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> chore: rename to llm-d org (#135) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> chore: remove openshift reference (#136) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> chore: try fixing test workflow after repo move Signed-off-by: Tomas Coufal <tcoufal@redhat.com> chore: downgrade min kube version to 1.30 (#139) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> chore: rename vllm-sim to llm-d-inference-sim (#140) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> docs: add prereqs (#142) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> bugfix: swap floats to ints for epp vars (#133) * swap floats to ints for epp vars Signed-off-by: greg pereira <grpereir@redhat.com> * fix: update remaining variables Signed-off-by: Tomas Coufal <tcoufal@redhat.com> --------- Signed-off-by: greg pereira <grpereir@redhat.com> Signed-off-by: Tomas Coufal <tcoufal@redhat.com> Co-authored-by: Tomas Coufal <tcoufal@redhat.com> Wire in the bitnami redis sc (#143) - bug reported in #141 Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> fix: make epp env variables merging possible and also extend sample app with epp env vars (#145) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> feat: add various selectors, constrains, tolerations etc (#147) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> Fix --values-file documentation in README.md (#148) Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> refactor: remove hfToken.create from the chart values (#149) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> chore: update badge for release decorator (#153) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> chore: refac prerequisites for quickstart (#150) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> Update the test-request.sh validation script (#156) * Update the test-request.sh validation script - Docs letting the user know if the hit podsec issues how to work workaround them. - replace shuf with RANDOM since its default on osx and linux - Make namespace and model arguments Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> * fix: lint errors Signed-off-by: Tomas Coufal <tcoufal@redhat.com> --------- Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> Signed-off-by: Tomas Coufal <tcoufal@redhat.com> Co-authored-by: Tomas Coufal <tcoufal@redhat.com> feat: upgrade to `ghcr.io/llm-d/llm-d:0.0.8` (#161) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> docs: docs docs and more docs (#164) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> feat: upgrade to inference scheduler 0.0.3 (#163) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> remove create token from sample override file (#166) Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> swap to HF by default to avoid RWX storage issue trajectory (#165) Signed-off-by: greg pereira <grpereir@redhat.com> Signed-off-by: Tomas Coufal <tcoufal@redhat.com> Set mikefarah yq as a quickstart requirement (#167) Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> upping model download time, including resrouce stuff in json schema, remove ee by default (#169) Signed-off-by: greg pereira <grpereir@redhat.com> s/quay/ghcr/ updates to quickstart readmes (#172) Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> fix: repeated of quickstart need CRD cleanups (#173) Signed-off-by: Tomas Coufal <tcoufal@redhat.com> pvc log cleanup in uninstall (#178) - user doesnt care if that was skipped if not PVC - pvc gets deleted with the ns Signed-off-by: Brent Salisbury <bsalisbu@redhat.com> keep values schema for resources but not actual values (#174) Signed-off-by: greg pereira <grpereir@redhat.com> Skip setting BASE_OCP_DOMAIN when not on OpenShift (#155) * Skip setting BASE_OCP_DOMAIN when not on openshift Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * pre-commit Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> --------- Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Remove redis persistence (#175) * Remove redis persistence * chore: bump chart version Signed-off-by: Tomas Coufal <tcoufal@redhat.com> --------- Signed-off-by: Tomas Coufal <tcoufal@redhat.com> Co-authored-by: Tomas Coufal <tcoufal@redhat.com> safer failures on uninstall full stack (#179) Signed-off-by: greg pereira <grpereir@redhat.com> safe uninstall (#180) Signed-off-by: greg pereira <grpereir@redhat.com> feat: populate CRB for metrics collection from epp (#171) Signed-off-by: Tomas Coufal <tcoufal@redhat.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
cc @tumido @nerdalert @sallyom
Changes:
modelservice.<component>.metrics.enabledllm-d/llm-d-model-service-dev#38)