Conversation
…nchmarks into feature/test-ci-whisk
Removed logging of MinIO client connection details.
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR adds OpenWhisk regression testing infrastructure via a new CI workflow that provisions a Kind Kubernetes cluster with local Docker registry and SeBS-backed storage, deploys OpenWhisk, and runs regression benchmarks. Supporting code updates enable container URI override for function deployment, implement MinIO download, fix storage configuration lookups, and adjust function wrapper formats across multiple runtimes. ChangesOpenWhisk Regression Testing Pipeline
Storage Configuration & Implementation
OpenWhisk Function Deployment
Function Wrapper Updates
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: Turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. 👉 Get your free trial and get 200 agent minutes per Slack user (a $50 value). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 41 minutes and 47 seconds.Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/regression-whisk.yml:
- Around line 224-242: The failure diagnostics step ("Diagnostics on failure")
can fail if the diagnostics/ directory doesn't exist when earlier setup steps
failed; modify the run block for that step to ensure the directory exists before
writing logs by adding a command such as mkdir -p diagnostics (or equivalent)
prior to the block that pipes output into tee
diagnostics/cluster-diagnostics.txt so the kubectl/curl/docker output can always
be written.
- Around line 123-130: The git clone in the workflow currently pulls the default
branch which makes CI brittle; update the git clone command (the line starting
with "git clone --depth 1 https://github.com/apache/openwhisk-deploy-kube.git
/tmp/ow") to pin a stable tag or branch by adding --branch <TAG> (for example
--branch 1.0.0) so it clones a specific release (keep --depth 1 for shallow
clone); this ensures the subsequent sed patch against values.yaml and the helm
install using /tmp/ow/helm/openwhisk remain reproducible.
In @.github/workflows/regression.yml:
- Around line 4-7: The regression workflow accidentally disables the automatic
regression matrix by commenting out the push trigger; restore automatic coverage
by re-enabling the push trigger (uncomment or re-add the "push:" block and its
branch filters) in .github/workflows/regression.yml so that branches "master"
and "feature/**" will trigger regressions again, and ensure the new
regression-whisk.yml remains additive (it should add OpenWhisk coverage but not
replace the original push-triggered AWS/GCP/Azure jobs).
In `@configs/openwhisk.json`:
- Around line 28-37: The storage block was flattened but
SelfHostedSystemResources._deserialize_storage() expects storage.object.type and
storage.object.minio; update the JSON so the storage entry nests under an
"object" key with a "type" and a "minio" object (e.g., storage.object.type =
"minio" and storage.object.minio = {
"address","mapped_port","access_key","secret_key","instance_id","input_buckets","output_buckets"
}) so the deserializer in SelfHostedSystemResources._deserialize_storage() can
find storage.object.type and storage.object.minio.
In `@sebs/storage/minio.py`:
- Around line 413-416: The except block binds an unused variable which triggers
lint failure; update the handler in the block around self.connection.fget_object
to either remove the try/except entirely or change the clause from "except
minio.error.ResponseError as err:" to "except minio.error.ResponseError:" so the
exception is still re-raised but no unused name is bound (keep the current
re-raise behavior).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: eb33e970-3f61-4f20-997c-fc2956ede71e
📒 Files selected for processing (6)
.github/workflows/regression-whisk.yml.github/workflows/regression.ymlconfigs/openwhisk.jsonsebs/cache.pysebs/openwhisk/openwhisk.pysebs/storage/minio.py
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
sebs/storage/resources.py (1)
138-145:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCached storage lookup no longer matches how the cache is written.
update_cache()in this file writes MinIO underresources.storage.*and ScyllaDB underresources.nosql.*, but this branch now readsresources.storage[storage_type]. That means the object path misses the cached MinIO shape, and thenosqlpath can dereference a key that is never written. This will break deserialization from cache after a priorupdate_cache()run.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sebs/storage/resources.py` around lines 138 - 145, The cached lookup is using resources.storage[storage_type] but update_cache() writes MinIO under resources.storage and ScyllaDB under resources.nosql, so adjust the branch that sets storage_impl and storage_config to read the correct cache shape based on storage_type: for object storage (storage_type == "object") read cached_config["resources"]["storage"]["object"] and derive storage_impl/storage_config from that structure (e.g. MinIO), and for nosql storage (storage_type == "nosql") read cached_config["resources"]["nosql"] to find the nosql implementation and its config; update the code that assigns storage_impl and storage_config (the variables storage_impl, storage_config and the cached_config checks) so they match the keys written by update_cache().
🧹 Nitpick comments (1)
.github/workflows/regression-whisk.yml (1)
63-68: ⚡ Quick winPin
kubectlto the Kind cluster minor instead ofstable.txt.This job fixes the cluster at
kindest/node:v1.31.4but letskubectlfloat to whatever upstream marks as stable. Once that drifts far enough, the workflow can start failing without any repo change. Pinningkubectlto the same Kubernetes minor keeps this CI path reproducible.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/regression-whisk.yml around lines 63 - 68, The workflow currently downloads kubectl using stable.txt (KCTL=$(curl -fsSL https://dl.k8s.io/release/stable.txt)), which lets kubectl float; change this to pin kubectl to the same Kubernetes minor as the Kind node image used by the job (use the same version variable or literal used for the Kind image, e.g., KIND_NODE_VERSION or the v1.31.4 value) so KCTL is set to that specific release string and the curl/download step fetches that pinned release; update the kubectl install block that defines KCTL and the download URL accordingly to ensure kubectl matches the Kind cluster minor.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/regression-whisk.yml:
- Around line 134-137: The WORKER_IP extraction uses an index-based JSONPath
which can pick the wrong address; update the kubectl command that sets WORKER_IP
to explicitly filter for the InternalIP type (use a JSONPath filter selecting
addresses where type=="InternalIP") so the WORKER_IP variable is always the node
InternalIP; keep the rest of the lines (HOST_IP and exporting to GITHUB_ENV)
unchanged.
---
Outside diff comments:
In `@sebs/storage/resources.py`:
- Around line 138-145: The cached lookup is using
resources.storage[storage_type] but update_cache() writes MinIO under
resources.storage and ScyllaDB under resources.nosql, so adjust the branch that
sets storage_impl and storage_config to read the correct cache shape based on
storage_type: for object storage (storage_type == "object") read
cached_config["resources"]["storage"]["object"] and derive
storage_impl/storage_config from that structure (e.g. MinIO), and for nosql
storage (storage_type == "nosql") read cached_config["resources"]["nosql"] to
find the nosql implementation and its config; update the code that assigns
storage_impl and storage_config (the variables storage_impl, storage_config and
the cached_config checks) so they match the keys written by update_cache().
---
Nitpick comments:
In @.github/workflows/regression-whisk.yml:
- Around line 63-68: The workflow currently downloads kubectl using stable.txt
(KCTL=$(curl -fsSL https://dl.k8s.io/release/stable.txt)), which lets kubectl
float; change this to pin kubectl to the same Kubernetes minor as the Kind node
image used by the job (use the same version variable or literal used for the
Kind image, e.g., KIND_NODE_VERSION or the v1.31.4 value) so KCTL is set to that
specific release string and the curl/download step fetches that pinned release;
update the kubectl install block that defines KCTL and the download URL
accordingly to ensure kubectl matches the Kind cluster minor.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9060557a-3241-4be5-afd7-f9b017cb88c5
📒 Files selected for processing (6)
.github/workflows/regression-whisk.ymlbenchmarks/wrappers/openwhisk/java/src/main/java/org/serverlessbench/Handler.javabenchmarks/wrappers/openwhisk/nodejs/nosql.jsconfigs/openwhisk.jsonsebs/storage/minio.pysebs/storage/resources.py
✅ Files skipped from review due to trivial changes (1)
- benchmarks/wrappers/openwhisk/nodejs/nosql.js
| WORKER_IP=$(kubectl get node kind-worker -o jsonpath='{.status.addresses[0].address}') | ||
| HOST_IP=$(hostname -I | awk '{print $1}') | ||
| echo "WORKER_IP=${WORKER_IP}" >> $GITHUB_ENV | ||
| echo "HOST_IP=${HOST_IP}" >> $GITHUB_ENV |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Kubernetes node address ordering guarantee status.addresses
💡 Result:
Kubernetes does not provide a general, stable ordering guarantee for Node.status.addresses. What Kubernetes actually guarantees/does: 1) Order can change for reasons outside your control (e.g., multiple addresses of the same type and how kubelet updates Node.Status). Historically kubelet could reorder Node.Status.Addresses when multiple addresses of the same type changed; Kubernetes fixed this by ensuring kubelet updates use a replace strategy for NodeStatus.Addresses (to avoid random reordering that could change the “primary” address). 2) Consumers are expected to pick an address by type (and often by a preference list), not by assuming list position. - GetPreferredNodeAddress iterates Node.Status.Addresses and returns the first entry whose Type matches the preferred types list; it does not define/require a particular overall list order. - The helper GetNodeHostIPs explicitly re-sorts internally: it moves all NodeInternalIP entries first, then NodeExternalIP, and then chooses a “primary” first IP; it does not rely on the original addresses list ordering. - kubelet’s nodestatus setters build Node.Status.Addresses with a specific order (e.g., InternalIP then Hostname) in the path shown, but this is implementation detail and depends on which fields are being set. Practical recommendation: - Never rely on Node.status.addresses[i] ordering. - Select by NodeAddress.Type (Hostname/InternalIP/ExternalIP, etc.) and/or use Kubernetes helpers like nodeutil.GetPreferredNodeAddress (in kubelet code) or apply your own preference rules. Bottom line: There is no “status.addresses ordering guarantee” you can depend on; order is not a contract. If you need determinism, sort/reselect by address type yourself or use Kubernetes’ address-selection logic.
Citations:
- 1: https://kubernetes.io/docs/reference/kubernetes-api/cluster-resources/node-v1
- 2: Don't use strategic merge patch on Node.Status.Addresses kubernetes/kubernetes#79391
- 3: https://github.com/kubernetes/kubernetes/blob/master/pkg/util/node/node.go
- 4: https://github.com/kubernetes/kubernetes/blob/v1.26.3/pkg/util/node/node.go
- 5: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/nodestatus/setters.go
- 6: https://github.com/kubernetes/kubernetes/blob/2757a872/pkg/kubelet/kubelet_node_status.go
🌐 Web query:
kubectl jsonpath filter InternalIP address syntax
💡 Result:
kubectl jsonpath filter for InternalIP on Node objects: kubectl get nodes -o jsonpath='{.items[].status.addresses[?(@.type=="InternalIP")].address}' Notes on syntax: - The filter uses the ? JSONPath filter syntax: [?()]. - @ refers to the current array element being filtered. - Quote the string literal (InternalIP). Example (human-readable with newline): kubectl get nodes -o jsonpath='{range .items[]}{range .status.addresses[?(@.type=="InternalIP")]}{.address}{"\n"}{end}{end}'
Citations:
- 1: https://kubernetes.io/docs/reference/kubectl/jsonpath/
- 2: https://stackoverflow.com/questions/58732218/how-to-print-internalip-of-nodes-using-jq-in-k8s
- 3: https://oneuptime.com/blog/post/2026-02-09-kubectl-jsonpath-expressions-nested-fields/view
🏁 Script executed:
# Let's also verify if this specific jsonpath syntax is valid
# by checking for examples in common tools and documentation
cat <<'EOF'
Testing JSONPath filter syntax for Kubernetes node addresses
EOFRepository: spcl/serverless-benchmarks
Length of output: 131
🏁 Script executed:
# Check the workflow file for context around how WORKER_IP is used
head -n 150 .github/workflows/regression-whisk.yml | tail -n 30Repository: spcl/serverless-benchmarks
Length of output: 1375
🏁 Script executed:
# Also check if there are other similar patterns in the file that might need the same fix
rg 'status\.addresses\[0\]' .github/workflows/Repository: spcl/serverless-benchmarks
Length of output: 206
Select the node InternalIP explicitly instead of relying on address ordering.
{.status.addresses[0].address} assumes a stable address order, but Kubernetes does not guarantee this. If Hostname lands first in the list, wsk --apihost will be configured with the wrong value and tests will fail. Use a JSONPath filter to explicitly select the InternalIP type.
Suggested fix
- WORKER_IP=$(kubectl get node kind-worker -o jsonpath='{.status.addresses[0].address}')
+ WORKER_IP=$(kubectl get node kind-worker -o jsonpath='{.status.addresses[?(@.type=="InternalIP")].address}')📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| WORKER_IP=$(kubectl get node kind-worker -o jsonpath='{.status.addresses[0].address}') | |
| HOST_IP=$(hostname -I | awk '{print $1}') | |
| echo "WORKER_IP=${WORKER_IP}" >> $GITHUB_ENV | |
| echo "HOST_IP=${HOST_IP}" >> $GITHUB_ENV | |
| WORKER_IP=$(kubectl get node kind-worker -o jsonpath='{.status.addresses[?(@.type=="InternalIP")].address}') | |
| HOST_IP=$(hostname -I | awk '{print $1}') | |
| echo "WORKER_IP=${WORKER_IP}" >> $GITHUB_ENV | |
| echo "HOST_IP=${HOST_IP}" >> $GITHUB_ENV |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/regression-whisk.yml around lines 134 - 137, The WORKER_IP
extraction uses an index-based JSONPath which can pick the wrong address; update
the kubectl command that sets WORKER_IP to explicitly filter for the InternalIP
type (use a JSONPath filter selecting addresses where type=="InternalIP") so the
WORKER_IP variable is always the node InternalIP; keep the rest of the lines
(HOST_IP and exporting to GITHUB_ENV) unchanged.
As title suggests - deploy OW on kind in GitHub runners, testing one more platform for correctness and compatibility.
Summary by CodeRabbit
New Features
Bug Fixes
Chores