You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+55Lines changed: 55 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,61 @@ Changes
16
16
17
17
<!-- towncrier release notes start -->
18
18
19
+
## 26.4.4rc4 (2026-05-13)
20
+
21
+
### Breaking Changes
22
+
* v2 GraphQL Query/Mutation root fields and computed nested resolver fields are now nullable in the schema. Clients with strict-typed code generation (Relay, Apollo, etc.) must regenerate types and add null-handling around fields that previously came back as non-null. ([#11517](https://github.com/lablup/backend.ai/issues/11517))
23
+
24
+
### Features
25
+
* Add `updated_at` column to `vfolders` that is automatically refreshed whenever the row is updated via SQLAlchemy. ([#10821](https://github.com/lablup/backend.ai/issues/10821))
26
+
* Provide a manager-side parallel supply for legacy `live_stat``stats.max` / `stats.avg` / `stats.rate` fields, computed from Prometheus on demand instead of from the agent's in-memory `MovingStatistics` accumulator. Survives agent / manager / host restart, stays consistent across sessions, and uses a sliding window (default 5m) instead of unbounded lifetime accumulation. ([#11360](https://github.com/lablup/backend.ai/issues/11360))
27
+
* Wire the bulk role-permission REST/GQL endpoints (`bulk-add`, `bulk-remove`, `replace`) through to the permission-controller processor so they actually mutate state. ([#11442](https://github.com/lablup/backend.ai/issues/11442))
28
+
* Add `vfolder:data`, `session:app_service`, and `user:email` RBAC element types as sub-entities of vfolder, session, and user, enabling fine-grained permission control over vfolder internal data, session app endpoints, and user email exposure separately from their parent entities. ([#11456](https://github.com/lablup/backend.ai/issues/11456))
29
+
* Add Alembic data migrations that seed `vfolder:data` and `session:app_service` RBAC permissions on existing roles in domain/project/user scopes, and migrate existing vfolder share invitations to per-entity `vfolder:data` grants using the entity-as-scope pattern. ([#11457](https://github.com/lablup/backend.ai/issues/11457))
30
+
* Expose a `modelCards` connection on `VFolder` GraphQL nodes for reverse lookup from a vfolder to its registered model cards. ([#11480](https://github.com/lablup/backend.ai/issues/11480))
31
+
* Add per-handler `max_retry_count` to session/deployment scheduler handler options (renaming the legacy `timeouts` JSONB key to `handler_options` carrying `{timeout, max_retry_count}` entries) and fill the missing `give_up` status transitions on `check-precondition`, `start-sessions`, and `deprioritize-sessions` lifecycle handlers. ([#11524](https://github.com/lablup/backend.ai/issues/11524))
32
+
* Expose the per-deployment `revision_number` on `ModelRevision` GraphQL nodes and REST v2 revision responses so clients can render "Revision #N" labels and order revisions without an extra round-trip. ([#11529](https://github.com/lablup/backend.ai/issues/11529))
33
+
* Route coordinator now scans lifecycle routes via `BatchQuerier`, and `RouteTargetStatuses` gains an explicit traffic-status filter axis so handlers can target only routes whose `traffic_status` is in a given list. ([#11534](https://github.com/lablup/backend.ai/issues/11534))
34
+
* Inject capacity sentinel into kernel live_stat for metrics without a Prometheus capacity series ([#11535](https://github.com/lablup/backend.ai/issues/11535))
35
+
* Add `node-exporter` to the halfstack `observability` profile so Prometheus
36
+
automatically scrapes host-level metrics (CPU, memory, disk, network) in local
37
+
dev environments. ([#11541](https://github.com/lablup/backend.ai/issues/11541))
38
+
* Split health probe into liveness, readiness, and informational tiers, and surface gating failures via HTTP 503 from `/livez` and `/readyz` so Kubernetes probes react automatically; `/health` detail stays at 200 with `DEGRADED` status for informational failures. ([#11544](https://github.com/lablup/backend.ai/issues/11544))
39
+
* Add required resource slot metadata so `cpu` and `mem` can be enforced during resource validation. ([#11555](https://github.com/lablup/backend.ai/issues/11555))
40
+
* Add RequiredResourceSlotRule to the SessionSpec validator chain so session creation fails with InvalidAPIParameters when a kernel omits a globally required resource slot ([#11556](https://github.com/lablup/backend.ai/issues/11556))
41
+
* Route webserver traffic to the Manager and the Apollo Router (Hive Gateway) through a health-aware `HealthyEndpointPool` with pluggable selection policy (`round_robin`, `random`, `least_connections`), readiness gating on `/readyz`, per-endpoint informational status on `/health`, and configurable probe / threshold / policy tunables under `[api]` and `[apollo-router]`. ([#11558](https://github.com/lablup/backend.ai/issues/11558))
42
+
43
+
### Improvements
44
+
* Migrate kernel `live_stat` GraphQL resolver from Valkey to Prometheus while preserving the legacy wire shape ([#11330](https://github.com/lablup/backend.ai/issues/11330))
45
+
* Resolve effective permissions for arbitrary per-target keys in a single SQL round-trip via the new `PermissionResolutionKey` shape. ([#11356](https://github.com/lablup/backend.ai/issues/11356))
46
+
* Introduce `BackendAISchema`, a Pydantic base whose `model_validate` / `model_validate_json` auto-convert validation failures into a domain-specific `BackendAIError` (HTTP 400) via an overridable `build_validation_error` classmethod, so each model surfaces its own 400 with structured per-field error details instead of raw `pydantic.ValidationError`. ([#11514](https://github.com/lablup/backend.ai/issues/11514))
47
+
* Migrate every remaining pydantic `BaseModel` subclass across `src/ai/backend/` to `BackendAISchema`, so any `model_validate()` failure auto-converts to a `BackendAISchemaValidationFailed` (HTTP 400) instead of leaking as raw `pydantic.ValidationError`. ([#11554](https://github.com/lablup/backend.ai/issues/11554))
48
+
49
+
### Fixes
50
+
* Single-source active/dead flag set definitions of `ContainerStatus` to prevent potential mismatch in future code edits ([#11213](https://github.com/lablup/backend.ai/issues/11213))
51
+
* Report `current_revision_id` correctly on deployment responses during rolling updates. ([#11494](https://github.com/lablup/backend.ai/issues/11494))
52
+
* Set `reads_vfolder_config_files=true` for the `custom` runtime variant in seed fixtures so freshly populated rows match the alembic migration intent and custom-variant model services can read `model-definition.yaml` from the vfolder. ([#11503](https://github.com/lablup/backend.ai/issues/11503))
53
+
* Honor `AND`/`OR`/`NOT` clauses in `myDeployments` and `projectDeployments` GraphQL filters, which were previously ignored and caused multi-condition deployment queries to return unfiltered results. ([#11506](https://github.com/lablup/backend.ai/issues/11506))
54
+
* Allow deployment names to be reused within a project so a hidden record from another user no longer blocks creation. ([#11507](https://github.com/lablup/backend.ai/issues/11507))
55
+
* Remove the leftover `name` field from `ModelRevisionData`, `RevisionDTO`/`RevisionNode`, and the GraphQL `ModelRevision` type so the public schema matches the backend. ([#11511](https://github.com/lablup/backend.ai/issues/11511))
56
+
* Base the legacy `ModifyEndpoint` mutation's override merge on the **latest** deployment revision instead of the current/serving one, fixing a `DeploymentRevisionNotFound` failure when modifying an endpoint whose first rollout has not yet completed (`current_revision` still NULL) and preserving accumulated changes when a follow-up modify is issued while a previous revision is still deploying. ([#11512](https://github.com/lablup/backend.ai/issues/11512))
57
+
* Reject session requests whose image or caller declares a resource slot the target resource group does not provide, returning a clear 4xx instead of failing internally. ([#11515](https://github.com/lablup/backend.ai/issues/11515))
58
+
* Fix model deployment status incorrectly reported as READY for endpoints that have never been deployed ([#11516](https://github.com/lablup/backend.ai/issues/11516))
59
+
* Accept UUID-shaped strings in the legacy session-create `mounts` field. ([#11521](https://github.com/lablup/backend.ai/issues/11521))
60
+
* Accept legacy str start_command in model definition by normalizing it to an argv list via shlex.split ([#11525](https://github.com/lablup/backend.ai/issues/11525))
61
+
* Make ModelConfig / ModelDefinition / ModelServiceConfig / ModelHealthCheck GraphQL input fields optional so addModelRevision can inherit values from the runtime variant, model-definition.yaml, or revision preset. ([#11531](https://github.com/lablup/backend.ai/issues/11531))
62
+
* Allow `ModelMountConfigInput.definition_path` to be omitted so the server auto-detects `model-definition.yaml` or `model-definition.yml` in the model vfolder ([#11537](https://github.com/lablup/backend.ai/issues/11537))
63
+
* Propagate `SessionRow.network_type` and `SessionRow.network_id` through scheduler queries into `SessionDataForStart`, so the launcher correctly reuses pre-created networks for `PERSISTENT` sessions instead of calling `create_network`. ([#11543](https://github.com/lablup/backend.ai/issues/11543))
64
+
* Bound the sokovan deployment provisioner: once the handler retry budget is exhausted, transition the deployment to ROLLING_BACK instead of creating new RoutingRows indefinitely when every replica spawn keeps failing. ([#11546](https://github.com/lablup/backend.ai/issues/11546))
65
+
* Fix `backend.ai admin image list` failing with `Cannot query field 'last_used_at' on type 'Image'` by removing `last_used_at` from the default field list of the v1 admin image listing. ([#11563](https://github.com/lablup/backend.ai/issues/11563))
66
+
67
+
### Miscellaneous
68
+
* Add `.github/CODEOWNERS` so that pull requests auto-request reviewers from the `@lablup/core_dev` team. ([#11467](https://github.com/lablup/backend.ai/issues/11467))
69
+
70
+
### Test Updates
71
+
* Add unit tests for FixedQueryBuilder ([#11273](https://github.com/lablup/backend.ai/issues/11273))
0 commit comments