Track bundle resource state sizes in telemetry (direct engine)#5199
Conversation
4a3106b to
715f019
Compare
715f019 to
77ec7bc
Compare
77ec7bc to
85dd380
Compare
85dd380 to
65bb595
Compare
65bb595 to
44e4bc0
Compare
|
Commit: d0b2183
140 interesting tests: 118 FAIL, 15 SKIP, 7 KNOWN
Top 26 slowest tests (at least 2 minutes):
|
| db := dstate.NewDatabase("", 0) | ||
|
|
||
| pattern := dyn.NewPattern(dyn.Key("resources"), dyn.AnyKey(), dyn.AnyKey()) | ||
| _, err := dyn.MapByPattern(b.Config.Value(), pattern, func(p dyn.Path, v dyn.Value) (dyn.Value, error) { |
There was a problem hiding this comment.
Why walk the config and not the actual state? They might not match 1-1.
There was a problem hiding this comment.
To capture the state for both terraform and direct deployments. Most customers are still on terraform so this givess us approximate stats for the state sizes.
There was a problem hiding this comment.
I see, in that case we should not call StateFileSize, since it has nothing to do with it, we should call it ConfigFileSize.
There was a problem hiding this comment.
It does try to approximate the state size - by calling PrepareState:
target := cfg
if adapter, ok := adapters[resourceType]; ok {
state, err := adapter.PrepareState(cfg)
if err != nil {
return nil, fmt.Errorf("prepare state: %w", err)
}
target = state
}
// dstate.SaveState writes resource state with MarshalIndent using these
// exact prefix/indent arguments; matching them here means each resource's
// byte length equals len(entry.State) on disk for direct deploys.
raw, err := json.MarshalIndent(target, " ", " ")
if err != nil {
return nil, fmt.Errorf("marshal: %w", err)
}
return raw, nil
44e4bc0 to
d16c208
Compare
d16c208 to
1cab854
Compare
1cab854 to
41cc73a
Compare
| // for sub-resources like permissions / grants / secret_acls. Sub-resources are | ||
| // tracked under the sub-resource type so they aggregate across resource | ||
| // families. Returns "" for keys that don't match. | ||
| func resourceTypeFromKey(key string) string { |
There was a problem hiding this comment.
duplicate of
bundle/config/root.go:// GetResourceTypeFromKey extracts the resource group from a resource path.
bundle/config/root.go:func GetResourceTypeFromKey(path string) string {
| resources := make([]protos.ResourceMetadata, 0, len(types)) | ||
| for _, t := range types { | ||
| sizes := sizesByType[t] | ||
| slices.Sort(sizes) |
There was a problem hiding this comment.
you don't need to sort to get min/max/mean, it can be done in one pass.
There was a problem hiding this comment.
We need to sort to get the median though. It's worth getting median (and later maybe P90) since mean does not give a complete picture.
41cc73a to
4978c70
Compare
Adds a `resources_metadata` field to the bundle deploy telemetry event with, per resource type, the count and the max/mean/median state size in bytes, plus the whole state file size. Only direct deploys are measured, and collection does no marshalling, file read, or JSON parsing of its own. The direct engine already serializes each resource's state during the deploy and reconstructs it via WAL replay in Finalize; ExportStateFromData now records each entry's len(state) on the ResourceState it returns. deployCore stashes that finalized state on b.Metrics, and telemetry reads the per-resource sizes straight off the in-memory map (grouping by config.GetResourceTypeFromKey). The whole-file size comes from a single os.Stat (no read/parse). Terraform stores state differently and is not collected (the field is absent there). Because the metadata is direct-only it diverges across the DATABRICKS_BUNDLE_ENGINE test matrix, so the telemetry/deploy test captures it in a per-engine out.resources_metadata.$DATABRICKS_BUNDLE_ENGINE.txt (terraform: null) and omits it from the engine-agnostic out.telemetry.txt. Per-resource sizes are deterministic for a fixed config and asserted exactly (no redaction). Only state_file_size_bytes is dropped from the golden: it is os.Stat of resources.json whose header embeds the CLI version string, which differs in length between linux/macos (0.0.0-dev+<sha>) and windows (0.0.0-dev). It is still emitted in real telemetry. The universe proto (resources_metadata, BundleResourcesMetadata, ResourceMetadata) is already merged, so this is ingested rather than dropped. Co-authored-by: Isaac
4978c70 to
d0b2183
Compare
Adds a
resources_metadatafield to the bundle deploy telemetry event with, per resource type, thecountand the max/mean/median state size in bytes, plus the whole state file size.