Skip to content

record accessor: fix CFL root-key matching and container translation#11673

Merged
edsiper merged 4 commits intomasterfrom
ra_key-fixes
Apr 7, 2026
Merged

record accessor: fix CFL root-key matching and container translation#11673
edsiper merged 4 commits intomasterfrom
ra_key-fixes

Conversation

@edsiper
Copy link
Copy Markdown
Member

@edsiper edsiper commented Apr 4, 2026

  • Fix direct root-key regex matching in flb_cfl_ra_key so it uses the actual resolved value instead of an uninitialized out_val.
  • Fix flb_cfl_record_accessor container translation to serialize full maps/arrays correctly and use the proper root-key compare input.
  • Add internal regression coverage for root-key regex/strcmp and full container translation in cfl_record_accessor tests.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • Bug Fixes

    • Fixed regex matching for root-level record keys.
    • Corrected JSON serialization for arrays and key/value lists.
    • Fixed string-comparison and temporary-string handling to avoid incorrect results and memory misuse.
  • Tests

    • Added tests for container-to-JSON translation and root key string/regex matching.
    • Added commit-message validation tests for umbrella/stacked commit prefixes.
  • Chores

    • Relaxed and improved commit-prefix validation to accept stacked/umbrella prefixes and root-prefix matching.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 4, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Fixes regex input selection for root-string matches, consolidates JSON serialization loops for arrays and kvlists, corrects temporary JSON buffer handling and strcmp argument passing, adds record-accessor tests, and relaxes commit-prefix parsing plus corresponding lint tests.

Changes

Cohort / File(s) Summary
Regex key handling
src/flb_cfl_ra_key.c
Use val->data.as_string and its length for regex input in the non-capture root-string path instead of relying on out_val.
Record accessor JSON & strcmp fixes
src/flb_cfl_record_accessor.c
Unified array serialization into a single loop with immediate error returns; iterate all kvlist pairs with proper comma placement; initialize and always free temporary JSON buffer (flb_sds_t js = NULL) and only append on success; pass provided (str, len) into flb_cfl_ra_key_strcmp.
Tests (record accessor)
tests/internal/cfl_record_accessor.c
Added tests: cb_translate_container_map() (container-map → JSON) and cb_strcmp_and_regex_root_key() (root-key strcmp and regex); added <fluent-bit/flb_regex.h> include.
Commit-lint tests
.github/scripts/tests/test_commit_lint.py
Added tests to accept umbrella subjects like tests: internal: and tests: runtime: for changes under matching test paths.
Commit-prefix checker
.github/scripts/commit_prefix_check.py
Expanded prefix regex to accept stacked <token>: segments; extracted root subject prefix for umbrella matching and allow either full or root-prefix variants when validating expected prefixes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • cosmo0920
  • niedbalski
  • patrick-stephens
  • celalettin1286

Poem

🐇 I nudged a string where roots reside,
commas now march in tidy lines,
tests hopped in to verify,
regex ears perked up and tried,
🥕 code carrots for bright designs.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 41.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly summarizes the main changes: fixing CFL root-key matching and container translation in the record accessor.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ra_key-fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/flb_cfl_record_accessor.c (2)

648-660: ⚠️ Potential issue | 🟠 Major

$array still falls through this container-translation branch.

cfl_variant_to_ra_value() still collapses both arrays and kvlists to FLB_CFL_RA_BOOL, but this code only re-serializes kvlists. A root accessor like $records will still return NULL instead of JSON.

➕ Minimal fix
-        if (crv->v.type == CFL_VARIANT_KVLIST) {
+        if (crv->v.type == CFL_VARIANT_KVLIST ||
+            crv->v.type == CFL_VARIANT_ARRAY) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_cfl_record_accessor.c` around lines 648 - 660, The branch in
flb_cfl_record_accessor.c that handles crv->type == FLB_CFL_RA_BOOL only
re-serializes CFL_VARIANT_KVLIST, causing arrays (which cfl_variant_to_ra_value
currently collapses to FLB_CFL_RA_BOOL) to be treated as NULL; update the
condition to also detect CFL_VARIANT_ARRAY and serialize it the same way as
kvlists (use cfl_to_json on &crv->v, append to buf, free js on both success and
failure) so root accessors like $records return proper JSON; reference
crv->type, crv->v.type, cfl_to_json,
flb_sds_create_size/flb_sds_cat/flb_sds_destroy and cfl_variant_to_ra_value to
locate the relevant logic.

573-585: ⚠️ Potential issue | 🔴 Critical

Stale SDS pointer after reallocation in cfl_to_json recursive serialization.

cfl_to_json() takes flb_sds_t by value while flb_sds_cat_safe() expects a pointer-to-pointer to update reallocations. When serializing kvlists or arrays (lines 573–616), the loop appends bracket, commas, and nested values via recursive cfl_to_json() calls. If appends exceed the initial 1024-byte buffer, flb_sds_cat_safe() reallocates and updates the local stack copy of buf, not the caller's js pointer. At line 655 in cfl_ra_translate_keymap(), strlen(js) then dereferences a stale pointer to the freed original buffer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_cfl_record_accessor.c` around lines 573 - 585, The bug is that
cfl_to_json takes flb_sds_t by value so recursive calls can cause
flb_sds_cat_safe reallocations to update only the callee's local buf, leaving
the caller's js pointer stale (used later by cfl_ra_translate_keymap). Fix by
changing cfl_to_json to accept a pointer to the sds (e.g. flb_sds_t *js) and
update all call sites (including array/kvlist loops where buf is passed) to pass
the address so flb_sds_cat_safe reallocations propagate back to the caller;
ensure cfl_ra_translate_keymap and any code that reads strlen(js) use the
updated pointer dereferenced from the passed-in pointer.
🧹 Nitpick comments (1)
tests/internal/cfl_record_accessor.c (1)

1704-1755: Add a root-array container regression next to this map case.

This PR also changes full array serialization, but the new coverage only exercises $obj on a map. A small $array translation assertion here would catch the still-unhandled CFL_VARIANT_ARRAY path in cfl_ra_translate_keymap().

Based on learnings, add or update tests for behavior changes, especially protocol parsing and encoder/decoder paths; prefer targeted tests close to the changed module.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/internal/cfl_record_accessor.c` around lines 1704 - 1755, Add a sibling
test next to cb_translate_container_map named e.g. cb_translate_container_array
that constructs a root kvlist containing a CFL_VARIANT_ARRAY under the key
"array" (with two string elements like "alpha" and "beta"), create a record
accessor with fmt = "$array", call flb_cfl_ra_translate(cra, NULL, -1, *vobj,
NULL) and assert the returned flb_sds_t matches the JSON array string
'["alpha","beta"]' (check length and memcmp like the map test); this will
exercise the CFL_VARIANT_ARRAY branch in cfl_ra_translate_keymap() and mirror
the setup/teardown patterns used in the existing cb_translate_container_map
(cfl_kvlist_create, cfl_variant_create_from_*, flb_cfl_ra_create,
flb_cfl_ra_translate, flb_sds_destroy, cfl_variant_destroy).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/flb_cfl_record_accessor.c`:
- Around line 648-660: The branch in flb_cfl_record_accessor.c that handles
crv->type == FLB_CFL_RA_BOOL only re-serializes CFL_VARIANT_KVLIST, causing
arrays (which cfl_variant_to_ra_value currently collapses to FLB_CFL_RA_BOOL) to
be treated as NULL; update the condition to also detect CFL_VARIANT_ARRAY and
serialize it the same way as kvlists (use cfl_to_json on &crv->v, append to buf,
free js on both success and failure) so root accessors like $records return
proper JSON; reference crv->type, crv->v.type, cfl_to_json,
flb_sds_create_size/flb_sds_cat/flb_sds_destroy and cfl_variant_to_ra_value to
locate the relevant logic.
- Around line 573-585: The bug is that cfl_to_json takes flb_sds_t by value so
recursive calls can cause flb_sds_cat_safe reallocations to update only the
callee's local buf, leaving the caller's js pointer stale (used later by
cfl_ra_translate_keymap). Fix by changing cfl_to_json to accept a pointer to the
sds (e.g. flb_sds_t *js) and update all call sites (including array/kvlist loops
where buf is passed) to pass the address so flb_sds_cat_safe reallocations
propagate back to the caller; ensure cfl_ra_translate_keymap and any code that
reads strlen(js) use the updated pointer dereferenced from the passed-in
pointer.

---

Nitpick comments:
In `@tests/internal/cfl_record_accessor.c`:
- Around line 1704-1755: Add a sibling test next to cb_translate_container_map
named e.g. cb_translate_container_array that constructs a root kvlist containing
a CFL_VARIANT_ARRAY under the key "array" (with two string elements like "alpha"
and "beta"), create a record accessor with fmt = "$array", call
flb_cfl_ra_translate(cra, NULL, -1, *vobj, NULL) and assert the returned
flb_sds_t matches the JSON array string '["alpha","beta"]' (check length and
memcmp like the map test); this will exercise the CFL_VARIANT_ARRAY branch in
cfl_ra_translate_keymap() and mirror the setup/teardown patterns used in the
existing cb_translate_container_map (cfl_kvlist_create,
cfl_variant_create_from_*, flb_cfl_ra_create, flb_cfl_ra_translate,
flb_sds_destroy, cfl_variant_destroy).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 356a4d2b-1cca-4888-b0f7-3e79878c13a8

📥 Commits

Reviewing files that changed from the base of the PR and between 3e414ac and a2baa28.

📒 Files selected for processing (3)
  • src/flb_cfl_ra_key.c
  • src/flb_cfl_record_accessor.c
  • tests/internal/cfl_record_accessor.c

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/scripts/tests/test_commit_lint.py:
- Around line 152-173: Tests assume hierarchical test prefixes like "tests:
internal:" and "tests: runtime:" but the validator currently only accepts a
single-level "tests:" prefix; update the commit validator
(commit_prefix_check.py) so validate_commit parses and accepts nested test
prefixes by either allowing a secondary subprefix after "tests:" (e.g. treat
"tests: internal" and "tests: runtime" as valid by matching a regex like
'^tests:\s*\w+:' or by splitting on ':' and checking that the leading token is
"tests" and the second token is an allowed subcomponent), and update the
allowed_prefixes logic (or the function that checks prefix validity within
validate_commit) to permit these hierarchical entries so the two test cases in
test_commit_lint.py pass.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a4dbb3ed-5f93-4b04-8012-372a729d3331

📥 Commits

Reviewing files that changed from the base of the PR and between a2baa28 and d6916c9.

📒 Files selected for processing (1)
  • .github/scripts/tests/test_commit_lint.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/scripts/commit_prefix_check.py:
- Line 367: The current check lets any stacked subject pass when its root
matches expected (using subj_root_lower), which is too permissive; change the
logic around the conditional that uses subj_lower, expected_lower, and
subj_root_lower so the root-prefix fallback only applies for a defined set of
umbrella prefixes (e.g., create an umbrella_prefixes set containing allowed
umbrella roots like "tests:" or others and only allow subj_root_lower in
expected_lower when subj_root_lower is a member of umbrella_prefixes); update
the if expression to require either subj_lower in expected_lower OR
(subj_root_lower in expected_lower AND subj_root_lower in umbrella_prefixes) and
adjust related tests to enforce that "tests: internal:" is required for commits
under tests/internal/**.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d041fe04-ad8f-4bfa-9f2f-618a1885f1c6

📥 Commits

Reviewing files that changed from the base of the PR and between d6916c9 and 6280934.

📒 Files selected for processing (1)
  • .github/scripts/commit_prefix_check.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/scripts/commit_prefix_check.py (1)

340-367: ⚠️ Potential issue | 🟠 Major

Enforce the documented tests: subscopes.

With Line 340's tests: umbrella check and Line 367's root-prefix fallback, validate_commit() also accepts tests: runtime: for tests/internal/... files, and vice versa, because it only verifies p.startswith("tests/"). That leaves the documented tests: internal: / tests: runtime: prefixes unenforced.

🔧 Suggested guard for known test scopes
             elif subj_root_lower == "tests:":
                 if not all(p.startswith("tests/") for p in norm_paths):
                     expected_list = sorted(expected)
                     expected_str = ", ".join(expected_list)
                     return False, (
                         f"Subject prefix '{subject_prefix}' does not match files changed.\n"
                         f"Expected one of: {expected_str}"
                     )
+
+                nested_parts = subject_prefix.lower().split()
+                if len(nested_parts) > 1 and nested_parts[1] in {"internal:", "runtime:"}:
+                    scope = nested_parts[1][:-1]
+                    if not all(p.startswith(f"tests/{scope}/") for p in norm_paths):
+                        expected_list = sorted(expected)
+                        expected_str = ", ".join(expected_list)
+                        return False, (
+                            f"Subject prefix '{subject_prefix}' does not match files changed.\n"
+                            f"Expected one of: {expected_str}"
+                        )

A small negative regression like tests/internal/... + tests: runtime: would catch this too. Based on learnings: Applies to tests/internal/** : Commit messages for internal tests: use prefix tests: internal: with short imperative description; Applies to tests/runtime/** : Commit messages for runtime tests: use prefix tests: runtime: with short imperative description.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/commit_prefix_check.py around lines 340 - 367,
validate_commit() currently treats any "tests:" subject as matching all tests/*
paths, allowing mismatched subscopes (e.g., "tests: runtime:" with
tests/internal/ files). Update the logic around subj_root_lower and subj_lower
to explicitly enforce the documented test subscopes: when subj_lower == "tests:
internal:" require all(norm_paths start with "tests/internal/") (and similarly
when subj_lower == "tests: runtime:" require all(norm_paths start with
"tests/runtime/")); keep the generic "tests:" branch for top-level tests/* but
ensure the error message/expected list reflects the specific scope checked; use
the existing symbols validate_commit(), subj_root_lower, subj_lower, and
norm_paths to locate and change the checks and returned expected_str wording.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In @.github/scripts/commit_prefix_check.py:
- Around line 340-367: validate_commit() currently treats any "tests:" subject
as matching all tests/* paths, allowing mismatched subscopes (e.g., "tests:
runtime:" with tests/internal/ files). Update the logic around subj_root_lower
and subj_lower to explicitly enforce the documented test subscopes: when
subj_lower == "tests: internal:" require all(norm_paths start with
"tests/internal/") (and similarly when subj_lower == "tests: runtime:" require
all(norm_paths start with "tests/runtime/")); keep the generic "tests:" branch
for top-level tests/* but ensure the error message/expected list reflects the
specific scope checked; use the existing symbols validate_commit(),
subj_root_lower, subj_lower, and norm_paths to locate and change the checks and
returned expected_str wording.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 19c83acf-b7d8-49d1-bdbf-b32a4844f880

📥 Commits

Reviewing files that changed from the base of the PR and between 6280934 and ed92765.

📒 Files selected for processing (2)
  • .github/scripts/commit_prefix_check.py
  • .github/scripts/tests/test_commit_lint.py

edsiper added 4 commits April 7, 2026 12:32
Signed-off-by: Eduardo Silva <eduardo@calyptia.com>
Signed-off-by: Eduardo Silva <eduardo@calyptia.com>
Signed-off-by: Eduardo Silva <eduardo@calyptia.com>
Signed-off-by: Eduardo Silva <eduardo@calyptia.com>
@edsiper
Copy link
Copy Markdown
Member Author

edsiper commented Apr 7, 2026

wip: fixed code conflicts.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/flb_cfl_record_accessor.c (1)

654-661: Correct fix for success/failure check.

The change from ret == -1 to ret == 0 properly guards the buffer usage to only occur on successful JSON conversion. Memory is correctly freed in both paths.

Consider using flb_sds_len(js) instead of strlen(js) at line 655 for consistency with sds type conventions.

♻️ Optional: use flb_sds_len for consistency
             if (ret == 0) {
-                len = strlen(js);
+                len = flb_sds_len(js);
                 tmp = flb_sds_cat(buf, js, len);
                 flb_sds_destroy(js);
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_cfl_record_accessor.c` around lines 654 - 661, The success branch
currently uses strlen(js) when appending the sds string to buf; replace the
strlen call with flb_sds_len(js) so the code in the block where ret == 0 uses
the sds API consistently (keep the surrounding logic with flb_sds_cat(buf, js,
len) and flb_sds_destroy(js) unchanged, and ensure flb_sds_destroy(js) still
runs in the else branch where ret != 0); reference variables/functions: ret, js,
buf, flb_sds_cat, flb_sds_destroy, flb_sds_len.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/flb_cfl_record_accessor.c`:
- Around line 654-661: The success branch currently uses strlen(js) when
appending the sds string to buf; replace the strlen call with flb_sds_len(js) so
the code in the block where ret == 0 uses the sds API consistently (keep the
surrounding logic with flb_sds_cat(buf, js, len) and flb_sds_destroy(js)
unchanged, and ensure flb_sds_destroy(js) still runs in the else branch where
ret != 0); reference variables/functions: ret, js, buf, flb_sds_cat,
flb_sds_destroy, flb_sds_len.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3eb54987-8171-4980-a9e9-1c7b8678d8b6

📥 Commits

Reviewing files that changed from the base of the PR and between ed92765 and eee8993.

📒 Files selected for processing (5)
  • .github/scripts/commit_prefix_check.py
  • .github/scripts/tests/test_commit_lint.py
  • src/flb_cfl_ra_key.c
  • src/flb_cfl_record_accessor.c
  • tests/internal/cfl_record_accessor.c
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/flb_cfl_ra_key.c
  • tests/internal/cfl_record_accessor.c
  • .github/scripts/tests/test_commit_lint.py

@edsiper edsiper merged commit e0b99b6 into master Apr 7, 2026
53 checks passed
@edsiper edsiper deleted the ra_key-fixes branch April 7, 2026 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant