Skip to content

Commit cd96b77

Browse files
committed
feat(ci): enforce lowerCamelCase and max depth in reference.conf
Add a CI gate that scans common/src/main/resources/reference.conf and fails the build when any key violates lowerCamelCase (^[a-z][a-zA-Z0-9]*$ per dot-separated segment) or exceeds the maximum hierarchy depth (6). Array element keys are validated the same way; each array step counts as one depth level — e.g. an inner field at `rate.limiter.rpc[].component` is depth 5. Parsing is delegated to pyhocon, the reference Python HOCON implementation. It returns a fully-merged ConfigTree where dotted-form keys expand into nested objects — the same canonical key set Typesafe Config and ConfigBeanFactory see at runtime — and handles triple-strings, substitutions, includes, +=, and block comments without us re-implementing the grammar. Four legacy PBFT* keys are grandfathered via an in-script allowlist so the gate fails only on new violations. A consolidated GHA error annotation lists every offending key, and sys.exit(1) drives step failure. The script also accepts `--debug` to print every parsed key with its depth (trailing `/` marks namespace intermediates) for manual verification against the source file. Runs as a new step in the existing checkstyle job of pr-check.yml (setup-python + `pip install pyhocon`), so no extra runner spin-up.
1 parent 381d369 commit cd96b77

2 files changed

Lines changed: 241 additions & 0 deletions

File tree

Lines changed: 227 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,227 @@
1+
#!/usr/bin/env python3
2+
"""Validate java-tron reference.conf key names and hierarchy depth.
3+
4+
Rules enforced:
5+
1. Every user-defined segment of every key path must match ^[a-z][a-zA-Z0-9]*$
6+
(lowerCamelCase: starts lowercase, letters/digits only).
7+
2. Total path depth must be <= MAX_DEPTH. Each list/array step counts as
8+
one additional level. For example `rate.limiter.http[].component` is
9+
5 levels deep (rate=1, limiter=2, http=3, []=4, component=5).
10+
3. ALLOWLIST entries are exempt from the format rule (legacy keys that ship
11+
in user configs; renaming would break compatibility).
12+
13+
Parsing strategy: delegated to pyhocon (https://github.com/chimpler/pyhocon),
14+
the reference Python HOCON implementation. This avoids hand-rolled scanner
15+
pitfalls (key = { ... } prefix loss, triple-strings, substitutions, includes,
16+
+= operator, block comments). pyhocon returns a fully-merged ConfigTree where
17+
dotted-form keys are expanded into nested objects — i.e. the same canonical
18+
key set Typesafe Config / ConfigBeanFactory will see at runtime.
19+
20+
Array handling: keys inside object-elements of arrays are also user-defined
21+
config keys (e.g. each entry in `rate.limiter.rpc = [{ component=..., ... }]`
22+
is parsed by RateLimiterConfig). The walker recurses into list elements and
23+
treats the array step as a synthetic `[]` segment that contributes to depth
24+
but is not itself validated as a name. Element keys are deduplicated across
25+
list entries because well-formed arrays use homogeneous object shapes.
26+
27+
Debug mode: pass `--debug` to print every parsed key with its depth, in
28+
walk order (which mirrors the file top-to-bottom). Use this to eyeball the
29+
parser's view against reference.conf.
30+
31+
Exit code: 0 if clean, 1 if any violation remains after allowlist filtering,
32+
2 on environment errors (missing pyhocon, file not found, parse failure).
33+
34+
CI integration: invoked by the `Validate reference.conf key names and depth`
35+
step of the `checkstyle` job in `.github/workflows/pr-check.yml`. The non-zero
36+
exit on violations is what makes that step fail — there is intentionally NO
37+
extra `exit 1` in the workflow shell wrapper. A single GHA `::error` workflow
38+
command is also emitted unconditionally (not gated on the GITHUB_ACTIONS env
39+
var) so local runs produce the same output as CI; the leading `::` is
40+
harmless noise locally.
41+
"""
42+
import re
43+
import sys
44+
from pathlib import Path
45+
46+
try:
47+
from pyhocon import ConfigFactory, ConfigTree
48+
except ImportError:
49+
print(
50+
"error: pyhocon is required. Install with `pip install pyhocon`.",
51+
file=sys.stderr,
52+
)
53+
sys.exit(2)
54+
55+
# Current max depth in reference.conf is 5; +1 buffer so a single new nested
56+
# key does not break CI. A future bump warrants a design discussion (deeper
57+
# trees hurt readability and complicate ConfigBeanFactory mapping).
58+
MAX_DEPTH = 6
59+
KEY_REGEX = re.compile(r'^[a-z][a-zA-Z0-9]*$')
60+
# Legacy keys grandfathered to keep user `config.conf` files compatible.
61+
# Do NOT extend this list for new keys — every new key must be lowerCamelCase.
62+
# A future rename + deprecation cycle can shrink this set back to empty.
63+
ALLOWLIST = {
64+
"node.http.PBFTEnable",
65+
"node.http.PBFTPort",
66+
"node.rpc.PBFTEnable",
67+
"node.rpc.PBFTPort",
68+
}
69+
70+
71+
def walk(node, path, depth):
72+
"""Yield (full_path, depth, is_leaf) for every reachable user-defined key.
73+
74+
- ConfigTree key adds one depth level and contributes a name segment.
75+
- list step adds one synthetic level rendered as `[]`. Element-internal
76+
keys are walked once per unique sub-path (homogeneous object arrays
77+
otherwise yield each field N times).
78+
- Scalars / null / list-of-scalars produce no further keys.
79+
80+
`depth` includes the array `[]` steps. `is_leaf` is True when the value
81+
at this path is a scalar/list/null — i.e. not another ConfigTree — so
82+
callers can filter leaves vs namespace intermediates.
83+
"""
84+
if isinstance(node, ConfigTree):
85+
for k, v in node.items():
86+
new_path = f"{path}.{k}" if path else k
87+
new_depth = depth + 1
88+
is_leaf = not isinstance(v, ConfigTree)
89+
yield new_path, new_depth, is_leaf
90+
yield from walk(v, new_path, new_depth)
91+
elif isinstance(node, list):
92+
array_path = f"{path}[]"
93+
array_depth = depth + 1
94+
seen = set()
95+
for elem in node:
96+
# Object element: walk its keys. Nested list element (HOCON allows
97+
# list-of-list, e.g. `a = [[{x=1}]]`): recurse so each inner [] step
98+
# also contributes to depth. Scalar elements have no sub-keys.
99+
if isinstance(elem, (ConfigTree, list)):
100+
for sub_path, sub_depth, sub_leaf in walk(elem, array_path, array_depth):
101+
if sub_path in seen:
102+
continue
103+
seen.add(sub_path)
104+
yield sub_path, sub_depth, sub_leaf
105+
106+
107+
def main(argv):
108+
debug = False
109+
args = list(argv[1:])
110+
if args and args[0] == "--debug":
111+
debug = True
112+
args = args[1:]
113+
if len(args) != 1:
114+
print(f"usage: {argv[0]} [--debug] <path/to/reference.conf>", file=sys.stderr)
115+
return 2
116+
path = Path(args[0])
117+
if not path.is_file():
118+
print(f"error: file not found: {path}", file=sys.stderr)
119+
return 2
120+
121+
try:
122+
tree = ConfigFactory.parse_file(str(path))
123+
except Exception as e:
124+
print(f"error: failed to parse {path}: {e}", file=sys.stderr)
125+
# Mirror the violation path: emit a single GHA annotation so the
126+
# parse failure surfaces in the PR check summary, not just the log.
127+
print(f"::error file={path},title=reference.conf::failed to parse: {e}")
128+
return 2
129+
130+
keys = list(walk(tree, "", 0))
131+
132+
if debug:
133+
# Keys are yielded in pyhocon insertion order, which mirrors the
134+
# source file top-to-bottom. Eyeball this against reference.conf to
135+
# confirm coverage; the depth column makes the array `[]` steps
136+
# explicit so MAX_DEPTH math is verifiable by inspection. Trailing
137+
# `/` marks namespace intermediates (have children); bare names are
138+
# leaves — `grep -v '/$'` filters to just leaves.
139+
leaf_count = sum(1 for _, _, lf in keys if lf)
140+
print(
141+
f"DEBUG: {len(keys)} parsed keys "
142+
f"({leaf_count} leaves + {len(keys) - leaf_count} intermediates), "
143+
f"walk order:"
144+
)
145+
for full_path, depth, is_leaf in keys:
146+
label = full_path if is_leaf else full_path + "/"
147+
print(f" d={depth} {label}")
148+
print()
149+
150+
format_violations = []
151+
depth_violations = []
152+
153+
# Only check leaves: pyhocon expands a dotted-form declaration like
154+
# `a.b.c = X` into intermediate ConfigTree nodes for `a` and `a.b`. A
155+
# single user-written bad key would otherwise be reported once per
156+
# intermediate AND once as the leaf, multiplying noise. The leaf path
157+
# carries every segment, so checking just leaves covers all segments.
158+
for full_path, depth, is_leaf in keys:
159+
if not is_leaf:
160+
continue
161+
if full_path not in ALLOWLIST:
162+
for seg in full_path.split('.'):
163+
# Strip any number of trailing `[]` markers — nested arrays
164+
# produce segments like `a[][]`.
165+
while seg.endswith('[]'):
166+
seg = seg[:-2]
167+
if seg and not KEY_REGEX.match(seg):
168+
format_violations.append((full_path, seg))
169+
break
170+
171+
if depth > MAX_DEPTH:
172+
depth_violations.append((full_path, depth))
173+
174+
format_violations.sort()
175+
depth_violations.sort()
176+
177+
if format_violations or depth_violations:
178+
lines_out = []
179+
if format_violations:
180+
lines_out.append(
181+
f"Format violations ({len(format_violations)}) — "
182+
f"each segment must match {KEY_REGEX.pattern}:"
183+
)
184+
for full_path, seg in format_violations:
185+
lines_out.append(f" format: {full_path} (segment: '{seg}')")
186+
if depth_violations:
187+
if lines_out:
188+
lines_out.append("")
189+
lines_out.append(
190+
f"Depth violations ({len(depth_violations)}) — max depth is {MAX_DEPTH} "
191+
f"(each `[]` array step counts as one level):"
192+
)
193+
for full_path, depth in depth_violations:
194+
lines_out.append(
195+
f" depth: {full_path} (depth={depth}, max={MAX_DEPTH})"
196+
)
197+
print("\n".join(lines_out))
198+
print()
199+
200+
# Emit ONE consolidated GHA workflow annotation. All offending entries
201+
# are packed into the annotation body via %0A (GHA's newline escape)
202+
# so the entries are visible in the annotation summary, not just in
203+
# the job log.
204+
entries = []
205+
for full_path, seg in format_violations:
206+
entries.append(f"format: {full_path} (segment '{seg}')")
207+
for full_path, depth in depth_violations:
208+
entries.append(f"depth: {full_path} (depth={depth}, max={MAX_DEPTH})")
209+
body = (
210+
f"reference.conf has {len(format_violations)} format + "
211+
f"{len(depth_violations)} depth violation(s):%0A"
212+
+ "%0A".join(entries)
213+
)
214+
print(f"::error file={path},title=reference.conf::{body}")
215+
print(
216+
f"FAIL: {len(format_violations)} format + {len(depth_violations)} depth "
217+
f"violation(s) in {path}",
218+
file=sys.stderr,
219+
)
220+
return 1
221+
222+
print(f"OK: {path}{len(keys)} keys, all lowerCamelCase, depth <= {MAX_DEPTH}")
223+
return 0
224+
225+
226+
if __name__ == "__main__":
227+
sys.exit(main(sys.argv))

.github/workflows/pr-check.yml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,20 @@ jobs:
103103
steps:
104104
- uses: actions/checkout@v5
105105

106+
- name: Set up Python
107+
uses: actions/setup-python@v5
108+
with:
109+
python-version: '3.11'
110+
111+
- name: Install pyhocon
112+
run: pip install --quiet pyhocon
113+
114+
- name: Validate reference.conf key names and depth
115+
shell: bash
116+
run: |
117+
python3 .github/scripts/check_reference_conf.py \
118+
common/src/main/resources/reference.conf
119+
106120
- name: Set up JDK 17
107121
uses: actions/setup-java@v5
108122
with:

0 commit comments

Comments
 (0)