Status: Future Vision (Long-Distance Backlog, Not Active)
Priority: High
Category: Flatten/Unflatten Semantics, Performance, Hierarchical Chunking
Enable efficient and predictable work on very deep structured content (JSON/XML/TOML/YAML) by combining:
- token separators (multi-character, not only single-char),
- depth-window operations on flattened paths,
- future
unflattenround-trip materialization.
Target pipeline:
flatten -> filter/chunk by depth -> merge/operate -> unflattenDeep files create two recurring problems:
- Too much path fan-out at once (difficult to focus operationally).
- Separator collisions when data keys contain separator-like characters.
Token separators and depth windows can reduce ambiguity and improve chunkability for large workflows.
tree/files/mergesupport token separators (for example--sep "__").flattenstill executes with single-character separator behavior.unflattenremains contract-only (Improvement 15), not CLI-implemented.
Improvement 17 is therefore intentionally parked until those dependencies mature.
Treat path hierarchy as an explicit token stream and operate on bounded depth ranges:
- Work at one depth window first (
d..d+k). - Run targeted transforms/filters/merge in that window.
- Expand window only when needed.
This allows focused, staged processing of deep structures instead of whole-tree churn.
- Token-stable flatten paths
flattenmust honor full separator tokens (.,_,__,::, etc.).
- Depth-window filtering
- ability to constrain operations to path depth ranges in flat records.
- Chunk planning
- deterministic chunk partitioning by depth and prefix.
- Round-trip safety
flatten -> merge(flat) -> unflattenwith collision diagnostics.
- Collision-aware policies
- explicit handling when keys contain separator tokens.
Use separator tokens by domain, not uniformly:
:/::are in-file domain separators (namespaces/symbol scopes).- File-system lane/eventness workflows should use file-safe separators (
.,_,-,__). - Before cross-file merge/eventness discovery, normalize in-file
:/::paths into the target file-safe separator domain.
This avoids Windows filename constraints and keeps in-file traversal semantics separate from file-lane hierarchy semantics.
- Improvement 15 core implementation (
unflattenMVP). - Flatten token-separator parity with tree/files/merge.
- Flat-record contract stability (
path,value,kind) across operations.
| Phase | Name | Outcome | Status |
|---|---|---|---|
| A | Contract Freeze | define token + depth-window semantics | planned |
| B | Flatten Token Parity | flatten honors full token separators | planned |
| C | Unflatten Token Parity | unflatten reconstructs from token paths | planned |
| D | Depth Window Ops | depth-bounded filter/chunk workflows | planned |
| E | Performance Validation | benchmark deep/wide workloads and guardrails | planned |
- What token escaping contract is needed for keys containing the token itself?
- Which depth metrics best predict useful chunk boundaries?
- Should chunking be deterministic by lexical path, structural prefix, or both?
- How should merge provenance be retained across chunked windows?
- Token separators round-trip without silent path collapse.
- Depth-window operations reduce runtime and memory pressure on deep datasets.
- Chunked pipelines remain deterministic and composable.
- Contract tests validate
flatten -> merge(flat) -> unflattenacross deep/wide fixtures.
- Immediate implementation in current phase lanes.
- Replacing Improvement 15 scope.
- Introducing opaque automatic chunking without explicit operator controls.
README.CORE.IMPROVEMENT15.mdREADME.CORE.IMPROVEMENT16.MDdocs/main.command.flatten.separator-token.investigation.mddocs/main.improvement.17.todo.future-plan.md