Core Improvement 17: Depth-Windowed Token Separators for Recomposition Pipelines

Status: Future Vision (Long-Distance Backlog, Not Active)
Priority: High
Category: Flatten/Unflatten Semantics, Performance, Hierarchical Chunking

Vision

Enable efficient and predictable work on very deep structured content (JSON/XML/TOML/YAML) by combining:

token separators (multi-character, not only single-char),
depth-window operations on flattened paths,
future unflatten round-trip materialization.

Target pipeline:

flatten -> filter/chunk by depth -> merge/operate -> unflatten

Why This Improvement Exists

Deep files create two recurring problems:

Too much path fan-out at once (difficult to focus operationally).
Separator collisions when data keys contain separator-like characters.

Token separators and depth windows can reduce ambiguity and improve chunkability for large workflows.

Current Reality (2026-03-01)

tree/files/merge support token separators (for example --sep "__").
flatten still executes with single-character separator behavior.
unflatten remains contract-only (Improvement 15), not CLI-implemented.

Improvement 17 is therefore intentionally parked until those dependencies mature.

Core Idea

Treat path hierarchy as an explicit token stream and operate on bounded depth ranges:

Work at one depth window first (d..d+k).
Run targeted transforms/filters/merge in that window.
Expand window only when needed.

This allows focused, staged processing of deep structures instead of whole-tree churn.

Proposed Future Capabilities

Token-stable flatten paths
- flatten must honor full separator tokens (., _, __, ::, etc.).
Depth-window filtering
- ability to constrain operations to path depth ranges in flat records.
Chunk planning
- deterministic chunk partitioning by depth and prefix.
Round-trip safety
- flatten -> merge(flat) -> unflatten with collision diagnostics.
Collision-aware policies
- explicit handling when keys contain separator tokens.

Separator Domain Rule (Proposed)

Use separator tokens by domain, not uniformly:

: / :: are in-file domain separators (namespaces/symbol scopes).
File-system lane/eventness workflows should use file-safe separators (., _, -, __).
Before cross-file merge/eventness discovery, normalize in-file :/:: paths into the target file-safe separator domain.

This avoids Windows filename constraints and keeps in-file traversal semantics separate from file-lane hierarchy semantics.

Dependency Chain

Improvement 15 core implementation (unflatten MVP).
Flatten token-separator parity with tree/files/merge.
Flat-record contract stability (path, value, kind) across operations.

Phased Future Plan

Phase	Name	Outcome	Status
A	Contract Freeze	define token + depth-window semantics	planned
B	Flatten Token Parity	flatten honors full token separators	planned
C	Unflatten Token Parity	unflatten reconstructs from token paths	planned
D	Depth Window Ops	depth-bounded filter/chunk workflows	planned
E	Performance Validation	benchmark deep/wide workloads and guardrails	planned

Research Questions

What token escaping contract is needed for keys containing the token itself?
Which depth metrics best predict useful chunk boundaries?
Should chunking be deterministic by lexical path, structural prefix, or both?
How should merge provenance be retained across chunked windows?

Success Criteria

Token separators round-trip without silent path collapse.
Depth-window operations reduce runtime and memory pressure on deep datasets.
Chunked pipelines remain deterministic and composable.
Contract tests validate flatten -> merge(flat) -> unflatten across deep/wide fixtures.

Non-Goals

Immediate implementation in current phase lanes.
Replacing Improvement 15 scope.
Introducing opaque automatic chunking without explicit operator controls.

README.CORE.IMPROVEMENT15.md
README.CORE.IMPROVEMENT16.MD
docs/main.command.flatten.separator-token.investigation.md
docs/main.improvement.17.todo.future-plan.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core Improvement 17: Depth-Windowed Token Separators for Recomposition Pipelines

Vision

Why This Improvement Exists

Current Reality (2026-03-01)

Core Idea

Proposed Future Capabilities

Separator Domain Rule (Proposed)

Dependency Chain

Phased Future Plan

Research Questions

Success Criteria

Non-Goals

Related

FilesExpand file tree

README.CORE.IMPROVEMENT17.md

Latest commit

History

README.CORE.IMPROVEMENT17.md

File metadata and controls

Core Improvement 17: Depth-Windowed Token Separators for Recomposition Pipelines

Vision

Why This Improvement Exists

Current Reality (2026-03-01)

Core Idea

Proposed Future Capabilities

Separator Domain Rule (Proposed)

Dependency Chain

Phased Future Plan

Research Questions

Success Criteria

Non-Goals

Related