Skip to content

Commit 5e9113b

Browse files
committed
refactor(ssa): single coercion funnel — classify + emit Any box/unbox
Cast classification is now a Zyntax-layer primitive (`cast_classify` crate module) consulted by every coercion site: field stores/loads on Any slots and explicit `as` casts. `SsaBuilder::emit_coercion` dispatches `UpcastBox` (concrete → Any) to `zyntax_box_X` and `DowncastUnbox` (Any → concrete) to `zyntax_box_get_X`, with an HIR-type guard that suppresses double-unboxing when an earlier site already widened the value. The classifier is frontend-agnostic — any DSL lowering to TypedAST inherits the funnel, not just ZynML. CastKind also reserves variants for class-hierarchy widening/narrowing and union variant wrap/extract so those slot in without restructuring the call sites when they land. `convert_type` now lowers Named-atomic `Any` to `HirType::I64` (slot size matches the `*mut DynamicBoxRepr` the runtime allocates), so value-precise tiers (BC interp) don't drop the box pointer on store. bench_any_cast exercises explicit `as` round-trip (`2.5 as Any` then `boxed as f64`) for 1M iterations. Tiered + LLVM tiers verify correctness; interp tiers also pass after the FFI marshalling fix in the follow-up commit.
1 parent 4022a61 commit 5e9113b

5 files changed

Lines changed: 389 additions & 39 deletions

File tree

Lines changed: 208 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
//! Cast classification for the Zyntax compiler.
2+
//!
3+
//! Given a source type and a target type, classify the coercion as
4+
//! identity, upcast, downcast, or numeric/pointer conversion. The
5+
//! classification drives downstream code emission — whether to box, to
6+
//! unbox, to widen a pointer, to extract a variant, or to emit a plain
7+
//! numeric `Cast` instruction.
8+
//!
9+
//! This module is intentionally frontend-agnostic. Every frontend that
10+
//! lowers to Zyntax `TypedAST` should consult `classify_cast` when
11+
//! deciding how to materialize a coercion. The same classifier is used
12+
//! at every coercion site: field stores/loads on `Type::Any` slots,
13+
//! explicit `as` casts, let-binding initializers with type
14+
//! annotations, function-argument coercions, and return statements.
15+
//!
16+
//! Only `Identity`, `UpcastBox`, `DowncastUnbox`, and `Convert` are
17+
//! currently materializable. The remaining variants (class hierarchy
18+
//! widening / narrowing, union variant wrap / extract) are present as
19+
//! placeholders so the classifier shape stays stable when those
20+
//! language features land.
21+
//!
22+
//! ## Specificity lattice
23+
//!
24+
//! ```text
25+
//! Any
26+
//! / \
27+
//! Union Trait object
28+
//! | |
29+
//! Base class ... (future)
30+
//! |
31+
//! Subclass
32+
//! |
33+
//! concrete primitives (i64, f64, ...)
34+
//! ```
35+
//!
36+
//! The direction of the cast on this lattice determines `Upcast` vs
37+
//! `Downcast`. Moving from a more specific position to a more general
38+
//! one (toward `Any`) is an upcast and is always safe — the runtime
39+
//! emits the appropriate `zyntax_box_X` wrap. Moving from a more
40+
//! general position to a more specific one is a downcast and may need
41+
//! a runtime tag check; for `Type::Any → T` the wrap-time tag stored
42+
//! in the `DynamicBox` header is used by `zyntax_box_get_X` to widen
43+
//! / narrow losslessly.
44+
45+
use zyntax_typed_ast::type_registry::{Type, TypeRegistry};
46+
use zyntax_typed_ast::PrimitiveType;
47+
48+
/// Classification of a coercion from a source type to a target type.
49+
///
50+
/// Used by `SsaBuilder::emit_coercion` to dispatch to the right code
51+
/// emitter. The classifier itself is pure — it does not touch any HIR
52+
/// state and can be called from any pass that needs to know what kind
53+
/// of cast a (source, target) pair represents.
54+
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
55+
pub enum CastKind {
56+
/// Source and target are the same type. No-op at the value level.
57+
Identity,
58+
59+
/// Source is concrete, target is `Type::Any`. Emit
60+
/// `zyntax_box_X` to wrap the value in a `DynamicBox` with the
61+
/// runtime tag matching the source HIR type.
62+
UpcastBox,
63+
64+
/// Source is `Type::Any`, target is concrete. Emit
65+
/// `zyntax_box_get_X` to unwrap the `DynamicBox` and widen /
66+
/// narrow to the requested concrete primitive.
67+
DowncastUnbox,
68+
69+
/// Both source and target are concrete and structurally
70+
/// different. Emit a plain `HirInstruction::Cast` with an
71+
/// op selected by `select_cast_op` (sitofp, fptosi, sext, etc.).
72+
Convert,
73+
74+
/// Class hierarchy widening: source is a subclass of target.
75+
/// Both ends are heap pointers; emit at most a structural
76+
/// pointer rebind. Not yet wired — placeholder for when ZynML
77+
/// (and other frontends) gain class extension syntax.
78+
UpcastWiden,
79+
80+
/// Class hierarchy narrowing: target is a subclass of source.
81+
/// Needs a runtime tag check (vtable / typeid). Not yet wired.
82+
DowncastChecked,
83+
84+
/// Union variant wrap: source is one of the union's variants,
85+
/// target is the union itself. Emit a tagged-variant constructor.
86+
/// Not yet wired.
87+
UpcastVariant,
88+
89+
/// Union variant extract: source is the union, target is one of
90+
/// its variants. Needs a runtime tag check. Not yet wired.
91+
DowncastVariant,
92+
93+
/// Types are structurally unrelated and no conversion path exists.
94+
/// Frontends should reject this at type-check time; if it reaches
95+
/// SSA the emitter falls back to a best-effort `Convert`.
96+
Incompatible,
97+
}
98+
99+
/// Classify the coercion `source as target`.
100+
///
101+
/// The returned `CastKind` is the *intent*. The actual instruction
102+
/// emission lives in `SsaBuilder::emit_coercion` so that the classifier
103+
/// can stay frontend-agnostic and free of HIR mutation.
104+
///
105+
/// `type_registry` is consulted to resolve `Type::Named` aliases — a
106+
/// `Type::Named` whose alias target is `Type::Any` is treated as `Any`
107+
/// for classification purposes. This keeps frontends free to expose
108+
/// any spelling (`Any`, `Object`, `dynamic`, etc.) without needing to
109+
/// teach the classifier about each one.
110+
pub fn classify_cast(source: &Type, target: &Type, type_registry: &TypeRegistry) -> CastKind {
111+
let s_is_any = is_any_type(source, type_registry);
112+
let t_is_any = is_any_type(target, type_registry);
113+
114+
match (s_is_any, t_is_any) {
115+
(true, true) => return CastKind::Identity,
116+
(false, true) => return CastKind::UpcastBox,
117+
(true, false) => return CastKind::DowncastUnbox,
118+
(false, false) => {}
119+
}
120+
121+
if types_equal(source, target) {
122+
return CastKind::Identity;
123+
}
124+
125+
// Future: class hierarchy traversal here. When ZynML gains a
126+
// `class A extends B` form the registry will carry the parent
127+
// chain; this is where `UpcastWiden` / `DowncastChecked` would be
128+
// produced.
129+
130+
// Future: union variant matching here. When typed-AST surfaces
131+
// `Type::Union(variants)` from a frontend with sum types, this
132+
// is where `UpcastVariant` / `DowncastVariant` would be produced.
133+
134+
if is_concrete_scalar(source) && is_concrete_scalar(target) {
135+
return CastKind::Convert;
136+
}
137+
138+
// Anything else (reference rebinds, opaque pointer round-trips,
139+
// tuple-to-tuple) falls through as Convert. The downstream emitter
140+
// can refuse it if the underlying HIR types disagree.
141+
CastKind::Convert
142+
}
143+
144+
/// `true` if `ty` denotes the universal top type — spelled
145+
/// `Type::Any` directly, aliased to it, or registered as an atomic
146+
/// type whose name matches one of the canonical Any spellings
147+
/// recognized at the Zyntax layer.
148+
///
149+
/// The name-based fallback exists because frontends commonly parse
150+
/// their Any keyword as a Named-atomic type before the typed AST
151+
/// reaches the compiler. Recognizing the standard spellings here
152+
/// keeps the classifier working without forcing each frontend to
153+
/// special-case the keyword in its own parser.
154+
pub fn is_any_type(ty: &Type, registry: &TypeRegistry) -> bool {
155+
match ty {
156+
Type::Any => true,
157+
Type::Alias { target, .. } => is_any_type(target, registry),
158+
Type::Named { id, .. } => match registry.get_type_by_id(*id) {
159+
Some(def) => {
160+
if matches!(registry.resolve_alias(def.name), Some(&Type::Any)) {
161+
return true;
162+
}
163+
matches_any_spelling(def.name)
164+
}
165+
None => false,
166+
},
167+
Type::Unresolved(name) => {
168+
matches!(registry.resolve_alias(*name), Some(&Type::Any)) || matches_any_spelling(*name)
169+
}
170+
_ => false,
171+
}
172+
}
173+
174+
fn matches_any_spelling(name: zyntax_typed_ast::InternedString) -> bool {
175+
match name.resolve_global() {
176+
Some(s) => matches!(s.as_str(), "Any"),
177+
None => false,
178+
}
179+
}
180+
181+
fn types_equal(a: &Type, b: &Type) -> bool {
182+
match (a, b) {
183+
(Type::Primitive(p), Type::Primitive(q)) => p == q,
184+
(Type::Named { id: a, .. }, Type::Named { id: b, .. }) => a == b,
185+
_ => false,
186+
}
187+
}
188+
189+
fn is_concrete_scalar(ty: &Type) -> bool {
190+
matches!(
191+
ty,
192+
Type::Primitive(
193+
PrimitiveType::I8
194+
| PrimitiveType::I16
195+
| PrimitiveType::I32
196+
| PrimitiveType::I64
197+
| PrimitiveType::I128
198+
| PrimitiveType::U8
199+
| PrimitiveType::U16
200+
| PrimitiveType::U32
201+
| PrimitiveType::U64
202+
| PrimitiveType::U128
203+
| PrimitiveType::F32
204+
| PrimitiveType::F64
205+
| PrimitiveType::Bool
206+
)
207+
)
208+
}

crates/compiler/src/lib.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ pub mod async_support;
2727
pub mod auto_vectorize;
2828
pub mod borrow_check; // HIR-level borrow checking pass
2929
pub mod bytecode; // HIR bytecode serialization/deserialization
30+
pub mod cast_classify; // Pure classification of source/target coercions → CastKind
3031
pub mod cfg;
3132
pub mod cfg_simplify;
3233
pub mod const_eval;

0 commit comments

Comments
 (0)