Skip to content

all: reduce generated C size significantly#27197

Open
medvednikov wants to merge 8 commits into
masterfrom
codex/reduce-generated-c-size-sha3
Open

all: reduce generated C size significantly#27197
medvednikov wants to merge 8 commits into
masterfrom
codex/reduce-generated-c-size-sha3

Conversation

@medvednikov
Copy link
Copy Markdown
Member

Summary

This reduces generated C size for the SHA3 reproducer from #26961 and tightens the dependencies emitted by markused/codegen paths.

Changes include:

  • avoid result/panic overhead in fixed-size crypto.sha3 checksum helpers
  • avoid pulling in full interpolation runtime for simple string interpolation and time.Duration.str() formatting
  • root str_intp, StrIntpData, and isnil only for markused cases that actually need them
  • keep macOS/BSD detailed atos/demangle backtraces behind -d detailed_backtraces
  • add C output regression coverage for simple interpolation and auto-str interpolation runtime needs

For the issue reproducer, generated C changed from old V 0.5.1 289,292 bytes / 7,708 lines to latest 223,280 bytes / 6,568 lines.

Validation

  • ./v -g -keepc -o ./vnew cmd/v
  • ./vnew -silent vlib/v/compiler_errors_test.v
  • ./vnew -silent vlib/v/gen/c/coutput_test.v
  • ./vnew -silent vlib/v/slow_tests/inout/compiler_test.v
  • ./vnew -silent test vlib/crypto/sha3/
  • ./vnew -silent test vlib/time/
  • ./vnew -silent test vlib/builtin/
  • ./vnew fmt -w vlib/crypto/sha3/sha3.v vlib/time/duration.v vlib/builtin/backtraces_nix.c.v vlib/v/gen/c/coutput_test.v vlib/v/markused/walker.v
  • git diff --check

@medvednikov medvednikov marked this pull request as ready for review May 19, 2026 19:34
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d77d763631

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread vlib/v/markused/walker.v
Comment on lines +2848 to +2852
ast.Struct, ast.SumType, ast.Interface {
return true
}
else {
return false
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Account for map auto-str when rooting str_intp

type_auto_str_needs_str_intp now returns false for map types, but generated map auto-str() code still emits builtin__str_intp (for example via str_intp_sq when formatting string/rune keys in vlib/v/gen/c/auto_str_methods.v). After this commit removed unconditional rooting of str_intp in finalize, programs that only print/dump maps (e.g. println({'a': 1})) can have str_intp pruned and then fail at C compile/link time with missing builtin__str_intp symbols.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4fee8c03fb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread vlib/v/markused/walker.v
Comment on lines +2849 to +2850
ast.Struct, ast.SumType, ast.Interface {
return true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Treat map auto-str as requiring str_intp when needed

type_auto_str_needs_str_intp currently falls through to false for map types, but map auto-str() generation still emits builtin__str_intp (e.g. via str_intp_sq, str_intp_rune, and str_intp_g32/g64 in vlib/v/gen/c/auto_str_methods.v), so map-only programs with string/rune/float map parts can have str_intp pruned and then fail at C compile/link time with an undefined builtin__str_intp. Fresh evidence: in this revision the match in walker.v still omits ast.Map while the map codegen path still contains those str_intp_* calls.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a9151f90b4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread vlib/v/markused/walker.v
Comment on lines +2697 to +2698
if i >= node.need_fmts.len || node.need_fmts[i] || i >= node.fmts.len {
return true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Mirror generic interpolation fallback when rooting str_intp

Update this runtime check to also treat generic-instantiated interpolations as requiring str_intp. In cgen, string_inter_literal rewrites fmts[i] to _ for generic concrete instantiations and gen_simple_string_inter_literal explicitly bails out when fmts[i] == '_', so codegen falls back to builtin__str_intp; this walker path does not model that condition, so it can mark the same node as “simple”, skip w.uses_interp, and prune str_intp/StrIntpData, producing undefined builtin__str_intp at C compile time for programs like generic helpers that only do "${x}" interpolation.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 03db92028b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread vlib/v/markused/walker.v Outdated
Comment on lines +2907 to +2910
resolved_elem_type := w.resolve_current_generic_type(elem_type).clear_option_and_result()
if resolved_elem_type == 0 || resolved_elem_type == ast.void_type
|| resolved_elem_type.has_flag(.generic) {
return false
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve option/result flags in array auto-str dependency check

array_auto_str_needs_str_intp strips option/result flags before deciding whether to root interpolation runtime, so array elements like ?int are treated as plain int and return false here. That under-roots str_intp even though option auto-str() code still emits str_intp wrappers (see gen_str_for_option in vlib/v/gen/c/auto_str_methods.v), which can produce undefined builtin__str_intp symbols for programs that only print/dump arrays of option/result values.

Useful? React with 👍 / 👎.

Comment thread vlib/v/markused/walker.v Outdated
Comment on lines +2919 to +2920
return elem_sym.kind in [.array, .array_fixed, .struct, .sum_type, .interface]
&& !elem_sym.has_method('str')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Recurse auto-str dependency checks into map array elements

This whitelist omits .map, so []map[...]... and [N]map[...]... elements are treated as not needing interpolation runtime unless some other path roots it. Map auto-str() generation still emits str_intp for several key/value kinds (e.g. quoted string keys in gen_str_for_map), so map-containing arrays can be the only use site and end up with pruned str_intp, causing C compile/link failures on otherwise valid print/dump code.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e1080a5864

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread vlib/v/markused/walker.v Outdated
Comment on lines +1829 to +1830
if !node.is_variadic && node.args.any(it.expr is ast.ArrayDecompose) {
w.uses_interp = true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Mark variadic array-decompose calls as needing str_intp

This guard skips w.uses_interp for variadic calls, but C codegen still emits builtin__str_intp for array-decompose argument length checks in variadic call paths (see vlib/v/gen/c/fn.v around the ast.ArrayDecompose handling). Because this commit also stopped unconditionally rooting interpolation runtime, programs that only use ...arr in a variadic call (and otherwise avoid interpolation/auto-str cases that root it) can now prune str_intp and fail during C compile/link with undefined builtin__str_intp.

Useful? React with 👍 / 👎.

medvednikov and others added 7 commits May 21, 2026 21:23
…lker

cgen rewrites fmts[i] to `_` for generic concrete instantiations and
gen_simple_string_inter_literal bails out when fmts[i] == `_`, falling
back to builtin__str_intp. The walker did not model this and could mark
the same node as simple, prune str_intp/StrIntpData, and produce
undefined builtin__str_intp at C compile time for generic helpers
that only do `${x}` interpolation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
array_auto_str_needs_str_intp stripped option/result flags before
deciding whether to root str_intp, so `[]?int` was treated as plain
`[]int` and returned false even though gen_str_for_option still emits
str_intp wrappers. It also omitted .map from the recurse list, so
`[]map[string]int` could be the only use site and end up with str_intp
pruned, producing undefined builtin__str_intp at C compile time.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The previous guard skipped variadic calls when checking for ArrayDecompose
arguments, but cgen can still emit builtin__str_intp for array-decompose
argument length checks (see vlib/v/gen/c/fn.v around the ast.ArrayDecompose
handling). Programs using only `...arr` in a variadic call could prune
str_intp / StrIntpData and fail at C compile/link with undefined
builtin__str_intp.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a `type Foo = Bar` alias has no str method of its own but its parent
type does, cgen emits a wrapper `indent_Foo_str` that calls
`builtin__str_intp` to format `'Foo(...)'` around the parent's str output
(auto_str_methods.v gen_str_for_alias). type_auto_str_needs_str_intp was
unaliasing first and seeing the parent's str method, so it returned false
and -skip-unused dropped str_intp / StrIntpData, producing `implicit
declaration of function 'builtin__str_intp'` for any program that just
calls .str() on an SDL Version (and other c2v-style aliases).

Check the pre-unaliased sym for the Alias-without-own-str case before
falling through to the unaliased-parent-has-str shortcut.
@medvednikov medvednikov force-pushed the codex/reduce-generated-c-size-sha3 branch from d236c5c to 2d127bc Compare May 21, 2026 18:24
When `cmd/tools/fast` runs under PR CI, the runner uses a shallow
checkout (fetch-depth: 1). `v up` then performs `git pull --rebase`,
which can fail with add/add conflicts because git cannot find a real
common ancestor between PR HEAD and master. The half-rebased worktree
keeps conflict markers in source files, which then breaks any later
V compilation step in the same job.

Run `git rebase --abort` before returning, so subsequent steps see a
clean worktree.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1071be4ade

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread vlib/v/markused/walker.v
Comment on lines +2860 to +2864
ast.Map {
return true
}
ast.Struct, ast.SumType, ast.Interface {
return true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Treat multi-return auto-str as needing str_intp when needed

type_auto_str_needs_str_intp falls through to false for ast.MultiReturn, but C auto-str() generation for multi-return values can still emit builtin__str_intp (for example, float/string elements in gen_str_for_multi_return use str_intp_* helpers). After this commit removed unconditional rooting of interpolation runtime in finalize, programs that only print/dump multi-return values can have str_intp/StrIntpData pruned and then fail at C compile/link time with missing builtin__str_intp symbols (especially with skip-unused pruning enabled).

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant