Skip to content

Compilation time optimization and cleanup#2527

Merged
stephenberry merged 17 commits into
mainfrom
compilation-optimization-cleanup
May 10, 2026
Merged

Compilation time optimization and cleanup#2527
stephenberry merged 17 commits into
mainfrom
compilation-optimization-cleanup

Conversation

@stephenberry
Copy link
Copy Markdown
Owner

@stephenberry stephenberry commented Apr 23, 2026

Compile-time and include-surface cleanup, landing as a sequence of independent commits.

New: <glaze/forward.hpp>

Lightweight header with forward declarations for the customization points (glz::meta, glz::to, glz::from, glz::to_partial, glz::skip_value, glz::parse, glz::serialize, glz::serialize_partial), format id constants, and glz::custom_t. Only depends on <cstdint>. Users can specialize these in widely-included headers without pulling in full Glaze. Addresses #2525.

glaze/glaze.hpp aggregate trimmed

Uncommon features removed from the top-level aggregate; users who want them include the individual headers directly (same convention CBOR already followed):

  • glaze/msgpack.hpp (docs/msgpack.md updated accordingly)
  • glaze/record/recorder.hpp (docs/recorder.md updated accordingly)

Core include diet

  • <atomic> / <bitset> / <complex> in core/common.hpp are now gated behind #if GLZ_REFLECTION26 — the specified<> specializations that need them are the only users and already live inside that block.
  • core/chrono.hpp dropped from core/common.hpp; moved to the direct users cbor/read.hpp and cbor/write.hpp.
  • glaze/file/write_directory.hpp now explicitly includes glaze/json/write.hpp (previously worked by transitive luck).
  • Dead <any> include removed from json/json_ptr.hpp.

Internal simplifications

  • unique_per_length_info in the reflection hash pipeline drops constexpr std::vector + std::ranges::sort for a copy of the input std::array plus a hand-rolled insertion sort — fewer template instantiations per reflected type.
  • glz::boxed<T> (internal to struct glz::schema) now manages a raw T* directly instead of wrapping std::unique_ptr<T>. Same public API, same semantics; skips the per-T unique_ptr + default_delete instantiations that every TU including schema.hpp otherwise pays.

Full build and all test suites pass (json, yaml, beve, bson, cbor, jsonschema).

common.hpp does not itself reference any chrono type; the include was
transitively relied on by cbor/{read,write}.hpp. Add the direct include to
those two files so the dependency is local rather than a global ordering
contract.
write_directory() references glz::write_json as a non-dependent name inside
a template body, so it must be visible at the point of the template
definition. Previously this only worked because glaze/beve/beve_to_json.hpp
(included earlier in glaze.hpp) transitively pulled in json/write.hpp.
Include it directly so the header is usable on its own and the include order
in glaze.hpp is no longer load-bearing.
@packit-as-a-service
Copy link
Copy Markdown

@packit-as-a-service
Copy link
Copy Markdown

@packit-as-a-service
Copy link
Copy Markdown

One of the tests failed for 454fff7. @admin check logs None, packit dashboard https://dashboard.packit.dev/jobs/copr/3502749 and external service dashboard https://copr.fedorainfracloud.org/coprs/build/10359418/

@stephenberry stephenberry force-pushed the compilation-optimization-cleanup branch from 05dcc09 to 22e25c0 Compare April 23, 2026 10:06
…larations

Users who want to specialize glz::meta, glz::to<Format, T>, glz::from<Format, T>,
or construct glz::custom_t in widely-included user headers can now include only
<glaze/forward.hpp>, which has no Glaze dependencies and pulls in <cstdint> only.
This lets them defer the full <glaze/glaze.hpp> include to the translation units
that actually call read/write.

New: include/glaze/forward.hpp — format id constants, primary templates for
meta/to/from/to_partial/skip_value/parse/serialize/serialize_partial, the
custom_t aggregate, and a namespace-scoped `using uint32_t = std::uint32_t;`
so the format-tag type has a home under glz::.

core/opts.hpp, core/meta.hpp, core/wrappers.hpp now include forward.hpp and
delete their duplicate declarations. Behavior is unchanged.

Addresses #2525.
Replace the std::vector<std::string_view> + std::ranges::sort inside
unique_per_length_info with a copy of the caller's std::array and a
hand-rolled insertion sort by key length. Also swap std::lower_bound /
std::upper_bound for a linear subrange scan (input is sorted, N is small,
and this avoids two more per-N instantiations).

The previous comment explained why the old version used std::vector rather
than std::array: using std::array<T, N> propagates per-N std::ranges::sort
instantiations. The new version keeps std::array but avoids that trap by
using a hand-rolled sort that is not template-instantiated per N.

-ftime-trace on a 20-field reflected struct shows glz::make_keys_info<20>
drop from 45.1 ms to 26.0 ms (-19 ms), with unique_per_length_info no
longer appearing as a top-level InstantiateFunction entry (inlined below
threshold). Wall-clock gain per TU is small (interleaved A/B at N=6 on
this machine gives borderline signal, ~1% on the 5-field TU and within
noise on the 20-field one), but the change removes a constexpr std::vector
from the reflection pipeline and simplifies the code.

Full build and json_test / yaml_test / beve_test / bson_test / cbor_test
all pass.
…TION26

core/common.hpp unconditionally included <atomic>, <bitset>, and <complex>
only to satisfy three glz::specified<> specializations — and those
specializations are already inside an `#if GLZ_REFLECTION26` block. Nothing
else in common.hpp referenced std::atomic, std::bitset, or std::complex.

Pre-C++26 users (the vast majority today) now skip parsing these three
headers per TU. C++26 users pick them up transitively, exactly as before,
because the includes moved into the same `#if` block that already contained
the specializations that need them.

Interleaved A/B on a TU that only `#include <glaze/glaze.hpp>` + empty main
shows a ~20 ms per-TU save (~1.7%). Smaller than the 84 ms estimated from
parse-time-alone measurements in the original analysis — other paths pull
in at least some of <atomic>/<bitset>/<complex> transitively, so the real
saving is the unique delta, not the raw parse cost.

No API change. Full build + json/yaml/beve/bson/cbor tests all pass.
Follow-up to 3d96d30. In that commit I moved the conditional
`#include <atomic>`, `<bitset>`, `<complex>` inside the `#if GLZ_REFLECTION26`
block that lives inside `namespace glz { ... }`. That caused libc++'s atomic
headers to be pulled into `::glz::std` rather than `::std`, breaking both
P2996 Reflection Tests CI jobs with errors like
  'no member named `addressof` in namespace `glz::std`'
  'unknown type name `memory_order`; did you mean `::std::memory_order`'

Hoist the conditional include block to file scope (above `namespace glz`),
still gated on GLZ_REFLECTION26. feature_test.hpp is already pulled in
above, so the macro is defined by the time the `#if` is evaluated.
boxed<T> is the heap-allocated nullable wrapper used by struct glz::schema for
its sparsely populated fields. It previously stored std::unique_ptr<T>, which
forced every TU including schema.hpp to instantiate the unique_ptr<T> and
default_delete<T> machinery for each unique boxed T — even TUs that never use
glz::schema at all.

Replace that with direct management of a raw T*: default ctor, deep-copy ctor
and assign, noexcept move ctor and assign, destructor, converting ctor/assign,
operator*, operator->, reset, emplace, has_value, operator bool, value_or.
Same public API and semantics. Strong exception safety on all mutating ops
(allocate-then-swap).

Does not support polymorphic upcast (boxed<Derived> → boxed<Base>); schema
types are concrete. Other unused unique_ptr capabilities — custom deleters,
nullptr construction, release/get, comparison operators — are also omitted
and unused by any caller (verified by grep).

Full build passes, json/yaml/beve/bson/cbor + json_test/jsonschema_test
(70 tests, 321 asserts) all green.
MSGPACK is already opt-in for most users — aligning it with the CBOR
convention (not in glaze.hpp by default; users who want it include
glaze/msgpack.hpp explicitly). Users who relied on the transitive pull
get a clear missing-declaration error pointing at the missing include.

tests/json_test/allocate_raw_pointers_test.cpp exercised both CBOR and
MSGPACK and only had the explicit CBOR include; add the MSGPACK one next
to it. All other msgpack usage sites already include the header directly.

Full build passes.
recorder is an uncommon utility — a compile-time multi-typed time-series
recorder with to<JSON>/from<JSON>/to<CSV> specializations. Same treatment
as the recent msgpack drop and the existing cbor convention: opt-in by
including glaze/record/recorder.hpp directly.

Both call sites in the tree (tests/json_test/json_test.cpp and
tests/csv_test/csv_test.cpp) already include the header explicitly, so
no test changes needed.

Also severs the transitive glaze/csv.hpp chain that recorder.hpp pulled
into any TU including glaze.hpp (recorder.hpp's own TODOs flagged this).

Interleaved A/B on t1_include: -18 ± 17 ms per TU (direction consistent
across 5 of 6 pairs). Full build passes.
Now that glaze/record/recorder.hpp is no longer pulled in by glaze/glaze.hpp,
the documentation should show the explicit include and note the opt-in
requirement — matching how docs/msgpack.md and docs/cbor.md handle the
same convention.
@packit-as-a-service
Copy link
Copy Markdown

@stephenberry
Copy link
Copy Markdown
Owner Author

@cout, would you mind looking over this PR concerning Glaze compilation time optimizations? Particularly <glaze/forward.hpp> for forward declarations.

@cout
Copy link
Copy Markdown

cout commented May 5, 2026

I have just returned from vacation and will look over this PR this week.

@stephenberry
Copy link
Copy Markdown
Owner Author

@cout, I decided this PR is in a good place to merge into main, so I'm merging it in. Just let me know if there are additional changes to make to glaze/forward.hpp to assist your needs or help compilation speeds. I'm going to do more passes to improve compile times, but this is a first step.

@stephenberry stephenberry merged commit b919ec6 into main May 10, 2026
51 checks passed
@stephenberry stephenberry deleted the compilation-optimization-cleanup branch May 10, 2026 23:27
@cout
Copy link
Copy Markdown

cout commented May 11, 2026

Sounds good! I did try looking at the code last week, but github was taking too long (i.e. over an hour) to load the diff. It's fast now. Hopefully I will have some feedback for you today.

@makekryl
Copy link
Copy Markdown
Contributor

makekryl commented May 11, 2026

One more suggestion: sometimes it's needed to use glz::generic in internal structures, but it pulls the entire json codec, polluting the TUs that don't actually perform encoding/decoding - is it also possible to split up glaze/json/generic.hpp into two headers, with one remaining "pure" without glaze/json/* includes?

UPD: I prepared a PR: #2560

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Forward declaration headers to improve compilation speed for projects that use glaze

3 participants