Compilation time optimization and cleanup#2527
Conversation
common.hpp does not itself reference any chrono type; the include was
transitively relied on by cbor/{read,write}.hpp. Add the direct include to
those two files so the dependency is local rather than a global ordering
contract.
write_directory() references glz::write_json as a non-dependent name inside a template body, so it must be visible at the point of the template definition. Previously this only worked because glaze/beve/beve_to_json.hpp (included earlier in glaze.hpp) transitively pulled in json/write.hpp. Include it directly so the header is usable on its own and the include order in glaze.hpp is no longer load-bearing.
|
One of the tests failed for b3c87c5. @admin check logs https://download.copr.fedorainfracloud.org/results/packit/stephenberry-glaze-2527/fedora-rawhide-s390x/10359336-glaze/builder-live.log, packit dashboard https://dashboard.packit.dev/jobs/copr/3502705 and external service dashboard https://copr.fedorainfracloud.org/coprs/build/10359336/ |
|
One of the tests failed for 454fff7. @admin check logs https://download.copr.fedorainfracloud.org/results/packit/stephenberry-glaze-2527/fedora-rawhide-s390x/10359418-glaze/builder-live.log, packit dashboard https://dashboard.packit.dev/jobs/copr/3502748 and external service dashboard https://copr.fedorainfracloud.org/coprs/build/10359418/ |
|
One of the tests failed for 454fff7. @admin check logs None, packit dashboard https://dashboard.packit.dev/jobs/copr/3502749 and external service dashboard https://copr.fedorainfracloud.org/coprs/build/10359418/ |
05dcc09 to
22e25c0
Compare
…larations Users who want to specialize glz::meta, glz::to<Format, T>, glz::from<Format, T>, or construct glz::custom_t in widely-included user headers can now include only <glaze/forward.hpp>, which has no Glaze dependencies and pulls in <cstdint> only. This lets them defer the full <glaze/glaze.hpp> include to the translation units that actually call read/write. New: include/glaze/forward.hpp — format id constants, primary templates for meta/to/from/to_partial/skip_value/parse/serialize/serialize_partial, the custom_t aggregate, and a namespace-scoped `using uint32_t = std::uint32_t;` so the format-tag type has a home under glz::. core/opts.hpp, core/meta.hpp, core/wrappers.hpp now include forward.hpp and delete their duplicate declarations. Behavior is unchanged. Addresses #2525.
Replace the std::vector<std::string_view> + std::ranges::sort inside unique_per_length_info with a copy of the caller's std::array and a hand-rolled insertion sort by key length. Also swap std::lower_bound / std::upper_bound for a linear subrange scan (input is sorted, N is small, and this avoids two more per-N instantiations). The previous comment explained why the old version used std::vector rather than std::array: using std::array<T, N> propagates per-N std::ranges::sort instantiations. The new version keeps std::array but avoids that trap by using a hand-rolled sort that is not template-instantiated per N. -ftime-trace on a 20-field reflected struct shows glz::make_keys_info<20> drop from 45.1 ms to 26.0 ms (-19 ms), with unique_per_length_info no longer appearing as a top-level InstantiateFunction entry (inlined below threshold). Wall-clock gain per TU is small (interleaved A/B at N=6 on this machine gives borderline signal, ~1% on the 5-field TU and within noise on the 20-field one), but the change removes a constexpr std::vector from the reflection pipeline and simplifies the code. Full build and json_test / yaml_test / beve_test / bson_test / cbor_test all pass.
…TION26 core/common.hpp unconditionally included <atomic>, <bitset>, and <complex> only to satisfy three glz::specified<> specializations — and those specializations are already inside an `#if GLZ_REFLECTION26` block. Nothing else in common.hpp referenced std::atomic, std::bitset, or std::complex. Pre-C++26 users (the vast majority today) now skip parsing these three headers per TU. C++26 users pick them up transitively, exactly as before, because the includes moved into the same `#if` block that already contained the specializations that need them. Interleaved A/B on a TU that only `#include <glaze/glaze.hpp>` + empty main shows a ~20 ms per-TU save (~1.7%). Smaller than the 84 ms estimated from parse-time-alone measurements in the original analysis — other paths pull in at least some of <atomic>/<bitset>/<complex> transitively, so the real saving is the unique delta, not the raw parse cost. No API change. Full build + json/yaml/beve/bson/cbor tests all pass.
Follow-up to 3d96d30. In that commit I moved the conditional `#include <atomic>`, `<bitset>`, `<complex>` inside the `#if GLZ_REFLECTION26` block that lives inside `namespace glz { ... }`. That caused libc++'s atomic headers to be pulled into `::glz::std` rather than `::std`, breaking both P2996 Reflection Tests CI jobs with errors like 'no member named `addressof` in namespace `glz::std`' 'unknown type name `memory_order`; did you mean `::std::memory_order`' Hoist the conditional include block to file scope (above `namespace glz`), still gated on GLZ_REFLECTION26. feature_test.hpp is already pulled in above, so the macro is defined by the time the `#if` is evaluated.
boxed<T> is the heap-allocated nullable wrapper used by struct glz::schema for its sparsely populated fields. It previously stored std::unique_ptr<T>, which forced every TU including schema.hpp to instantiate the unique_ptr<T> and default_delete<T> machinery for each unique boxed T — even TUs that never use glz::schema at all. Replace that with direct management of a raw T*: default ctor, deep-copy ctor and assign, noexcept move ctor and assign, destructor, converting ctor/assign, operator*, operator->, reset, emplace, has_value, operator bool, value_or. Same public API and semantics. Strong exception safety on all mutating ops (allocate-then-swap). Does not support polymorphic upcast (boxed<Derived> → boxed<Base>); schema types are concrete. Other unused unique_ptr capabilities — custom deleters, nullptr construction, release/get, comparison operators — are also omitted and unused by any caller (verified by grep). Full build passes, json/yaml/beve/bson/cbor + json_test/jsonschema_test (70 tests, 321 asserts) all green.
MSGPACK is already opt-in for most users — aligning it with the CBOR convention (not in glaze.hpp by default; users who want it include glaze/msgpack.hpp explicitly). Users who relied on the transitive pull get a clear missing-declaration error pointing at the missing include. tests/json_test/allocate_raw_pointers_test.cpp exercised both CBOR and MSGPACK and only had the explicit CBOR include; add the MSGPACK one next to it. All other msgpack usage sites already include the header directly. Full build passes.
recorder is an uncommon utility — a compile-time multi-typed time-series recorder with to<JSON>/from<JSON>/to<CSV> specializations. Same treatment as the recent msgpack drop and the existing cbor convention: opt-in by including glaze/record/recorder.hpp directly. Both call sites in the tree (tests/json_test/json_test.cpp and tests/csv_test/csv_test.cpp) already include the header explicitly, so no test changes needed. Also severs the transitive glaze/csv.hpp chain that recorder.hpp pulled into any TU including glaze.hpp (recorder.hpp's own TODOs flagged this). Interleaved A/B on t1_include: -18 ± 17 ms per TU (direction consistent across 5 of 6 pairs). Full build passes.
Now that glaze/record/recorder.hpp is no longer pulled in by glaze/glaze.hpp, the documentation should show the explicit include and note the opt-in requirement — matching how docs/msgpack.md and docs/cbor.md handle the same convention.
|
One of the tests failed for c870748. @admin check logs https://download.copr.fedorainfracloud.org/results/packit/stephenberry-glaze-2527/fedora-rawhide-s390x/10360790-glaze/builder-live.log, packit dashboard https://dashboard.packit.dev/jobs/copr/3505804 and external service dashboard https://copr.fedorainfracloud.org/coprs/build/10360790/ |
|
@cout, would you mind looking over this PR concerning Glaze compilation time optimizations? Particularly |
|
I have just returned from vacation and will look over this PR this week. |
|
@cout, I decided this PR is in a good place to merge into main, so I'm merging it in. Just let me know if there are additional changes to make to |
|
Sounds good! I did try looking at the code last week, but github was taking too long (i.e. over an hour) to load the diff. It's fast now. Hopefully I will have some feedback for you today. |
|
One more suggestion: sometimes it's needed to use UPD: I prepared a PR: #2560 |
Compile-time and include-surface cleanup, landing as a sequence of independent commits.
New:
<glaze/forward.hpp>Lightweight header with forward declarations for the customization points (
glz::meta,glz::to,glz::from,glz::to_partial,glz::skip_value,glz::parse,glz::serialize,glz::serialize_partial), format id constants, andglz::custom_t. Only depends on<cstdint>. Users can specialize these in widely-included headers without pulling in full Glaze. Addresses #2525.glaze/glaze.hppaggregate trimmedUncommon features removed from the top-level aggregate; users who want them include the individual headers directly (same convention CBOR already followed):
glaze/msgpack.hpp(docs/msgpack.mdupdated accordingly)glaze/record/recorder.hpp(docs/recorder.mdupdated accordingly)Core include diet
<atomic>/<bitset>/<complex>incore/common.hppare now gated behind#if GLZ_REFLECTION26— thespecified<>specializations that need them are the only users and already live inside that block.core/chrono.hppdropped fromcore/common.hpp; moved to the direct userscbor/read.hppandcbor/write.hpp.glaze/file/write_directory.hppnow explicitly includesglaze/json/write.hpp(previously worked by transitive luck).<any>include removed fromjson/json_ptr.hpp.Internal simplifications
unique_per_length_infoin the reflection hash pipeline dropsconstexpr std::vector+std::ranges::sortfor a copy of the inputstd::arrayplus a hand-rolled insertion sort — fewer template instantiations per reflected type.glz::boxed<T>(internal tostruct glz::schema) now manages a rawT*directly instead of wrappingstd::unique_ptr<T>. Same public API, same semantics; skips the per-Tunique_ptr+default_deleteinstantiations that every TU includingschema.hppotherwise pays.Full build and all test suites pass (json, yaml, beve, bson, cbor, jsonschema).