fix: CPU refit for BVH AABBs to work around Metal relaxed atomics#15
Draft
jkrumbiegel wants to merge 1 commit intoJuliaGeometry:sd/multitype-vecfrom
Draft
fix: CPU refit for BVH AABBs to work around Metal relaxed atomics#15jkrumbiegel wants to merge 1 commit intoJuliaGeometry:sd/multitype-vecfrom
jkrumbiegel wants to merge 1 commit intoJuliaGeometry:sd/multitype-vecfrom
Conversation
The GPU refit_aabbs_kernel uses atomic counters for bottom-up AABB computation (Karras 2012). On Metal, Atomix.jl hardcodes memory_order_relaxed for all atomic operations, ignoring the requested ordering. This means regular memory writes (node AABB updates) are not guaranteed to be visible across threadgroups, causing some internal nodes to end up with stale or zeroed AABBs non-deterministically. The fix copies nodes to CPU after the GPU refit kernel, recursively recomputes all internal AABBs, and copies back. Applied to build_blas, _build_tlas_topology, and refit_tlas!. Also adds a determinant threshold (1e-7) to fast_intersect_triangle to avoid division by zero for rays exactly parallel to a triangle. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
refit_aabbs_kerneluses atomic counters for bottom-up AABB computation (Karras 2012). On Metal, Atomix.jl hardcodesmemory_order_relaxedfor all atomics, so regular memory writes (node AABB updates) aren't visible across threadgroups. This causes some internal BVH nodes to have stale/zeroed AABBs non-deterministically, leading to missing geometry in renders.cpu_refit_blas_aabbs!,cpu_refit_tlas_aabbs!) that recursively recompute all internal AABBs after the GPU kernel. Applied inbuild_blas,_build_tlas_topology, andrefit_tlas!.1e-7) infast_intersect_triangleto avoid division by zero for parallel rays.How these changes were tested
KA.CPU).Created with the help of Claude Code