compiler-research · a-elrawy · Jun 17, 2026
diff --git a/_data/contributors.yml b/_data/contributors.yml
@@ -680,6 +680,37 @@
       proposal: /assets/docs/de_la_torre_gonzalez_salvador_proposal_gsoc_2026.pdf
       mentors: Vassil Vassilev, Lukas Breitwieser, Luciana Melina Luque
 
+- name: Abdelrhman Elrawy
+  photo: Abdelrhman.jpg
+  info: "Google Summer of Code 2026 Contributor"
+  email: abdelrhman.elrawy1@gmail.com
+  education: "Master of Applied Computing, Wilfrid Laurier University, Canada"
+  github: "https://github.com/a-elrawy"
+  linkedin: "https://www.linkedin.com/in/elrawy/"
+  active: 1
+  projects:
+    - title: "Enabling Differentiable Rendering via AD of Parallel C++ STL Primitives in Clad"
+      status: Ongoing
+      description: |
+        This project proposes to enhance Clad, a Clang-based automatic differentiation (AD) tool, by
+        leveraging its liveness analysis to automatically generate lock-free backward passes for
+        highly parallel differentiable rendering pipelines. Specifically, the project targets the atomic
+        bottleneck inherent in 3D Gaussian Splatting (3DGS) rasterization.
+      proposal: /assets/docs/Abdelrhman_Elrawy_Proposal_GSoC_2026.pdf
+      mentors: Vassil Vassilev, Alexander Penev
+    - title: "Support usage of Thrust API in Clad"
+      status: Completed
+      description: |
+        This project enhances Clad, a Clang-based automatic differentiation tool,
+        by enabling it to support NVIDIA's Thrust library for GPU-parallel programming.
+        The goal is to implement custom derivative rules for Thrust primitives like
+        `thrust::transform` and `thrust::reduce`, making it possible to differentiate
+        high-performance CUDA code automatically. This work bridges the gap between
+        automatic differentiation and GPU acceleration, enabling efficient gradient
+        computations in scientific computing and machine learning workloads.
+      proposal: /assets/docs/Abdelrhman_Elrawy_Proposal_GSoC_2025.pdf
+      mentors: Vassil Vassilev, Alexander Penev
+
 - name: "This could be you!"
   photo: rock.jpg
   info: See <a href="/careers">openings</a> for more info
@@ -865,27 +896,6 @@
       proposal: /assets/docs/Jiayang_Li_Proposal_2025.pdf
       mentors: Vassil Vassilev, Martin Vassilev
 
-- name: Abdelrhman Elrawy
-  photo: Abdelrhman.jpg
-  info: "Google Summer of Code 2025 Contributor"
-  email: abdelrhman.elrawy1@gmail.com
-  education: "Master of Applied Computing, Wilfrid Laurier University, Canada"
-  github: "https://github.com/a-elrawy"
-  linkedin: "https://www.linkedin.com/in/elrawy/"
-  projects:
-    - title: "Support usage of Thrust API in Clad"
-      status: Completed
-      description: |
-        This project enhances Clad, a Clang-based automatic differentiation tool,
-        by enabling it to support NVIDIA's Thrust library for GPU-parallel programming.
-        The goal is to implement custom derivative rules for Thrust primitives like
-        `thrust::transform` and `thrust::reduce`, making it possible to differentiate
-        high-performance CUDA code automatically. This work bridges the gap between
-        automatic differentiation and GPU acceleration, enabling efficient gradient
-        computations in scientific computing and machine learning workloads.
-      proposal: /assets/docs/Abdelrhman_Elrawy_Proposal_GSoC_2025.pdf
-      mentors: Vassil Vassilev, Alexander Penev
-
 - name: Galin Bistrev
   photo: Bistrev.jpg
   info: "CERN Summer student 2025"

diff --git a/_data/standing_meetings.yml b/_data/standing_meetings.yml
@@ -7,6 +7,10 @@
       date: 2026-06-10 16:00:00 +0200
       speaker: "Muhammad Bassiouni"
       link: "[Slides](/assets/presentations/Muhammad_Bassiouni_CppAlliance_Fellowship2026_initial_presentation.pdf)"
+    - title: "Enabling Differentiable Rendering via AD of Parallel C++ STL Primitives in Clad Initial Presentation"
+      date: 2026-06-10 16:00:00 +0200
+      speaker: "Abdelrhman Elrawy"
+      link: "[Slides](/assets/presentations/Abdelrhman_initial_presentation_enabling_differentiable_rendering_via_AD_in_clad.pdf)"
     - title: "Creating teaching materials with xeus-cpp final report"
       date: 2026-06-10 17:00:00 +0200
       speaker: "Hristiyan Shterev"

diff --git a/_posts/2026-06-17-enabling-differentiable-rendering-via-AD-in-clad.md b/_posts/2026-06-17-enabling-differentiable-rendering-via-AD-in-clad.md
@@ -0,0 +1,39 @@
+---
+title: "Enabling Differentiable Rendering via AD of Parallel C++ STL Primitives in Clad"
+layout: post
+excerpt: "This summer, I am working on enhancing Clad to automatically generate lock-free backward passes for highly parallel differentiable rendering pipelines. This project targets the atomic bottleneck inherent in 3D Gaussian Splatting (3DGS) rasterization."
+sitemap: false
+author: Abdelrhman Elrawy
+permalink: blogs/gsoc26_abdelrhman_elrawy_introduction_blog
+banner_image: /images/blog/gsoc-banner.png
+date: 2026-06-17
+tags: gsoc llvm clang automatic-differentiation gpu cuda thrust rendering 3dgs
+---
+
+## About Me
+
+Hi! I’m Abdelrhman Elrawy, a graduate student in Applied Computing specializing in Machine Learning and Parallel Programming. After successfully adding Thrust API support to Clad during GSoC 2025, I’m back this summer to work on **Enabling Differentiable Rendering via AD of Parallel C++ STL Primitives in Clad**.
+
+## Project Description
+
+[Clad](https://github.com/vgvassilev/clad) is a Clang-based automatic differentiation (AD) tool. This project proposes to enhance Clad by leveraging its liveness analysis to automatically generate lock-free backward passes for highly parallel differentiable rendering pipelines. Specifically, the project targets the atomic bottleneck inherent in 3D Gaussian Splatting (3DGS) rasterization.
+
+Modern GPU-based differentiable renderers suffer from heavy use of atomic operations (e.g., `atomicAdd`) during the backward pass because many threads attempt to update the same pixel and gradient buffers simultaneously. This contention overwhelms L2 cache atomic units, stalling execution and causing the gradient computation to consume a significant portion of training time.
+
+## Technical Approach
+
+My approach is twofold:
+
+1. **Redesign the Rendering Pipeline**: I will avoid write conflicts by construction using deterministic sorting, tile-based memory ownership, and local accumulation before a single global write. Crucially, I plan to build this new architecture on top of my previous GSoC work that integrated the Thrust API into Clad. By utilizing Thrust's highly optimized parallel primitives (such as `thrust::sort_by_key` and `thrust::reduce`) for the sorting and tile-based operations, we can provide a clean, high-level C++ structure to the compiler.
+
+2. **Generate the Backward Pass**: I will utilize Clad to automatically generate the backward pass. By relying on Clad's liveness analysis, the compiler will prove exclusive memory ownership and automatically eliminate the need for `atomicAdd` calls. To isolate the compiler complexity from the graphics complexity, the project will use a differentiable geometry path tracer as the foundational stepping stone.
+
+## Expected Outcomes
+
+Beyond differentiable rendering, this work establishes a foundation for compiler-driven automatic differentiation of parallel C++ programs, enabling efficient gradient computation in a wide range of high-performance computing applications.
+
+## Related Links
+
+- [Clad GitHub](https://github.com/vgvassilev/clad)
+- [Project Proposal](/assets/docs/Abdelrhman_Elrawy_Proposal_GSoC_2026.pdf)
+- [My GitHub](https://github.com/a-elrawy)
diff --git a/assets/docs/Abdelrhman_Elrawy_Proposal_GSoC_2026.pdf b/assets/docs/Abdelrhman_Elrawy_Proposal_GSoC_2026.pdf
diff --git a/...ions/Abdelrhman_initial_presentation_enabling_differentiable_rendering_via_AD_in_clad.pdf b/...ions/Abdelrhman_initial_presentation_enabling_differentiable_rendering_via_AD_in_clad.pdf