Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 31 additions & 21 deletions _data/contributors.yml
Original file line number Diff line number Diff line change
Expand Up @@ -680,6 +680,37 @@
proposal: /assets/docs/de_la_torre_gonzalez_salvador_proposal_gsoc_2026.pdf
mentors: Vassil Vassilev, Lukas Breitwieser, Luciana Melina Luque

- name: Abdelrhman Elrawy
photo: Abdelrhman.jpg
info: "Google Summer of Code 2026 Contributor"
email: abdelrhman.elrawy1@gmail.com
education: "Master of Applied Computing, Wilfrid Laurier University, Canada"
github: "https://github.com/a-elrawy"
linkedin: "https://www.linkedin.com/in/elrawy/"
active: 1
projects:
- title: "Enabling Differentiable Rendering via AD of Parallel C++ STL Primitives in Clad"
status: Ongoing
description: |
This project proposes to enhance Clad, a Clang-based automatic differentiation (AD) tool, by
leveraging its liveness analysis to automatically generate lock-free backward passes for
highly parallel differentiable rendering pipelines. Specifically, the project targets the atomic
bottleneck inherent in 3D Gaussian Splatting (3DGS) rasterization.
proposal: /assets/docs/Abdelrhman_Elrawy_Proposal_GSoC_2026.pdf
mentors: Vassil Vassilev, Alexander Penev
- title: "Support usage of Thrust API in Clad"
status: Completed
description: |
This project enhances Clad, a Clang-based automatic differentiation tool,
by enabling it to support NVIDIA's Thrust library for GPU-parallel programming.
The goal is to implement custom derivative rules for Thrust primitives like
`thrust::transform` and `thrust::reduce`, making it possible to differentiate
high-performance CUDA code automatically. This work bridges the gap between
automatic differentiation and GPU acceleration, enabling efficient gradient
computations in scientific computing and machine learning workloads.
proposal: /assets/docs/Abdelrhman_Elrawy_Proposal_GSoC_2025.pdf
mentors: Vassil Vassilev, Alexander Penev

- name: "This could be you!"
photo: rock.jpg
info: See <a href="/careers">openings</a> for more info
Expand Down Expand Up @@ -865,27 +896,6 @@
proposal: /assets/docs/Jiayang_Li_Proposal_2025.pdf
mentors: Vassil Vassilev, Martin Vassilev

- name: Abdelrhman Elrawy
photo: Abdelrhman.jpg
info: "Google Summer of Code 2025 Contributor"
email: abdelrhman.elrawy1@gmail.com
education: "Master of Applied Computing, Wilfrid Laurier University, Canada"
github: "https://github.com/a-elrawy"
linkedin: "https://www.linkedin.com/in/elrawy/"
projects:
- title: "Support usage of Thrust API in Clad"
status: Completed
description: |
This project enhances Clad, a Clang-based automatic differentiation tool,
by enabling it to support NVIDIA's Thrust library for GPU-parallel programming.
The goal is to implement custom derivative rules for Thrust primitives like
`thrust::transform` and `thrust::reduce`, making it possible to differentiate
high-performance CUDA code automatically. This work bridges the gap between
automatic differentiation and GPU acceleration, enabling efficient gradient
computations in scientific computing and machine learning workloads.
proposal: /assets/docs/Abdelrhman_Elrawy_Proposal_GSoC_2025.pdf
mentors: Vassil Vassilev, Alexander Penev

- name: Galin Bistrev
photo: Bistrev.jpg
info: "CERN Summer student 2025"
Expand Down
4 changes: 4 additions & 0 deletions _data/standing_meetings.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@
date: 2026-06-10 16:00:00 +0200
speaker: "Muhammad Bassiouni"
link: "[Slides](/assets/presentations/Muhammad_Bassiouni_CppAlliance_Fellowship2026_initial_presentation.pdf)"
- title: "Enabling Differentiable Rendering via AD of Parallel C++ STL Primitives in Clad Initial Presentation"
date: 2026-06-10 16:00:00 +0200
speaker: "Abdelrhman Elrawy"
link: "[Slides](/assets/presentations/Abdelrhman_initial_presentation_enabling_differentiable_rendering_via_AD_in_clad.pdf)"
- title: "Creating teaching materials with xeus-cpp final report"
date: 2026-06-10 17:00:00 +0200
speaker: "Hristiyan Shterev"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
title: "Enabling Differentiable Rendering via AD of Parallel C++ STL Primitives in Clad"
layout: post
excerpt: "This summer, I am working on enhancing Clad to automatically generate lock-free backward passes for highly parallel differentiable rendering pipelines. This project targets the atomic bottleneck inherent in 3D Gaussian Splatting (3DGS) rasterization."
sitemap: false
author: Abdelrhman Elrawy
permalink: blogs/gsoc26_abdelrhman_elrawy_introduction_blog
banner_image: /images/blog/gsoc-banner.png
date: 2026-06-17
tags: gsoc llvm clang automatic-differentiation gpu cuda thrust rendering 3dgs
---

## About Me

Hi! I’m Abdelrhman Elrawy, a graduate student in Applied Computing specializing in Machine Learning and Parallel Programming. After successfully adding Thrust API support to Clad during GSoC 2025, I’m back this summer to work on **Enabling Differentiable Rendering via AD of Parallel C++ STL Primitives in Clad**.

## Project Description

[Clad](https://github.com/vgvassilev/clad) is a Clang-based automatic differentiation (AD) tool. This project proposes to enhance Clad by leveraging its liveness analysis to automatically generate lock-free backward passes for highly parallel differentiable rendering pipelines. Specifically, the project targets the atomic bottleneck inherent in 3D Gaussian Splatting (3DGS) rasterization.

Modern GPU-based differentiable renderers suffer from heavy use of atomic operations (e.g., `atomicAdd`) during the backward pass because many threads attempt to update the same pixel and gradient buffers simultaneously. This contention overwhelms L2 cache atomic units, stalling execution and causing the gradient computation to consume a significant portion of training time.

## Technical Approach

My approach is twofold:

1. **Redesign the Rendering Pipeline**: I will avoid write conflicts by construction using deterministic sorting, tile-based memory ownership, and local accumulation before a single global write. Crucially, I plan to build this new architecture on top of my previous GSoC work that integrated the Thrust API into Clad. By utilizing Thrust's highly optimized parallel primitives (such as `thrust::sort_by_key` and `thrust::reduce`) for the sorting and tile-based operations, we can provide a clean, high-level C++ structure to the compiler.

2. **Generate the Backward Pass**: I will utilize Clad to automatically generate the backward pass. By relying on Clad's liveness analysis, the compiler will prove exclusive memory ownership and automatically eliminate the need for `atomicAdd` calls. To isolate the compiler complexity from the graphics complexity, the project will use a differentiable geometry path tracer as the foundational stepping stone.

## Expected Outcomes

Beyond differentiable rendering, this work establishes a foundation for compiler-driven automatic differentiation of parallel C++ programs, enabling efficient gradient computation in a wide range of high-performance computing applications.

## Related Links

- [Clad GitHub](https://github.com/vgvassilev/clad)
- [Project Proposal](/assets/docs/Abdelrhman_Elrawy_Proposal_GSoC_2026.pdf)
- [My GitHub](https://github.com/a-elrawy)
Binary file not shown.
Binary file not shown.
Loading