Skip to content

Commit e83c61d

Browse files
fogsong233vgvassilev
authored andcommitted
add Kacent profile, blog, presentation
1 parent a0a3621 commit e83c61d

7 files changed

Lines changed: 146 additions & 0 deletions

File tree

.github/actions/spelling/allow/terms.txt

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -206,3 +206,11 @@ PBMCs
206206
PPo
207207
SIGPLAN
208208
TGF
209+
ATen
210+
autograd
211+
dtype
212+
fogsong
213+
Kacent
214+
LibTorch
215+
Nanjing
216+
PyTorch

_data/contributors.yml

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,6 +209,38 @@
209209
into a robust and consistent system.
210210
proposal: /assets/docs/Vedant_Goyal_Proposal_2026.pdf
211211
mentors: Aaron Jomy, David Lange, Vassil Vassilev
212+
213+
- name: Kacent Huang
214+
info: "Google Summer of Code 2026 Contributor"
215+
email: "fogsong233@gmail.com"
216+
github: "https://github.com/fogsong233"
217+
education: "Computer Science, Nanjing University, Nanjing, China"
218+
active: 1
219+
projects:
220+
- title: "Clad as a First-Class Gradient Engine in LibTorch"
221+
status: Ongoing
222+
description: |
223+
This proposal targets the HSF 2026 idea Clad as a first-class gradient engine
224+
in LibTorch, a 350-hour project intended to make compiler-generated gradients
225+
available from LibTorch and, eventually, easier to reuse from the ROOT ecosystem.
226+
My goal is to build a practical and well-scoped integration path that allows
227+
a LibTorch C++ training loop to call Clad-generated derivative code for
228+
a small but meaningful subset of workloads. The primary implementation path
229+
is to wrap Clad-generated backward logic inside torch::autograd::Function, because
230+
the PyTorch C++ frontend explicitly supports custom autograd functions and presents
231+
them as the standard way to integrate optimized forward and backward code in extensions.
232+
If time permits and the API proves cleaner, I will also evaluate a custom-operator path
233+
based on the PyTorch C++ extension mechanisms. This proposal is intentionally scoped
234+
as a proof of concept, not as a general replacement for LibTorch autograd. A realistic
235+
first milestone is to differentiate selected training kernels or small model components
236+
written in C++, expose them to LibTorch, and measure where compiler-generated derivatives
237+
are correct, maintainable, and competitive. This produces a concrete result for mentors
238+
and also establishes a clear baseline for future work on broader operator coverage, deeper
239+
ROOT integration, or GPU support.
240+
proposal: /assets/docs/Kacent_Proposal_GSOC2026.pdf
241+
mentors: Aaron Jomy, David Lange, Vassil Vassilev
242+
243+
212244

213245
- name: Matthew Barton
214246
info: "Open Source Contributor"

_data/standing_meetings.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,10 @@
1919
date: 2026-05-27 9:15:00 +0200
2020
speaker: "Georgi Runtolev"
2121
link: "[Slides](/assets/presentations/G_Runtolev_xeus_cpp_final_report.pdf)"
22+
- title: "Clad as a First-Class Gradient Engine in LibTorch Initial Presentation"
23+
date: 2026-05-20 17:00:00 +0200
24+
speaker: "Kacent Huang"
25+
link: "[Slides](/assets/presentations/Kacent_Clad_As_Torch_Engine.pdf)"
2226
- title: "Ramtools multithreading final presentation"
2327
date: 2026-05-13 9:15:00 +0200
2428
speaker: "Georgi Haralanov"

_pages/team/kacent-huang.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
---
2+
title: "Compiler Research - Team - Kacent Huang"
3+
layout: gridlay
4+
excerpt: "Compiler Research: Team members"
5+
sitemap: false
6+
permalink: /team/KacentHuang
7+
email: fogsong233@gmail.com
8+
---
9+
10+
{% include team-profile.html %}
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
---
2+
title: "Clad as a First-Class Gradient Engine in LibTorch"
3+
layout: post
4+
excerpt: "A GSoC 2026 project exploring whether Clad-generated gradients can become a practical backend for selected LibTorch C++ workloads."
5+
sitemap: true
6+
author: Kacent Huang
7+
permalink: /blogs/gsoc26_kacent_introduction_blog/
8+
banner_image: /images/blog/banner-clad-gsoc.png
9+
date: 2026-06-03
10+
tags: gsoc c++ clang clad libtorch pytorch machine-learning
11+
---
12+
13+
### Introduction
14+
15+
My name is Kacent Huang, and I am a second-year Computer Science undergraduate student at Nanjing University. During Google Summer of Code 2026, I will be working with the Compiler Research group on **"Clad as a First-Class Gradient Engine in LibTorch"**.
16+
17+
The project explores whether compiler-generated gradients from [Clad](https://github.com/vgvassilev/clad) can be used as a practical backend for selected [LibTorch](https://pytorch.org/cppdocs/) workloads. The goal is not to replace the full PyTorch autograd system. Instead, I want to build a focused C++ prototype that shows where source-transformed derivatives are correct, maintainable, and competitive.
18+
19+
**Mentors**: Aaron Jomy, David Lange, Vassil Vassilev
20+
21+
### Background
22+
23+
LibTorch is the C++ API of PyTorch. It lets C++ applications define tensors, run models, and execute training or inference workflows without moving the main application logic into Python. This is useful for ROOT and HEP workflows where data processing, simulation, and analysis are already deeply rooted in C++.
24+
25+
Clad is a Clang-based automatic differentiation tool. Rather than building a dynamic computation graph at runtime, Clad works through compiler source transformation: it analyzes C++ code and emits derivative code. Recent Compiler Research work has shown that this approach can be promising for machine-learning workloads, especially when the workload is CPU-bound and the differentiated code is carefully scoped.
26+
27+
This project sits at the boundary between these two systems. LibTorch provides the tensor runtime and the user-facing ML framework, while Clad provides a compiler-driven path for generating backward code for selected parts of a workload.
28+
29+
### Project Scope
30+
31+
The first version of the project will be intentionally narrow. LibTorch supports a very large operator ecosystem, dynamic tensor dispatch, and framework-managed autograd. Trying to replace all of that in one GSoC project would be unrealistic and would not produce a useful engineering result.
32+
33+
Instead, the project will focus on a limited proof of concept:
34+
35+
- CPU execution first.
36+
- Contiguous floating-point tensors first.
37+
- A small supported operator or graph subset first.
38+
- Correctness checks against native LibTorch autograd and, where appropriate, finite differences.
39+
- Clear documentation of what is supported and what remains out of scope.
40+
41+
My proposal starts from `torch::autograd::Function` because it is the smallest official LibTorch extension boundary for custom forward and backward code. The forward path can call a reference C++ kernel or component, while the backward path can call Clad-generated derivative code.
42+
43+
In the initial presentation, I also described a longer-term "torch.compile-like" direction for LibTorch C++. In eager LibTorch code, a user function such as:
44+
45+
```cpp
46+
auto my_graph(torch::Tensor x) -> torch::Tensor {
47+
return (x * x).sum();
48+
}
49+
```
50+
51+
produces an observed graph instance at runtime. A future version of this idea could specialize that observed graph by operator sequence, tensor shape, dtype, and layout assumptions, then lower the supported subset into Clad-friendly C++ code. That is a broader research direction; the GSoC deliverable remains a small, measurable prototype.
52+
53+
### Implementation Plan
54+
55+
The implementation will proceed in stages.
56+
57+
First, I will select a reference workload that is small enough to test rigorously. Candidate workloads include a compact dense layer, an activation-plus-loss path, or a toy training component that can be expressed with source-visible C++ logic and simple tensor access patterns.
58+
59+
Second, I will generate and validate gradients with Clad outside LibTorch. This separates the core AD question from the framework-integration question: before using the generated code inside LibTorch, the derivative itself should be checked against expected results.
60+
61+
Third, I will build the LibTorch integration layer. The main path is a custom `torch::autograd::Function` whose `forward` method runs the selected C++ workload and whose `backward` method calls the Clad-generated derivative. The interface should keep saved tensors and metadata minimal so the data flow stays easy to inspect.
62+
63+
Fourth, I will compare the Clad-backed path with native LibTorch autograd on the same workload. Correctness comes first. Performance measurements and engineering tradeoffs will follow once the end-to-end path is stable.
64+
65+
If the primary path is stable early enough, I will evaluate whether a PyTorch custom-operator route gives a cleaner API or lower overhead. I will treat that as a stretch goal, not a prerequisite for the core project.
66+
67+
### Expected Deliverables
68+
69+
By the end of the project, I aim to deliver:
70+
71+
- A minimal LibTorch C++ example that uses Clad-generated derivative code through `torch::autograd::Function`.
72+
- A documented CPU reference workload chosen to match Clad's strengths.
73+
- Correctness tests against native LibTorch autograd and finite-difference checks where they are useful.
74+
- Benchmark notes explaining runtime behavior, integration overhead, and workload limitations.
75+
- Developer-facing documentation describing the supported scope and possible extension points.
76+
77+
The result should give the Compiler Research and HSF communities a concrete baseline for future work. If the prototype works well, it can motivate broader operator coverage, a cleaner user-facing API, ROOT-facing examples, or GPU support. If some parts do not work well, the project should still document the boundary clearly enough to guide the next attempt.
78+
79+
### Looking Forward
80+
81+
I am interested in this project because it combines compilers, C++, machine-learning systems, and scientific software. The most important outcome is not only a working demo, but also a clear understanding of where compiler-generated derivatives fit naturally into a mature ML framework.
82+
83+
For users, the ideal experience should still feel like LibTorch C++. Clad should act as a compiler backend for supported pieces of the computation, with LibTorch remaining responsible for tensors, execution, and the surrounding training workflow.
84+
85+
### Related Links
86+
87+
- [Project Description](https://hepsoftwarefoundation.org/gsoc/2026/proposal_Clad-Libtorch.html)
88+
- [Clad Repository](https://github.com/vgvassilev/clad)
89+
- [PyTorch C++ Documentation](https://pytorch.org/cppdocs/)
90+
- [GSoC Project Proposal](/assets/docs/Kacent_Proposal_GSOC2026.pdf)
91+
- [Initial Presentation](/assets/presentations/Kacent_Clad_As_Torch_Engine.pdf)
92+
- [My GitHub Profile](https://github.com/fogsong233)
295 KB
Binary file not shown.
266 KB
Binary file not shown.

0 commit comments

Comments
 (0)