Skip to content

Commit 633e17d

Browse files
Add fix summary and remove temporary test file
1 parent cf23723 commit 633e17d

2 files changed

Lines changed: 98 additions & 13 deletions

File tree

FIX_SUMMARY.md

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# Fix Summary: Issue #105367 - Segmentation Fault with Complex Variable Operations
2+
3+
## Issue Description
4+
A segmentation fault occurred when performing complex number operations involving:
5+
- Complex64/complex128 variables
6+
- `tf.raw_ops.Conj` operation
7+
- `Variable.assign_add()` method
8+
9+
## Reproduction Code
10+
```python
11+
import tensorflow as tf
12+
input_data = tf.constant([1 + 2j, 3 + 4j], dtype=tf.complex64)
13+
var = tf.Variable(input_data, dtype=tf.complex64)
14+
conj_result = tf.raw_ops.Conj(input=input_data)
15+
assign_add_op = var.assign_add(conj_result)
16+
# Segmentation fault (core dumped)
17+
```
18+
19+
## Root Cause Analysis
20+
21+
The segmentation fault was caused by **missing complex type support** in two critical locations:
22+
23+
1. **GPU DenseUpdate Functor Instantiations** (`dense_update_functor_gpu.cu.cc`)
24+
- The template instantiations for `DenseUpdate<GPUDevice, T, ADD>` and `DenseUpdate<GPUDevice, T, SUB>` only included `TF_CALL_GPU_NUMBER_TYPES` and `TF_CALL_INTEGRAL_TYPES`
25+
- `TF_CALL_GPU_NUMBER_TYPES` = {half, bfloat16, float, double} - **does NOT include complex types**
26+
- `TF_CALL_COMPLEX_TYPES` = {complex64, complex128} - **was missing**
27+
28+
2. **GPU Kernel Registrations** (`resource_variable_ops.cc`)
29+
- The GPU kernel registrations for `AssignAddVariableOp` and `AssignSubVariableOp` similarly only included `TF_CALL_GPU_NUMBER_TYPES` and `TF_CALL_INTEGRAL_TYPES_NO_INT32`
30+
- Complex types were not registered for GPU execution
31+
32+
When users attempted to use `assign_add` on complex variables (especially after operations like `tf.raw_ops.Conj`), the kernel was not properly instantiated for complex types on GPU, leading to undefined behavior and segmentation faults.
33+
34+
## Solution
35+
36+
### Files Modified
37+
38+
1. **tensorflow/core/kernels/dense_update_functor_gpu.cu.cc**
39+
```cpp
40+
// Added complex type support
41+
#define DEFINE_GPU_KERNELS(T) \
42+
template struct functor::DenseUpdate<GPUDevice, T, ADD>; \
43+
template struct functor::DenseUpdate<GPUDevice, T, SUB>;
44+
TF_CALL_GPU_NUMBER_TYPES(DEFINE_GPU_KERNELS);
45+
TF_CALL_INTEGRAL_TYPES(DEFINE_GPU_KERNELS);
46+
TF_CALL_COMPLEX_TYPES(DEFINE_GPU_KERNELS); // <-- ADDED
47+
TF_CALL_float8_e5m2(DEFINE_GPU_KERNELS);
48+
TF_CALL_float8_e4m3fn(DEFINE_GPU_KERNELS);
49+
```
50+
51+
2. **tensorflow/core/kernels/resource_variable_ops.cc**
52+
```cpp
53+
// Added complex type support to GPU kernel registrations
54+
TF_CALL_GPU_NUMBER_TYPES(REGISTER_GPU_KERNELS);
55+
TF_CALL_INTEGRAL_TYPES_NO_INT32(REGISTER_GPU_KERNELS);
56+
TF_CALL_COMPLEX_TYPES(REGISTER_GPU_KERNELS); // <-- ADDED
57+
```
58+
59+
3. **tensorflow/python/kernel_tests/variables/resource_variable_ops_test.py**
60+
- Added `testComplexVariableAssignAddWithConj()` - Tests GPU execution with Conj operation
61+
- Added `testComplexVariableAssignAddCPU()` - Tests CPU execution with complex types
62+
- Both tests cover complex64 and complex128 data types
63+
64+
## Testing
65+
66+
The fix has been validated with:
67+
- ✅ The original reproduction case from issue #105367
68+
- ✅ New unit tests covering both complex64 and complex128 types
69+
- ✅ Tests for both CPU and GPU execution paths
70+
- ✅ Tests with `tf.raw_ops.Conj` operation combined with `assign_add`
71+
72+
## Impact
73+
74+
This fix enables:
75+
- Proper support for complex number arithmetic in resource variables on GPU
76+
- Safe usage of `assign_add` and `assign_sub` with complex variables
77+
- Compatibility with operations that produce complex results (like `Conj`, `FFT`, etc.)
78+
79+
## Pull Request
80+
81+
- **Branch**: `fix-complex-variable-conj-segfault`
82+
- **PR URL**: https://github.com/CodersAcademy006/tensorflow/pull/9
83+
- **Fixes**: #105367
84+
85+
## Technical Details
86+
87+
### Type Macro Definitions
88+
- `TF_CALL_GPU_NUMBER_TYPES`: half, bfloat16, float, double
89+
- `TF_CALL_COMPLEX_TYPES`: complex64, complex128
90+
- `TF_CALL_NUMBER_TYPES`: TF_CALL_REAL_NUMBER_TYPES + TF_CALL_COMPLEX_TYPES
91+
92+
### Why This Worked on CPU but Failed on GPU
93+
- CPU implementations use generic templates defined in header files
94+
- GPU implementations require explicit template instantiations in `.cu.cc` files
95+
- CPU kernel registrations already included `TF_CALL_NUMBER_TYPES` (which includes complex types)
96+
- GPU kernel registrations only included `TF_CALL_GPU_NUMBER_TYPES` (which excludes complex types)
97+
98+
This asymmetry caused the issue to only manifest on GPU execution paths.

test_issue_105367.py

Lines changed: 0 additions & 13 deletions
This file was deleted.

0 commit comments

Comments
 (0)