You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: resolve MACA build and runtime issues to enable GPT-2 training
CMakeLists.txt:
- Pre-set HAVE_MODE_T/HAVE_SSIZE_T and their sentinel variables
(HAVE_HAVE_MODE_T/HAVE_HAVE_SSIZE_T) before add_subdirectory(glog),
since mxcc cmake feature-detection probes cannot find standard POSIX
headers; without the sentinels check_type_size re-runs and overwrites
the pre-set values, causing glog to emit conflicting fallback typedefs
- Add BUILD_TESTING=OFF to skip glog unit tests (-fPIE unsupported by mxcc)
- Add BUILD_SHARED_LIBS=OFF to build glog as a static library; mxcc
defaults to hidden symbol visibility, making libglog.so export nothing
datatype.h:
- Add is_bfloat16<T> and is_fp16<T> type traits with USE_CUDA/USE_MACA
specializations, needed by common_cpu.h Cast and init.cc ARANGE_CASE
common/cpu/common_cpu.h:
- Route fp16/bf16 destinations through float in Cast<T>(), avoiding
ambiguous integer→__half/__maca_bfloat16 conversion on MACA
kernels/maca/{stack,concat,slice,transform,elementwise,split,gather}.maca:
- Add reinterpret_cast<void **> to all mcMallocAsync(&ptr, ...) calls;
MACA's mcMallocAsync requires void** but typed pointers were passed
- Fix mcDevAttrMultiProcessorCount → mcDeviceAttributeMultiProcessorCount
in elementwise.maca (correct MACA enum name)
optimizer.cc:
- Change Fill<T>(0) → Fill<T>(0.f) for Adam m/v initialization;
__half(0) is ambiguous on MACA (only float/double ctors available)
nn/init.cc:
- Replace std::iota + static_cast<TYPE>(start) in ARANGE_CASE with an
explicit loop via static_cast<float> to avoid ambiguous integer→fp16/
bf16 conversion for kBFLOAT16/kFLOAT16 cases
example/gpt2/main.cc:
- Add kDeviceMACA constant, update --device validator to accept "maca",
and add Device::DeviceType::kMACA branch in device selection
0 commit comments