Skip to content

Commit 0665bd0

Browse files
authored
Merge pull request #15 from SharpAI/feature/use-upstream-mlx-swift
Feature/use upstream mlx swift
2 parents 29f11a6 + 0f59dad commit 0665bd0

File tree

1,832 files changed

+279
-571258
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,832 files changed

+279
-571258
lines changed
Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
---
2+
description: How to synchronize Apple MLX ecosystem updates into SharpAI forks and triage SSD-streaming bugs
3+
---
4+
5+
# Upstream MLX Synchronization & SSD Streaming Maintenance
6+
7+
This workflow documents the architecture for maintaining Apple MLX forks within the SharpAI repository ecosystem, executing upstream synchronization, and resolving bugs within the `ssd_streamer` custom extensions.
8+
9+
## 1. Ecosystem Architecture
10+
11+
The `mlx-server` repository now cleanly references the upstream Swift layer `SharpAI/mlx-swift` via Swift Package Manager (`SPM`).
12+
13+
```
14+
mlx-server (SharpAI/SwiftLM)
15+
16+
└── SPM Dependency: SharpAI/mlx-swift (The Swift wrapper wrapper)
17+
├── .gitmodules
18+
│ ├── submodules/mlx -> https://github.com/SharpAI/mlx (Branch: main)
19+
│ └── submodules/mlx-c -> https://github.com/SharpAI/mlx-c (Branch: main)
20+
```
21+
22+
**Never bundle C++ source files directly into `mlx-swift`.** All Apple core Engine updates and C-wrapper modifications MUST be executed in the `SharpAI/mlx` and `SharpAI/mlx-c` forks respectively.
23+
24+
## 2. Upstream Feature Verification & Integration Flow
25+
26+
When Apple releases new features to `ml-explore/mlx` or `ml-explore/mlx-c`, follow this systematic process to verify, integrate, and validate the changes before bringing them into the SharpAI ecosystem.
27+
28+
### 2.1 Double-Checking Upstream Features
29+
30+
Before syncing, verify if Apple's upstream actually fulfills all your custom requirements (which informs whether you should safely drop your custom patches):
31+
32+
1. **Review Upstream Logging/Releases:** Actively monitor the [Apple MLX Releases page](https://github.com/ml-explore/mlx/releases) or the `main` commit history for mentions of "quantization", "streaming", "memory-mapped operations", or "out-of-core inference".
33+
2. **Examine Target C++ Kernels:**
34+
- Look primarily in `mlx/backend/metal/` and `mlx/core/`.
35+
- Has upstream Apple added an equivalent to `moe_stream_op.cpp` natively?
36+
- Do the Metal shaders in `mlx/backend/metal/kernels/` natively introduce block execution / memory-mapped loading primitives similar to our `ssd_streamer.mm` and `fence.air` logic?
37+
3. **Check Exported C-APIs:** Look at `mlx/c/ops.h` and `mlx/c/fast.h` in `ml-explore/mlx-c`. If Apple has added official C-bindings for out-of-core tensor operations, you can securely begin stripping out the custom SharpAI C++ bridging codebase.
38+
39+
### 2.2 Integration Flow
40+
41+
If Apple's features are highly beneficial (e.g., core Metal optimizations) but do not explicitly replace our SSD streaming, we need to pull their features *while maintaining* the SharpAI SSD kernels.
42+
43+
1. **Pull Upstream to SharpAI forks**:
44+
```bash
45+
git clone https://github.com/SharpAI/mlx && cd mlx
46+
git remote add upstream https://github.com/ml-explore/mlx
47+
git fetch upstream
48+
49+
# Rebase Apple's latest main directly under our custom SSD commits
50+
git rebase upstream/main
51+
# Resolve any merge conflicts specifically around `fast.cpp` or Make/CMake builds
52+
git push -f origin main
53+
```
54+
2. Execute the identical rebasing process for `SharpAI/mlx-c`, monitoring `mlx_c/ops.cpp`.
55+
3. In `SharpAI/mlx-swift`, update the submodule pointers to mount your freshly rebased commits:
56+
```bash
57+
cd LocalPackages/mlx-swift
58+
git submodule update --remote --recursive
59+
git commit -am "chore: sync latest Apple MLX components and re-graft SSD patches"
60+
git push origin main
61+
```
62+
63+
### 2.3 Validation Flow
64+
65+
Do not deploy binary updates to the inference engine without executing the extreme validation matrix.
66+
67+
1. **Clean Re-Build:** Always execute a destructive cache wipe before a Metal compilation test.
68+
```bash
69+
# In mlx-server framework
70+
rm -rf .build
71+
./build.sh
72+
```
73+
2. **Swift API Layer Verification:** Run the test suites within your wrapper to certify that the Swift `->` C `->` C++ bindings remain structurally unified.
74+
```bash
75+
cd LocalPackages/mlx-swift
76+
swift test
77+
```
78+
3. **Extreme Context Benchmarking (The Harness):**
79+
- Run the dedicated `/run-benchmark` workflow from the root `mlx-server` directory (utilizing `run_benchmark.sh` or `profile_runner.py`).
80+
- Specifically target models invoking >32k token contexts. High prompt generation latency, GPU thrashing, or hard Out-of-Memory (OOM) faults directly indicate that the Metal barrier (`fence.air`) or `ssd_streamer.mm` broke silently during the git rebase.
81+
82+
## 3. Triaging SSD-Stream Bugs
83+
84+
The SSD streaming kernels introduce custom memory synchronization routines (`ssd_streamer.h`, `ssd_streamer.mm`) that interact with Apple's core MLX framework (`mlx/core/moe_stream_op.cpp`).
85+
86+
**Triage Protocol:**
87+
- **Crash in Metal Execution (`fence.air`, `moe_stream.metal`)**: Identify if Apple's upstream Metal API (`mlx/backend/metal/device.h`) changed rendering assumptions. Navigate to `SharpAI/mlx` and patch `mlx/backend/metal/ssd_streamer.mm`.
88+
- **C-API Mapping Errors (`fast.cpp`, `ops.cpp`)**: Swift throws errors linking to underlying kernels. Navigate to `SharpAI/mlx-c` and ensure `mlx/c/ops.cpp` cleanly wraps the updated arguments from `SharpAI/mlx`'s `moe_stream_op.h`.
89+
- **Memory Leaks/High Swap Usage**: Typically arises if the `fence.air` streaming barrier lacks synchronization with the newly upstreamed Apple thread-pool executors.
90+
91+
## 4. Retiring the Fork (When to Drop)
92+
93+
> [!WARNING]
94+
> The ultimate goal is to delete the `SharpAI/mlx` and `SharpAI/mlx-c` forks and point `SharpAI/mlx-swift` directly to `ml-explore/mlx` natively.
95+
96+
**Indications for Dropping the Fork:**
97+
1. Apple officially merges Turbo Quant framework into `ml-explore/mlx/fast/turbo_quant.h` or equivalent upstream PR.
98+
2. Apple natively supports out-of-core SSD context offloading (e.g., streaming inference blocks directly from Non-Volatile Memory to GPU) in `ml-explore/mlx/backend/metal/`.
99+
3. If Apple's `moe_stream_op` native implementations match or exceed the latency speedups provided by your custom `ssd_streamer.mm`.
100+
101+
If any of these conditions are met, simply rewrite `SharpAI/mlx-swift/.gitmodules` back to `https://github.com/ml-explore/mlx` and delete your Github forks!
102+
103+
## 5. SharpAI Custom Patches Inventory (vs. Upstream ml-explore)
104+
105+
As of **April 2026**, the following specific features exist ONLY in our custom forks. Knowing precisely *what* we added is the key to knowing exactly *when* we can revert to Apple's native upstream (`ml-explore`).
106+
107+
### 🛠️ In `SharpAI/mlx` (C++ Engine)
108+
*Compared to `ml-explore/mlx:main`*
109+
1. `feat: custom ssd-streaming kernels and custom MLX I/O fast loaders`
110+
- Added `moe_stream_op` primitives enabling SSD flash streaming (out-of-core execution).
111+
2. `fix(metal): align moe_stream_op add_temporary signature with latest apple upstream`
112+
- Custom extensions needed maintaining against newer MLX memory-pool updates.
113+
3. `fix(metal): add default initialization loop for bound encoder contexts in async`
114+
- Patched `device.cpp` so thread pool reassignments by Swift's async engine don't result in fatal runtime aborts due to missing context dictionaries.
115+
116+
### 🛠️ In `SharpAI/mlx-c` (C-API Bridge)
117+
*Compared to `ml-explore/mlx-c:main`*
118+
1. `chore: rebase SharpAI custom ops onto latest Apple MLX-C upstream to fix fft/dequantize signatures`
119+
2. `fix(ops): align c wrappers with mlx 0.30.0+ upstream signatures for dequantize, qqmm, and fft`
120+
3. `fix(fft): restore Shape type for fft methods n parameter` & `fix(fft): remove invalid norm from fftshift calls`
121+
- Resolves signature drift and struct mismatches linking the new C++ API modifications down to Swift C headers.
122+
123+
### 🛠️ In `SharpAI/mlx-swift` (Swift Wrappers)
124+
*Compared to `ml-explore/mlx-swift:main`*
125+
1. `Restoration of missing MLX custom extensions including C-API and Swift bridge` & `Update custom C++ kernel patches for SSD Streaming`
126+
- Recreated Swift integrations bridging into out-of-core functionality.
127+
2. `chore: isolate SharpAI custom MLX/MLX-C engines into dedicated GitHub forks`
128+
- Submodule remotes internally pinned from `ml-explore` tracking links to `SharpAI` ecosystem forks.
129+
3. `fix(build): bump cxxLanguageStandard to .gnucxx20 for Apple MLX upstream compatibility`
130+
- Custom `Package.swift` override explicitly permitting C++20 standard since upstream didn't upgrade constraints simultaneously.
131+
4. `fix(mlx): build steel_conv_3d C++ string for Cmlx target`
132+
- Added missing header dependencies specifically isolated by recent upstream migrations.
133+
5. `fix(jit): update generated mlx c++ metal headers and fix fast.h signature to match fast.cpp`
134+
- Recompiled Metal header string buffers internally inside `mlx-generated` ensuring `affine_qmm_t_splitk` and other functions are dynamically injected at runtime.

.github/workflows/build.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ name: Build
22

33
on:
44
push:
5-
branches: [main, develop, feature/*]
5+
branches: [main]
66
pull_request:
77
branches: [main]
88

@@ -23,9 +23,9 @@ jobs:
2323
path: .build
2424
# Key includes product name so any rename (e.g. mlx-server→SwiftLM)
2525
# automatically busts the cache and prevents stale PCH errors.
26-
key: ${{ runner.os }}-spm-SwiftLM-${{ hashFiles('Package.resolved') }}
26+
key: ${{ runner.os }}-spm-SwiftLM-v2-${{ hashFiles('Package.resolved') }}
2727
restore-keys: |
28-
${{ runner.os }}-spm-SwiftLM-
28+
${{ runner.os }}-spm-SwiftLM-v2-
2929
3030
- name: Resolve dependencies
3131
run: swift package resolve

.github/workflows/e2e-test.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ name: E2E Tests
22

33
on:
44
push:
5-
branches: [main, feature/*]
5+
branches: [main]
66
pull_request:
77
branches: [main]
88

@@ -24,9 +24,9 @@ jobs:
2424
uses: actions/cache@v4
2525
with:
2626
path: .build
27-
key: ${{ runner.os }}-spm-SwiftLM-${{ hashFiles('Package.resolved') }}
27+
key: ${{ runner.os }}-spm-SwiftLM-v2-${{ hashFiles('Package.resolved') }}
2828
restore-keys: |
29-
${{ runner.os }}-spm-SwiftLM-
29+
${{ runner.os }}-spm-SwiftLM-v2-
3030
3131
- name: Clear stale module cache
3232
# Prevents: "PCH was compiled with module cache path '…mlx-server…'

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,3 +21,4 @@ DerivedData/
2121
curl_out.txt
2222
sample.txt
2323
tmp/
24+
/homesec-benchmark/

.gitmodules

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
11
[submodule "mlx-swift-lm"]
22
path = mlx-swift-lm
33
url = https://github.com/SharpAI/mlx-swift-lm.git
4+
[submodule "LocalPackages/mlx-swift"]
5+
path = LocalPackages/mlx-swift
6+
url = https://github.com/SharpAI/mlx-swift

LocalPackages/mlx-swift/.github/ISSUE_TEMPLATE/bug_report.md

Lines changed: 0 additions & 29 deletions
This file was deleted.

LocalPackages/mlx-swift/.github/pull_request_template.md

Lines changed: 0 additions & 12 deletions
This file was deleted.

LocalPackages/mlx-swift/.github/scripts/build-linux-cuda-cmake.sh

Lines changed: 0 additions & 22 deletions
This file was deleted.

LocalPackages/mlx-swift/.github/scripts/run-xcode-tests.sh

Lines changed: 0 additions & 8 deletions
This file was deleted.

LocalPackages/mlx-swift/.github/scripts/setup+build-linux-container-cmake.sh

Lines changed: 0 additions & 44 deletions
This file was deleted.

0 commit comments

Comments
 (0)