Skip to content

Latest commit

 

History

History
164 lines (112 loc) · 4.94 KB

File metadata and controls

164 lines (112 loc) · 4.94 KB

MLX Backend Tests

This directory contains end-to-end tests for the MLX backend. Each test verifies that a specific op or pattern is correctly lowered to MLX and produces matching outputs between PyTorch and the MLX runtime.

Setup

1. Install ExecuTorch Python package (if not already installed)

python install_executorch.py --editable

2. Configure CMake with MLX preset

From the ExecuTorch root directory:

cmake --preset mlx-release -DEXECUTORCH_BUILD_TESTS=ON

This configures the build with MLX delegate support and test targets. Build files are generated in cmake-out/.

3. Build the test runner

cmake --build cmake-out --target op_test_runner

This builds the op_test_runner binary that executes .pte models using the MLX runtime.

Prerequisites

  1. Python environment: Tests must be run in an environment where the executorch Python package is installed
  2. Built C++ runtime: The op_test_runner binary must be built (see Setup above)

Running Tests

Run All Tests

To run all registered tests:

python -m executorch.backends.mlx.test.run_all_tests -j4 --clean-after

Options

Flag Description
-j N / --parallel N Run tests in parallel with N workers
--clean-after Clean up generated test files after running
--clean Clean up generated test files and exit
--rebuild Rebuild the C++ test runner before running
--list List available tests and exit
-v / --verbose Verbose output
--timeout SECS Timeout per test in seconds (default: 300)

Memory Management Options

Running many tests can accumulate memory (torch/MLX/Metal allocations). These flags help manage memory:

Flag Description
--isolate Run each test in a separate subprocess (sequential mode only). Provides full memory isolation but is slower due to Python/torch import overhead per test.
--max-tasks-per-worker N Recycle parallel workers after N tests (parallel mode only). Workers are terminated and replaced after completing N tests, releasing accumulated memory.

Comparison:

Mode Memory Isolation Speed
-j 4 None (workers reused) Fastest
-j 4 --max-tasks-per-worker 10 Bounded (recycled every 10 tests) Fast
-j 4 --max-tasks-per-worker 1 Full (new process per test) Slower
--isolate Full (subprocess per test) Slowest (sequential)

Recommended for CI with memory constraints:

python -m executorch.backends.mlx.test.run_all_tests -j4 --max-tasks-per-worker 10 --clean-after

Run a Specific Test

To run a specific test by name (e.g., linear):

python -m executorch.backends.mlx.test.run_all_tests linear

With verbose output:

python -m executorch.backends.mlx.test.run_all_tests -v linear

List Available Tests

python -m executorch.backends.mlx.test.run_all_tests --list

Test Architecture

All tests are defined in test_ops.py. Each test follows a common pattern:

  1. Define a model - A simple nn.Module that uses the op being tested
  2. Create test inputs - Generate random input tensors
  3. Export and lower - Export the model and lower it to the MLX backend
  4. Run C++ binary - Execute the lowered model using op_test_runner
  5. Compare outputs - Verify PyTorch and MLX outputs match within tolerance

Test Class Structure

Tests inherit from OpTestCase and implement:

@register_test
class MyTest(OpTestCase):
    name = "my_test"           # Test name (used for output directory)
    rtol = 1e-5                # Relative tolerance for comparison
    atol = 1e-5                # Absolute tolerance for comparison

    def create_model(self) -> nn.Module:
        """Return the model to test."""
        ...

    def create_inputs(self) -> Tuple[torch.Tensor, ...]:
        """Return input tensors for export."""
        ...

    def get_dynamic_shapes(self) -> Optional[Dict]:
        """Return dynamic shape specs, or None for static shapes."""
        ...

    @classmethod
    def get_test_configs(cls) -> List["MyTest"]:
        """Return list of test configurations to run."""
        ...

Test Output

Test artifacts are saved to op_tests/<test_name>/:

  • model.pte - Exported ExecuTorch model
  • input.bin - Serialized input tensors
  • expected_output.bin - PyTorch reference output
  • actual_output.bin - MLX runtime output

Adding a New Test

  1. Add a new model class and OpTestCase subclass to test_ops.py
  2. Use the @register_test decorator on the test class
  3. Implement create_model(), create_inputs(), and get_test_configs()
  4. Run the test to verify it works E2E

Test harness

MLX also plugs into the ExecuTorch test harness for even more coverage. To run, use the following command from the ExecuTorch root directory:

pytest -c /dev/null backends/test/suite/operators/ -m flow_mlx