Skip to content

macOS Apple Silicon: C++ Segfault during test shutdown due to TF GCS client (tokenizer caching) #660

@prince-shakyaa

Description

@prince-shakyaa

System Information:

  • OS: macOS (Apple Silicon)
  • Python Version: 3.12
  • Environment: Local pytest execution

Bug Description:
When running the gemma/gm/text test suite locally on macOS Apple Silicon, the pytest-xdist workers crash with the following unhandled C++ exception during interpreter shutdown:
libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument

Root Cause:
The issue stems from gemma/gm/utils/_file_cache.py. Currently, maybe_get_from_cache does not actually download and save the tokenizer model to the local cache directory if it is missing. Instead, it returns the gs:// path, causing epath.Path('gs://...').read_bytes() to be executed.

This directly invokes the TensorFlow C++ GCS client via gRPC to read the bytes over the network. On macOS Apple Silicon, there is a known bug where tearing down the SentencePieceProcessor and the TensorFlow C++ GCS client threads during interpreter shutdown causes a mutex lock failure and segfaults the process.

Proposed Fix:
We should update _file_cache.py to intercept gs:// paths, download them over standard HTTP (e.g., using urllib.request), and save them to the local ~/.gemma/tokenizer/ cache directory before returning the path.

This ensures epath only ever performs local disk I/O, completely avoiding the C++ GCS client and preventing the crash on macOS. As an added benefit, this introduces true local caching, meaning the test suite (and user code) won't have to download the tokenizer over the network on every single run.

I will be raising a PR shortly with this fix!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions