Local Development

Local Development is for internal team (kaggle) only. For local development, you will need to configure your environment to use the Kaggle Model Proxy.

Prerequisites:

Python 3.11+
Git
uv

Installation & Configuration:

Clone the repository:

git clone https://github.com/Kaggle/kaggle-benchmarks.git
cd kaggle-benchmarks

Create a virtual environment and install dependencies using uv:

# Create and activate the virtual environment
uv venv
source .venv/bin/activate  # On Windows, use `.venv\Scripts\activate`

# Install dependencies
uv pip install -e .

Obtain a Kaggle MODEL_PROXY_API_KEY for access.
Create a .env file in the project root and add your configuration:
```
MODEL_PROXY_URL=https://mp-staging.kaggle.net/models/openapi
MODEL_PROXY_API_KEY={your_token}
LLM_DEFAULT=google/gemini-2.5-flash
LLM_DEFAULT_EVAL=google/gemini-2.5-pro
LLMS_AVAILABLE=anthropic/claude-sonnet-4,google/gemini-2.5-flash,meta/llama-3.1-70b,google/gemini-2.5-pro
PYTHONPATH=src
```
- LLM_DEFAULT: Sets the model identifier for kbench.llm, the default model used for running tasks.
- LLM_DEFAULT_EVAL: Sets the model identifier for kbench.judge_llm, which is typically a model used for evaluation or judging the outputs of other models.
- LLMS_AVAILABLE: A comma-separated list of models authorized for use by your proxy token.
Note: The LLM_DEFAULT, LLM_DEFAULT_EVAL, and LLMS_AVAILABLE variables depend on the models authorized by your proxy token.

Caching:

To speed up development and reduce costs, the framework uses hishel to cache HTTP responses. Caching is disabled by default. To enable it, set the ENABLE_LOCAL_CACHING environment variable to true:

export ENABLE_LOCAL_CACHING=true

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local Development

FilesExpand file tree

local_development.md

Latest commit

History

local_development.md

File metadata and controls

Local Development