An ML intern that autonomously researches, writes, and ships good quality ML related code using the Hugging Face ecosystem β with deep access to docs, papers, datasets, and cloud compute.
git clone git@github.com:huggingface/ml-intern.git
cd ml-intern
uv sync
uv tool install -e .ml-internCreate a .env file in the project root (or export these in your shell):
ANTHROPIC_API_KEY=<your-anthropic-api-key> # if using anthropic models
OPENAI_API_KEY=<your-openai-api-key> # if using openai models
LOCAL_LLM_BASE_URL=http://localhost:8000 # shared fallback for local model prefixes
LOCAL_LLM_API_KEY=<optional-local-api-key> # optional shared local API key
HF_TOKEN=<your-hugging-face-token>
GITHUB_TOKEN=<github-personal-access-token> If no HF_TOKEN is set, the CLI will prompt you to paste one on first launch
unless you start on a local model. To get a GITHUB_TOKEN follow the tutorial
here.
Interactive mode (start a chat session):
ml-internHeadless mode (single prompt, auto-approve):
ml-intern "fine-tune llama on my dataset"Options:
ml-intern --model anthropic/claude-opus-4-7 "your prompt" # requires ANTHROPIC_API_KEY
ml-intern --model openai/gpt-5.5 "your prompt" # requires OPENAI_API_KEY
ml-intern --model ollama/llama3.1:8b "your prompt"
ml-intern --model vllm/meta-llama/Llama-3.1-8B-Instruct "your prompt"
ml-intern --sandbox-tools "your prompt" # use HF Space sandbox tools
ml-intern --max-iterations 100 "your prompt"
ml-intern --no-stream "your prompt"Run ml-intern then /model to see the full list of suggested model ids
(Claude, GPT, HF-router models like MiniMax, Kimi, GLM, DeepSeek, and local
model prefixes).
Local models:
Local model support uses OpenAI-compatible HTTP endpoints through LiteLLM. The agent does not load model weights directly from disk; start your inference server first, then select it with a provider-specific model prefix:
ml-intern --model ollama/llama3.1:8b "your prompt"
ml-intern --model vllm/meta-llama/Llama-3.1-8B-Instruct "your prompt"Inside interactive mode, switch with /model:
/model ollama/llama3.1:8b
/model lm_studio/google/gemma-3-4b
/model llamacpp/llama-3.1-8b-instruct
Supported local prefixes are ollama/, vllm/, lm_studio/, and
llamacpp/. Set LOCAL_LLM_BASE_URL and optional LOCAL_LLM_API_KEY to use
one shared local endpoint, or override a specific provider with its matching
*_BASE_URL / *_API_KEY variable, such as OLLAMA_BASE_URL or
VLLM_API_KEY. Provider-specific variables take precedence over the shared
local variables. Base URLs may include or omit /v1.
CLI tool runtime:
By default, the CLI runs bash, read, write, and edit on your local
filesystem. To use HF Space sandbox tools instead, including sandbox_create,
opt in with --sandbox-tools:
ml-intern --sandbox-tools "test this training script in a GPU sandbox"
ml-intern --model llamacpp/ggml-org/gemma-3-1b-it-GGUF --sandbox-toolsSandbox tool runtime requires HF_TOKEN, even when the selected model is local,
because it creates private HF Spaces. You can also make sandbox tools your CLI
default in ~/.config/ml-intern/cli_agent_config.json:
{ "tool_runtime": "sandbox" }Use the default local runtime when you want tools to inspect or edit files in your checkout. Use sandbox runtime when you want the agent to create or replace an HF Space sandbox, test code remotely, or request GPU sandbox hardware before launching larger HF Jobs.
Every session is auto-uploaded to your own private Hugging Face dataset in Claude Code JSONL format, which the HF Agent Trace Viewer auto-detects so you can browse turns, tool calls, and model responses directly on the Hub.
By default the dataset is named {your-hf-username}/ml-intern-sessions and is
created private. You can flip it to public from inside the CLI:
/share-traces # show current visibility + dataset URL
/share-traces public # publish (anyone can view)
/share-traces private # lock it back downYou can also flip visibility from the dataset page on huggingface.co β the agent honours whatever you set there for subsequent uploads.
To opt out entirely, set in your CLI config (e.g. configs/cli_agent_config.json
or ~/.config/ml-intern/cli_agent_config.json):
{ "share_traces": false }To override the destination repo, set:
{ "personal_trace_repo_template": "{hf_user}/my-custom-traces" }The shared smolagents/ml-intern-sessions dataset is unrelated and only
receives anonymized telemetry rows used by the backend KPI scheduler.
ML Intern currently supports one-way notification gateways from CLI sessions. These gateways send out-of-band status updates; they do not accept inbound chat messages.
Slack notifications use the Slack Web API to post messages when the agent needs
approval, hits an error, or completes a turn. Create a Slack app with a bot token
that has chat:write, invite the bot to the target channel, then set:
SLACK_BOT_TOKEN=xoxb-...
SLACK_CHANNEL_ID=C...The CLI automatically creates a slack.default destination when both variables
are present. Optional environment variables for the env-only default:
ML_INTERN_SLACK_NOTIFICATIONS=false
ML_INTERN_SLACK_DESTINATION=slack.ops
ML_INTERN_SLACK_AUTO_EVENTS=approval_required,error,turn_complete
ML_INTERN_SLACK_ALLOW_AGENT_TOOL=true
ML_INTERN_SLACK_ALLOW_AUTO_EVENTS=trueFor a persistent user-level config, put overrides in
~/.config/ml-intern/cli_agent_config.json or point ML_INTERN_CLI_CONFIG at a
JSON file:
{
"messaging": {
"enabled": true,
"auto_event_types": ["approval_required", "error", "turn_complete"],
"destinations": {
"slack.ops": {
"provider": "slack",
"token": "${SLACK_BOT_TOKEN}",
"channel": "${SLACK_CHANNEL_ID}",
"allow_agent_tool": true,
"allow_auto_events": true
}
}
}
}βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User/CLI β
ββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ¬βββββββββββ
β Operations β Events
β (user_input, exec_approval, β
submission_queue interrupt, compact, ...) event_queue
β β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β submission_loop (agent_loop.py) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β 1. Receive Operation from queue β β β
β β 2. Route to handler (run_agent/compact/...) β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β Handlers.run_agent() β ββββ€
β β β β β
β β ββββββββββββββββββββββββββββββββββββββββββ β β β
β β β Agentic Loop (max 300 iterations) β β β β
β β β β β β β
β β β ββββββββββββββββββββββββββββββββββββ β β β β
β β β β Session β β β β β
β β β β ββββββββββββββββββββββββββββββ β β β β β
β β β β β ContextManager β β β β β β
β β β β β β’ Message history β β β β β β
β β β β β (litellm.Message[]) β β β β β β
β β β β β β’ Auto-compaction (170k) β β β β β β
β β β β β β’ Session upload to HF β β β β β β
β β β β ββββββββββββββββββββββββββββββ β β β β β
β β β β β β β β β
β β β β ββββββββββββββββββββββββββββββ β β β β β
β β β β β ToolRouter β β β β β β
β β β β β ββ HF docs & research β β β β β β
β β β β β ββ HF repos, datasets, β β β β β β
β β β β β β jobs, papers β β β β β β
β β β β β ββ GitHub code search β β β β β β
β β β β β ββ Sandbox & local tools β β β β β β
β β β β β ββ Planning β β β β β β
β β β β β ββ MCP server tools β β β β β β
β β β β ββββββββββββββββββββββββββββββ β β β β β
β β β ββββββββββββββββββββββββββββββββββββ β β β β
β β β β β β β
β β β ββββββββββββββββββββββββββββββββββββ β β β β
β β β β Doom Loop Detector β β β β β
β β β β β’ Detects repeated tool patterns β β β β β
β β β β β’ Injects corrective prompts β β β β β
β β β ββββββββββββββββββββββββββββββββββββ β β β β
β β β β β β β
β β β Loop: β β β β
β β β 1. LLM call (litellm.acompletion) β β β β
β β β β β β β β
β β β 2. Parse tool_calls[] β β β β
β β β β β β β β
β β β 3. Approval check β β β β
β β β (jobs, sandbox, destructive ops) β β β β
β β β β β β β β
β β β 4. Execute via ToolRouter β β β β
β β β β β β β β
β β β 5. Add results to ContextManager β β β β
β β β β β β β β
β β β 6. Repeat if tool_calls exist β β β β
β β ββββββββββββββββββββββββββββββββββββββββββ β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ΄βββ
User Message
β
[Add to ContextManager]
β
βββββββββββββββββββββββββββββββββββββββββββββ
β Iteration Loop (max 300) β
β β
β Get messages + tool specs β
β β β
β litellm.acompletion() β
β β β
β Has tool_calls? ββNoββ> Done β
β β β
β Yes β
β β β
β Add assistant msg (with tool_calls) β
β β β
β Doom loop check β
β β β
β For each tool_call: β
β β’ Needs approval? ββYesββ> Wait for β
β β user confirm β
β No β
β β β
β β’ ToolRouter.execute_tool() β
β β’ Add result to ContextManager β
β β β
β Continue loop ββββββββββββββββββ β
β β β β
β βββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββ
The agent emits the following events via event_queue:
processing- Starting to process user inputready- Agent is ready for inputassistant_chunk- Streaming token chunkassistant_message- Complete LLM response textassistant_stream_end- Token stream finishedtool_call- Tool being called with argumentstool_output- Tool execution resulttool_log- Informational tool log messagetool_state_change- Tool execution state transitionapproval_required- Requesting user approval for sensitive operationsturn_complete- Agent finished processingerror- Error occurred during processinginterrupted- Agent was interruptedcompacted- Context was compactedundo_complete- Undo operation completedshutdown- Agent shutting down
Run Ruff before every commit:
uv run ruff check .
uv run ruff format --check .If the format check fails, run uv run ruff format . and re-run the checks
before committing.
Edit agent/core/tools.py:
def create_builtin_tools() -> list[ToolSpec]:
return [
ToolSpec(
name="your_tool",
description="What your tool does",
parameters={
"type": "object",
"properties": {
"param": {"type": "string", "description": "Parameter description"}
},
"required": ["param"]
},
handler=your_async_handler
),
# ... existing tools
]Edit configs/cli_agent_config.json for CLI defaults, or
configs/frontend_agent_config.json for web-session defaults:
{
"model_name": "anthropic/claude-sonnet-4-5-20250929",
"mcpServers": {
"your-server-name": {
"transport": "http",
"url": "https://example.com/mcp",
"headers": {
"Authorization": "Bearer ${YOUR_TOKEN}"
}
}
}
}Note: Environment variables like ${YOUR_TOKEN} are auto-substituted from .env.
If you use ml-intern in your work, please cite it by using the following BibTeX entry or similar.
@Misc{ml-intern,
title = {ml-intern: an agent that autonomously researches, writes, and ships good quality ML related code using the Hugging Face ecosystem},
author = {Aksel Joonas Reedi, Henri Bonamy, Yoan Di Cosmo, Leandro von Werra, Lewis Tunstall},
howpublished = {\url{https://github.com/huggingface/ml-intern}},
year = {2026}
}