Skip to content

Commit 86d797f

Browse files
codelionclaude
andcommitted
Replace gated test model with Qwen2.5-Coder-0.5B-Instruct, bump to 0.3.15
google/gemma-3-270m-it became gated, breaking integration-tests and conversation-logging-tests in CI. Swap in Qwen/Qwen2.5-Coder-0.5B-Instruct which is public, instruction-tuned, has a chat_template, no thinking mode, and works with the existing transformers pin. Verified locally that test_json_plugin.py (9/9), test_n_parameter.py, test_reasoning_integration.py (8/8), and test_conversation_logging_server.py (10/10) now pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 5c07f46 commit 86d797f

6 files changed

Lines changed: 8 additions & 8 deletions

File tree

.github/workflows/test.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ jobs:
8484
- name: Start optillm server
8585
run: |
8686
echo "Starting optillm server for integration tests..."
87-
OPTILLM_API_KEY=optillm python optillm.py --model google/gemma-3-270m-it --port 8000 &
87+
OPTILLM_API_KEY=optillm python optillm.py --model Qwen/Qwen2.5-Coder-0.5B-Instruct --port 8000 &
8888
echo $! > server.pid
8989
9090
# Wait for server to be ready
@@ -179,7 +179,7 @@ jobs:
179179
echo "Starting optillm server with conversation logging..."
180180
mkdir -p /tmp/optillm_conversations
181181
OPTILLM_API_KEY=optillm python optillm.py \
182-
--model google/gemma-3-270m-it \
182+
--model Qwen/Qwen2.5-Coder-0.5B-Instruct \
183183
--port 8000 \
184184
--log-conversations \
185185
--conversation-log-dir /tmp/optillm_conversations &

optillm/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Version information
2-
__version__ = "0.3.14"
2+
__version__ = "0.3.15"
33

44
# Import from server module
55
from .server import (

optillm/plugins/json_plugin.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ def get_device(self):
2222
else:
2323
return torch.device("cpu")
2424

25-
def __init__(self, model_name: str = "google/gemma-3-270m-it"):
25+
def __init__(self, model_name: str = "Qwen/Qwen2.5-Coder-0.5B-Instruct"):
2626
"""Initialize the JSON generator with a specific model."""
2727
self.device = self.get_device()
2828
logger.info(f"Using device: {self.device}")

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "optillm"
7-
version = "0.3.14"
7+
version = "0.3.15"
88
description = "An optimizing inference proxy for LLMs."
99
readme = "README.md"
1010
license = "Apache-2.0"

tests/test_conversation_logging_server.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
#!/usr/bin/env python3
22
"""
33
Server-based integration tests for conversation logging with real model
4-
Tests conversation logging with actual OptILLM server and google/gemma-3-270m-it model
4+
Tests conversation logging with actual OptILLM server and Qwen/Qwen2.5-Coder-0.5B-Instruct model
55
"""
66

77
import unittest

tests/test_utils.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,8 @@
1212
from openai import OpenAI
1313

1414
# Standard test model for all tests - small and fast
15-
TEST_MODEL = "google/gemma-3-270m-it"
16-
TEST_MODEL_MLX = "mlx-community/gemma-3-270m-it-bf16"
15+
TEST_MODEL = "Qwen/Qwen2.5-Coder-0.5B-Instruct"
16+
TEST_MODEL_MLX = "mlx-community/Qwen2.5-Coder-0.5B-Instruct-bf16"
1717

1818
def setup_test_env():
1919
"""Set up test environment with local inference"""

0 commit comments

Comments
 (0)