feat: add default option using transformers by EYH0602 · Pull Request #67 · SecurityLab-UCD/TF-Bench

EYH0602 · 2025-09-20T06:25:28Z

No description provided.

Copilot

Pull Request Overview

This PR adds support for using Hugging Face Transformers models as a default fallback option when no specific model provider is matched. The change removes the VLLM dependency and replaces it with a Hugging Face implementation.

Adds new HFChat class using Hugging Face Transformers for model inference
Removes VLLM-based implementation and dependencies
Updates router function to use HFChat as default fallback instead of returning None

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/tfbench/lm/utils.py	Updates router to import HFChat and use it as default fallback
src/tfbench/lm/_vllm.py	Removes entire VLLM implementation file
src/tfbench/lm/_hf.py	Adds new Hugging Face Transformers implementation
src/tfbench/experiment.py	Removes assertion check since router no longer returns None
src/tfbench/env.py	Simplifies environment handling using load_dotenv
src/main.py	Minor code reorganization moving result_dir creation
pyproject.toml	Replaces vllm dependency with transformers[torch]

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

src/tfbench/lm/_hf.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* resturcture project * fix: ollama new type api * fix lint * add ruff * add ruff to ci * refactor: llm api (#60) New LLM generation workflow. * add an empty .env * refactor OpenAI util class * use new OpenAI client in main * assume .env unchanged * fix: response processing * use new Gemini client in main * enable reasoning effort from cli * document why two gemini wrapper * add Claude API * add claude models to supported list * handle UnionType for Literal ReasoningEffort * add vLLM support and use it as default option * fix: use vLLM chat interface instead of gen * env add vllm api key * add VLLM_HOST and VLLM_PORT * add vllm server mode * add vLLM in dependencies * doc: instruct to run vllm from uv * make deprecated ollama a standalone script * doc: revise ollama * use 3.12 * add Ollama models * fix: ollama model name * fix: ollama model name * fix: Gemini use its own EFFORT_TOKEN_MAP * remove unused imports * fix: google-genai version * fix: ci with uv run * feat: load TF-Bench from HuggingFace by default (#61) * update tfb to huggingface with base and pure splits * feat: load tfbench from huggingface * remove mandatory path * avoid loading vLLM for now * remove vLLM option in main * feat: update response processing inside tfbench package (#62) * answer cannot be None from LM * move evaluation logic inside the tfbench package * fix: orjson writes binary, error is not an option * fix: use pure as parameter in main._eval * feat: script to analysis saved generation results * use orjsonl in main for consistancy * feat: evaluation prove type equiv using TypeOperators (#64) * fix: allow generation to fail * remove unnecessary imports * fix: OpenAI response add reasoning summary * fix: load_gen_results_json type * fix: analysis_saved script * fix: evaluation benchmark name * fix: OpenAI response API add summary * use pydantic-v2 * extract incorrect task-answer pairs * fix: groundtruth error (#63) * fix: missing type class and typevar in benchmark * fix: order of tasks in tfb * fix: allow load_gen_results to load error * remove error_cls unused imports * extract type variables from source code * add GHC type check by proving type equiv * fix: cp -> process * fix: API change for AST * feat: type prover support new type definition * test: ghc and type_util * feat: use prover_evaluate for base split * test: add real tfbench test cases, which the deprecated evaluation failed * alt error to syntax parsing error * feat: typeclass constrains reorder * fix: AST.get_all_nodes_of_type ignores the root itself * reorder_constraints using compiler frontend static analysis * feat: add type definitions for pure tasks * test: check type equivalence prover after rewriting mono types * fix: handle type classes alone when ading new definitions * feat: define new types automatically for pure tasks * ghc prover remove standalone type class * doc: detaile docstring for prover_evaluate * script: analysis_saved run both split * fix: experiment use prover_evaluate * feat: error analysis with reasoning steps (#65) * error analysis use prover * error analysis script * feat: record model name when doing error analysis * add plot script for error analysis * adjust row and column spacing * update color map * revise error_analysis default path * test: list constructor * remove tmp file * fix: main missing pure parameter to * error analysis only output category * default error analysis model to gpt-5-mini * adjust fontsize for 5 pies in a row * doc: require GHC >= 9.2.1 for ImpredicativeTypes * feat: add default option using transformers (#67) * add transformers generation as default * remove None option for router * remove vllm option for ease of dependency * Update src/tfbench/lm/_hf.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/tfbench/lm/_hf.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * remove unnecessary imports --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * doc: make readme and export clearer (#68) * add transformers generation as default * Update src/tfbench/lm/_hf.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/tfbench/lm/_hf.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * remove unnecessary imports * doc: improve instructions * fix: unused parameter and import * enable github actions on main commits * doc: add badges and images --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

EYH0602 added 3 commits September 19, 2025 22:59

add transformers generation as default

a116450

remove None option for router

c62956d

remove vllm option for ease of dependency

6616d5c

EYH0602 requested a review from Copilot September 20, 2025 06:25

Copilot AI reviewed Sep 20, 2025

View reviewed changes

src/tfbench/lm/_hf.py Outdated Show resolved Hide resolved

src/tfbench/lm/_hf.py Outdated Show resolved Hide resolved

src/tfbench/lm/_hf.py Outdated Show resolved Hide resolved

EYH0602 and others added 3 commits September 19, 2025 23:31

Update src/tfbench/lm/_hf.py

6aae732

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/tfbench/lm/_hf.py

fdc28a3

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

remove unnecessary imports

ac87084

EYH0602 merged commit dadb45e into release-0.1.0 Sep 24, 2025
4 checks passed

EYH0602 deleted the feat-transformers-default branch November 26, 2025 21:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add default option using transformers#67

feat: add default option using transformers#67
EYH0602 merged 6 commits intorelease-0.1.0from
feat-transformers-default

EYH0602 commented Sep 20, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

EYH0602 commented Sep 20, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants