|
1 | 1 | # AutoChecklist |
2 | 2 |
|
3 | | -[](https://github.com/ChicagoHAI/AutoChecklist) |
4 | | -[](https://www.python.org/downloads/) |
5 | | -[](LICENSE) |
| 3 | +<p align="center"> |
| 4 | + <a href="https://github.com/ChicagoHAI/AutoChecklist"><img src="https://img.shields.io/github/stars/ChicagoHAI/AutoChecklist?style=flat-square" alt="GitHub Stars"></a> |
| 5 | + <a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.10+-blue.svg?style=flat-square" alt="Python 3.10+"></a> |
| 6 | + <a href="LICENSE"><img src="https://img.shields.io/badge/License-Apache%202.0-green.svg?style=flat-square" alt="License"></a> |
| 7 | + <a href="https://autochecklist.github.io/"><img src="https://img.shields.io/badge/site-autochecklist.github.io-purple?style=flat-square" alt="Site"></a> |
| 8 | +</p> |
6 | 9 |
|
7 | 10 | A library of composable pipelines for generating and scoring checklist criteria. |
8 | 11 |
|
@@ -33,7 +36,7 @@ Each generator is customizable via prompt templates (`.md` files with `{input}`, |
33 | 36 |
|
34 | 37 | ### Built-in Pipelines |
35 | 38 |
|
36 | | -The library includes built-in pipelines implementing methods from research papers ([TICK](https://arxiv.org/abs/2410.03608), [RocketEval](https://arxiv.org/abs/2503.05142), [RLCF](https://arxiv.org/abs/2507.18624), [CheckEval](https://arxiv.org/abs/2403.18771), [InteractEval](https://arxiv.org/abs/2409.07355), and more). See [Supported Pipelines](https://github.com/ChicagoHAI/AutoChecklist/blob/main/docs/user-guide/pipelines.md) for the full list and configuration details. |
| 39 | +The library includes built-in pipelines implementing methods from research papers ([TICK](https://arxiv.org/abs/2410.03608), [RocketEval](https://arxiv.org/abs/2503.05142), [RLCF](https://arxiv.org/abs/2507.18624), [OpenRubrics](https://arxiv.org/abs/2510.07743), [CheckEval](https://arxiv.org/abs/2403.18771), [InteractEval](https://arxiv.org/abs/2409.07355), and more). See [Supported Pipelines](https://autochecklist.github.io/user-guide/supported-pipelines/) for the full list and configuration details. |
37 | 40 |
|
38 | 41 | ### Scoring |
39 | 42 |
|
@@ -78,114 +81,30 @@ pip install "autochecklist[all]" |
78 | 81 |
|
79 | 82 | For development installation from source, see the [GitHub repository](https://github.com/ChicagoHAI/AutoChecklist). |
80 | 83 |
|
81 | | -## Using the Package |
82 | | - |
83 | | -### Custom Prompts |
84 | | - |
85 | | -Write a prompt template and generate a checklist: |
86 | | - |
87 | | -```python |
88 | | -from autochecklist import DirectGenerator, ChecklistScorer |
89 | | - |
90 | | -gen = DirectGenerator( |
91 | | - custom_prompt="You are an expert evaluator. Generate yes/no checklist questions to score:\n\n{input}", |
92 | | - model="openai/gpt-5-mini", |
93 | | -) |
94 | | -checklist = gen.generate(input="Write a haiku about autumn.") |
95 | | - |
96 | | -scorer = ChecklistScorer(mode="batch", model="openai/gpt-5-mini") |
97 | | -score = scorer.score(checklist, target="Leaves fall gently down...") |
98 | | -print(f"Pass rate: {score.pass_rate:.0%}") |
99 | | -``` |
100 | | - |
101 | | -Scorers also take custom prompts. Prompts can also be loaded from `.md` files — see [Custom Prompts](https://github.com/ChicagoHAI/AutoChecklist/blob/main/docs/user-guide/custom-prompts.md) for the full guide (placeholders, custom scorers, registration). |
102 | | - |
103 | | -### Custom Pipelines |
104 | | - |
105 | | -Register a custom pipeline (generator + scorer + prompts) as a reusable unit: |
106 | | - |
107 | | -```python |
108 | | -from autochecklist import register_custom_pipeline, pipeline |
109 | | - |
110 | | -# Register from config |
111 | | -register_custom_pipeline( |
112 | | - "my_eval", |
113 | | - generator_prompt="Generate yes/no questions for:\n\n{input}", |
114 | | - scorer="weighted", |
115 | | -) |
116 | | -pipe = pipeline("my_eval", generator_model="openai/gpt-5-mini") |
117 | | - |
118 | | -# Or register from an existing pipeline instance |
119 | | -register_custom_pipeline("my_eval_v2", pipe) |
120 | | - |
121 | | -# Save/load pipeline configs as JSON |
122 | | -from autochecklist import save_pipeline_config, load_pipeline_config |
123 | | -save_pipeline_config("my_eval", "my_eval.json") |
124 | | -load_pipeline_config("my_eval.json") # registers and returns the name |
125 | | -``` |
126 | | - |
127 | | -### Built-in Pipelines |
128 | | - |
129 | | -The library includes pipelines implementing methods from research papers. Use them via `method_name` or the `pipeline()` shorthand: |
| 84 | +## Quick Start |
130 | 85 |
|
131 | 86 | ```python |
132 | 87 | from autochecklist import pipeline |
133 | 88 |
|
134 | 89 | pipe = pipeline("tick", generator_model="openai/gpt-5-mini", scorer_model="openai/gpt-5-mini") |
135 | | -result = pipe(input="Write a haiku about autumn", target="Leaves fall gently...") |
| 90 | +result = pipe(input="Write a haiku about autumn.", target="Leaves fall gently down...") |
136 | 91 | print(f"Pass rate: {result.pass_rate:.0%}") |
137 | 92 | ``` |
138 | 93 |
|
139 | | -See [Supported Pipelines](https://github.com/ChicagoHAI/AutoChecklist/blob/main/docs/user-guide/pipelines.md) for the full list of pipelines, paper details, and configuration options. |
140 | | - |
141 | | -### Batch Evaluation |
142 | | - |
143 | | -```python |
144 | | -data = [ |
145 | | - {"input": "Write a haiku", "target": "Leaves fall..."}, |
146 | | - {"input": "Write a limerick", "target": "There once was..."}, |
147 | | -] |
148 | | -result = pipe.run_batch(data, show_progress=True) |
149 | | -print(f"Macro pass rate: {result.macro_pass_rate:.0%}") |
150 | | -``` |
151 | | - |
152 | | -For pipeline composition, provider configuration, and the full API, see the [Pipeline Guide](https://github.com/ChicagoHAI/AutoChecklist/blob/main/docs/user-guide/pipeline.md). |
| 94 | +See the [Quick Start guide](https://autochecklist.github.io/getting-started/quickstart/) for custom prompts, batch evaluation, and more. |
153 | 95 |
|
154 | | -### Command-Line Interface |
155 | | - |
156 | | -Run evaluations directly from the terminal: |
| 96 | +### CLI |
157 | 97 |
|
158 | 98 | ```bash |
159 | | -# Full evaluation (generate + score) |
160 | 99 | autochecklist run --pipeline tick --data eval_data.jsonl -o results.jsonl \ |
161 | 100 | --generator-model openai/gpt-4o-mini --scorer-model openai/gpt-4o-mini |
162 | | - |
163 | | -# Generate checklists only |
164 | | -autochecklist generate --pipeline tick --data inputs.jsonl -o checklists.jsonl \ |
165 | | - --generator-model openai/gpt-4o-mini |
166 | | - |
167 | | -# Score with existing checklist |
168 | | -autochecklist score --data eval_data.jsonl --checklist checklist.json \ |
169 | | - -o results.jsonl --scorer-model openai/gpt-4o-mini |
170 | | - |
171 | | -# List available pipelines |
172 | | -autochecklist list |
173 | 101 | ``` |
174 | 102 |
|
175 | | -API keys can be set via `--api-key`, environment variables (`OPENROUTER_API_KEY`), or a `.env` file. See the [CLI Guide](https://github.com/ChicagoHAI/AutoChecklist/blob/main/docs/user-guide/cli.md) for full details. |
176 | | - |
177 | | -### Examples |
178 | | - |
179 | | -Detailed examples with runnable code: |
180 | | - |
181 | | -- **[custom_components_tutorial.ipynb](https://github.com/ChicagoHAI/AutoChecklist/blob/main/examples/custom_components_tutorial.ipynb)** - Create your own generators, scorers, and refiners |
182 | | -- **[pipeline_demo.ipynb](https://github.com/ChicagoHAI/AutoChecklist/blob/main/examples/pipeline_demo.ipynb)** - Pipeline API, registry, batch evaluation, export |
183 | | -- **[instance_level_demo.ipynb](https://github.com/ChicagoHAI/AutoChecklist/blob/main/examples/instance_level_demo.ipynb)** - DirectGenerator, ContrastiveGenerator (per-input checklists) |
184 | | -- **[corpus_level_demo.ipynb](https://github.com/ChicagoHAI/AutoChecklist/blob/main/examples/corpus_level_demo.ipynb)** - InductiveGenerator, DeductiveGenerator, InteractiveGenerator (per-dataset checklists) |
| 103 | +See the [CLI guide](https://autochecklist.github.io/user-guide/cli/) for all commands. |
185 | 104 |
|
186 | 105 | ## Links |
187 | 106 |
|
188 | | -<!-- - [Full Documentation](https://autochecklist.github.io) --> |
| 107 | +- [Documentation](https://autochecklist.github.io) |
189 | 108 | - [GitHub Repository](https://github.com/ChicagoHAI/AutoChecklist) — contributing, UI, dev setup |
190 | 109 | - [Bug Tracker](https://github.com/ChicagoHAI/AutoChecklist/issues) |
191 | 110 |
|
|
0 commit comments