这份文档说明如何把仓库外部发现的方法整理成 strategy_card,再接入 Meta-Harness 的通用 benchmark 流程。
公开版保留的是通用机制:
- 用
strategy_card表达外部方法 - 做 compatibility 检查与 shortlist
- 生成 benchmark spec 或直接跑 benchmark
- 结合报告决定是否把方法物化为 candidate
仓库不再内置任何特定私有项目的策略资产或专属模板。
核心字段定义见 src/meta_harness/schemas.py:
strategy_idtitlesourcecategorygroupprioritychange_typevariant_namehypothesisimplementation_idvariant_typecompatibilityexpected_benefitsexpected_costsconfig_patchcode_patchexpected_signalsrisk_notestags
最常见的三类:
config_onlypatch_basednot_yet_executable
下面是仓库内 web_scrape 方法卡的一个简化例子:
{
"strategy_id": "web_scrape/selector-only",
"title": "Selector Only",
"source": "reference://web_scrape/selector_only",
"category": "web_scrape",
"primitive_id": "web_scrape",
"change_type": "config_only",
"variant_type": "method_family",
"variant_name": "selector_only",
"hypothesis": "Stable recurring pages should avoid unnecessary AI hops.",
"config_patch": {
"workflow": {
"primitives": {
"web_scrape": {
"pipeline": "selector_only",
"timeout_ms": 5000
}
}
}
},
"expected_signals": {
"fingerprints": {
"scrape.mode": "selector"
}
}
}直接从若干 strategy cards 生成 benchmark spec:
PYTHONPATH=src python -m meta_harness.cli strategy build-spec \
--experiment web-scrape-methods \
--baseline current_pipeline \
--output configs/benchmarks/web_scrape_methods.json \
configs/strategy_cards/web_scrape/selector_only.json \
configs/strategy_cards/web_scrape/html_to_markdown_llm.json查看 compatibility:
PYTHONPATH=src python -m meta_harness.cli strategy inspect \
--profile demo_public \
--project demo_public \
--config-root configs \
configs/strategy_cards/web_scrape/selector_only.json批量 shortlist:
PYTHONPATH=src python -m meta_harness.cli strategy shortlist \
--profile demo_public \
--project demo_public \
--config-root configs \
configs/strategy_cards/web_scrape/html_to_markdown_llm.json \
configs/strategy_cards/web_scrape/selector_only.json \
configs/strategy_cards/web_scrape/vlm_visual_extract.json \
configs/strategy_cards/web_scrape/headless_fingerprint_proxy.json直接执行 benchmark:
PYTHONPATH=src python -m meta_harness.cli strategy benchmark \
--profile demo_public \
--project demo_public \
--config-root configs \
--runs-root runs \
--candidates-root candidates \
--task-set task_sets/demo/failure_repair.json \
--experiment web-scrape-methods \
--baseline current_pipeline \
configs/strategy_cards/web_scrape/selector_only.json \
configs/strategy_cards/web_scrape/html_to_markdown_llm.jsonconfigs/strategy_cards/web_scrape/html_to_markdown_llm.jsonconfigs/strategy_cards/web_scrape/selector_only.jsonconfigs/strategy_cards/web_scrape/vlm_visual_extract.jsonconfigs/strategy_cards/web_scrape/headless_fingerprint_proxy.jsonconfigs/primitives/web_scrape.jsonscripts/eval_web_scrape.py
- 先把外部方法整理成
strategy_card - 用
strategy inspect看能否在当前 profile/project 下执行 - 用
strategy shortlist做初筛 - 用
strategy build-spec或strategy benchmark跑一轮通用评测 - 只有结果稳定后,再把方法物化成 candidate 或 proposal