IWG — Inverse Workflow Generation

Recognize Your Orchestrator: An Entropy Dynamics Perspective for LLM Multi-Agent Systems. Junze Zhu, Weihao Chen, Xuanwang Zhang, Zhen Wu, Xinyu Dai. In Proceedings of ICML, 2026.

IWG is a multi-agent pipeline that synthesizes process-verifiable benchmarks by reconstructing execution environments backward from target solutions. It enables dense, step-level measurement of orchestrator scheduling entropy for mean-field dynamics analysis.

Setup

pip install openai

Configure your API key in iwg/config.json:

{
  "model": {
    "api_key": "sk-...",
    "model": "gpt-4.1"
  }
}

Data

Each Seed Data declares the query and groundtruth.

{
  "id": "seed_001",
  "query": "What year was the director of The White Ribbon born?",
  "answer": "March 23, 1942",
}

Usage

The pipeline is fully decoupled: generate benchmarks first, then run any orchestrator model independently.

Phase 1 — Generate benchmarks

python3 -m iwg.generate_benchmarks                    # all seeds
python3 -m iwg.generate_benchmarks --seed-id seed_001  # single seed

Benchmarks are saved to bench/ as static JSON files (environments, checkpoints, gold agent sequences).

Phase 2 — Run orchestrator

python3 -m iwg.run_orchestrator --list                          # list benchmarks
python3 -m iwg.run_orchestrator --bench 001 --model gpt-4.1  # single run
python3 -m iwg.run_orchestrator --all --model gpt-4.1        # all benchmarks

Trajectories and metrics are saved to trajectories/ independently. Run the same benchmark with different models for direct comparison:

python3 -m iwg.run_orchestrator --bench 001 --model gemini-2.5-pro,gpt-4o,claude-sonnet-4-6

Metrics

Six trajectory-aware metrics (LCS-F1, Task Success, Step Success Rate, Exception Handling F1, Faithfulness, Consistency) are computed automatically. See iwg/metrics.py.

Citation

@inproceedings{zhu2026recognize,
  title     = {Recognize Your Orchestrator: An Entropy Dynamics Perspective
               for LLM Multi-Agent Systems},
  author    = {Zhu, Junze and Chen, Weihao and Zhang, Xuanwang and
               Wu, Zhen and Dai, Xinyu},
  booktitle = {Proceedings of the 43rd International Conference on
               Machine Learning},
  year      = {2026},
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
iwg		iwg
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IWG — Inverse Workflow Generation

Setup

Data

Usage

Phase 1 — Generate benchmarks

Phase 2 — Run orchestrator

Metrics

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

IWG — Inverse Workflow Generation

Setup

Data

Usage

Phase 1 — Generate benchmarks

Phase 2 — Run orchestrator

Metrics

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages