Skip to content

Commit 5cc5599

Browse files
committed
update RL code
1 parent 2530467 commit 5cc5599

File tree

51 files changed

+142
-1681
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+142
-1681
lines changed

README.md

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,17 @@
2828
- **Versatile Applications**: Ready to use as a best-in-class reranker to improve editing outputs, or as a high-fidelity reward signal for **stable and effective Reinforcement Learning (RL) fine-tuning**.
2929

3030
## 🔥 News
31-
- **2025-10-16**: Training datasets [EditScore-Reward-Data](https://huggingface.co/datasets/EditScore/EditScore-Reward-Data) and [EditScore-RL-Data](https://huggingface.co/datasets/EditScore/EditScore-RL-Data) are available.
32-
- **2025-10-15**: **EditScore** is now available on PyPI — install it easily with `pip install editscore`.
33-
- **2025-10-15**: Best-of-N inference scripts for OmniGen2, Flux-dev-Kontext, and Qwen-Image-Edit are now available! See [this](#apply-editscore-to-image-editing) for details.
31+
- **2025-10-22**: **Introducing Our Reinforcement Learning Training Framework!**
32+
We're excited to release our complete RL pipeline, the result of a massive effort to simplify fine-tuning for image editing models. Key features include:
33+
- **Ready-to-Use RL Dataset**: Includes the complete dataset used in the EditScore project, along with clear usage guidelines and preparation scripts.
34+
- **An Easy-to-Use Reward Model**: Seamlessly integrate **EditScore** as a reward signal.
35+
- **A Scalable Reward Server**: Built with native multi-node support for high-throughput training.
36+
- **Flexible Training Code**: Supports distributed training, variable image resolutions and mixed tasks (t2i, edit, in-context generation) out-of-the-box.
37+
Dive into our comprehensive guide on [RL Fine-Tuning](examples/OmniGen2-RL#application-2-reinforcement-fine-tuning) to get started.
38+
39+
- 2025-10-16: Training datasets [EditScore-Reward-Data](https://huggingface.co/datasets/EditScore/EditScore-Reward-Data) and [EditScore-RL-Data](https://huggingface.co/datasets/EditScore/EditScore-RL-Data) are available.
40+
- 2025-10-15: **EditScore** is now available on PyPI — install it easily with `pip install editscore`.
41+
- 2025-10-15: Best-of-N inference scripts for OmniGen2, Flux-dev-Kontext, and Qwen-Image-Edit are now available! See [this](#apply-editscore-to-image-editing) for details.
3442
- 2025-09-30: We release **OmniGen2-EditScore7B**, unlocking online RL For Image Editing via high-fidelity EditScore. LoRA weights are available at [Hugging Face](https://huggingface.co/OmniGen2/OmniGen2-EditScore7B) and [ModelScope](https://www.modelscope.cn/models/OmniGen2/OmniGen2-EditScore7B).
3543
- 2025-09-30: We are excited to release **EditScore** and **EditReward-Bench**! Model weights and the benchmark dataset are now publicly available. You can access them on Hugging Face: [Models Collection](https://huggingface.co/collections/EditScore/editscore-68d8e27ee676981221db3cfe) and [Benchmark Dataset](https://huggingface.co/datasets/EditScore/EditReward-Bench), and on ModelScope: [Models Collection](https://www.modelscope.cn/collections/EditScore-8b0d53aa945d4e) and [Benchmark Dataset](https://www.modelscope.cn/datasets/EditScore/EditReward-Bench).
3644

evaluate.sh

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,6 @@
22
SHELL_FOLDER=$(cd "$(dirname "$0")";pwd)
33
cd $SHELL_FOLDER
44

5-
source "$(dirname $(which conda))/../etc/profile.d/conda.sh"
6-
conda activate py3.12+pytorch2.7.1+cu126
7-
85
python evaluation.py \
96
--benchmark_dir EditScore/EditReward-Bench \
107
--result_dir results/EditScore-7B \

evaluate_32B_vllm.sh

Lines changed: 0 additions & 24 deletions
This file was deleted.

evaluate_72B_vllm.sh

Lines changed: 14 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -2,23 +2,20 @@
22
SHELL_FOLDER=$(cd "$(dirname "$0")";pwd)
33
cd $SHELL_FOLDER
44

5-
source "$(dirname $(which conda))/../etc/profile.d/conda.sh"
6-
conda activate py3.12+pytorch2.7.1+cu126
7-
8-
# python evaluation.py \
9-
# --benchmark_dir EditScore/EditReward-Bench \
10-
# --result_dir results/EditScore-72B \
11-
# --backbone qwen25vl_vllm \
12-
# --model_name_or_path Qwen/Qwen2.5-VL-72B-Instruct \
13-
# --enable_lora \
14-
# --lora_path /share/project/jiahao/LLaMA-Factory2/output/lora_72B_extract \
15-
# --score_range 25 \
16-
# --max_workers 1 \
17-
# --max_model_len 4096 \
18-
# --max_num_seqs 1 \
19-
# --max_num_batched_tokens 4096 \
20-
# --tensor_parallel_size 4 \
21-
# --num_pass 1
5+
python evaluation.py \
6+
--benchmark_dir EditScore/EditReward-Bench \
7+
--result_dir results/EditScore-72B \
8+
--backbone qwen25vl_vllm \
9+
--model_name_or_path Qwen/Qwen2.5-VL-72B-Instruct \
10+
--enable_lora \
11+
--lora_path EditScore/EditScore-72B \
12+
--score_range 25 \
13+
--max_workers 1 \
14+
--max_model_len 4096 \
15+
--max_num_seqs 1 \
16+
--max_num_batched_tokens 4096 \
17+
--tensor_parallel_size 4 \
18+
--num_pass 1
2219

2320
python calculate_statistics.py \
2421
--result_dir results/EditScore-72B/qwen25vl_vllm

evaluate_72B_vllm_2.sh

Lines changed: 0 additions & 24 deletions
This file was deleted.

evaluate_vllm.sh

Lines changed: 14 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -2,23 +2,20 @@
22
SHELL_FOLDER=$(cd "$(dirname "$0")";pwd)
33
cd $SHELL_FOLDER
44

5-
source "$(dirname $(which conda))/../etc/profile.d/conda.sh"
6-
conda activate py3.12+pytorch2.7.1+cu126
7-
8-
# python evaluation.py \
9-
# --benchmark_dir EditScore/EditReward-Bench \
10-
# --result_dir results/EditScore-7B \
11-
# --backbone qwen25vl_vllm \
12-
# --model_name_or_path Qwen/Qwen2.5-VL-7B-Instruct \
13-
# --enable_lora \
14-
# --lora_path EditScore/EditScore-7B \
15-
# --score_range 25 \
16-
# --max_workers 1 \
17-
# --max_model_len 4096 \
18-
# --max_num_seqs 1 \
19-
# --max_num_batched_tokens 4096 \
20-
# --tensor_parallel_size 1 \
21-
# --num_pass 1
5+
python evaluation.py \
6+
--benchmark_dir EditScore/EditReward-Bench \
7+
--result_dir results/EditScore-7B \
8+
--backbone qwen25vl_vllm \
9+
--model_name_or_path Qwen/Qwen2.5-VL-7B-Instruct \
10+
--enable_lora \
11+
--lora_path EditScore/EditScore-7B \
12+
--score_range 25 \
13+
--max_workers 1 \
14+
--max_model_len 4096 \
15+
--max_num_seqs 1 \
16+
--max_num_batched_tokens 4096 \
17+
--tensor_parallel_size 1 \
18+
--num_pass 1
2219

2320
python calculate_statistics.py \
2421
--result_dir results/EditScore-7B/qwen25vl_vllm

examples/OmniGen2-RL/data_configs/train/example/edit/all.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@ ratio_type: inside_ratio
22

33
data:
44
-
5-
# path: '/path/to/EditScore-RL-Data/rl_abs_9tasks.jsonl'
6-
path: '/share/project/chenyuan/data2/EditScore-RL-Data-v4/rl_abs_9tasks.jsonl'
5+
path: '/path/to/EditScore-RL-Data/rl_abs_9tasks.jsonl'
76
type: 'edit'
87
ratio: !!float 1

examples/OmniGen2-RL/evaluation/GEdit-Bench/flux_kontext_dev_16samples_select_best_editscore_pass1.sh

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -73,12 +73,16 @@ for ((i=0; i<num_gpus_per_machine; i++)); do
7373
--result_dir evaluation/GEdit-Bench/results/FLUX-Kontext-dev/results_gs${guidance_scale}_16samples \
7474
--save_dir evaluation/GEdit-Bench/results/FLUX-Kontext-dev/results_gs${guidance_scale}_16samples_pass1 \
7575
--num_samples 16 \
76-
--backbone qwen25vl \
77-
--model_variant GRM-v4 \
78-
--model_path /share/project/jiahao/LLaMA-Factory2/output/merge_v7-2_8models_omnigen2-4samples_gpt4-1_range_0to25 \
76+
--backbone qwen25vl_vllm \
77+
--model_name_or_path Qwen/Qwen2.5-VL-7B-Instruct \
78+
--enable_lora \
79+
--lora_path EditScore/EditScore-7B \
80+
--score_range 25 \
7981
--max_workers 1 \
8082
--max_model_len 4096 \
81-
--context_version v2 \
83+
--max_num_seqs 1 \
84+
--max_num_batched_tokens 4096 \
85+
--tensor_parallel_size 1 \
8286
--num_pass 1 \
8387
--start_index ${start_idx} --end_index ${end_idx} \
8488
> logs/gedit_FLUX-Kontext-dev_gs${guidance_scale}_16samples_select_best_pass1_${start_idx}_${end_idx}.log 2>&1 &

examples/OmniGen2-RL/evaluation/GEdit-Bench/flux_kontext_dev_16samples_select_best_editscore_pass4.sh

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -73,12 +73,16 @@ for ((i=0; i<num_gpus_per_machine; i++)); do
7373
--result_dir evaluation/GEdit-Bench/results/FLUX-Kontext-dev/results_gs${guidance_scale}_16samples \
7474
--save_dir evaluation/GEdit-Bench/results/FLUX-Kontext-dev/results_gs${guidance_scale}_16samples_pass4 \
7575
--num_samples 16 \
76-
--backbone qwen25vl \
77-
--model_variant GRM-v4 \
78-
--model_path /share/project/jiahao/LLaMA-Factory2/output/merge_v7-2_8models_omnigen2-4samples_gpt4-1_range_0to25 \
76+
--backbone qwen25vl_vllm \
77+
--model_name_or_path Qwen/Qwen2.5-VL-7B-Instruct \
78+
--enable_lora \
79+
--lora_path EditScore/EditScore-7B \
80+
--score_range 25 \
7981
--max_workers 1 \
8082
--max_model_len 4096 \
81-
--context_version v2 \
83+
--max_num_seqs 1 \
84+
--max_num_batched_tokens 4096 \
85+
--tensor_parallel_size 1 \
8286
--num_pass 4 \
8387
--start_index ${start_idx} --end_index ${end_idx} \
8488
> logs/gedit_FLUX-Kontext-dev_gs${guidance_scale}_16samples_select_best_pass4_${start_idx}_${end_idx}.log 2>&1 &

examples/OmniGen2-RL/evaluation/GEdit-Bench/flux_kontext_dev_pass1_best.sh

Lines changed: 0 additions & 26 deletions
This file was deleted.

0 commit comments

Comments
 (0)