Skip to content

Commit 589c6d1

Browse files
authored
add loss & add delta loss & updata icon (#18)
1 parent ee5fe72 commit 589c6d1

16 files changed

Lines changed: 753 additions & 10 deletions

docs/.vuepress/notes/en/guide.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ export const Guide: ThemeNote = defineNoteConfig({
2424
items: [
2525
'quickstart',
2626
'tutorial',
27+
'selector_delta_loss',
28+
'selector_loss',
2729
'selector_less',
2830
'selector_nice',
2931
'selector_offline_tsds',

docs/.vuepress/notes/zh/guide.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ export const Guide: ThemeNote = defineNoteConfig({
2424
items: [
2525
'quickstart',
2626
'tutorial',
27+
'selector_delta_loss',
28+
'selector_loss',
2729
'selector_less',
2830
'selector_nice',
2931
'selector_offline_tsds',

docs/en/notes/guide/mixer/odm.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: ODM Data Mixer
33
createTime: 2025/01/27 10:00:00
4-
icon: material-symbols:casino
4+
icon: material-symbols:balance
55
permalink: /en/guide/mixer/odm/
66
---
77

Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,183 @@
1+
---
2+
title: Delta Loss Selector
3+
createTime: 2025/12/27 00:44:11
4+
permalink: /en/guide/delta-loss/
5+
icon: carbon:select-window
6+
---
7+
# Delta Loss Selector Guide
8+
9+
This document explains how to use the **Delta Loss Selector** in **DataFlex**. It tracks loss reduction relative to an initial baseline and samples from a sliding window over the ranked delta-loss list.
10+
11+
---
12+
13+
## 1. Method Overview
14+
15+
**Delta Loss Selector** workflow:
16+
17+
1. On the first selection, compute and cache **initial losses**, then return a random warmup batch.
18+
2. On later steps, compute current losses and define $\Delta l_i = l_i^{(init)} - l_i^{(current)}$.
19+
3. Sort samples by $\Delta l_i$ in descending order and compute a sliding-window position based on training progress.
20+
4. Assign high sampling probability inside the window and a small base probability outside.
21+
22+
**Sliding window schedule:**
23+
24+
Let update index be $t$ out of $T$, and window ratio be $s$:
25+
26+
$$
27+
u = \sigma\left(\frac{t}{T}\right), \quad
28+
\text{start} = u \cdot (N - sN), \quad
29+
\text{end} = \text{start} + sN
30+
$$
31+
32+
If $\Delta l_i < 0$, the window end is truncated to avoid prioritizing samples that got worse.
33+
34+
35+
## 2. Implementation Steps
36+
37+
### Step 1: Environment Setup
38+
39+
```bash
40+
git clone https://github.com/OpenDCAI/DataFlex.git
41+
cd DataFlex
42+
pip install -e .
43+
pip install llamafactory
44+
```
45+
46+
---
47+
48+
### Step 2: Delta Loss Selector Configuration
49+
50+
**Configuration file path:**
51+
```
52+
DataFlex/src/dataflex/configs/components.yaml
53+
```
54+
55+
**Example configuration:**
56+
```yaml
57+
delta_loss:
58+
name: delta_loss
59+
params:
60+
cache_dir: ../dataflex_saves/delta_loss_output
61+
window_size: 0.2
62+
```
63+
64+
**Parameter Description:**
65+
* `cache_dir`: Cache directory; the first step stores initial losses here.
66+
* `window_size`: Sliding window ratio for focused sampling.
67+
68+
---
69+
70+
### Step 3: Dynamic Training Configuration
71+
72+
**Configuration file path:**
73+
```
74+
DataFlex/examples/train_lora/selectors/delta_loss.yaml
75+
```
76+
77+
**Example configuration:**
78+
```yaml
79+
### model
80+
model_name_or_path: meta-llama/Llama-3.1-8B
81+
trust_remote_code: true
82+
83+
### method
84+
stage: sft
85+
do_train: true
86+
finetuning_type: lora
87+
lora_target: all
88+
lora_rank: 16
89+
lora_alpha: 8
90+
91+
### dataset
92+
dataset: alpaca_en_demo
93+
template: llama3
94+
cutoff_len: 4096
95+
overwrite_cache: true
96+
preprocessing_num_workers: 16
97+
dataloader_num_workers: 0
98+
seed: 42
99+
100+
### output
101+
output_dir: ../dataflex_saves/Llama-3.1-8B/delta_loss
102+
logging_steps: 10
103+
save_steps: 100
104+
plot_loss: true
105+
save_only_model: false
106+
overwrite_output_dir: true
107+
108+
### train
109+
per_device_train_batch_size: 1
110+
gradient_accumulation_steps: 1
111+
learning_rate: 1.0e-4
112+
num_train_epochs: 1.0
113+
lr_scheduler_type: cosine
114+
warmup_ratio: 0.1
115+
bf16: true
116+
ddp_timeout: 180000000
117+
118+
### Dataflex args
119+
train_type: dynamic_select
120+
components_cfg_file: src/dataflex/configs/components.yaml
121+
component_name: delta_loss
122+
warmup_step: 10
123+
update_step: 10
124+
update_times: 2
125+
126+
eval_dataset: alpaca_zh_demo
127+
```
128+
129+
---
130+
131+
### Step 4: Run Training
132+
133+
```bash
134+
FORCE_TORCHRUN=1 DISABLE_VERSION_CHECK=1 dataflex-cli train examples/train_lora/selectors/delta_loss.yaml
135+
```
136+
137+
---
138+
139+
### Step 5: Model Merge and Export
140+
141+
**Configuration file path:**
142+
143+
```
144+
DataFlex/examples/merge_lora/llama3_lora_sft.yaml
145+
```
146+
147+
**Example configuration:**
148+
149+
```yaml
150+
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
151+
adapter_name_or_path: ../dataflex_saves/Llama-3.1-8B/delta_loss
152+
template: llama3
153+
trust_remote_code: true
154+
155+
export_dir: ../dataflex_saves/Llama-3.1-8B_lora_sft
156+
export_size: 5
157+
export_device: cpu # choices: [cpu, auto]
158+
export_legacy_format: false
159+
```
160+
161+
**Parameter Description:**
162+
163+
* `model_name_or_path`: Model name or path used for training.
164+
* `adapter_name_or_path`: Output path of the LoRA adapter.
165+
* `export_dir`: Directory for saving the merged result of the fine-tuned model and LoRA adapter.
166+
167+
Execute the export command:
168+
169+
```bash
170+
llamafactory-cli export llama3_lora_sft.yaml
171+
```
172+
173+
The merged model will be saved in:
174+
175+
```
176+
/dataflex_saves/Llama-3.1-8B_lora_sft
177+
```
178+
179+
180+
## 3. Model Evaluation
181+
182+
It is recommended to use the [DataFlow](https://github.com/OpenDCAI/DataFlow) [Model QA Evaluation Pipeline](https://opendcai.github.io/DataFlow-Doc/zh/guide/2k5wjgls/) for systematic evaluation of the generated model.
183+
Lines changed: 188 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,188 @@
1+
---
2+
title: Loss Selector
3+
createTime: 2025/12/27 00:44:11
4+
permalink: /en/guide/loss/
5+
icon: carbon:select-window
6+
---
7+
# Loss Selector Guide
8+
9+
This document explains how to use the **Loss Selector** in **DataFlex**. The selector computes per-sample training loss, splits the distribution into low/medium/high bands using quantiles, and then samples with higher weight on a chosen band.
10+
11+
---
12+
13+
## 1. Method Overview
14+
15+
**Core idea of the Loss Selector:**
16+
17+
1. During training, compute the training loss for each sample. In a multi-GPU setting, results from different processes are aligned to the full dataset using the sample index (`idx`).
18+
19+
* In the current implementation, `batch_size = 1`, so the loss returned by the model corresponds to a per-sample loss.
20+
2. On the main process, collect and deduplicate all valid sample losses, and partition samples into **low / medium / high** loss regions using quantile thresholds.
21+
3. Assign a base weight of 1 to all valid samples, and apply an amplified weight `focus_weight` to samples in the specified focus region (`focus`).
22+
4. Smooth the weight distribution using a temperature parameter and perform random sampling according to the resulting probability distribution; when the number of valid samples is insufficient to meet the sampling requirement, automatically switch to sampling with replacement.
23+
24+
**Sampling probability:**
25+
26+
Let the loss of sample ($i$) be ($l_i$), the segment weight be ($w_i$), and the temperature be ($T$):
27+
28+
$$
29+
p_i = \frac{(w_i + \epsilon)^{1/T}}{\sum_j (w_j + \epsilon)^{1/T}}
30+
$$
31+
32+
33+
## 2. Implementation Steps
34+
35+
### Step 1: Environment Setup
36+
37+
```bash
38+
git clone https://github.com/OpenDCAI/DataFlex.git
39+
cd DataFlex
40+
pip install -e .
41+
pip install llamafactory
42+
```
43+
44+
---
45+
46+
### Step 2: Loss Selector Configuration
47+
48+
**Configuration file path:**
49+
```
50+
DataFlex/src/dataflex/configs/components.yaml
51+
```
52+
53+
**Example configuration:**
54+
```yaml
55+
loss:
56+
name: loss
57+
params:
58+
cache_dir: ../dataflex_saves/loss_output
59+
focus: "medium" # low | medium | high
60+
focus_weight: 5.0
61+
quantiles: [0.33, 0.66]
62+
replacement: false
63+
temperature: 1.0
64+
```
65+
66+
**Parameter Description:**
67+
* `cache_dir`: Cache directory for selection results (`step_{id}.json` per step).
68+
* `focus`: Target band to up-weight (`low` / `medium` / `high`, default `high`).
69+
* `focus_weight`: Weight multiplier for the focus band.
70+
* `quantiles`: Split points for low/medium/high, values in `[0, 1]`.
71+
* `replacement`: Sample with replacement or not; auto-switches to replacement if needed.
72+
* `temperature`: Distribution sharpness; `>1` smooths, `<1` sharpens.
73+
74+
---
75+
76+
### Step 3: Dynamic Training Configuration
77+
78+
**Configuration file path:**
79+
```
80+
DataFlex/examples/train_lora/selectors/loss.yaml
81+
```
82+
83+
**Example configuration:**
84+
```yaml
85+
### model
86+
model_name_or_path: meta-llama/Llama-3.1-8B
87+
trust_remote_code: true
88+
89+
### method
90+
stage: sft
91+
do_train: true
92+
finetuning_type: lora
93+
lora_target: all
94+
lora_rank: 16
95+
lora_alpha: 8
96+
97+
### dataset
98+
dataset: alpaca_en_demo
99+
template: llama3
100+
cutoff_len: 4096
101+
overwrite_cache: true
102+
preprocessing_num_workers: 16
103+
dataloader_num_workers: 0
104+
seed: 42
105+
106+
### output
107+
output_dir: ../dataflex_saves/Llama-3.1-8B/loss
108+
logging_steps: 10
109+
save_steps: 100
110+
plot_loss: true
111+
save_only_model: false
112+
overwrite_output_dir: true
113+
114+
### train
115+
per_device_train_batch_size: 1
116+
gradient_accumulation_steps: 1
117+
learning_rate: 1.0e-4
118+
num_train_epochs: 1.0
119+
lr_scheduler_type: cosine
120+
warmup_ratio: 0.1
121+
bf16: true
122+
ddp_timeout: 180000000
123+
124+
### Dataflex args
125+
train_type: dynamic_select
126+
components_cfg_file: src/dataflex/configs/components.yaml
127+
component_name: loss
128+
warmup_step: 10
129+
update_step: 10
130+
update_times: 2
131+
132+
eval_dataset: alpaca_zh_demo
133+
```
134+
135+
---
136+
137+
### Step 4: Run Training
138+
139+
```bash
140+
FORCE_TORCHRUN=1 DISABLE_VERSION_CHECK=1 dataflex-cli train examples/train_lora/selectors/loss.yaml
141+
```
142+
143+
144+
### Step 5: Model Merge and Export
145+
146+
**Configuration file path:**
147+
148+
```
149+
DataFlex/examples/merge_lora/llama3_lora_sft.yaml
150+
```
151+
152+
**Example configuration:**
153+
154+
```yaml
155+
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
156+
adapter_name_or_path: ../dataflex_saves/Llama-3.1-8B/loss
157+
template: llama3
158+
trust_remote_code: true
159+
160+
export_dir: ../dataflex_saves/Llama-3.1-8B_lora_sft
161+
export_size: 5
162+
export_device: cpu # choices: [cpu, auto]
163+
export_legacy_format: false
164+
```
165+
166+
**Parameter Description:**
167+
168+
* `model_name_or_path`: Model name or path used for training.
169+
* `adapter_name_or_path`: Output path of the LoRA adapter.
170+
* `export_dir`: Directory for saving the merged result of the fine-tuned model and LoRA adapter.
171+
172+
Execute the export command:
173+
174+
```bash
175+
llamafactory-cli export llama3_lora_sft.yaml
176+
```
177+
178+
The merged model will be saved in:
179+
180+
```
181+
/dataflex_saves/Llama-3.1-8B_lora_sft
182+
```
183+
184+
185+
## 3. Model Evaluation
186+
187+
It is recommended to use the [DataFlow](https://github.com/OpenDCAI/DataFlow) [Model QA Evaluation Pipeline](https://opendcai.github.io/DataFlow-Doc/zh/guide/2k5wjgls/) for systematic evaluation of the generated model.
188+

0 commit comments

Comments
 (0)