|
| 1 | +--- |
| 2 | +title: Loss Selector |
| 3 | +createTime: 2025/12/27 00:44:11 |
| 4 | +permalink: /en/guide/loss/ |
| 5 | +icon: carbon:select-window |
| 6 | +--- |
| 7 | +# Loss Selector Guide |
| 8 | + |
| 9 | +This document explains how to use the **Loss Selector** in **DataFlex**. The selector computes per-sample training loss, splits the distribution into low/medium/high bands using quantiles, and then samples with higher weight on a chosen band. |
| 10 | + |
| 11 | +--- |
| 12 | + |
| 13 | +## 1. Method Overview |
| 14 | + |
| 15 | +**Core idea of the Loss Selector:** |
| 16 | + |
| 17 | +1. During training, compute the training loss for each sample. In a multi-GPU setting, results from different processes are aligned to the full dataset using the sample index (`idx`). |
| 18 | + |
| 19 | + * In the current implementation, `batch_size = 1`, so the loss returned by the model corresponds to a per-sample loss. |
| 20 | +2. On the main process, collect and deduplicate all valid sample losses, and partition samples into **low / medium / high** loss regions using quantile thresholds. |
| 21 | +3. Assign a base weight of 1 to all valid samples, and apply an amplified weight `focus_weight` to samples in the specified focus region (`focus`). |
| 22 | +4. Smooth the weight distribution using a temperature parameter and perform random sampling according to the resulting probability distribution; when the number of valid samples is insufficient to meet the sampling requirement, automatically switch to sampling with replacement. |
| 23 | + |
| 24 | +**Sampling probability:** |
| 25 | + |
| 26 | +Let the loss of sample ($i$) be ($l_i$), the segment weight be ($w_i$), and the temperature be ($T$): |
| 27 | + |
| 28 | +$$ |
| 29 | + p_i = \frac{(w_i + \epsilon)^{1/T}}{\sum_j (w_j + \epsilon)^{1/T}} |
| 30 | +$$ |
| 31 | + |
| 32 | + |
| 33 | +## 2. Implementation Steps |
| 34 | + |
| 35 | +### Step 1: Environment Setup |
| 36 | + |
| 37 | +```bash |
| 38 | +git clone https://github.com/OpenDCAI/DataFlex.git |
| 39 | +cd DataFlex |
| 40 | +pip install -e . |
| 41 | +pip install llamafactory |
| 42 | +``` |
| 43 | + |
| 44 | +--- |
| 45 | + |
| 46 | +### Step 2: Loss Selector Configuration |
| 47 | + |
| 48 | +**Configuration file path:** |
| 49 | +``` |
| 50 | +DataFlex/src/dataflex/configs/components.yaml |
| 51 | +``` |
| 52 | + |
| 53 | +**Example configuration:** |
| 54 | +```yaml |
| 55 | +loss: |
| 56 | + name: loss |
| 57 | + params: |
| 58 | + cache_dir: ../dataflex_saves/loss_output |
| 59 | + focus: "medium" # low | medium | high |
| 60 | + focus_weight: 5.0 |
| 61 | + quantiles: [0.33, 0.66] |
| 62 | + replacement: false |
| 63 | + temperature: 1.0 |
| 64 | +``` |
| 65 | +
|
| 66 | +**Parameter Description:** |
| 67 | +* `cache_dir`: Cache directory for selection results (`step_{id}.json` per step). |
| 68 | +* `focus`: Target band to up-weight (`low` / `medium` / `high`, default `high`). |
| 69 | +* `focus_weight`: Weight multiplier for the focus band. |
| 70 | +* `quantiles`: Split points for low/medium/high, values in `[0, 1]`. |
| 71 | +* `replacement`: Sample with replacement or not; auto-switches to replacement if needed. |
| 72 | +* `temperature`: Distribution sharpness; `>1` smooths, `<1` sharpens. |
| 73 | + |
| 74 | +--- |
| 75 | + |
| 76 | +### Step 3: Dynamic Training Configuration |
| 77 | + |
| 78 | +**Configuration file path:** |
| 79 | +``` |
| 80 | +DataFlex/examples/train_lora/selectors/loss.yaml |
| 81 | +``` |
| 82 | +
|
| 83 | +**Example configuration:** |
| 84 | +```yaml |
| 85 | +### model |
| 86 | +model_name_or_path: meta-llama/Llama-3.1-8B |
| 87 | +trust_remote_code: true |
| 88 | +
|
| 89 | +### method |
| 90 | +stage: sft |
| 91 | +do_train: true |
| 92 | +finetuning_type: lora |
| 93 | +lora_target: all |
| 94 | +lora_rank: 16 |
| 95 | +lora_alpha: 8 |
| 96 | +
|
| 97 | +### dataset |
| 98 | +dataset: alpaca_en_demo |
| 99 | +template: llama3 |
| 100 | +cutoff_len: 4096 |
| 101 | +overwrite_cache: true |
| 102 | +preprocessing_num_workers: 16 |
| 103 | +dataloader_num_workers: 0 |
| 104 | +seed: 42 |
| 105 | +
|
| 106 | +### output |
| 107 | +output_dir: ../dataflex_saves/Llama-3.1-8B/loss |
| 108 | +logging_steps: 10 |
| 109 | +save_steps: 100 |
| 110 | +plot_loss: true |
| 111 | +save_only_model: false |
| 112 | +overwrite_output_dir: true |
| 113 | +
|
| 114 | +### train |
| 115 | +per_device_train_batch_size: 1 |
| 116 | +gradient_accumulation_steps: 1 |
| 117 | +learning_rate: 1.0e-4 |
| 118 | +num_train_epochs: 1.0 |
| 119 | +lr_scheduler_type: cosine |
| 120 | +warmup_ratio: 0.1 |
| 121 | +bf16: true |
| 122 | +ddp_timeout: 180000000 |
| 123 | +
|
| 124 | +### Dataflex args |
| 125 | +train_type: dynamic_select |
| 126 | +components_cfg_file: src/dataflex/configs/components.yaml |
| 127 | +component_name: loss |
| 128 | +warmup_step: 10 |
| 129 | +update_step: 10 |
| 130 | +update_times: 2 |
| 131 | +
|
| 132 | +eval_dataset: alpaca_zh_demo |
| 133 | +``` |
| 134 | + |
| 135 | +--- |
| 136 | + |
| 137 | +### Step 4: Run Training |
| 138 | + |
| 139 | +```bash |
| 140 | +FORCE_TORCHRUN=1 DISABLE_VERSION_CHECK=1 dataflex-cli train examples/train_lora/selectors/loss.yaml |
| 141 | +``` |
| 142 | + |
| 143 | + |
| 144 | +### Step 5: Model Merge and Export |
| 145 | + |
| 146 | +**Configuration file path:** |
| 147 | + |
| 148 | +``` |
| 149 | +DataFlex/examples/merge_lora/llama3_lora_sft.yaml |
| 150 | +``` |
| 151 | + |
| 152 | +**Example configuration:** |
| 153 | + |
| 154 | +```yaml |
| 155 | +model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct |
| 156 | +adapter_name_or_path: ../dataflex_saves/Llama-3.1-8B/loss |
| 157 | +template: llama3 |
| 158 | +trust_remote_code: true |
| 159 | + |
| 160 | +export_dir: ../dataflex_saves/Llama-3.1-8B_lora_sft |
| 161 | +export_size: 5 |
| 162 | +export_device: cpu # choices: [cpu, auto] |
| 163 | +export_legacy_format: false |
| 164 | +``` |
| 165 | +
|
| 166 | +**Parameter Description:** |
| 167 | +
|
| 168 | +* `model_name_or_path`: Model name or path used for training. |
| 169 | +* `adapter_name_or_path`: Output path of the LoRA adapter. |
| 170 | +* `export_dir`: Directory for saving the merged result of the fine-tuned model and LoRA adapter. |
| 171 | + |
| 172 | +Execute the export command: |
| 173 | + |
| 174 | +```bash |
| 175 | +llamafactory-cli export llama3_lora_sft.yaml |
| 176 | +``` |
| 177 | + |
| 178 | +The merged model will be saved in: |
| 179 | + |
| 180 | +``` |
| 181 | +/dataflex_saves/Llama-3.1-8B_lora_sft |
| 182 | +``` |
| 183 | + |
| 184 | + |
| 185 | +## 3. Model Evaluation |
| 186 | + |
| 187 | +It is recommended to use the [DataFlow](https://github.com/OpenDCAI/DataFlow) [Model QA Evaluation Pipeline](https://opendcai.github.io/DataFlow-Doc/zh/guide/2k5wjgls/) for systematic evaluation of the generated model. |
| 188 | + |
0 commit comments