Update README.md

Feilong607 · web-flow · commit a06c94909f63 · 2025-12-25T18:44:47.000+11:00
diff --git a/README.md b/README.md
@@ -265,16 +265,61 @@ torchrun --nproc_per_node=8 --master_port=29512 attentive_prob_codec.py \
   --model_name hf_llava_vit_large_ln \
   --embedding_size 1024 \
   --default_epoch 30 \
-  --data_root /data_3/data_attentive_probe/ \
-  --cache_dir /data_3/data_attentive_probe/diving48_hevc/cache_residuals/ \
+  --data_root /path/to/your/data_attentive_probe/ \
+  --cache_dir /path/to/your/cache_residuals/ \
   --K_keep 2048 \
   --mv_compensate median
 ```
 
-**Parameter Notes:**
-- `K_keep`: Number of patches to keep. For example, 256 patches per frame × 8 frames = 2048 total patches.
-- `model_weight`: Path to your pre-trained model weights. Set this to your own model path.
-- `cache_dir`: Directory for cached codec patches. Set this to your own cache directory path.
+**Codec-Specific Parameters:**
+- `cache_dir`: Directory for cached codec patches. This is where the codec-selected patches will be stored/loaded.
+- `K_keep`: Number of patches to keep. For example, 256 patches per frame × 8 frames = 2048 total patches. Adjust based on your frame count and desired compression ratio.
+- `mv_compensate`: Motion vector compensation method (e.g., `median`).
+
+#### Sampling Evaluation
+
+To evaluate the encoder with uniform frame sampling, first navigate to the evaluation directory:
+
+```bash
+cd eval_encoder
+```
+
+Then run the following command:
+
+```bash
+torchrun --nproc_per_node=8 --master_port=29507 attentive_probe.py \
+  --eval_freq 1 \
+  --default_lr_list 0.0001 \
+  --batch_size 32 \
+  --default_weight_decay 0 \
+  --dali_py_num_workers 8 \
+  --model_family llava_vit_sampling \
+  --dataset diving48 \
+  --num_frames 8 \
+  --model_weight lmms-lab/onevision-encoder-large \
+  --model_name hf_llava_vit_large_ln \
+  --embedding_size 1024 \
+  --frames_token_num 256
+```
+
+**Sampling-Specific Parameters:**
+- `frames_token_num`: Number of tokens per frame (e.g., 256 tokens for standard sampling).
+
+#### Shared Parameters
+
+The following parameters are common to both evaluation methods:
+
+- `dataset`: Dataset to evaluate on (e.g., `diving48`, `ssv2`, `kinetics400`). Prepare the dataset according to the Attentive Probe format.
+- `num_frames`: Total number of frames in the video sequence (e.g., 8 for sampling, 64 for codec).
+- `model_weight`: Path to the pre-trained model. Use `lmms-lab/onevision-encoder-large` to load directly from HuggingFace, or provide a local path.
+- `model_name`: Model architecture name (e.g., `hf_llava_vit_large_ln`).
+- `embedding_size`: Size of the embedding dimension (e.g., 1024).
+- `batch_size`: Training batch size (varies by evaluation type).
+- `default_lr_list`: Learning rate for the probe training.
+- `default_weight_decay`: Weight decay for optimization.
+- `eval_freq`: Evaluation frequency during training.
+- `dali_py_num_workers`: Number of DALI data loading workers.
+- `data_root`: Root directory containing your prepared dataset (codec evaluation only).