docs: archive curated N50 FedAvg baseline#2264
Conversation
There was a problem hiding this comment.
Pull request overview
This PR archives a curated, reproducible subset of the N=50 FedAvg no-attack / no-defense baseline assets extracted from PR #2263, including the experiment runner, offline analysis tooling, and the full 30-run metrics/config snapshot matrix.
Changes:
- Added
doc_fedavg/archive documentation plusn50_results/snapshot (30 metrics JSONL + 30 config YAML). - Added offline analysis script
doc_fedavg/tools/analyze_baseline.pyfor summary stats and AC checks. - Added batch runner
python/examples/federate/prebuilt_jobs/shieldfl/scripts/batch_baseline_n50.shto execute the 30-run baseline matrix.
Reviewed changes
Copilot reviewed 46 out of 66 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| doc_fedavg/README.md | Archive overview, contents, and naming/provenance notes for the baseline asset set. |
| doc_fedavg/n50_results/README.md | Describes the archived results snapshot and integrity expectations. |
| doc_fedavg/tools/analyze_baseline.py | Offline metrics loader + summary/AC checks for the archived JSONL results. |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a0.1_pmr0.0_gauto_seed0.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=0.1, seed=0). |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a0.1_pmr0.0_gauto_seed1.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=0.1, seed=1). |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a0.1_pmr0.0_gauto_seed2.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=0.1, seed=2). |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a0.1_pmr0.0_gauto_seed3.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=0.1, seed=3). |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a0.1_pmr0.0_gauto_seed4.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=0.1, seed=4). |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a0.5_pmr0.0_gauto_seed0.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=0.5, seed=0). |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a0.5_pmr0.0_gauto_seed1.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=0.5, seed=1). |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a0.5_pmr0.0_gauto_seed2.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=0.5, seed=2). |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a0.5_pmr0.0_gauto_seed3.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=0.5, seed=3). |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a0.5_pmr0.0_gauto_seed4.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=0.5, seed=4). |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a100_pmr0.0_gauto_seed0.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=100, seed=0). |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a100_pmr0.0_gauto_seed1.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=100, seed=1). |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a100_pmr0.0_gauto_seed2.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=100, seed=2). |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a100_pmr0.0_gauto_seed3.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=100, seed=3). |
| doc_fedavg/n50_results/metrics/metrics_LeNet5_mnist_shieldfl_atknone_defnone_a100_pmr0.0_gauto_seed4.jsonl | Baseline metrics JSONL (LeNet5/mnist, α=100, seed=4). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a0.1_pmr0.0_gauto_seed0.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=0.1, seed=0). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a0.1_pmr0.0_gauto_seed1.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=0.1, seed=1). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a0.1_pmr0.0_gauto_seed2.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=0.1, seed=2). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a0.1_pmr0.0_gauto_seed3.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=0.1, seed=3). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a0.1_pmr0.0_gauto_seed4.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=0.1, seed=4). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_gauto_seed0.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=0.5, seed=0). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_gauto_seed1.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=0.5, seed=1). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_gauto_seed2.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=0.5, seed=2). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_gauto_seed3.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=0.5, seed=3). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_gauto_seed4.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=0.5, seed=4). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a100_pmr0.0_gauto_seed0.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=100, seed=0). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a100_pmr0.0_gauto_seed1.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=100, seed=1). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a100_pmr0.0_gauto_seed2.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=100, seed=2). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a100_pmr0.0_gauto_seed3.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=100, seed=3). |
| doc_fedavg/n50_results/metrics/metrics_ResNet18_cifar10_shieldfl_atknone_defnone_a100_pmr0.0_gauto_seed4.jsonl | Baseline metrics JSONL (ResNet18/cifar10, α=100, seed=4). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a0.1_pmr0.0_seed0.yaml | Baseline config YAML (LeNet5/mnist, α=0.1, seed=0). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a0.1_pmr0.0_seed1.yaml | Baseline config YAML (LeNet5/mnist, α=0.1, seed=1). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a0.1_pmr0.0_seed2.yaml | Baseline config YAML (LeNet5/mnist, α=0.1, seed=2). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a0.1_pmr0.0_seed3.yaml | Baseline config YAML (LeNet5/mnist, α=0.1, seed=3). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a0.1_pmr0.0_seed4.yaml | Baseline config YAML (LeNet5/mnist, α=0.1, seed=4). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a0.5_pmr0.0_seed0.yaml | Baseline config YAML (LeNet5/mnist, α=0.5, seed=0). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a0.5_pmr0.0_seed1.yaml | Baseline config YAML (LeNet5/mnist, α=0.5, seed=1). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a0.5_pmr0.0_seed2.yaml | Baseline config YAML (LeNet5/mnist, α=0.5, seed=2). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a0.5_pmr0.0_seed3.yaml | Baseline config YAML (LeNet5/mnist, α=0.5, seed=3). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a0.5_pmr0.0_seed4.yaml | Baseline config YAML (LeNet5/mnist, α=0.5, seed=4). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a100_pmr0.0_seed0.yaml | Baseline config YAML (LeNet5/mnist, α=100, seed=0). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a100_pmr0.0_seed1.yaml | Baseline config YAML (LeNet5/mnist, α=100, seed=1). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a100_pmr0.0_seed2.yaml | Baseline config YAML (LeNet5/mnist, α=100, seed=2). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a100_pmr0.0_seed3.yaml | Baseline config YAML (LeNet5/mnist, α=100, seed=3). |
| doc_fedavg/n50_results/configs/config_LeNet5_mnist_shieldfl_atknone_defnone_a100_pmr0.0_seed4.yaml | Baseline config YAML (LeNet5/mnist, α=100, seed=4). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.1_pmr0.0_seed0.yaml | Baseline config YAML (ResNet18/cifar10, α=0.1, seed=0). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.1_pmr0.0_seed1.yaml | Baseline config YAML (ResNet18/cifar10, α=0.1, seed=1). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.1_pmr0.0_seed2.yaml | Baseline config YAML (ResNet18/cifar10, α=0.1, seed=2). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.1_pmr0.0_seed3.yaml | Baseline config YAML (ResNet18/cifar10, α=0.1, seed=3). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.1_pmr0.0_seed4.yaml | Baseline config YAML (ResNet18/cifar10, α=0.1, seed=4). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_seed0.yaml | Baseline config YAML (ResNet18/cifar10, α=0.5, seed=0). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_seed1.yaml | Baseline config YAML (ResNet18/cifar10, α=0.5, seed=1). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_seed2.yaml | Baseline config YAML (ResNet18/cifar10, α=0.5, seed=2). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_seed3.yaml | Baseline config YAML (ResNet18/cifar10, α=0.5, seed=3). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_seed4.yaml | Baseline config YAML (ResNet18/cifar10, α=0.5, seed=4). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a100_pmr0.0_seed0.yaml | Baseline config YAML (ResNet18/cifar10, α=100, seed=0). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a100_pmr0.0_seed1.yaml | Baseline config YAML (ResNet18/cifar10, α=100, seed=1). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a100_pmr0.0_seed2.yaml | Baseline config YAML (ResNet18/cifar10, α=100, seed=2). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a100_pmr0.0_seed3.yaml | Baseline config YAML (ResNet18/cifar10, α=100, seed=3). |
| doc_fedavg/n50_results/configs/config_ResNet18_cifar10_shieldfl_atknone_defnone_a100_pmr0.0_seed4.yaml | Baseline config YAML (ResNet18/cifar10, α=100, seed=4). |
| python/examples/federate/prebuilt_jobs/shieldfl/scripts/batch_baseline_n50.sh | Batch script to run the full 30-experiment baseline matrix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| local CMD="bash scripts/run_experiment.sh \ | ||
| --model $MODEL --dataset $DATASET \ | ||
| --attack $ATTACK --defense $DEFENSE \ | ||
| --pmr $PMR --alpha $ALPHA --seed $SEED \ | ||
| --rounds $ROUNDS --clients $CLIENTS \ |
There was a problem hiding this comment.
scripts/run_experiment.sh is invoked here, but that file does not exist anywhere in the repository (a full repo search only finds this reference). As-is, the batch runner cannot be executed; either add/restore the referenced script (and any dependencies) or update this to call the actual existing entrypoint, and consider adding an early check with a clear error if the dependency is missing.
| else | ||
| local T1=$(date +%s) | ||
| local DUR=$((T1 - T0)) | ||
| echo "[FAIL] $TAG duration=${DUR}s exit=$?" | ||
| echo "[FAIL] $TAG duration=${DUR}s" >>"$DONE_FILE" |
There was a problem hiding this comment.
In the failure branch, exit=$? does not report the exit status of the experiment command because $? is overwritten by the subsequent date/assignments inside the else block. Capture the return code immediately after eval fails (e.g., store it in a variable) so the logged exit code is correct.
| for alpha in ["0.1", "0.5", "100"]: | ||
| c10_finals = [load_exp("ResNet18", "cifar10", alpha, s)[-1]['test_accuracy'] for s in range(5)] | ||
| mn_finals = [load_exp("LeNet5", "mnist", alpha, s)[-1]['test_accuracy'] for s in range(5)] | ||
| c10_mean = np.mean(c10_finals) | ||
| mn_mean = np.mean(mn_finals) |
There was a problem hiding this comment.
These list comprehensions assume load_exp(...) always returns a list; if any metrics file is missing, load_exp returns None and this will raise at [-1]. Either assert all required files exist with a clear error message before this section, or filter/skip missing seeds here (similar to the earlier loop) to keep the analysis script robust.
| for alpha in ["0.1", "0.5", "100"]: | ||
| for seed in range(5): | ||
| data = load_exp(model, dataset, alpha, seed) | ||
| accs = [d['test_accuracy'] for d in data] | ||
| last10_std = np.std(accs[-10:]) * 100 |
There was a problem hiding this comment.
data = load_exp(...) can be None when a metrics file is missing, but this block immediately iterates over data without a guard. This will raise a TypeError and stop the analysis; add a None check (or preflight validation) before computing accs/last10_std.
|
|
||
| - 原始 JSONL / YAML 文件名沿用现有实验管线的 `shieldfl` 命名;在这批实验里,它表示由 ShieldFL 实验框架产出的 **no-attack / no-defense 基线记录**,并不表示额外启用了某种防御。 | ||
| - 指标文件中的 `gauto` 后缀也保留原样;由于本实验 `attack=none`,文件内 `gamma_actual` 为 `null`,该后缀仅用于保持与 PR 原始资产一致。 | ||
| - `config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_seed0.yaml` 在 `master` 上已存在,因此未出现在 PR diff 中;本次归档时已将其补齐,以完整覆盖 30 组矩阵。 |
There was a problem hiding this comment.
This README refers to the default branch as master, but this repository uses main (as indicated by the diff base). To avoid confusion for readers trying to locate the referenced existing config, update the branch name here (or rephrase to “main branch/default branch”).
| - `config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_seed0.yaml` 在 `master` 上已存在,因此未出现在 PR diff 中;本次归档时已将其补齐,以完整覆盖 30 组矩阵。 | |
| - `config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_seed0.yaml` 在 `main` 上已存在,因此未出现在 PR diff 中;本次归档时已将其补齐,以完整覆盖 30 组矩阵。 |
|
|
||
| - 原始 JSONL / YAML 文件名沿用现有实验管线的 `shieldfl` 命名;在这批实验里,它表示由 ShieldFL 实验框架产出的 **no-attack / no-defense 基线记录**,并不表示额外启用了某种防御。 | ||
| - 指标文件中的 `gauto` 后缀也保留原样;由于本实验 `attack=none`,文件内 `gamma_actual` 为 `null`,该后缀仅用于保持与 PR 原始资产一致。 | ||
| - `config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_seed0.yaml` 在 `master` 上已存在,因此未出现在 PR diff 中;本次归档时已将其补齐,以完整覆盖 30 组矩阵。 |
There was a problem hiding this comment.
This bullet says the ...a0.5...seed0.yaml config “already existed” and therefore “did not appear in the PR diff”, but in this PR it is actually being added as a new file. Please reconcile the provenance note (either remove the “already existed/not in diff” claim or clarify what PR/branch that statement refers to).
| - `config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_seed0.yaml` 在 `master` 上已存在,因此未出现在 PR diff 中;本次归档时已将其补齐,以完整覆盖 30 组矩阵。 | |
| - `config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_seed0.yaml` 已在本次归档中纳入,用于完整覆盖 `2 数据集 × 3 α × 5 seeds = 30` 组实验矩阵。 |
|
|
||
| - 文件名保留原始 `shieldfl` / `gauto` 命名,以维持与报告、脚本和 PR 证据链的一致性。 | ||
| - 本归档只保留最终需要复核的原始 metrics / configs;运行期 `batch_logs/`、`done.txt` 等过程文件未纳入仓库。 | ||
| - `config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_seed0.yaml` 来自主分支现有文件,用于补齐完整的 30 组实验矩阵;其余 29 个 config 和 30 个 metrics 来自 PR 分支。 |
There was a problem hiding this comment.
This note claims ...a0.5...seed0.yaml comes from the main branch (and only the other 29 configs came from the PR branch), but in this PR the file is being added as a new artifact. Please update this provenance statement to match what’s actually in the repo/diff (or clarify the exact source branch/commit where it was taken from).
| - `config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_seed0.yaml` 来自主分支现有文件,用于补齐完整的 30 组实验矩阵;其余 29 个 config 和 30 个 metrics 来自 PR 分支。 | |
| - `config_ResNet18_cifar10_shieldfl_atknone_defnone_a0.5_pmr0.0_seed0.yaml` 为本次归档中补齐 30 组实验矩阵而纳入的配置快照;其余 29 个 config 和 30 个 metrics 来自 PR 分支。 |
| def load_exp(model, dataset, alpha, seed): | ||
| path = RESULTS_DIR / f'metrics_{model}_{dataset}_shieldfl_atknone_defnone_a{alpha}_pmr0.0_gauto_seed{seed}.jsonl' | ||
| if not path.exists(): | ||
| return None | ||
| with path.open(encoding='utf-8') as f: | ||
| data = [json.loads(l) for l in f] | ||
| return data |
There was a problem hiding this comment.
load_exp parses every line with json.loads(l), but the archived JSONL metrics files in this PR contain a trailing blank line (e.g., line 101 is empty), which will raise JSONDecodeError. Filter out empty/whitespace-only lines (or strip() and continue) when loading to make the analysis script work with the archived artifacts.
| - 原始结果根目录:`python/examples/federate/prebuilt_jobs/shieldfl/results/` | ||
| - 完整性校验: | ||
| - `30/30` 个 metrics 文件齐全 | ||
| - 每个 metrics 文件均为 `100` 行,末轮 `round=99` |
There was a problem hiding this comment.
The integrity check claims each metrics file has exactly 100 lines, but the archived JSONL files in this PR include an extra trailing blank line (they show up as 101 lines in the repo and have an empty final line). Please either remove the trailing blank lines from the metrics files or update this statement to match the actual archived artifacts.
| - 每个 metrics 文件均为 `100` 行,末轮 `round=99` | |
| - 每个 metrics 文件包含 `100` 条 JSONL 记录,末轮 `round=99`;当前归档文件因末尾保留一个空白行,在仓库中显示为 `101` 行 |
This PR contains the curated subset extracted from PR #2263 and archives only the N=50 FedAvg no-attack baseline assets.
Included:
doc_fedavg/archive with design/report/README filesbatch_baseline_n50.shExcluded from the original PR:
This PR is intended to supersede #2263 with a clean, reviewable merge surface.