-
Notifications
You must be signed in to change notification settings - Fork 1
Record LeakyCLIP boundary gate #303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,132 @@ | ||
| # LeakyCLIP CLIP-Inversion Boundary Gate | ||
|
|
||
| > Date: 2026-05-25 | ||
| > Status: CLIP inversion privacy watch-plus / official code-public / diffusion used only as refinement / no row-bound membership artifact / no download / no GPU release / no admitted row | ||
|
|
||
| ## Question | ||
|
|
||
| Does arXiv `2508.00756` / `LeakyCLIP: Extracting Training Data from CLIP` | ||
| expose a public DiffAudit-ready diffusion membership target, split, score | ||
| packet, or verifier that should change the active Research slots? | ||
|
|
||
| This was selected as a narrow Lane A metadata gate because the recent arXiv | ||
| search surfaced a privacy paper whose abstract mentions diffusion models and | ||
| training-data membership leakage, and GitHub search found an official public | ||
| repository. The check used arXiv API metadata, GitHub repository metadata, the | ||
| recursive tree, releases/tags metadata, and the official README. It did not | ||
| clone the repository, download CLIP models, download Stable Diffusion or VAE | ||
| weights, download LAION/Flickr/LFW/Furniture datasets, generate images, train | ||
| embedding alignment, run inversion, or compute metrics. | ||
|
|
||
| ## Public Surface | ||
|
|
||
| | Field | Value | | ||
| | --- | --- | | ||
| | Paper line | `LeakyCLIP: Extracting Training Data from CLIP` | | ||
| | arXiv | `https://arxiv.org/abs/2508.00756v4` | | ||
| | Published / updated | `2025-08-01T16:32:48Z` / `2026-05-21T07:46:00Z` | | ||
| | Official code | `https://github.com/dongdongunique/LeakyCLIP` | | ||
| | Repository state | default branch `main`, pushed `2026-02-27T08:12:16Z`, updated `2026-05-11T07:59:31Z`, `24` stars, no license in GitHub metadata, `0` releases, `0` tags | | ||
| | Repository topics | `clip-inversion`, `clip-memorization`, `clip-privacy-leakage`, `multimodal-model-privacy`, `training-data-extraction-from-clip` | | ||
| | Current scope | CLIP inversion / multimodal privacy; Stable Diffusion is an optional refinement component, not the audited target model | | ||
|
|
||
| The official repository is a real code surface. The checked tree includes | ||
| configuration files, inversion/refinement code, evaluation metrics, scripts, | ||
| and documentation images: | ||
|
|
||
| ```text | ||
| README.md | ||
| config.py | ||
| data.py | ||
| ea_train.py | ||
| eval/metrics.py | ||
| inversion/inverter.py | ||
| main.py | ||
| refinement/sd_refiner.py | ||
| configs/baseline/{flickr,furniture_object,laion}.json | ||
| configs/method/{flickr,furniture_object,laion,single}.json | ||
| scripts/run_dataset_inversion_baseline.sh | ||
| scripts/run_ea_train.sh | ||
| scripts/run_inversion.sh | ||
| scripts/run_laion_inversion.sh | ||
| docs/example_output.png | ||
| docs/pipeline.png | ||
| ``` | ||
|
|
||
| The README describes a three-stage pipeline: adversarial fine-tuning of CLIP, | ||
| embedding alignment from text embeddings to pseudo-image embeddings, and | ||
| optional Stable Diffusion refinement for texture/detail recovery. It lists | ||
| required public CLIP, robust CLIP, Stable Diffusion XL, VAE, SSCD, and dataset | ||
| downloads. It also states that membership can be inferred from reconstruction | ||
| metrics. | ||
|
|
||
| No committed result packet was visible in the checked public tree: no frozen | ||
| CLIP checkpoint hashes, no immutable LAION/Flickr/LFW/Furniture member and | ||
| nonmember manifests, no generated reconstruction packet, no per-row membership | ||
| score file, no ROC arrays, no metric JSON, no trained embedding-alignment | ||
| weights, and no no-training verifier output. | ||
|
|
||
| ## Claim Boundary | ||
|
|
||
| LeakyCLIP is scientifically relevant privacy evidence, but it does not audit a | ||
| diffusion model as the target. The target under attack is CLIP. Diffusion | ||
| appears as an image-refinement stage after CLIP inversion and as a general | ||
| background risk in the paper abstract, not as the membership target whose | ||
| training set is being inferred. | ||
|
|
||
| That makes it adjacent to DiffAudit's image-generation privacy map but outside | ||
| the current diffusion / latent-image per-sample membership consumer contract. | ||
| Treating it as a diffusion MIA asset would conflate a CLIP inversion threat | ||
| model with a diffusion-model membership row. | ||
|
|
||
| ## Gate Result | ||
|
|
||
| | Gate | Result | | ||
| | --- | --- | | ||
| | Current image/latent-image fit | Partial. It is multimodal image privacy work, but the audited target family is CLIP rather than a diffusion or latent-diffusion generator. | | ||
| | Target identity | Fail for DiffAudit replay. The README lists CLIP/robust CLIP/SDXL/VAE model sources, but no paper-bound hashes or frozen target bundle is committed. | | ||
| | Exact member split | Fail. No immutable member row IDs, image filenames, captions, URLs, or split manifests are committed. | | ||
| | Exact nonmember split | Fail. No row-bound holdout/nonmember manifest is committed. | | ||
| | Query/response or score coverage | Fail. The repository ships code and examples, not reconstruction packets, per-row membership scores, ROC arrays, metric JSON, or verifier output. | | ||
| | Mechanism delta | Pass as watch-plus only. CLIP inversion with reconstruction metrics is a different privacy surface, but not a diffusion membership mechanism for the current admitted boundary. | | ||
| | Download justification | Fail. Running it would require CLIP/SDXL/VAE/SSCD/model and dataset downloads without a released row-bound replay packet. | | ||
| | GPU release | Fail. The blocker is target-family boundary plus missing replay artifacts, not local compute. | | ||
|
|
||
| ## Decision | ||
|
|
||
| `CLIP inversion privacy watch-plus / official code-public / diffusion used only | ||
| as refinement / no row-bound membership artifact / no download / no GPU release | ||
| / no admitted row`. | ||
|
|
||
| Keep LeakyCLIP as Research-only adjacent privacy evidence. It is useful for | ||
| framing multimodal model privacy and CLIP training-data extraction risk, but it | ||
| does not reopen the current diffusion asset path and does not justify model, | ||
| dataset, or GPU work. | ||
|
|
||
| Current slots become `active_gpu_question = none`, `next_gpu_candidate = none`, | ||
| and `CPU sidecar = none selected after LeakyCLIP CLIP-inversion boundary gate`. | ||
|
|
||
| Smallest valid reopen condition: | ||
|
|
||
| - authors publish compact row-bound reconstruction and membership score | ||
| packets with immutable member/nonmember manifests, model hashes, ROC arrays, | ||
| metric JSON, and a no-training verifier; and | ||
| - DiffAudit explicitly opens a CLIP / multimodal-model privacy consumer | ||
| boundary separate from diffusion / latent-image admitted rows. | ||
|
|
||
| Stop condition: | ||
|
|
||
| - Do not download CLIP, robust CLIP, SDXL, VAE, SSCD, LAION, Flickr, LFW, | ||
| Furniture, generated reconstructions, or embedding-alignment weights from | ||
| this gate. | ||
| - Do not clone or run LeakyCLIP for execution from this gate. | ||
| - Do not run `main.py`, `ea_train.py`, `scripts/run_*`, inversion, refinement, | ||
| metric computation, or GPU work from this gate. | ||
| - Do not add Platform/Runtime rows, schemas, product copy, or recommendation | ||
| logic until a reviewed CLIP privacy consumer boundary or row-bound replay | ||
| artifacts exist. | ||
|
|
||
| ## Platform and Runtime Impact | ||
|
|
||
| None. Platform and Runtime continue consuming only the admitted `recon / PIA | ||
| baseline / PIA defended / GSA / DPDM W-1` set. |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -39,6 +39,7 @@ Smoke tests and dry runs are engineering validation, not benchmark claims. | |||||
| | GUARD surgical mitigation | `hold-semantic-shift` | arXiv `2603.00133` / `You Don't Need All That Attention: Surgical Memorization Mitigation in Text-to-Image Diffusion Models` is official code-public memorization-mitigation watch evidence. The `kairanzhao/GUARD` repo exposes `sdv1_500_mem` inference, detection, mask-generation, metric, and vendored `open_clip` code, but the release requires Google Drive benchmark assets and Stable Diffusion/reference model execution rather than shipping checkpoint-bound target identities, immutable row manifests, generated response packets, pre/post GUARD score rows, ROC arrays, metric JSON, retained-utility artifacts, or a no-training verifier. No arXiv source, GitHub archive, Google Drive payload, Stable Diffusion/reference model weight, generated image, mask, checkpoint, script execution, CPU/GPU sidecar, or Platform/Runtime row is selected. See [guard-surgical-mitigation-artifact-gate-20260523.md](guard-surgical-mitigation-artifact-gate-20260523.md). | | ||||||
| | BAF LoRA parameter-space mitigation | `hold-semantic-shift` | arXiv `2605.10439` / `Filtering Memorization from Parameter-Space in Diffusion Models` is a weight-only LoRA memorization-mitigation watch item. The paper proposes Base-Anchored Filtering, a post-hoc, training-free, data-free method that decomposes LoRA updates into spectral channels and suppresses weakly backbone-aligned channels as possible memorization carriers. The public surface is supplementary-code-claim-only: arXiv HTML says code is in supplementary material, but GitHub exact-title/arXiv-id/BAF searches found no official public repository, target LoRA/checkpoint bundle, training-image manifest, member/nonmember rows, generated response packet, per-row score file, ROC array, metric JSON, retained-utility artifact, or verifier. No arXiv source/supplement, LoRA weights, SD base weights, training images, mitigation implementation, CPU/GPU sidecar, or Platform/Runtime row is selected. See [baf-lora-parameter-space-mitigation-gate-20260523.md](baf-lora-parameter-space-mitigation-gate-20260523.md). | | ||||||
| | Broken Memories | `hold-semantic-shift` | arXiv `2605.22050` / `Broken Memories: Detecting and Mitigating Memorization in Diffusion Models with Degraded Generations` is fresh Stable Diffusion memorization detection/mitigation evidence with reported SD `1.4` `AUC > 0.999`, `0.0%` post-mitigation memorization rate, and about `0.01s` overhead. It is not a current per-sample membership row: arXiv metadata and GitHub searches expose no official code, exact prompt/image manifest, generated image packet, internal trace, per-row score file, ROC array, metric JSON, mitigation-decision artifact, or verifier. No arXiv source tarball, Stable Diffusion weights, LAION/Webster assets, implementation-from-paper, CPU/GPU sidecar, or Platform/Runtime row is selected. See [broken-memories-artifact-gate-20260523.md](broken-memories-artifact-gate-20260523.md). | | ||||||
| | LeakyCLIP CLIP inversion | `hold-semantic-shift` | arXiv `2508.00756` / `LeakyCLIP: Extracting Training Data from CLIP` is official code-public CLIP inversion / multimodal privacy evidence. The repository exposes real inversion/refinement code, configs, metrics, scripts, and README instructions, but the audited target is CLIP; Stable Diffusion is only an optional refinement component. The checked public tree has no frozen CLIP/SDXL/VAE hashes, immutable member/nonmember manifests, generated reconstruction packet, per-row membership score file, ROC array, metric JSON, trained embedding-alignment weights, or no-training verifier. No CLIP/SDXL/VAE/SSCD/model/dataset download, inversion/refinement run, CPU/GPU sidecar, or Platform/Runtime row is selected. See [leakyclip-clip-inversion-boundary-gate-20260525.md](leakyclip-clip-inversion-boundary-gate-20260525.md). | | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This table row contains a very long line of text, which harms readability and maintainability of the markdown source. Consider adding line breaks (
Suggested change
|
||||||
| | IAR Privacy Attacks | `hold-code-report-only` | arXiv `2502.02514` / `Privacy Attacks on Image AutoRegressive Models` is image-generation privacy watch-plus with official code and strong reported MIA/DI/extraction claims, including IAR `TPR@FPR=1% = 94.57%`, dataset inference with as few as `4` samples, and `698` extracted samples from `VAR-d30`. The target family is image autoregressive generation rather than current diffusion/latent-image rows. The official repository exposes `main.py`, `environment.yaml`, MIA/DI/memorization analysis scripts, VAR/RAR/MAR configs, and attack code, but no model hashes, immutable ImageNet row manifests, generated sample packet, per-row MIA scores, ROC arrays, metric JSON, DI CSV, memorization CSV, or verifier is committed. No ImageNet/model/upstream-repo download, MIA/DI/extraction run, CPU/GPU sidecar, or Platform/Runtime row is selected. See [iar-privacy-attacks-artifact-gate-20260523.md](iar-privacy-attacks-artifact-gate-20260523.md). | | ||||||
| | Silent Brush / Art Arena | `hold-semantic-shift` | arXiv `2605.17500` / `The Silent Brush: Evaluating Artistic Style Leakage in AI Art Generation` is text-to-image diffusion-adjacent privacy evidence, but the claim is style leakage / copyright evaluation rather than the current per-sample membership contract. The anonymous resource surface exposes code/notebook inventory only: `ArtArena.ipynb`, `README.md`, ET/MD eval and infer scripts, `FT_models.py`, `get_leadger.py`, prep scripts, `CSD/model.py`, `CSD/utils.py`, and figure PDFs. It exposes no target checkpoint hash, immutable member/nonmember artwork manifest, generated image packet, per-row membership score file, ROC array, metric JSON, or ready verifier. No artwork/model/source-tarball download, script execution, CPU/GPU sidecar, or Platform/Runtime row is selected. See [silent-brush-artarena-artifact-gate-20260523.md](silent-brush-artarena-artifact-gate-20260523.md). | | ||||||
| | Trajectory Generation Privacy | `hold-paper-source-only` | arXiv `2605.15246` / `Privacy Evaluation of Generative Models for Trajectory Generation` is cross-domain trajectory/mobility privacy evidence, not a current image or latent-image asset. It evaluates LSTM-TrajGAN, MoveSim, DiffTraj, and Diff-RNTraj; the diffusion trajectory results are near random (`0.5012` and `0.4949` AUC-ROC), while the only clearly positive table value is GAN MoveSim `AUC-ROC = 0.7002`. The checked public surface is TeX/source only, GitHub searches returned no official code or artifact hits, and no model checkpoint, immutable member/nonmember trajectory manifest, generated trajectory packet, per-row score file, ROC array, metric JSON, or verifier is public. No trajectory dataset/model download, attack implementation, CPU/GPU sidecar, or Platform/Runtime row is selected. See [trajectory-generation-privacy-artifact-gate-20260523.md](trajectory-generation-privacy-artifact-gate-20260523.md). | | ||||||
|
|
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is very long and difficult to read. The use of backticks for a long paragraph of text is also unconventional. For better maintainability and readability, consider restructuring this information. Instead of a single long line within backticks, you could use a multi-line blockquote or break it down into a nested list. This would make the information much easier to parse and edit.