Python 3.10+ | License: MIT | Backend: OpenAI Vision API
Vision-to-Policy (V2P) is the official artifact repository for the paper:
Rethinking Access-Control Policy Authoring as a Multimodal Challenge
Accepted at SACMAT 2026
This repository implements the direct vision-to-specification translation pipeline described in Section 3 of the paper, where access-control policy diagrams are translated into structured policy specifications using vision-language models (VLMs).
V2P is a model-agnostic pipeline that converts Access Control Directed Acyclic Graph (DAG) images into structured knowledge graphs. Given a diagram depicting users, resources, and policies, the system supports:
- Entity extraction (nodes),
- Relation classification (typed edges),
- End-to-end graph reconstruction (nodes + edges + paths),
These correspond to the experimental pipeline evaluated in:
- Section 3 — Direct VLM Translation Pipeline
- Section 4 — Empirical Framing of Vision-to-Specification Translation
This repository reproduces the two empirical tables in the paper. Each table maps to a single batch script (run from the repository root):
| Paper Table | What it reports | Script to run |
|---|---|---|
| Table 1 — Entity and Relation Recovery (micro-averaged P/R/F1) | End-to-end node + edge recovery across SAG/MAG/DAG, with and without legend | bash run_v1.sh |
| Table 2 — Assignment Directional Structural Error (Misdirection, Mis/FN) | Reversed-edge analysis across SAG/MAG/DAG, with and without legend | bash run_assignment_misdirection_analyzer.sh followed by bash run_make_assignment_misdirection_table.sh |
Both tables cover the three structural regimes — SAG (Sparse), MAG (Moderate), and DAG (Dense) Authorization Graphs — under both legend conditions.
Step-by-step commands, per-regime invocations, and the underlying CLI flags are documented in Reproducing Results below.
The results in the paper were generated under the following environment. We recommend reviewers use a comparable setup; deviations (notably in the OpenAI model version) can change absolute numbers but should preserve the qualitative trends in Tables 1 and 2.
Hardware
- CPU: standard x86_64 workstation (e.g., Intel Core i7 / Xeon, 8+ cores)
- Memory: 16 GB RAM
- GPU: not required — all inference runs through the OpenAI Vision API; no local model weights
- Network: stable connection to
api.openai.comis required for the duration of the run
Software
- OS: Ubuntu 22.04 LTS (also tested on macOS 14)
- Python: 3.10+
- Python dependencies: pinned in
requirements.txt(openai>=1.0,pydantic>=2.0,python-dotenv>=1.0,Pillow>=10.0) - VLM backend:
gpt-5-nano(default; configurable via--model) - Image detail:
highfor Table 1 relation extraction;lowis supported for cost-sensitive runs
Estimated wall-clock runtime
End-to-end relation extraction takes approximately 3 minutes per image (case) with gpt-5-nano at --image_detail high. Total runtime scales linearly with the number of images in each regime × legend combination, and inversely with --workers for parallel execution.
| Step | Dataset coverage | Approx. runtime |
|---|---|---|
bash run_v1.sh (Table 1) |
6 regime × legend combinations, 130 images total (SAG: 51, MAG: 9, DAG: 5, each ×2 legend conditions) | ~3 min per case; ~6.5 hours sequential, ~1.6 hours with --workers 4 |
bash run_assignment_misdirection_analyzer.sh (Table 2, step 1) |
Post-hoc analysis of predicted JSON from step above | < 5 min total (local analysis only, no API calls) |
bash run_make_assignment_misdirection_table.sh (Table 2, step 2) |
Aggregation across the 6 CSV outputs | < 1 min (no API calls) |
Note: VLM endpoints are non-deterministic, so absolute numbers may shift slightly across runs; qualitative trends across SAG/MAG/DAG and legend conditions are preserved.
The repository structure directly corresponds to the architecture in Figure 2 (Direct VLM Pipeline):
| Paper Component | Implementation |
|---|---|
| VLM Translation Pipeline | access_control_run.py, src/core_processor.py |
| Prompt + Structured Output | src/access_prompt.py |
| Processing Strategies | src/processing_strategies.py |
| Evaluation Metrics | src/evaluation.py, src/eval_metric.py |
| Assignment Misdirection Analysis | assignment_misdirection_analyzer.py |
Rather than using multi-stage diagram parsers, V2P performs single-pass structured generation, producing JSON outputs that explicitly enumerate:
- nodes,
- typed relations,
- policy paths.
This design makes structural errors (omissions, misdirection, relation failures) directly observable and measurable.
- Sherifdeen Lawal, University of Texas at San Antonio
- Xingmeng Zhao, University of Colorado
- Enrique Navarroespino, University of Texas at San Antonio
- Anthony Rios, University of Texas at San Antonio
- Ram Krishnan, University of Texas at San Antonio
flowchart LR
images["DAG images<br/>(PNG / JPEG)"] --> entry[access_control_run.py]
entry --> cli[src/cli.py]
cli --> engine[src/core_processor.py]
engine --> strategies[src/processing_strategies.py]
strategies --> openai["OpenAI Vision API"]
openai --> strategies
strategies --> output["JSON results<br/>experiments/"]
The pipeline supports three primary modes:
Mode (--method) |
What it does |
|---|---|
extract_entities |
Identify nodes (users, objects, policy classes) from the image. |
relation_classification |
Binary relation check for each entity pair (requires entity list). |
relation_extraction |
End-to-end: extract nodes + edges from the image in one pass. This is the mode used to populate Table 1. |
Additional experimental methods (enumerate_paths, path_generation, extract_relation) are also available via the CLI.
Input: PNG or JPEG images of Access Control DAG graphs.
Images can include or exclude a legend (--with_legend / --no_legend).
Output: JSON files containing extracted entities, classified relations, or full knowledge graphs, saved under the output directory.
Dataset layout (when using the bundled SubgraphsWithTriples data):
datasets/
GroundTruthGraphsImages/ # --input points here
SAG_with_legend/
MAG_with_legend/
DAG_with_legend/
SAG_wo_legend/
MAG_wo_legend/
DAG_wo_legend/
GroundTruthGraphsJSON/ # ground-truth (auto-resolved)
SAG/ # GT for SAG_with_legend + SAG_wo_legend
MAG/ # GT for MAG_with_legend + MAG_wo_legend
DAG/ # GT for DAG_with_legend + DAG_wo_legend
PredictedPathGenerationJSON/
SAG_with_legend/
MAG_with_legend/
DAG_with_legend/
SAG_wo_legend/
MAG_wo_legend/
DAG_wo_legend/
Place your own images in
datasets/or pass an explicit--inputpath.
git clone git@github.com:UTSA-ICS/Rethinking-Access-Control-Policy-Authoring-as-a-Multimodal-Challenge.git
cd Rethinking-Access-Control-Policy-Authoring-as-a-Multimodal-Challenge
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txtDependencies (see requirements.txt):
openai>= 1.0pydantic>= 2.0python-dotenv>= 1.0Pillow>= 10.0
Copy the example environment file and fill in your API key:
cp .env.example .envEdit .env:
OPENAI_API_KEY="<OPENAI_API_KEY>"Security:
.envis in.gitignore. Never commit real API keys.
Use placeholders (<OPENAI_API_KEY>,<YOUR_NAME>, etc.) in documentation and pull requests.
The two paper tables are reproduced by two independent workflows. Run from the repository root (Rethinking-Access-Control-Policy-Authoring-as-a-Multimodal-Challenge/).
Table 1 reports micro-averaged precision, recall, and F1 for entity recovery and relation recovery across the three structural regimes (SAG/MAG/DAG) under both legend conditions. It is produced by the end-to-end relation extraction pipeline (--method relation_extraction), which emits a single JSON specification per image enumerating nodes, typed edges, and paths.
Batch run (recommended for full reproduction):
bash run_v1.shThis script iterates over all six regime × legend combinations, writes per-image predictions under experiments/relation_extraction/<regime>_<legend>/, and computes the micro-averaged metrics that populate Table 1.
Per-regime invocation (for inspection or partial reruns):
# SAG, with legend
python access_control_run.py \
--input datasets/GroundTruthGraphsImages/SAG_with_legend \
--output experiments/relation_extraction/SAG_with_legend \
--method relation_extraction \
--model gpt-5-nano --image_detail high --few_shot zero# MAG, without legend
python access_control_run.py \
--input datasets/GroundTruthGraphsImages/MAG_wo_legend \
--output experiments/relation_extraction/MAG_wo_legend \
--no_legend --method relation_extraction \
--model gpt-5-nano --image_detail high --few_shot zeroSwap SAG/MAG/DAG and with_legend/wo_legend to reproduce the remaining four rows.
Table 2 quantifies reversed-edge errors among predicted assignment relations, normalized by assignment false negatives (Mis/FN). It is a post-hoc analysis of predictions already produced for Table 1 (no additional API calls).
Step 1 — Per-regime misdirection analysis:
bash run_assignment_misdirection_analyzer.shThis iterates over the six regime × legend combinations and writes per-sample CSVs to experiments/assignment_misdirection_results/<regime>_<legend>/.
Step 2 — Aggregate into the table:
bash run_make_assignment_misdirection_table.shThis consumes the per-sample CSVs and emits the aggregated Mis/FN values reported in Table 2.
The following modes are available for ablations and inspection but are not required to reproduce the paper's tables:
# Entity-only extraction (Section 3 ablation)
python access_control_run.py --method extract_entities
# Binary relation classification given a fixed entity list
python access_control_run.py \
--input datasets/GroundTruthGraphsImages/DAG_with_legend \
--output experiments/relation_classification/DAG_with_legend \
--entities_input datasets/GroundTruthGraphsJSON/DAG \
--gt_input datasets/GroundTruthGraphsJSON/DAG \
--method relation_classification \
--relation_source ground_truth \
--model gpt-5-nano --image_detail lowPer-regime misdirection inspection (single regime, useful for debugging):
python3 assignment_misdirection_analyzer.py \
--dir datasets/GroundTruthGraphsImages/SAG_with_legend \
--glob "*path_generation.json" \
--out_prefix assign_misdirection_with \
--out_dir experiments/assignment_misdirection_results/SAG_with_legendpython3 make_assignment_misdirection_table.py \
--with_sag /path/to/sag_assign_misdirection_with_per_sample.csv \
--wo_sag /path/to/sag_assign_misdirection_wo_per_sample.csv \
--out_csv /path/to/assign_misdirection_results.csvpython access_control_run.py --help| Argument | Default | Description |
|---|---|---|
--input |
datasets/ |
Image file or directory. |
--output |
experiments/ |
Output file or directory. |
--method |
extract_entities |
Processing mode (see table above). |
--model |
gpt-5-nano |
Vision model (gpt-5-nano, gpt-5-mini, gpt-4o-mini, gpt-4o). |
--image_detail |
low |
low (cost-efficient, ~2.8k tokens) or high (~54k tokens). |
--few_shot |
zero |
zero or few (Context7-style few-shot). |
--workers |
4 |
Parallel workers for batch processing (1 = sequential). |
--relation_source |
ground_truth |
Entity source for relation_classification: ground_truth or predicted. |
--entities_input |
— | Entity folder for relation_classification. |
--gt_input |
— | Explicit ground-truth directory for evaluation. |
--with_legend / --no_legend |
with | Legend handling. |
--subset_size |
— | Limit to N random relations per graph (testing). |
--comprehensive_eval |
off | Broader evaluation across all possible relations. |
--fuzzy_matching |
off | Fuzzy entity name matching in evaluation. |
Access-Control-Policy/
├── README.md
├── access_control_run.py
├── assignment_misdirection_analyzer.py
├── datasets
│ ├── GroundTruthGraphsImages
│ ├── GroundTruthGraphsJSON
│ ├── PredictedGraphsJSON
│ └── test.md
├── experiments
│ ├── assignment_misdirection_results
│ ├── relation_classification
│ ├── relation_extraction
│ └── test.md
├── logs
├── make_assignment_misdirection_table.py
├── performance_results.csv
├── requirements.txt
├── run_assignment_misdirection_analyzer.sh
├── run_make_assignment_misdirection_table.sh
├── run_v1.sh
└── src
├── __init__.py
├── __pycache__
├── access_prompt.py
├── cli.py
├── config.py
├── core_processor.py
├── entity_pair_generator.py
├── eval_metric.py
├── evaluation.py
├── file_utils.py
└── processing_strategies.py
| Problem | Solution |
|---|---|
OpenAI API key not provided |
Set OPENAI_API_KEY in .env or export it in your shell. |
ModuleNotFoundError: No module named 'src' |
Run from the repository root (Access-Control-Policy/). |
| Wrong output directory | Set --output explicitly; the default auto-nests under experiments/<method>/. |
| Truncated model responses | Increase max_tokens in src/config.py → APIConfig. |
| Numbers differ slightly from the paper | OpenAI VLM endpoints are non-deterministic; small variations are expected. Qualitative trends across SAG/MAG/DAG and legend conditions should be preserved. |
- Never commit
.env, API keys, or paths that reveal personal information. - Use these placeholders in all shared docs:
<OPENAI_API_KEY>,<YOUR_NAME>,<YOUR_ORG>,<YOUR_EMAIL>. .envis listed in.gitignore.
This project is licensed under the MIT License.
See the LICENSE file for details.
@inproceedings{lawal2026v2p,
title = {Rethinking Access-Control Policy Authoring as a Multimodal Challenge},
author = {Lawal, Sherifdeen and Zhao, Xingmeng and Navarroespino, Enrique and Rios, Anthony and Krishnan, Ram},
booktitle = {Proceedings of the ACM Symposium on Access Control Models and Technologies (SACMAT)},
year = {2026},
note = {Artifact: Vision-to-Policy (V2P)},
url = {https://github.com/UTSA-ICS/Rethinking-Access-Control-Policy-Authoring-as-a-Multimodal-Challenge}
}