Perf/secrets scanner performance#354
Open
shreelshah12 wants to merge 2 commits into
Open
Conversation
…xed sleeps) The SAT Secrets Scanner could run ~14h and time out on large or locked-down accounts. Address the bottlenecks: - Batch TruffleHog: materialize a chunk of notebooks and scan the whole chunk in one invocation instead of two subprocesses per notebook. - Parallelize I/O: thread-pool the 403-fallback workspace traversal, notebook export/FUSE copy, and per-cluster config fetch + scan. - Skip the per-leaf get-status in the workspace/list fallback when the list response already carries modified_at; otherwise fetch in parallel. - Remove the unconditional 10s inter-page and per-cluster sleeps; rate limiting is now reactive (429-only exponential backoff with retry). - Scan workspaces concurrently and run notebook+cluster scans in parallel; run_ids are pre-allocated sequentially to avoid racing on SELECT max(runID). Configurable via secrets_max_parallel_workspaces. - Raise inner notebook.run timeout 1h -> 4h and add an 8h job-level timeout (DABS + Terraform) so runs finish or fail fast. Co-authored-by: Isaac
…llisions When multiple workspaces are scanned concurrently, every child notebook shares the driver-local filesystem. A fixed shared SCAN_BATCH_DIR let one workspace's rmtree/writes clobber another's in-flight batch and cross-wire finding attribution. Use tempfile.mkdtemp() per chunk and clean it up in a finally block so concurrent and sequential scans can never collide. Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The scanner now batches TruffleHog over a chunk of notebooks (one pass instead of two subprocesses per file), parallelizes notebook/cluster I/O and scans workspaces concurrently, drops the fixed 10s inter-page/per-cluster sleeps for 429-only backoff, and skips the per-notebook get-status in the 403 fallback. Also raises the inner timeout to 4h and adds an 8h job timeout.
Type of Change