[FIX] Unblock OSS first-time setup — sample envs, host-gateway, ollama hint#2107
Conversation
…a hint
- backend/sample.env: set INTERNAL_SERVICE_API_KEY=dev-internal-key-123 so
workers' internal API calls don't 500 against backend middleware. Add
TEMPORARY_REMOTE_STORAGE alongside PERMANENT_REMOTE_STORAGE.
- workers/sample.env: mirror PERMANENT_REMOTE_STORAGE, TEMPORARY_REMOTE_STORAGE,
REMOTE_PROMPT_STUDIO_FILE_PATH. Without these the executor / ide-callback
workers crash at json.loads("") on first Prompt Studio index.
- unstract/sdk1 file_storage/env_helper.py: raise FileStorageError with a
clear message when the storage env var is unset or invalid JSON, instead
of letting json.loads("") raise an inscrutable JSONDecodeError.
- adapters/llm1 + embedding1 ollama.json: fix typo
docker.host.internal -> host.docker.internal in the Base URL hint.
- docker/docker-compose.yaml: extract a YAML anchor x-host-gateway and
apply it (<<: *host_gateway) to backend, prompt-service, runner,
celery-*, and every worker-* service. Before this only backend and
prompt-service had extra_hosts, so adapter Test passed in
prompt-service but real prompt execution from the worker pool failed
with EAI_NONAME against host.docker.internal.
- run-platform.sh: fail fast with actionable hint when `docker info`
can't reach the daemon (missing docker group membership, etc.).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Br691aZbhfcB4xdswrUjuw
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Summary by CodeRabbit
WalkthroughUpdates sample environment files, strengthens remote storage JSON validation, centralizes Docker host-gateway wiring, adds a Docker daemon check to startup, and corrects Ollama hostname examples. ChangesEnvironment, storage, and Docker runtime
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
| Filename | Overview |
|---|---|
| backend/sample.env | Adds dev default for INTERNAL_SERVICE_API_KEY and TEMPORARY_REMOTE_STORAGE; removes stray quotes from REMOTE_PROMPT_STUDIO_FILE_PATH. |
| workers/sample.env | Adds PERMANENT_REMOTE_STORAGE, TEMPORARY_REMOTE_STORAGE, and REMOTE_PROMPT_STUDIO_FILE_PATH required by executor/ide-callback workers for Prompt Studio flows. |
| docker/docker-compose.yaml | Introduces x-host-gateway YAML anchor with an inline warning about merge-key shadowing, and applies it to 16 services; removes now-redundant inline extra_hosts from backend and prompt-service. |
| run-platform.sh | Adds docker info pre-flight check with OS-specific (Linux/macOS/Windows) remediation hints; fails fast before any compose actions when the daemon is unreachable. |
| unstract/sdk1/src/unstract/sdk1/file_storage/env_helper.py | Replaces the silent json.loads("") crash with layered validation (unset/empty, invalid JSON, non-object, missing provider, non-dict credentials) all raising FileStorageError with actionable messages. |
| unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/ollama.json | Fixes typo in Base URL hint: docker.host.internal → host.docker.internal. |
| unstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/ollama.json | Same typo fix as the LLM adapter: docker.host.internal → host.docker.internal. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["EnvHelper.get_storage(storage_type, env_name)"] --> B{env var set\nand non-empty?}
B -- No --> C["raise FileStorageError\n'unset or empty'"]
B -- Yes --> D{valid JSON?}
D -- No --> E["raise FileStorageError\n'not valid JSON'"]
D -- Yes --> F{is a dict?}
F -- No --> G["raise FileStorageError\n'must be JSON object'"]
F -- Yes --> H{provider key\npresent & valid?}
H -- KeyError/ValueError --> I["raise FileStorageError\n'invalid storage config'"]
H -- OK --> J{credentials\nis a dict?}
J -- No --> K["raise FileStorageError\n'credentials must be JSON object'"]
J -- Yes --> L{storage_type?}
L -- PERMANENT --> M["PermanentFileStorage(provider, **credentials)"]
L -- SHARED_TEMPORARY --> N["SharedTemporaryFileStorage(provider, **credentials)"]
L -- other --> O["raise NotImplementedError"]
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
A["EnvHelper.get_storage(storage_type, env_name)"] --> B{env var set\nand non-empty?}
B -- No --> C["raise FileStorageError\n'unset or empty'"]
B -- Yes --> D{valid JSON?}
D -- No --> E["raise FileStorageError\n'not valid JSON'"]
D -- Yes --> F{is a dict?}
F -- No --> G["raise FileStorageError\n'must be JSON object'"]
F -- Yes --> H{provider key\npresent & valid?}
H -- KeyError/ValueError --> I["raise FileStorageError\n'invalid storage config'"]
H -- OK --> J{credentials\nis a dict?}
J -- No --> K["raise FileStorageError\n'credentials must be JSON object'"]
J -- Yes --> L{storage_type?}
L -- PERMANENT --> M["PermanentFileStorage(provider, **credentials)"]
L -- SHARED_TEMPORARY --> N["SharedTemporaryFileStorage(provider, **credentials)"]
L -- other --> O["raise NotImplementedError"]
Reviews (5): Last reviewed commit: "[FIX] Split long f-string in EnvHelper.g..." | Re-trigger Greptile
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/file_storage/env_helper.py`:
- Around line 41-49: The code at lines 48-49 where FileStorageProvider is
instantiated and credentials are accessed can still throw raw TypeError or
ValueError exceptions (for example, if file_storage_creds is not a dictionary,
or if the provider value is invalid) which bypass the FileStorageError handling.
Wrap the FileStorageProvider instantiation and credentials dictionary access in
a try-except block that catches TypeError and ValueError, and re-raise them as
FileStorageError with a descriptive message that includes the
EnvHelper.ENV_CONFIG_FORMAT reference, similar to how the JSONDecodeError is
handled. This ensures all configuration validation errors are normalized to
FileStorageError for consistent error handling.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bfd2382d-9a06-40fa-8ab6-97ea75851cab
📒 Files selected for processing (7)
backend/sample.envdocker/docker-compose.yamlrun-platform.shunstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/ollama.jsonunstract/sdk1/src/unstract/sdk1/adapters/llm1/static/ollama.jsonunstract/sdk1/src/unstract/sdk1/file_storage/env_helper.pyworkers/sample.env
- env_helper.py: validate parsed JSON is a dict; normalize KeyError / TypeError / ValueError from provider construction into FileStorageError with the remediation message, so callers never see raw json/dict exceptions on a misconfigured env var. - docker-compose.yaml: extend the x-host-gateway anchor's docstring to warn that YAML merge does NOT concatenate lists — adding a sibling extra_hosts to a service shadows the anchor entry rather than appending. Future contributors must inline all entries instead. - run-platform.sh: branch the docker-daemon remediation hints by OS (Linux / macOS / Windows / other), so macOS users see the Docker-Desktop hint instead of irrelevant getent/usermod commands. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Br691aZbhfcB4xdswrUjuw
There was a problem hiding this comment.
🧹 Nitpick comments (1)
unstract/sdk1/src/unstract/sdk1/file_storage/env_helper.py (1)
65-66: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winUse
logger.exceptionso the validation failure keeps its traceback.Line 65 is inside the
exceptblock and is currently flagged by SonarCloud; logging the exception once preserves context while keeping the normalizedFileStorageErrorbehavior unchanged.Proposed fix
- logger.error(f"Invalid storage configuration in env: {str(e)}") - logger.error(f"The configuration format is {EnvHelper.ENV_CONFIG_FORMAT}") + logger.exception( + "Invalid storage configuration in env var '%s'. Expected: %s", + env_name, + EnvHelper.ENV_CONFIG_FORMAT, + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@unstract/sdk1/src/unstract/sdk1/file_storage/env_helper.py` around lines 65 - 66, In the except block of the EnvHelper class where the invalid storage configuration error is being logged, replace the logger.error call that logs the exception message (the one with str(e)) with logger.exception instead. This will preserve the full traceback context for debugging while keeping the FileStorageError handling behavior intact. Keep the second logger.error call that displays the ENV_CONFIG_FORMAT as is.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/file_storage/env_helper.py`:
- Around line 65-66: In the except block of the EnvHelper class where the
invalid storage configuration error is being logged, replace the logger.error
call that logs the exception message (the one with str(e)) with logger.exception
instead. This will preserve the full traceback context for debugging while
keeping the FileStorageError handling behavior intact. Keep the second
logger.error call that displays the ENV_CONFIG_FORMAT as is.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ed46ecd1-550b-4e91-a960-2c763ed77f9a
📒 Files selected for processing (3)
docker/docker-compose.yamlrun-platform.shunstract/sdk1/src/unstract/sdk1/file_storage/env_helper.py
🚧 Files skipped from review as they are similar to previous changes (2)
- run-platform.sh
- docker/docker-compose.yaml
jaseemjaskp
left a comment
There was a problem hiding this comment.
Reviewed via PR Review Toolkit (code-reviewer, silent-failure-hunter, comment-analyzer, code-simplifier). Verified independently: the x-host-gateway anchor expands into all 16 services (docker compose config renders host.docker.internal:host-gateway for each), and the get_storage exception-type change is caller-safe — none of the ~25 callers in backend/prompt-service catch KeyError/JSONDecodeError from it. No blocking issues; the change cleanly unblocks first-time OSS setup. A few non-blocking nits below.
- env_helper.py: narrow except (KeyError, ValueError) to the FileStorageProvider resolution only; constructor / fsspec exceptions now propagate untouched instead of being mislabeled as env config errors. - env_helper.py: drop dead `except FileStorageError: raise e` (inner try doesn't catch FileStorageError, and `raise e` would reset traceback). - workers/sample.env: refresh comment to match new FileStorageError behavior. - backend/sample.env: unquote REMOTE_PROMPT_STUDIO_FILE_PATH for consistency with workers/sample.env. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Br691aZbhfcB4xdswrUjuw
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/file_storage/env_helper.py`:
- Around line 61-65: `create_file_storage` (or the file-storage factory logic in
`env_helper.py`) currently unpacks `credentials` into `PermanentFileStorage` and
`SharedTemporaryFileStorage` without validating its type, so non-dict values
raise a raw TypeError. Add an explicit check right after reading
`CredentialKeyword.CREDENTIALS` to ensure `credentials` is an object/mapping
before using `**credentials`, and raise `FileStorageError` with a clear message
when the shape is invalid. Keep the validation in the same branch that selects
`StorageType.PERMANENT` and `StorageType.SHARED_TEMPORARY` so both paths use the
same guard.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 484eeba4-228c-4e26-a490-6aaae922e1a7
📒 Files selected for processing (3)
backend/sample.envunstract/sdk1/src/unstract/sdk1/file_storage/env_helper.pyworkers/sample.env
🚧 Files skipped from review as they are similar to previous changes (2)
- workers/sample.env
- backend/sample.env
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com>
The coderabbit suggestion added an isinstance(credentials, dict) guard whose error message exceeded the 90-char line limit. Split the f-string across two adjacent string literals to keep behavior identical. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Br691aZbhfcB4xdswrUjuw
Unstract test resultsPer-group results
Critical paths
|
|



What
Fix six unrelated-but-co-incident bugs that make a first-time OSS install non-functional out of the box (specifically Prompt Studio indexing + host-installed adapters).
backend/sample.env:INTERNAL_SERVICE_API_KEYwas blank → workers'X-API-Keycalls into backend 500. Set to dev default that matchesworkers/sample.env.backend/sample.env: addTEMPORARY_REMOTE_STORAGE(onlyPERMANENT_*was set).workers/sample.env: addPERMANENT_REMOTE_STORAGE,TEMPORARY_REMOTE_STORAGE,REMOTE_PROMPT_STUDIO_FILE_PATH. Without these the executor/ide-callback workers crash atjson.loads("")on first Prompt Studio index.unstract/sdk1/.../file_storage/env_helper.py: raiseFileStorageErrorwith a clear remediation message when the storage env var is unset or invalid JSON, instead ofjson.loads("")exploding with a uselessJSONDecodeError.adapters/llm1+adapters/embedding1ollama.json: fix typodocker.host.internal→host.docker.internalin the Base URL hint.docker/docker-compose.yaml: extract a YAML anchorx-host-gatewayand apply it (<<: *host_gateway) to backend, prompt-service, runner, celery-, and every worker- (16 services). Before this only backend and prompt-service hadextra_hosts, so adapter Test passed in prompt-service but real prompt execution from the worker pool failed withEAI_NONAMEagainsthost.docker.internal.run-platform.sh: fail fast with actionable hint whendocker infocan't reach the daemon (missing docker group membership, etc.).Why
Reported by a new OSS user (2026-05-12): even after correctly setting up Ollama on the host and the bundled stack via
./run-platform.sh, Prompt Studio indexing hangs forever and host-installed LLM/embedding adapters give confusing "Test passes, prompt fails" errors. Each fix here removes one step that blocks a freshgit clonefrom reaching a successful first prompt run.How
<<: *host_gatewaymerges into each service's mapping. Validated locally withdocker compose config— anchor expands to 16 services.docker infocheck added immediately after the existingcommand -v dockercheck inrun-platform.sh.Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
extra_hostsentry; no service config is changed otherwise.run-platform.shcheck fires before any container actions; only impacts the case where the daemon is unreachable, which already fails downstream.Database Migrations
Env Config
backend/sample.envandworkers/sample.env. Existing.envfiles are not touched; users who already have working configs are unaffected. New installs that copy fromsample.envget a functional Prompt Studio out of the box.Relevant Docs
unstract-docsfor the "host-installed Ollama" / docker socket prerequisites notes (out of scope for this PR).Related Issues or PRs
Dependencies Versions
Notes on Testing
docker compose -f docker/docker-compose.yaml configresolves; YAML anchor expandshost.docker.internal:host-gatewayinto all 16 expected services.PERMANENT_REMOTE_STORAGEnow raisesFileStorageErrorwith the expected message rather thanJSONDecodeError.run-platform.shdocker-info check: simulated by stopping the daemon — script exits with the new hint instead of the original opaque error fromdocker compose.🤖 Generated with Claude Code
https://claude.ai/code/session_01Br691aZbhfcB4xdswrUjuw