feat(nodes): add text-to-speech node with OpenAI TTS support#515
feat(nodes): add text-to-speech node with OpenAI TTS support#515charliegillet wants to merge 6 commits into
Conversation
Add a new tts_openai pipeline node that converts text input to audio output using the OpenAI TTS API. Supports tts-1 and tts-1-hd models, six voices (alloy, echo, fable, onyx, nova, shimmer), configurable speed (0.25-4.0x), and multiple output formats (mp3, opus, aac, flac, wav, pcm). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds a new OpenAI Text-to-Speech node (global manager + instance), service manifest and dependency, SSRF protection utilities and HTTP client/driver SSRF integration, plus unit tests for SSRF behavior. Changes
Sequence Diagram(s)sequenceDiagram
participant Pipeline as Pipeline
participant IGlobal as IGlobal (TTS)
participant IInstance as IInstance (TTS)
participant OpenAI as OpenAI API
Pipeline->>IGlobal: validateConfig()
IGlobal->>IGlobal: load deps, read config, validate model/voice
IGlobal->>OpenAI: client.models.list() (connectivity check)
OpenAI-->>IGlobal: ok / error (handled)
Pipeline->>IGlobal: beginGlobal()
IGlobal->>IGlobal: clamp params, instantiate OpenAI client
Pipeline->>IInstance: writeText(text)
IInstance->>IInstance: trim & validate text
IInstance->>IGlobal: access shared _client
IInstance->>OpenAI: audio.speech.create(input, model, voice, speed, format)
OpenAI-->>IInstance: audio bytes
IInstance->>Pipeline: writeAudio(BEGIN, WRITE(bytes), END)
Pipeline->>IGlobal: endGlobal()
IGlobal->>IGlobal: clear client
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Add a shared SSRF protection module that validates URLs and resolved IP addresses before HTTP requests are made. This prevents agents from making requests to private, loopback, link-local, and reserved IP ranges — including cloud metadata endpoints like 169.254.169.254. Key features: - Blocks all RFC 1918 private ranges, loopback, link-local, multicast, and other reserved IP ranges by default - Resolves DNS before checking IPs to prevent DNS rebinding attacks - Supports a configurable allowlist via ROCKETRIDE_SSRF_ALLOWLIST env var (comma-separated CIDRs) for self-hosted/on-premise operators - Supports per-node allowlist via ssrfAllowlist config parameter - Only allows http/https schemes (blocks file://, ftp://, gopher://, etc.) - Blocks known dangerous hostnames (localhost, metadata.google.internal) - Clear error messages guiding users to the allowlist when blocked Integrated into tool_http_request node; the shared module in library/ssrf_protection.py can be imported by any other node. Closes rocketride-org#482 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nodes/src/nodes/tts_openai/IGlobal.py`:
- Around line 155-162: The speed coercion can raise on malformed configs; in
IGlobal (where self._speed is set from config.get) validate and safely parse the
value instead of calling float(...) unguarded: check if the raw value is None or
not a number, then try converting inside a try/except (catch
ValueError/TypeError) and on failure fall back to the default 1.0 before
clamping, finally apply the existing clamp logic (max(0.25, min(4.0,
parsed_speed))). Update the initialization in IGlobal (and any related
beginGlobal path if it reuses this) to use this safe-parse-and-fallback flow so
non-numeric strings/None don’t crash startup.
In `@nodes/src/nodes/tts_openai/IInstance.py`:
- Around line 77-79: The calls to writeAudio in IInstance are missing the
required third buffer argument; update the three calls so they pass a buffer
(use an explicit empty buffer such as b'' or bytearray() for BEGIN and END) so
the contract writeAudio(action, mimeType, buffer) is honored—e.g., change
writeAudio(AVI_ACTION.BEGIN, mime_type) and writeAudio(AVI_ACTION.END,
mime_type) to writeAudio(AVI_ACTION.BEGIN, mime_type, b'') and
writeAudio(AVI_ACTION.END, mime_type, b''), leaving the WRITE call as
writeAudio(AVI_ACTION.WRITE, mime_type, audio_data).
In `@nodes/src/nodes/tts_openai/services.json`:
- Line 61: The services.json entry references an icon "tts-openai.svg" that is
missing; either add the SVG asset named exactly "tts-openai.svg" into the
repository's icon/assets directory following the same naming and sizing
conventions used by other icons, or remove the "icon": "tts-openai.svg" property
from the service definition in services.json (and update any asset manifest or
import list if present) so the service no longer points to a non-existent file.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 018cb4a9-71bd-42b8-b823-56abbcf78afc
📒 Files selected for processing (5)
nodes/src/nodes/tts_openai/IGlobal.pynodes/src/nodes/tts_openai/IInstance.pynodes/src/nodes/tts_openai/__init__.pynodes/src/nodes/tts_openai/requirements.txtnodes/src/nodes/tts_openai/services.json
| self._model = config.get('model', 'tts-1') | ||
| self._voice = config.get('voice', 'alloy') | ||
| self._speed = config.get('speed', 1.0) | ||
| self._response_format = config.get('response_format', 'mp3') | ||
| apikey = config.get('apikey') | ||
|
|
||
| # Clamp speed to valid range | ||
| self._speed = max(0.25, min(4.0, float(self._speed))) |
There was a problem hiding this comment.
Guard speed coercion against malformed config values.
Config.getNodeConfig() merges raw connection data, so imported/manual JSON can still surface None or non-numeric strings here. float(self._speed) on Line 162 will then raise during beginGlobal() and prevent the node from starting instead of falling back to the default.
🛡️ Proposed fix
- self._speed = max(0.25, min(4.0, float(self._speed)))
+ try:
+ self._speed = max(0.25, min(4.0, float(self._speed)))
+ except (TypeError, ValueError):
+ warning(f'Invalid TTS speed: {self._speed!r}. Falling back to 1.0')
+ self._speed = 1.0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nodes/src/nodes/tts_openai/IGlobal.py` around lines 155 - 162, The speed
coercion can raise on malformed configs; in IGlobal (where self._speed is set
from config.get) validate and safely parse the value instead of calling
float(...) unguarded: check if the raw value is None or not a number, then try
converting inside a try/except (catch ValueError/TypeError) and on failure fall
back to the default 1.0 before clamping, finally apply the existing clamp logic
(max(0.25, min(4.0, parsed_speed))). Update the initialization in IGlobal (and
any related beginGlobal path if it reuses this) to use this
safe-parse-and-fallback flow so non-numeric strings/None don’t crash startup.
| self.instance.writeAudio(AVI_ACTION.BEGIN, mime_type) | ||
| self.instance.writeAudio(AVI_ACTION.WRITE, mime_type, audio_data) | ||
| self.instance.writeAudio(AVI_ACTION.END, mime_type) |
There was a problem hiding this comment.
writeAudio() is missing the required buffer argument on BEGIN/END.
The framework writeAudio() contract is (action, mimeType, buffer). Line 77 and Line 79 only pass two arguments, so the first BEGIN call raises TypeError and the audio lane never receives output.
🐛 Proposed fix
- self.instance.writeAudio(AVI_ACTION.BEGIN, mime_type)
+ self.instance.writeAudio(AVI_ACTION.BEGIN, mime_type, b'')
self.instance.writeAudio(AVI_ACTION.WRITE, mime_type, audio_data)
- self.instance.writeAudio(AVI_ACTION.END, mime_type)
+ self.instance.writeAudio(AVI_ACTION.END, mime_type, b'')📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| self.instance.writeAudio(AVI_ACTION.BEGIN, mime_type) | |
| self.instance.writeAudio(AVI_ACTION.WRITE, mime_type, audio_data) | |
| self.instance.writeAudio(AVI_ACTION.END, mime_type) | |
| self.instance.writeAudio(AVI_ACTION.BEGIN, mime_type, b'') | |
| self.instance.writeAudio(AVI_ACTION.WRITE, mime_type, audio_data) | |
| self.instance.writeAudio(AVI_ACTION.END, mime_type, b'') |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nodes/src/nodes/tts_openai/IInstance.py` around lines 77 - 79, The calls to
writeAudio in IInstance are missing the required third buffer argument; update
the three calls so they pass a buffer (use an explicit empty buffer such as b''
or bytearray() for BEGIN and END) so the contract writeAudio(action, mimeType,
buffer) is honored—e.g., change writeAudio(AVI_ACTION.BEGIN, mime_type) and
writeAudio(AVI_ACTION.END, mime_type) to writeAudio(AVI_ACTION.BEGIN, mime_type,
b'') and writeAudio(AVI_ACTION.END, mime_type, b''), leaving the WRITE call as
writeAudio(AVI_ACTION.WRITE, mime_type, audio_data).
| // Optional: | ||
| // The icon is the icon to display in the UI for this node | ||
| // | ||
| "icon": "tts-openai.svg", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Expect at least one match for the icon referenced by nodes/src/nodes/tts_openai/services.json.
fd -i '^tts-openai\.svg$'Repository: rocketride-org/rocketride-server
Length of output: 58
Add the missing icon asset file or remove the icon reference.
The icon file tts-openai.svg referenced on line 61 does not exist in the repository. Either create the missing SVG asset or remove the icon property from the service definition to prevent a broken icon.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nodes/src/nodes/tts_openai/services.json` at line 61, The services.json entry
references an icon "tts-openai.svg" that is missing; either add the SVG asset
named exactly "tts-openai.svg" into the repository's icon/assets directory
following the same naming and sizing conventions used by other icons, or remove
the "icon": "tts-openai.svg" property from the service definition in
services.json (and update any asset manifest or import list if present) so the
service no longer points to a non-existent file.
…ls.list The validateConfig method was calling client.audio.speech.create() which generates audio and incurs cost on every config validation. Replaced with client.models.list() which only verifies the API key without generating audio. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
services.json references tts-openai.svg but no file existed. Added a speaker/volume icon consistent with other audio node icons. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Moved _client, _model, _voice, _speed, _response_format from class-level attributes to instance initialization in beginGlobal to prevent shared mutable state across instances. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The except block was silently swallowing all exceptions. Now rate-limit and connection errors are treated as transient (warn only), while all other errors are re-raised so the pipeline knows the operation failed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 7
♻️ Duplicate comments (2)
nodes/src/nodes/tts_openai/IGlobal.py (1)
158-163:⚠️ Potential issue | 🟠 MajorGuard
speedparsing before clamping.Line 163 still calls
float(...)on raw merged config.Config.getNodeConfig()(packages/ai/src/ai/common/config.py:71-175) preserves non-Noneuser values, so imported/manual JSON like'fast'will abortbeginGlobal()instead of falling back to1.0.🛡️ Proposed fix
# Clamp speed to valid range - self._speed = max(0.25, min(4.0, float(self._speed))) + try: + self._speed = max(0.25, min(4.0, float(self._speed))) + except (TypeError, ValueError): + warning(f'Invalid TTS speed: {self._speed!r}. Falling back to 1.0') + self._speed = 1.0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nodes/src/nodes/tts_openai/IGlobal.py` around lines 158 - 163, The code calls float(...) on the raw config value when setting self._speed which can raise for non-numeric inputs (e.g., "fast"); update the logic in beginGlobal / the initializer that sets self._speed to first attempt a safe numeric parse (wrap float(config.get('speed')) in a try/except catching ValueError/TypeError) and fall back to the default 1.0 when parsing fails or the value is None, then apply the existing clamp (max/min) to the parsed/fallback value so non-numeric user-provided values don't abort initialization.nodes/src/nodes/tts_openai/IInstance.py (1)
76-79:⚠️ Potential issue | 🔴 CriticalPass the
dataargument onBEGINandEND.Per
nodes/src/nodes/response/IInstance.py:194-230,writeAudio()expects(aviAction, mimeType, data). Line 77 and Line 79 omit that third argument, so the node raises before any audio is emitted.🐛 Proposed fix
- self.instance.writeAudio(AVI_ACTION.BEGIN, mime_type) + self.instance.writeAudio(AVI_ACTION.BEGIN, mime_type, b'') self.instance.writeAudio(AVI_ACTION.WRITE, mime_type, audio_data) - self.instance.writeAudio(AVI_ACTION.END, mime_type) + self.instance.writeAudio(AVI_ACTION.END, mime_type, b'')Run this read-only check to confirm the contract and the call sites. Expected result:
writeAudiotakes three positional arguments afterself, while theBEGIN/ENDcalls here only pass two.#!/bin/bash set -e rg -n -C3 'def writeAudio\(' nodes/src/nodes/response/IInstance.py rg -n -C1 'writeAudio\(AVI_ACTION\.(BEGIN|WRITE|END)' nodes/src/nodes/tts_openai/IInstance.py🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nodes/src/nodes/tts_openai/IInstance.py` around lines 76 - 79, The BEGIN and END calls to self.instance.writeAudio are missing the third positional data argument required by writeAudio (signature in IInstance.writeAudio expects aviAction, mimeType, data); update the two calls that use AVI_ACTION.BEGIN and AVI_ACTION.END to pass a data value (e.g., empty bytes or an explicit sentinel) just like the WRITE call passes audio_data so the calls become writeAudio(AVI_ACTION.BEGIN, mime_type, <data>) and writeAudio(AVI_ACTION.END, mime_type, <data>), leaving the WRITE call unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nodes/src/nodes/library/ssrf_protection.py`:
- Around line 246-251: Rename the unused loop variable "family" to "_family" in
the for-loop that unpacks addrinfos (for family, _type, _proto, _canonname,
sockaddr in addrinfos) inside ssrf_protection.py so static analysis no longer
flags an unused variable; update the unpacking to (_family, _type, _proto,
_canonname, sockaddr) and ensure there are no other references to "family" in
the surrounding function (e.g., _check_ip usage remains unchanged).
- Around line 92-97: Update the _BLOCKED_HOSTNAMES frozenset in
ssrf_protection.py to include additional cloud metadata hostnames (e.g.,
'metadata.azure.com', 'management.azure.com', and other provider metadata hosts
you want to cover) so the SSFR protection covers Azure and other cloud providers
in addition to existing 'localhost' and 'metadata.google.internal'; modify the
_BLOCKED_HOSTNAMES set literal (symbol: _BLOCKED_HOSTNAMES) to add these
hostnames and keep the existing IP-range-based checks for AWS/other providers
intact.
- Around line 194-220: The _build_allowlist function currently swallows
ValueError for malformed CIDR strings; change the except blocks to log a warning
instead of silently passing: get a module logger via logging.getLogger(__name__)
(or reuse an existing logger if one is available), and in both exception
handlers call logger.warning with the malformed cidr string and its source
(e.g., "SSRF_ALLOWLIST_ENV" vs "per-call extra") and note that the entry will be
skipped; keep the behavior of skipping malformed entries but surface a clear
warning including the bad value and context.
In `@nodes/src/nodes/tool_http_request/http_client.py`:
- Around line 71-72: The SSRF check only validates the initial resolved_url via
validate_url(resolved_url, allowed_private=ssrf_allowed_private) but
requests.request follows redirects by default, allowing redirect-based SSRF; fix
by disabling automatic redirects in the HTTP call (set allow_redirects=False on
requests.request used in this module) and implement manual redirect handling: on
3xx responses read the Location header, resolve the next URL, call
validate_url(next_resolved_url, allowed_private=ssrf_allowed_private) before
following, enforce a max redirect count, and repeat; alternatively, if you must
keep automatic redirects, implement a requests.Session with a custom redirect
hook that validates each redirected Location using validate_url before allowing
the redirect.
In `@nodes/src/nodes/tool_http_request/IGlobal.py`:
- Around line 109-127: The node reads cfg.get('ssrfAllowlist') in the
_build_ssrf_allowlist method but that configuration key isn't declared in
services.json, so add a new entry for "http_request.ssrfAllowlist" to the node's
services.json fields section; declare it as a string (or appropriate type) with
a title like "SSRF Allowlist", a description noting it expects a JSON array of
CIDR strings (e.g. ["192.168.1.0/24"]), and a sensible default (e.g. "[]") so
the UI exposes and persists the ssrfAllowlist config consumed by
_build_ssrf_allowlist.
In `@nodes/test/test_ssrf_protection.py`:
- Around line 129-144: Add a new test method to TestBlockedIPv6 that uses the
existing _fake_getaddrinfo helper to return an IPv4-mapped IPv6 address (e.g.
'::ffff:192.168.1.1') and asserts validate_url raises SSRFError with the
"private/reserved range" message; place the test alongside test_blocked_ipv6 and
follow the same patch pattern (patching 'ssrf_protection.socket.getaddrinfo') so
the code path in ssrf_protection that handles IPv4-mapped IPv6 is exercised.
- Around line 71-80: The helper function _fake_getaddrinfo_multi in
test_ssrf_protection.py is unused; remove this dead code or add tests that
exercise it. Either delete the _fake_getaddrinfo_multi definition to clean up
the file, or add a test (e.g., test_blocked_and_allowed_multi_homed_host) that
calls _fake_getaddrinfo_multi to simulate a multi-homed host with both blocked
and allowed IPs and asserts the SSRF protection behavior; update imports/mocks
as needed so the new test actually uses _fake_getaddrinfo_multi.
---
Duplicate comments:
In `@nodes/src/nodes/tts_openai/IGlobal.py`:
- Around line 158-163: The code calls float(...) on the raw config value when
setting self._speed which can raise for non-numeric inputs (e.g., "fast");
update the logic in beginGlobal / the initializer that sets self._speed to first
attempt a safe numeric parse (wrap float(config.get('speed')) in a try/except
catching ValueError/TypeError) and fall back to the default 1.0 when parsing
fails or the value is None, then apply the existing clamp (max/min) to the
parsed/fallback value so non-numeric user-provided values don't abort
initialization.
In `@nodes/src/nodes/tts_openai/IInstance.py`:
- Around line 76-79: The BEGIN and END calls to self.instance.writeAudio are
missing the third positional data argument required by writeAudio (signature in
IInstance.writeAudio expects aviAction, mimeType, data); update the two calls
that use AVI_ACTION.BEGIN and AVI_ACTION.END to pass a data value (e.g., empty
bytes or an explicit sentinel) just like the WRITE call passes audio_data so the
calls become writeAudio(AVI_ACTION.BEGIN, mime_type, <data>) and
writeAudio(AVI_ACTION.END, mime_type, <data>), leaving the WRITE call unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2bf63b98-7dd7-4878-999a-da89304d759e
⛔ Files ignored due to path filters (1)
packages/shared-ui/src/assets/nodes/tts-openai.svgis excluded by!**/*.svg
📒 Files selected for processing (7)
nodes/src/nodes/library/ssrf_protection.pynodes/src/nodes/tool_http_request/IGlobal.pynodes/src/nodes/tool_http_request/http_client.pynodes/src/nodes/tool_http_request/http_driver.pynodes/src/nodes/tts_openai/IGlobal.pynodes/src/nodes/tts_openai/IInstance.pynodes/test/test_ssrf_protection.py
| _BLOCKED_HOSTNAMES = frozenset( | ||
| { | ||
| 'localhost', | ||
| 'metadata.google.internal', | ||
| } | ||
| ) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider adding additional cloud metadata hostnames.
The current list blocks localhost and metadata.google.internal. For more comprehensive cloud protection, consider adding:
metadata.google.internal✓ (present)- AWS: IP-based (169.254.169.254 ✓ blocked by range)
- Azure:
metadata.azure.com,management.azure.com(some instance metadata scenarios) - DigitalOcean, Oracle Cloud, etc.
This is optional since the IP-based blocking catches most metadata endpoints.
💡 Optional: Expand blocked hostnames
_BLOCKED_HOSTNAMES = frozenset(
{
'localhost',
'metadata.google.internal',
+ 'metadata.azure.com',
+ 'instance-data', # some Linode/legacy
}
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| _BLOCKED_HOSTNAMES = frozenset( | |
| { | |
| 'localhost', | |
| 'metadata.google.internal', | |
| } | |
| ) | |
| _BLOCKED_HOSTNAMES = frozenset( | |
| { | |
| 'localhost', | |
| 'metadata.google.internal', | |
| 'metadata.azure.com', | |
| 'instance-data', # some Linode/legacy | |
| } | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nodes/src/nodes/library/ssrf_protection.py` around lines 92 - 97, Update the
_BLOCKED_HOSTNAMES frozenset in ssrf_protection.py to include additional cloud
metadata hostnames (e.g., 'metadata.azure.com', 'management.azure.com', and
other provider metadata hosts you want to cover) so the SSFR protection covers
Azure and other cloud providers in addition to existing 'localhost' and
'metadata.google.internal'; modify the _BLOCKED_HOSTNAMES set literal (symbol:
_BLOCKED_HOSTNAMES) to add these hostnames and keep the existing IP-range-based
checks for AWS/other providers intact.
| def _build_allowlist( | ||
| extra: Optional[Sequence[str]] = None, | ||
| ) -> list[ipaddress.IPv4Network | ipaddress.IPv6Network]: | ||
| """Merge per-call allowlist with the global env-var allowlist.""" | ||
| nets: list[ipaddress.IPv4Network | ipaddress.IPv6Network] = [] | ||
|
|
||
| # Global allowlist from environment | ||
| env_val = os.environ.get(SSRF_ALLOWLIST_ENV, '').strip() | ||
| if env_val: | ||
| for cidr in env_val.split(','): | ||
| cidr = cidr.strip() | ||
| if cidr: | ||
| try: | ||
| nets.append(ipaddress.ip_network(cidr, strict=False)) | ||
| except ValueError: | ||
| pass # silently skip malformed entries | ||
|
|
||
| # Per-call allowlist | ||
| for cidr in extra or []: | ||
| cidr_s = str(cidr).strip() | ||
| if cidr_s: | ||
| try: | ||
| nets.append(ipaddress.ip_network(cidr_s, strict=False)) | ||
| except ValueError: | ||
| pass | ||
|
|
||
| return nets |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Silent failure on malformed CIDRs is intentional but worth logging.
The code silently ignores malformed CIDR entries (lines 208-209, 217-218). While this is defensive, operators may not realize their allowlist is partially broken. Consider emitting a warning for malformed entries.
📝 Optional: Log malformed CIDR warnings
+import logging
+
+_logger = logging.getLogger(__name__)
+
def _build_allowlist(
extra: Optional[Sequence[str]] = None,
) -> list[ipaddress.IPv4Network | ipaddress.IPv6Network]:
...
try:
nets.append(ipaddress.ip_network(cidr, strict=False))
except ValueError:
- pass # silently skip malformed entries
+ _logger.warning('Ignoring malformed SSRF allowlist CIDR: %r', cidr)🧰 Tools
🪛 Ruff (0.15.7)
[warning] 204-204: for loop variable cidr overwritten by assignment target
(PLW2901)
[warning] 206-209: Use contextlib.suppress(ValueError) instead of try-except-pass
(SIM105)
[warning] 215-218: Use contextlib.suppress(ValueError) instead of try-except-pass
Replace try-except-pass with with contextlib.suppress(ValueError): ...
(SIM105)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nodes/src/nodes/library/ssrf_protection.py` around lines 194 - 220, The
_build_allowlist function currently swallows ValueError for malformed CIDR
strings; change the except blocks to log a warning instead of silently passing:
get a module logger via logging.getLogger(__name__) (or reuse an existing logger
if one is available), and in both exception handlers call logger.warning with
the malformed cidr string and its source (e.g., "SSRF_ALLOWLIST_ENV" vs
"per-call extra") and note that the entry will be skipped; keep the behavior of
skipping malformed entries but surface a clear warning including the bad value
and context.
| for family, _type, _proto, _canonname, sockaddr in addrinfos: | ||
| ip_str = sockaddr[0] | ||
| addr = ipaddress.ip_address(ip_str) | ||
| _check_ip(addr, hostname, allow_nets) | ||
| if ip_str not in resolved_ips: | ||
| resolved_ips.append(ip_str) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Rename unused loop variable family to _family.
Static analysis correctly flags that family is unpacked but not used. Prefix with underscore to indicate intentional discard.
♻️ Proposed fix
- for family, _type, _proto, _canonname, sockaddr in addrinfos:
+ for _family, _type, _proto, _canonname, sockaddr in addrinfos:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for family, _type, _proto, _canonname, sockaddr in addrinfos: | |
| ip_str = sockaddr[0] | |
| addr = ipaddress.ip_address(ip_str) | |
| _check_ip(addr, hostname, allow_nets) | |
| if ip_str not in resolved_ips: | |
| resolved_ips.append(ip_str) | |
| for _family, _type, _proto, _canonname, sockaddr in addrinfos: | |
| ip_str = sockaddr[0] | |
| addr = ipaddress.ip_address(ip_str) | |
| _check_ip(addr, hostname, allow_nets) | |
| if ip_str not in resolved_ips: | |
| resolved_ips.append(ip_str) |
🧰 Tools
🪛 Ruff (0.15.7)
[warning] 246-246: Loop control variable family not used within loop body
Rename unused family to _family
(B007)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nodes/src/nodes/library/ssrf_protection.py` around lines 246 - 251, Rename
the unused loop variable "family" to "_family" in the for-loop that unpacks
addrinfos (for family, _type, _proto, _canonname, sockaddr in addrinfos) inside
ssrf_protection.py so static analysis no longer flags an unused variable; update
the unpacking to (_family, _type, _proto, _canonname, sockaddr) and ensure there
are no other references to "family" in the surrounding function (e.g., _check_ip
usage remains unchanged).
| # --- SSRF protection: validate the resolved URL before connecting --- | ||
| validate_url(resolved_url, allowed_private=ssrf_allowed_private) |
There was a problem hiding this comment.
Critical: SSRF protection bypassed via HTTP redirects.
The SSRF validation at line 72 only checks the initial URL. However, requests.request() follows redirects by default (allow_redirects=True). An attacker can bypass SSRF protection:
- Submit
https://attacker.com/redirect(passes validation - public IP) - Attacker's server returns
302redirect tohttp://169.254.169.254/... requestsfollows the redirect without re-validation- Cloud metadata or internal service accessed
Disable automatic redirects and manually validate each hop, or use a custom transport adapter.
🔒 Proposed fix: disable automatic redirects
req_kwargs: Dict[str, Any] = {
'method': method.upper(),
'url': resolved_url,
'headers': req_headers,
'params': merged_params or None,
'auth': req_auth,
+ 'allow_redirects': False,
}Then handle redirects manually in the response, or document that redirects are intentionally not followed. If redirect-following is required, implement a session with a custom redirect hook that validates each Location header before following.
Also applies to: 99-100
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nodes/src/nodes/tool_http_request/http_client.py` around lines 71 - 72, The
SSRF check only validates the initial resolved_url via
validate_url(resolved_url, allowed_private=ssrf_allowed_private) but
requests.request follows redirects by default, allowing redirect-based SSRF; fix
by disabling automatic redirects in the HTTP call (set allow_redirects=False on
requests.request used in this module) and implement manual redirect handling: on
3xx responses read the Location header, resolve the next URL, call
validate_url(next_resolved_url, allowed_private=ssrf_allowed_private) before
following, enforce a max redirect count, and repeat; alternatively, if you must
keep automatic redirects, implement a requests.Session with a custom redirect
hook that validates each redirected Location using validate_url before allowing
the redirect.
| @staticmethod | ||
| def _build_ssrf_allowlist(cfg: dict) -> List[str]: | ||
| """Read the SSRF private-IP allowlist from the node config. | ||
|
|
||
| Expects ``cfg['ssrfAllowlist']`` to be a JSON array of strings | ||
| (CIDR notation), e.g. ``["192.168.1.0/24", "10.0.0.5/32"]``. | ||
| """ | ||
| raw = cfg.get('ssrfAllowlist') or [] | ||
| if not isinstance(raw, list): | ||
| try: | ||
| raw = _json.loads(str(raw)) | ||
| except (_json.JSONDecodeError, TypeError, ValueError): | ||
| raw = [] | ||
| result: List[str] = [] | ||
| for entry in raw: | ||
| val = str(entry).strip() if entry else '' | ||
| if val: | ||
| result.append(val) | ||
| return result |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify ssrfAllowlist field is missing from services.json
rg -n 'ssrfAllowlist' nodes/src/nodes/tool_http_request/services.json || echo "ssrfAllowlist field NOT found in services.json"Repository: rocketride-org/rocketride-server
Length of output: 122
🏁 Script executed:
# Check how services.json is used in the codebase
rg -n 'services\.json' nodes/src/nodes/tool_http_request/ | head -20Repository: rocketride-org/rocketride-server
Length of output: 58
🏁 Script executed:
# Look for how cfg parameter is constructed/populated
rg -n 'cfg\[' nodes/src/nodes/tool_http_request/IGlobal.py | head -20Repository: rocketride-org/rocketride-server
Length of output: 150
🏁 Script executed:
# Check for other optional fields in services.json and how they're handled
rg -n '"default"' nodes/src/nodes/tool_http_request/services.json | head -20Repository: rocketride-org/rocketride-server
Length of output: 328
🏁 Script executed:
# Search for any test files or examples using ssrfAllowlist
find nodes/src/nodes/tool_http_request -name '*test*' -o -name '*example*' | xargs grep -l 'ssrfAllowlist' 2>/dev/null || echo "No test files found with ssrfAllowlist"Repository: rocketride-org/rocketride-server
Length of output: 114
🏁 Script executed:
# Look at the overall structure of services.json to understand field patterns
head -60 nodes/src/nodes/tool_http_request/services.jsonRepository: rocketride-org/rocketride-server
Length of output: 1515
🏁 Script executed:
# Find where _build_ssrf_allowlist is called and where cfg comes from
rg -n '_build_ssrf_allowlist' nodes/src/nodes/tool_http_request/Repository: rocketride-org/rocketride-server
Length of output: 290
🏁 Script executed:
# Look for where IGlobal is instantiated and cfg is passed
rg -n 'IGlobal' nodes/src/nodes/tool_http_request/ -A 3Repository: rocketride-org/rocketride-server
Length of output: 2164
🏁 Script executed:
# Search for config loading/initialization code
rg -n 'cfg.*=' nodes/src/nodes/tool_http_request/IGlobal.py | head -20Repository: rocketride-org/rocketride-server
Length of output: 242
🏁 Script executed:
# Check if there's documentation about how config is populated
find nodes/src/nodes/tool_http_request -name '*.md' -o -name '*.txt' | xargs grep -l 'config\|cfg' 2>/dev/null || echo "No docs found"Repository: rocketride-org/rocketride-server
Length of output: 89
🏁 Script executed:
# Look at the __init__ or setup methods to see how cfg is used
rg -n 'def __init__|def setup|def __new__' nodes/src/nodes/tool_http_request/IGlobal.py -A 10Repository: rocketride-org/rocketride-server
Length of output: 58
🏁 Script executed:
# Find the Config class
rg -n 'class Config' nodes/src/nodes/tool_http_request/Repository: rocketride-org/rocketride-server
Length of output: 58
🏁 Script executed:
# Find Config imports in IGlobal.py
rg -n 'import.*Config|from.*Config' nodes/src/nodes/tool_http_request/IGlobal.pyRepository: rocketride-org/rocketride-server
Length of output: 114
🏁 Script executed:
# Search in rocketlib for Config class
rg -n 'def getNodeConfig' --type pyRepository: rocketride-org/rocketride-server
Length of output: 548
🏁 Script executed:
# Check if connConfig contains raw/undeclared config values
rg -n 'connConfig' nodes/src/nodes/tool_http_request/ -B 2 -A 2Repository: rocketride-org/rocketride-server
Length of output: 965
🏁 Script executed:
# Read the Config.getNodeConfig implementation
sed -n '71,150p' packages/ai/src/ai/common/config.pyRepository: rocketride-org/rocketride-server
Length of output: 3338
🏁 Script executed:
# Look for more context around getNodeConfig
sed -n '60,200p' packages/ai/src/ai/common/config.pyRepository: rocketride-org/rocketride-server
Length of output: 5272
🏁 Script executed:
# Look for cfg.get() calls with defaults in other node files, to see if undeclared optional params are common
rg -n 'cfg\.get\(' nodes/src/nodes/ --type py | head -30Repository: rocketride-org/rocketride-server
Length of output: 3238
🏁 Script executed:
# Check another node's services.json to compare patterns
find nodes/src/nodes -name 'services.json' -type f | head -3 | xargs wc -lRepository: rocketride-org/rocketride-server
Length of output: 247
🏁 Script executed:
# Compare fields declared in another node vs cfg.get calls
fd 'services.json' nodes/src/nodes/tool_anthropic/ | xargs cat | head -100Repository: rocketride-org/rocketride-server
Length of output: 194
🏁 Script executed:
# Check tool_python services.json for timeout and allowedModules fields
rg -n 'timeout|allowedModules' nodes/src/nodes/tool_python/services.jsonRepository: rocketride-org/rocketride-server
Length of output: 270
🏁 Script executed:
# Check tool_http_request services.json for urlWhitelist
rg -n 'urlWhitelist' nodes/src/nodes/tool_http_request/services.jsonRepository: rocketride-org/rocketride-server
Length of output: 357
🏁 Script executed:
# Look at the complete tool_http_request services.json to see all declared fields
cat nodes/src/nodes/tool_http_request/services.json | jq '.fields | keys' 2>/dev/null || rg -n '".*".*:' nodes/src/nodes/tool_http_request/services.json | grep -E '^\s*"[^"]+":' | head -30Repository: rocketride-org/rocketride-server
Length of output: 373
Add http_request.ssrfAllowlist field declaration to services.json.
The _build_ssrf_allowlist method reads cfg.get('ssrfAllowlist') (line 116), but this field is not declared in the node's services.json. Without this declaration, the configuration system will not expose the field to users via the UI, making the SSRF allowlist feature undiscoverable and non-functional. All other optional parameters in this node (urlWhitelist, serverName) are properly declared in services.json; ssrfAllowlist must be too.
Add to services.json fields section:
"http_request.ssrfAllowlist": {
"type": "string",
"title": "SSRF Allowlist",
"description": "JSON array of CIDR strings to permit (e.g. [\"192.168.1.0/24\"])",
"default": "[]"
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nodes/src/nodes/tool_http_request/IGlobal.py` around lines 109 - 127, The
node reads cfg.get('ssrfAllowlist') in the _build_ssrf_allowlist method but that
configuration key isn't declared in services.json, so add a new entry for
"http_request.ssrfAllowlist" to the node's services.json fields section; declare
it as a string (or appropriate type) with a title like "SSRF Allowlist", a
description noting it expects a JSON array of CIDR strings (e.g.
["192.168.1.0/24"]), and a sensible default (e.g. "[]") so the UI exposes and
persists the ssrfAllowlist config consumed by _build_ssrf_allowlist.
| def _fake_getaddrinfo_multi(ips: List[Tuple[str, int]]): | ||
| """Return a patched getaddrinfo that resolves to multiple addresses.""" | ||
|
|
||
| def _patched(host, port, **_kw): | ||
| results = [] | ||
| for ip, family in ips: | ||
| results.append((family, socket.SOCK_STREAM, socket.IPPROTO_TCP, '', (ip, port))) | ||
| return results | ||
|
|
||
| return _patched |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Remove unused _fake_getaddrinfo_multi helper.
This function is defined but never called in any test. Either remove it or add tests that use it (e.g., testing multi-homed hosts where some IPs are blocked and some are not).
🗑️ Proposed fix: remove dead code
-def _fake_getaddrinfo_multi(ips: List[Tuple[str, int]]):
- """Return a patched getaddrinfo that resolves to multiple addresses."""
-
- def _patched(host, port, **_kw):
- results = []
- for ip, family in ips:
- results.append((family, socket.SOCK_STREAM, socket.IPPROTO_TCP, '', (ip, port)))
- return results
-
- return _patched📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _fake_getaddrinfo_multi(ips: List[Tuple[str, int]]): | |
| """Return a patched getaddrinfo that resolves to multiple addresses.""" | |
| def _patched(host, port, **_kw): | |
| results = [] | |
| for ip, family in ips: | |
| results.append((family, socket.SOCK_STREAM, socket.IPPROTO_TCP, '', (ip, port))) | |
| return results | |
| return _patched |
🧰 Tools
🪛 Ruff (0.15.7)
[warning] 71-71: Missing return type annotation for private function _fake_getaddrinfo_multi
(ANN202)
[warning] 74-74: Missing return type annotation for private function _patched
(ANN202)
[warning] 74-74: Unused function argument: host
(ARG001)
[warning] 74-74: Missing type annotation for **_kw
(ANN003)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nodes/test/test_ssrf_protection.py` around lines 71 - 80, The helper function
_fake_getaddrinfo_multi in test_ssrf_protection.py is unused; remove this dead
code or add tests that exercise it. Either delete the _fake_getaddrinfo_multi
definition to clean up the file, or add a test (e.g.,
test_blocked_and_allowed_multi_homed_host) that calls _fake_getaddrinfo_multi to
simulate a multi-homed host with both blocked and allowed IPs and asserts the
SSRF protection behavior; update imports/mocks as needed so the new test
actually uses _fake_getaddrinfo_multi.
| class TestBlockedIPv6: | ||
| """Private / reserved IPv6 addresses must be blocked.""" | ||
|
|
||
| @pytest.mark.parametrize( | ||
| 'ip', | ||
| [ | ||
| '::1', # loopback | ||
| 'fc00::1', # unique local | ||
| 'fd12:3456::1', # unique local | ||
| 'fe80::1', # link-local | ||
| ], | ||
| ) | ||
| def test_blocked_ipv6(self, ip): | ||
| with patch('ssrf_protection.socket.getaddrinfo', _fake_getaddrinfo(ip)): | ||
| with pytest.raises(SSRFError, match='private/reserved range'): | ||
| validate_url('http://example.com/') |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider adding IPv4-mapped IPv6 test case.
The ssrf_protection.py module has special handling for IPv4-mapped IPv6 addresses (lines 263-265). Add a test to verify this path works correctly.
💚 Proposed test addition
def test_ipv4_mapped_ipv6_blocked(self):
"""IPv4-mapped IPv6 addresses with private v4 must be blocked."""
# ::ffff:192.168.1.1 embeds a private IPv4
with patch('ssrf_protection.socket.getaddrinfo', _fake_getaddrinfo('::ffff:192.168.1.1')):
with pytest.raises(SSRFError, match='private/reserved range'):
validate_url('http://example.com/')🧰 Tools
🪛 Ruff (0.15.7)
[warning] 142-143: Use a single with statement with multiple contexts instead of nested with statements
(SIM117)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nodes/test/test_ssrf_protection.py` around lines 129 - 144, Add a new test
method to TestBlockedIPv6 that uses the existing _fake_getaddrinfo helper to
return an IPv4-mapped IPv6 address (e.g. '::ffff:192.168.1.1') and asserts
validate_url raises SSRFError with the "private/reserved range" message; place
the test alongside test_blocked_ipv6 and follow the same patch pattern (patching
'ssrf_protection.socket.getaddrinfo') so the code path in ssrf_protection that
handles IPv4-mapped IPv6 is exercised.
|
Closing this. The PR bundles the TTS node together with an SSRF protection library that overlaps with #517, and at 1174 lines across 11 files it's doing too much in one shot. I'll resubmit the TTS node and the SSRF hardening as separate, focused PRs. |
|
@charliegillet thanks for the effort, I have been working since last week on audio TTS PR #583 that supports different models and also cloud service as OpenAI TTS |
) Adds `permissions: contents: read` (least-privilege default) to the top of five workflow files that had no top-level permissions declaration: - .github/workflows/_build.yaml (Scorecard #513) - .github/workflows/_docker.yaml (Scorecard #514) - .github/workflows/_init.yaml (Scorecard #515) - .github/workflows/_release.yaml (Scorecard #516) - .github/workflows/sync-models.yml (Scorecard #605) Existing job-level `permissions:` blocks within these files are unaffected; the top-level acts as a default ceiling that individual jobs can raise where they actually need write access (Docker push, release tag push, etc.). Stage 1 of 3 in cleaning up TokenPermissionsID findings: Stage 1 (this PR): no top-level permissions -> 5 alerts Stage 2 (separate): broad top-level writes -> 3 alerts Stage 3 (separate): job-level writes for -> 7 alerts legitimate release ops SOC2 CC7.1 vulnerability management — mechanical batch resolution following SECURITY.md disposition framework (Fix). Fixes #789 Co-authored-by: Anand Ray <anand.ray@rocketride.ai>
Summary
tts_openaipipeline node that converts text input to speech audio output using the OpenAI TTS APIlibrary/ssrf_protection.py) and integrates it into the existingtool_http_requestnodeType
Feature
Why this feature fits this codebase
RocketRide's node system uses a standard
IGlobal/IInstancepattern for pipeline components. The newtts_openainode follows this exactly:IGlobal.beginGlobal()reads config viaConfig.getNodeConfig(), validates model/voice/speed/format againstVALID_MODELS/VALID_VOICES/VALID_FORMATSconstants, and creates a sharedOpenAIclient.IInstance.writeText()receives text from the input lane, callsself.IGlobal._client.audio.speech.create(), and writes the resulting bytes through theAVI_ACTION.BEGIN/WRITE/ENDpattern to the audio output lane — the same streaming pattern used by other media-producing nodes. Theservices.jsonregisters withclassType: ["audio"], defineslanes: {"text": ["audio"]}for pipeline wiring, and usesllm.cloud.apikeyfor the API key field (shared with other OpenAI nodes). The SSRF protection addition addresses a security gap in the existingtool_http_requestnode:http_client.py'sexecute_request()now callsvalidate_url()before connecting, andhttp_driver.py'sHttpDriver.__init__()accepts anssrf_allowed_privatelist so self-hosted operators can allowlist internal services. Thevalidate_url()function inlibrary/ssrf_protection.pychecks URL scheme, resolves DNS, and validates all resulting IPs against RFC 1918/loopback/link-local/metadata ranges.What changed
nodes/src/nodes/tts_openai/__init__.py— Module entry point exportingIGlobalandIInstancenodes/src/nodes/tts_openai/IGlobal.py— Global config: API key validation viaclient.models.list(), model/voice/speed/format validation and clamping, sharedOpenAIclient creationnodes/src/nodes/tts_openai/IInstance.py— Instance:writeText()calls OpenAI TTS API, writes audio viaAVI_ACTIONpattern, handlesRateLimitError/APIConnectionErroras transient warningsnodes/src/nodes/tts_openai/services.json— Node definition with lanes (text->audio), two preconfig profiles (tts-1, tts-1-hd), voice/speed/format fields, and test casesnodes/src/nodes/tts_openai/requirements.txt— Declaresopenaidependencynodes/src/nodes/library/ssrf_protection.py— New SSRF protection module:validate_url()checks scheme, blocked hostnames, DNS resolution, and IP ranges (RFC 1918, loopback, link-local, metadata, multicast); supportsROCKETRIDE_SSRF_ALLOWLISTenv var and per-call allowlistsnodes/src/nodes/tool_http_request/IGlobal.py— Added_build_ssrf_allowlist()static method; passesssrf_allowed_privatetoHttpDrivernodes/src/nodes/tool_http_request/http_driver.py—HttpDriver.__init__()now acceptsssrf_allowed_private; passes it through toexecute_request()nodes/src/nodes/tool_http_request/http_client.py—execute_request()now callsvalidate_url(resolved_url, allowed_private=ssrf_allowed_private)before making the HTTP requestnodes/test/test_ssrf_protection.py— 304-line test suite covering blocked IPs, allowed public IPs, scheme validation, hostname blocking, allowlist (env var + per-call), and DNS resolution mockingValidation
FORMAT_MIME_TYPESvalidateConfigtool_http_requestnow blocks requests tohttp://127.0.0.1,http://169.254.169.254(cloud metadata), andhttp://192.168.1.1; confirm public URLs still work; confirm allowlisted CIDRs viaROCKETRIDE_SSRF_ALLOWLISTbypass the blockpytest nodes/test/test_ssrf_protection.py— all tests should passruff checkandruff format --checkon all new filesHow this could be extended
The TTS node pattern can be reused for other speech providers (ElevenLabs, Google Cloud TTS, Azure Speech) by swapping the API client in
IGlobaland thespeech.create()call inIInstance.writeText(). The SSRF protection module is generic and should be adopted by any future node that makes outbound HTTP requests (e.g.,tool_exa_search,tool_firecrawl) to prevent agents from exfiltrating data to internal services.Closes #411
#Hack-with-bay-2