Skip to content

chore: run ruff format across all Python node files#505

Closed
charliegillet wants to merge 1 commit into
rocketride-org:developfrom
charliegillet:chore/ruff-format-all-python
Closed

chore: run ruff format across all Python node files#505
charliegillet wants to merge 1 commit into
rocketride-org:developfrom
charliegillet:chore/ruff-format-all-python

Conversation

@charliegillet
Copy link
Copy Markdown
Contributor

@charliegillet charliegillet commented Mar 30, 2026

Summary

  • Ran ruff format and ruff check --fix on all 324 Python files under nodes/src/nodes/
  • 45 files reformatted, 279 files already compliant, 1 lint auto-fix applied
  • This is a formatting-only change with no logic modifications -- purely whitespace, blank lines, quote style, and trailing whitespace cleanup per the project's existing pyproject.toml ruff config

Closes #432

Test plan

  • Verify ruff format --check nodes/src/nodes/ passes with zero reformats needed
  • Verify ruff check nodes/src/nodes/ shows no new errors (remaining warnings are pre-existing and out of scope)
  • Confirm all existing tests pass (no logic was modified)
  • Spot-check a few diffs to confirm changes are formatting-only

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Chores
    • Code formatting and style improvements across multiple modules for consistency and maintainability

Apply ruff format and ruff check --fix to all 324 Python files under
nodes/src/nodes/. 45 files were reformatted and 1 lint auto-fix was
applied. This is a formatting-only change with no logic modifications.

Closes rocketride-org#432

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 30, 2026

📝 Walkthrough

Walkthrough

Large-scale formatting pass across Python files in the nodes module applying consistent quote styles, collapsing multi-line expressions to single lines, and normalizing whitespace. All changes are purely stylistic with no functional logic alterations.

Changes

Cohort / File(s) Summary
Agent Nodes
nodes/src/nodes/agent_crewai/crewai.py, nodes/src/nodes/agent_langchain/IInstance.py, nodes/src/nodes/agent_langchain/langchain.py, nodes/src/nodes/agent_rocketride/IGlobal.py, nodes/src/nodes/agent_rocketride/executor.py, nodes/src/nodes/agent_rocketride/rocketride_agent.py
Condensed multi-line string literals to single lines, normalized blank lines, and updated quote style consistency in agent initialization and tool handling.
Index Search
nodes/src/nodes/index_search/*
Collapsed multi-line expressions in query builders, normalized quote styles across parsing/validation helpers, condensed debug/warning messages, and removed trailing whitespace.
Anonymize Module
nodes/src/nodes/anonymize/IInstance.py, nodes/src/nodes/anonymize/anonymize.py, nodes/src/nodes/anonymize/glinerRecognizer.py
Condensed list/expression literals to single lines and normalized blank-line spacing around method definitions and logic blocks.
Database Nodes
nodes/src/nodes/db_mysql/IGlobal.py, nodes/src/nodes/db_postgres/IGlobal.py
Removed alignment padding in dictionary literal formatting within _connection_params methods.
LLM & Accessibility Nodes
nodes/src/nodes/accessibility_describe/accessibility_vision.py, nodes/src/nodes/llm_openai/IGlobal.py, nodes/src/nodes/llm_ibm_watson/ibm_watson.py
Simplified prompt construction, collapsed error messages to single lines, reformatted constant definitions, and normalized quote styles.
Chart & Audio Nodes
nodes/src/nodes/audio_transcribe/IGlobal.py, nodes/src/nodes/chart_chartjs/IInstance.py, nodes/src/nodes/chart_chartjs/chartjs_driver.py
Compressed list comprehensions and string literals to single lines, reformatted multi-line Question/instruction arguments, and adjusted constant formatting.
MCP Client
nodes/src/nodes/mcp_client/IGlobal.py, nodes/src/nodes/mcp_client/IInstance.py, nodes/src/nodes/mcp_client/__init__.py, nodes/src/nodes/mcp_client/mcp_driver.py, nodes/src/nodes/mcp_client/mcp_sse_client.py, nodes/src/nodes/mcp_client/mcp_stdio_client.py, nodes/src/nodes/mcp_client/mcp_streamable_http_client.py
Condensed multi-line exception messages, normalized type annotation quotes, and removed trailing blank lines throughout transport and protocol handlers.
Other Nodes
nodes/src/nodes/autopipe/IGlobal.py, nodes/src/nodes/library/msgraph/application_permissions.py, nodes/src/nodes/library/permissions/utils.py, nodes/src/nodes/llamaparse/IInstance.py, nodes/src/nodes/memory_internal/memory.py, nodes/src/nodes/ner/IGlobal.py, nodes/src/nodes/ner/IInstance.py, nodes/src/nodes/ocr/IGlobal.py, nodes/src/nodes/ocr/IInstance.py, nodes/src/nodes/tool_firecrawl/firecrawl_driver.py, nodes/src/nodes/tool_http_request/IGlobal.py, nodes/src/nodes/tool_http_request/http_client.py, nodes/src/nodes/tool_http_request/http_driver.py, nodes/src/nodes/vectordb_postgres/vectordb_postgres.py
Applied consistent formatting across module initialization, error messages, SQL query strings, slice expressions, and dictionary/list literals; removed trailing whitespace and normalized blank-line placement.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

Suggested labels

module:nodes

Suggested reviewers

  • jmaionchi
  • stepmikhaylov
  • Rod-Christensen

Poem

🐰 A hop and a bound through the code we go,
Ruff format clean—let the Python strings flow!
From double to single, the whitespace aligned,
Each line finds its place, no stray spaces to find. ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.27% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Linked Issues check ❓ Inconclusive The PR satisfies the primary objective from issue #432 (ruff format pass on all Python files), but the code changes alone do not implement CI enforcement or CONTRIBUTING.md updates mentioned in the acceptance criteria. Verify that related changes to pyproject.toml CI configuration and CONTRIBUTING.md documentation are included in this PR or merged separately to fully satisfy issue #432's acceptance criteria.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main change: running ruff format across all Python node files as a formatting-only update.
Out of Scope Changes check ✅ Passed All changes are in-scope formatting modifications (whitespace, blank lines, quote style) directly resulting from running ruff format, with no unrelated logic or feature changes introduced.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nodes/src/nodes/accessibility_describe/accessibility_vision.py`:
- Line 108: Replace the long inline raise messages with a named constant or a
dedicated exception to satisfy TRY003: create a module-level constant (e.g.,
MISSING_GOOGLE_AI_API_KEY_MSG) or a custom exception class (e.g.,
MissingGoogleAPIKeyError) in accessibility_vision.py and use that constant/class
in the three raise sites where ValueError is raised (the current inline raises
around the Google AI API key checks). Keep the raise calls concise (raise
ValueError(MISSING_GOOGLE_AI_API_KEY_MSG) or raise MissingGoogleAPIKeyError) so
the message text is centralized and the raises are short.
- Line 70: Replace the double-quoted regular string literal used as the value
for the 'clock' key with a single-quoted literal (i.e., change "\"\\n\\nUse
clock positions for spatial references (12 o'clock = straight ahead).\"" to a
single-quoted version) to match the module's quote convention; also locate and
convert the other double-quoted regular string literal noted in the review (the
second similar occurrence in this module) to single quotes so all regular string
literals in accessibility_vision.py follow the project's single-quote style.

In `@nodes/src/nodes/anonymize/glinerRecognizer.py`:
- Line 237: Replace the wrapped generator-to-list expression assigning
existing_matches with a direct list comprehension to satisfy Ruff C400; update
the assignment that currently uses list((... for ...)) to use [(m['offset'],
m['length']) for m in ((m.get('location', {}).get('inChars') or m) for m in
text_matches)] so existing_matches is a plain list of tuples derived from
text_matches (keep the inner fallback logic using m.get('location',
{}).get('inChars') or m).

In `@nodes/src/nodes/anonymize/IInstance.py`:
- Line 32: The class-level DEFAULT_PII_LABELS is defined as a mutable list on
IInstance; replace it with an immutable tuple (e.g., DEFAULT_PII_LABELS =
('person', 'name', ...)) to prevent accidental cross-instance mutation, and
update any usages or type hints that expect a list to accept a tuple (or convert
with list(...) at use sites) so behavior remains the same while the constant
stays immutable.

In `@nodes/src/nodes/index_search/elasticsearch_store.py`:
- Line 106: Replace the generic Exception raises with specific exception types:
where the code validates the similarity option (the check that currently raises
"Similarity must be one of: cosine, l2_norm, dot_product") change the raise to
ValueError with the same message; for the bulk indexing failure path (the raise
at the other location around line 540) raise RuntimeError (or a module-specific
custom exception class) instead of Exception and preserve the original error
message and any wrapped exception info. Update the surrounding
docstring/comments and any tests that expect the old Exception type to assert
the new specific exceptions (ValueError for config validation and RuntimeError
for bulk indexing failures).

In `@nodes/src/nodes/index_search/IGlobal.py`:
- Line 214: In IGlobal.py update the regular string literals used in the warning
calls to use single quotes instead of double quotes; locate the warning(...)
invocation(s) (e.g., the call that currently logs "Index name is invalid. Use
1-255 lowercase chars: letters, digits, '_', '-', '.'; no '/' or spaces") and
replace the double-quoted literal with a single-quoted literal to match the repo
convention for nodes/**/*.py (also apply the same change to the similar string
at the other occurrence around line 260).
- Around line 336-340: The try/except around parsing slop in IGlobal is too
broad; narrow the catch to only handle coercion errors—replace the generic
except Exception in the block that reads connConfig.get('slop') and sets
self.search_exact_slop (used with self.search_match_operator == 'exact') with
except (ValueError, TypeError) so only int() conversion/coercion errors are
swallowed and other exceptions propagate.

In `@nodes/src/nodes/llamaparse/IInstance.py`:
- Line 322: Replace the explicit str() conversion in the f-string that builds
error_text (the variable named error_text in IInstance.py where the exception
variable is e) by using the f-string conversion flag {e!s} instead of {str(e)};
update the f-string in the error handling block that constructs the LlamaParse
Processing Error message so it reads {e!s} to satisfy Ruff RUF010 while
preserving the rest of the message and surrounding variables (mime_type,
document_data).

In `@nodes/src/nodes/llm_ibm_watson/ibm_watson.py`:
- Line 71: The current raise ValueError(...) in ibm_watson.py can be replaced by
a custom exception to satisfy TRY003: create an InvalidLocationError(Exception
or ValueError subclass) that accepts location and valid_locations, builds the
message using location and _VALID_LOCATIONS, and replace the raise
ValueError(...) with raise InvalidLocationError(location, _VALID_LOCATIONS);
ensure the new class is defined near related helpers (e.g., above where location
is validated) and update any tests or callers to catch InvalidLocationError if
needed.

In `@nodes/src/nodes/ocr/IInstance.py`:
- Around line 49-50: The private helper function _diag is missing a return type
annotation (Ruff ANN202); update its signature (def _diag(msg) -> None:) to
explicitly return None so the linter is satisfied, keeping the function body
unchanged; locate the _diag definition inside the IInstance class/function and
add the "-> None" return annotation to the def line.

In `@nodes/src/nodes/vectordb_postgres/vectordb_postgres.py`:
- Around line 53-55: The SQL for the 'create_collection' entry is a single long
literal and the surrounding "# fmt: off/on" implies you wanted preserved manual
formatting; split the SQL into a multi-line Python string by concatenating
adjacent string literals (or a triple-quoted string) so the CREATE TABLE clause
is readable while keeping the same placeholders ({collection}, {vector_size}),
and then either remove the "# fmt: off/on" markers or ensure your ruff config
respects them; update the 'create_collection' value accordingly in
vectordb_postgres.py so the statement remains identical semantically but is
formatted across multiple lines for maintainability.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2db39755-61cb-4cab-8158-93842b038ea0

📥 Commits

Reviewing files that changed from the base of the PR and between b6b2dc1 and 0e0eddd.

📒 Files selected for processing (45)
  • nodes/src/nodes/accessibility_describe/accessibility_vision.py
  • nodes/src/nodes/agent_crewai/crewai.py
  • nodes/src/nodes/agent_langchain/IInstance.py
  • nodes/src/nodes/agent_langchain/langchain.py
  • nodes/src/nodes/agent_rocketride/IGlobal.py
  • nodes/src/nodes/agent_rocketride/executor.py
  • nodes/src/nodes/agent_rocketride/rocketride_agent.py
  • nodes/src/nodes/anonymize/IInstance.py
  • nodes/src/nodes/anonymize/anonymize.py
  • nodes/src/nodes/anonymize/glinerRecognizer.py
  • nodes/src/nodes/audio_transcribe/IGlobal.py
  • nodes/src/nodes/autopipe/IGlobal.py
  • nodes/src/nodes/chart_chartjs/IInstance.py
  • nodes/src/nodes/chart_chartjs/chartjs_driver.py
  • nodes/src/nodes/db_mysql/IGlobal.py
  • nodes/src/nodes/db_postgres/IGlobal.py
  • nodes/src/nodes/index_search/IEndpoint.py
  • nodes/src/nodes/index_search/IGlobal.py
  • nodes/src/nodes/index_search/IInstance.py
  • nodes/src/nodes/index_search/__init__.py
  • nodes/src/nodes/index_search/constants.py
  • nodes/src/nodes/index_search/elasticsearch_store.py
  • nodes/src/nodes/index_search/opensearch_client.py
  • nodes/src/nodes/library/msgraph/application_permissions.py
  • nodes/src/nodes/library/permissions/utils.py
  • nodes/src/nodes/llamaparse/IInstance.py
  • nodes/src/nodes/llm_ibm_watson/ibm_watson.py
  • nodes/src/nodes/llm_openai/IGlobal.py
  • nodes/src/nodes/mcp_client/IGlobal.py
  • nodes/src/nodes/mcp_client/IInstance.py
  • nodes/src/nodes/mcp_client/__init__.py
  • nodes/src/nodes/mcp_client/mcp_driver.py
  • nodes/src/nodes/mcp_client/mcp_sse_client.py
  • nodes/src/nodes/mcp_client/mcp_stdio_client.py
  • nodes/src/nodes/mcp_client/mcp_streamable_http_client.py
  • nodes/src/nodes/memory_internal/memory.py
  • nodes/src/nodes/ner/IGlobal.py
  • nodes/src/nodes/ner/IInstance.py
  • nodes/src/nodes/ocr/IGlobal.py
  • nodes/src/nodes/ocr/IInstance.py
  • nodes/src/nodes/tool_firecrawl/firecrawl_driver.py
  • nodes/src/nodes/tool_http_request/IGlobal.py
  • nodes/src/nodes/tool_http_request/http_client.py
  • nodes/src/nodes/tool_http_request/http_driver.py
  • nodes/src/nodes/vectordb_postgres/vectordb_postgres.py
💤 Files with no reviewable changes (7)
  • nodes/src/nodes/mcp_client/mcp_driver.py
  • nodes/src/nodes/agent_langchain/IInstance.py
  • nodes/src/nodes/autopipe/IGlobal.py
  • nodes/src/nodes/mcp_client/mcp_stdio_client.py
  • nodes/src/nodes/mcp_client/IInstance.py
  • nodes/src/nodes/mcp_client/mcp_streamable_http_client.py
  • nodes/src/nodes/mcp_client/init.py

# Spatial format prompt modifiers
SPATIAL_PROMPTS = {
'clock': '\n\nUse clock positions for spatial references (12 o\'clock = straight ahead).',
'clock': "\n\nUse clock positions for spatial references (12 o'clock = straight ahead).",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Use single quotes for regular string literals in nodes/**/*.py.

These two updated literals switched to double quotes and now diverge from the node-level quote convention.

Proposed quote-style fix
-    'clock': "\n\nUse clock positions for spatial references (12 o'clock = straight ahead).",
+    'clock': '\n\nUse clock positions for spatial references (12 o\'clock = straight ahead).',
...
-            return f"Model '{self._model}' is currently unavailable. Please try a different model."
+            return f'Model \'{self._model}\' is currently unavailable. Please try a different model.'
As per coding guidelines, `nodes/**/*.py`: Python pipeline nodes: use single quotes, ruff for linting/formatting, PEP 257 docstrings, target Python 3.10+.

Also applies to: 147-147

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/accessibility_describe/accessibility_vision.py` at line 70,
Replace the double-quoted regular string literal used as the value for the
'clock' key with a single-quoted literal (i.e., change "\"\\n\\nUse clock
positions for spatial references (12 o'clock = straight ahead).\"" to a
single-quoted version) to match the module's quote convention; also locate and
convert the other double-quoted regular string literal noted in the review (the
second similar occurrence in this module) to single quotes so all regular string
literals in accessibility_vision.py follow the project's single-quote style.

raise ValueError(
'Missing Google AI API key. Get one at https://aistudio.google.com/apikey'
)
raise ValueError('Missing Google AI API key. Get one at https://aistudio.google.com/apikey')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

TRY003 warnings remain on long inline ValueError messages.

Ruff is still flagging these lines; consider moving message text to named constants or a dedicated exception type to keep raises concise and consistent.

Also applies to: 112-112, 189-189

🧰 Tools
🪛 Ruff (0.15.7)

[warning] 108-108: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/accessibility_describe/accessibility_vision.py` at line 108,
Replace the long inline raise messages with a named constant or a dedicated
exception to satisfy TRY003: create a module-level constant (e.g.,
MISSING_GOOGLE_AI_API_KEY_MSG) or a custom exception class (e.g.,
MissingGoogleAPIKeyError) in accessibility_vision.py and use that constant/class
in the three raise sites where ValueError is raised (the current inline raises
around the Google AI API key checks). Keep the raise calls concise (raise
ValueError(MISSING_GOOGLE_AI_API_KEY_MSG) or raise MissingGoogleAPIKeyError) so
the message text is centralized and the raises are short.

(m['offset'], m['length'])
for m in ((m.get('location', {}).get('inChars') or m) for m in text_matches)
)
existing_matches = list((m['offset'], m['length']) for m in ((m.get('location', {}).get('inChars') or m) for m in text_matches))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the exact pattern and nearby context for C400-style constructions.
rg -n -C2 "existing_matches\s*=\s*list\(" nodes/src/nodes/anonymize/glinerRecognizer.py

Repository: rocketride-org/rocketride-server

Length of output: 388


Rewrite existing_matches as a direct list comprehension to comply with Ruff C400.

Use [(m['offset'], m['length']) for m in ...] instead of wrapping the comprehension with list().

♻️ Proposed cleanup
-        existing_matches = list((m['offset'], m['length']) for m in ((m.get('location', {}).get('inChars') or m) for m in text_matches))
+        existing_matches = [(m['offset'], m['length']) for m in ((m.get('location', {}).get('inChars') or m) for m in text_matches)]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
existing_matches = list((m['offset'], m['length']) for m in ((m.get('location', {}).get('inChars') or m) for m in text_matches))
existing_matches = [(m['offset'], m['length']) for m in ((m.get('location', {}).get('inChars') or m) for m in text_matches)]
🧰 Tools
🪛 Ruff (0.15.7)

[warning] 237-237: Unnecessary generator (rewrite as a list comprehension)

Rewrite as a list comprehension

(C400)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/anonymize/glinerRecognizer.py` at line 237, Replace the
wrapped generator-to-list expression assigning existing_matches with a direct
list comprehension to satisfy Ruff C400; update the assignment that currently
uses list((... for ...)) to use [(m['offset'], m['length']) for m in
((m.get('location', {}).get('inChars') or m) for m in text_matches)] so
existing_matches is a plain list of tuples derived from text_matches (keep the
inner fallback logic using m.get('location', {}).get('inChars') or m).

'bank account', 'passport number', 'driver license'
]

DEFAULT_PII_LABELS = ['person', 'name', 'email', 'phone number', 'address', 'social security number', 'credit card number', 'date of birth', 'organization', 'company', 'location', 'ip address', 'bank account', 'passport number', 'driver license']
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether DEFAULT_PII_LABELS is mutated anywhere (should return no mutation sites).
rg -n -C2 "DEFAULT_PII_LABELS\s*(\[|\=|\.append|\.extend|\.remove|\.pop|\.clear|\[.*\]\s*=)" nodes/src/nodes/anonymize/IInstance.py nodes/src/nodes/anonymize

Repository: rocketride-org/rocketride-server

Length of output: 247


🏁 Script executed:

# First, let me examine the file structure and context around line 32
cat -n nodes/src/nodes/anonymize/IInstance.py | head -50

Repository: rocketride-org/rocketride-server

Length of output: 2840


🏁 Script executed:

# Search for DEFAULT_PII_LABELS usage and potential mutations with simpler, corrected patterns
rg -n "DEFAULT_PII_LABELS" nodes/src/nodes/anonymize/

Repository: rocketride-org/rocketride-server

Length of output: 530


🏁 Script executed:

# Check for mutations via append, extend, remove, pop, clear, or direct assignment
rg -n "\.append\(|\.extend\(|\.remove\(|\.pop\(|\.clear\(" nodes/src/nodes/anonymize/IInstance.py

Repository: rocketride-org/rocketride-server

Length of output: 58


🏁 Script executed:

# Search the entire anonymize directory for DEFAULT_PII_LABELS mutations or reassignment
rg -n "DEFAULT_PII_LABELS" nodes/src/nodes/anonymize/ -A 2 -B 2

Repository: rocketride-org/rocketride-server

Length of output: 1109


🏁 Script executed:

# Check if there are any attempts to modify the list through indexing or methods
rg -n "DEFAULT_PII_LABELS\[|DEFAULT_PII_LABELS\.pop|DEFAULT_PII_LABELS\.remove|DEFAULT_PII_LABELS\.append|DEFAULT_PII_LABELS\.extend|DEFAULT_PII_LABELS\.clear" nodes/src/nodes/anonymize/

Repository: rocketride-org/rocketride-server

Length of output: 58


Avoid mutable class-level default for DEFAULT_PII_LABELS.

Line 32 defines a mutable list at class scope; while no mutations are currently made in the codebase, accidental mutation would leak across instances. Prefer an immutable tuple constant.

🛡️ Proposed fix
-    DEFAULT_PII_LABELS = ['person', 'name', 'email', 'phone number', 'address', 'social security number', 'credit card number', 'date of birth', 'organization', 'company', 'location', 'ip address', 'bank account', 'passport number', 'driver license']
+    DEFAULT_PII_LABELS = (
+        'person',
+        'name',
+        'email',
+        'phone number',
+        'address',
+        'social security number',
+        'credit card number',
+        'date of birth',
+        'organization',
+        'company',
+        'location',
+        'ip address',
+        'bank account',
+        'passport number',
+        'driver license',
+    )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
DEFAULT_PII_LABELS = ['person', 'name', 'email', 'phone number', 'address', 'social security number', 'credit card number', 'date of birth', 'organization', 'company', 'location', 'ip address', 'bank account', 'passport number', 'driver license']
DEFAULT_PII_LABELS = (
'person',
'name',
'email',
'phone number',
'address',
'social security number',
'credit card number',
'date of birth',
'organization',
'company',
'location',
'ip address',
'bank account',
'passport number',
'driver license',
)
🧰 Tools
🪛 Ruff (0.15.7)

[warning] 32-32: Mutable default value for class attribute

(RUF012)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/anonymize/IInstance.py` at line 32, The class-level
DEFAULT_PII_LABELS is defined as a mutable list on IInstance; replace it with an
immutable tuple (e.g., DEFAULT_PII_LABELS = ('person', 'name', ...)) to prevent
accidental cross-instance mutation, and update any usages or type hints that
expect a list to accept a tuple (or convert with list(...) at use sites) so
behavior remains the same while the constant stays immutable.

raise Exception(
'Similarity must be one of: cosine, l2_norm, dot_product'
)
raise Exception('Similarity must be one of: cosine, l2_norm, dot_product')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Use more specific exception types in touched paths.

Lines 106 and 540 still raise generic Exception; this keeps Ruff TRY warnings alive and weakens error contracts. Prefer ValueError for config validation and RuntimeError (or a custom error type) for bulk indexing failures.

Proposed refinement
-            raise Exception('Similarity must be one of: cosine, l2_norm, dot_product')
+            raise ValueError('Similarity must be one of: cosine, l2_norm, dot_product')
...
-                    raise Exception(f'Elasticsearch index failed (id={doc_id}): {reason}') from e
+                    raise RuntimeError(f'Elasticsearch index failed (id={doc_id}): {reason}') from e

Also applies to: 540-540

🧰 Tools
🪛 Ruff (0.15.7)

[warning] 106-106: Create your own exception

(TRY002)


[warning] 106-106: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/index_search/elasticsearch_store.py` at line 106, Replace the
generic Exception raises with specific exception types: where the code validates
the similarity option (the check that currently raises "Similarity must be one
of: cosine, l2_norm, dot_product") change the raise to ValueError with the same
message; for the bulk indexing failure path (the raise at the other location
around line 540) raise RuntimeError (or a module-specific custom exception
class) instead of Exception and preserve the original error message and any
wrapped exception info. Update the surrounding docstring/comments and any tests
that expect the old Exception type to assert the new specific exceptions
(ValueError for config validation and RuntimeError for bulk indexing failures).

Comment on lines 336 to 340
try:
slop_val = connConfig.get('slop')
self.search_exact_slop = (
int(slop_val or 0) if self.search_match_operator == 'exact' else 0
)
self.search_exact_slop = int(slop_val or 0) if self.search_match_operator == 'exact' else 0
except Exception:
self.search_exact_slop = 0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Narrow the exception in slop parsing.

Catching Exception here is broader than needed; constrain to TypeError/ValueError for config coercion.

Proposed fix
-        except Exception:
+        except (TypeError, ValueError):
             self.search_exact_slop = 0
🧰 Tools
🪛 Ruff (0.15.7)

[warning] 339-339: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/index_search/IGlobal.py` around lines 336 - 340, The
try/except around parsing slop in IGlobal is too broad; narrow the catch to only
handle coercion errors—replace the generic except Exception in the block that
reads connConfig.get('slop') and sets self.search_exact_slop (used with
self.search_match_operator == 'exact') with except (ValueError, TypeError) so
only int() conversion/coercion errors are swallowed and other exceptions
propagate.

f'**File Type:** {mime_type}\n**File Size:** {len(document_data)} bytes\n\n'
f'*The document could not be processed due to an error in the LlamaParse service.*'
)
error_text = f'# LlamaParse Processing Error\n\n**Error Type:** {type(e).__name__}\n**Error Message:** {str(e)}\n\n**File Type:** {mime_type}\n**File Size:** {len(document_data)} bytes\n\n*The document could not be processed due to an error in the LlamaParse service.*'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify Ruff RUF010 status for this file and inspect the exact line.
set -euo pipefail

if command -v ruff >/dev/null 2>&1; then
  ruff check --select RUF010 nodes/src/nodes/llamaparse/IInstance.py
else
  echo "ruff not found; fallback pattern check:"
  rg -nP '\{str\([^}]+\)\}' nodes/src/nodes/llamaparse/IInstance.py
fi

Repository: rocketride-org/rocketride-server

Length of output: 3279


Line 322: replace {str(e)} with {e!s} to satisfy Ruff RUF010.

Ruff detects an explicit f-string conversion violation at this location. Use the conversion flag {e!s} instead of wrapping with str(...).

Suggested diff
-            error_text = f'# LlamaParse Processing Error\n\n**Error Type:** {type(e).__name__}\n**Error Message:** {str(e)}\n\n**File Type:** {mime_type}\n**File Size:** {len(document_data)} bytes\n\n*The document could not be processed due to an error in the LlamaParse service.*'
+            error_text = f'# LlamaParse Processing Error\n\n**Error Type:** {type(e).__name__}\n**Error Message:** {e!s}\n\n**File Type:** {mime_type}\n**File Size:** {len(document_data)} bytes\n\n*The document could not be processed due to an error in the LlamaParse service.*'
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
error_text = f'# LlamaParse Processing Error\n\n**Error Type:** {type(e).__name__}\n**Error Message:** {str(e)}\n\n**File Type:** {mime_type}\n**File Size:** {len(document_data)} bytes\n\n*The document could not be processed due to an error in the LlamaParse service.*'
error_text = f'# LlamaParse Processing Error\n\n**Error Type:** {type(e).__name__}\n**Error Message:** {e!s}\n\n**File Type:** {mime_type}\n**File Size:** {len(document_data)} bytes\n\n*The document could not be processed due to an error in the LlamaParse service.*'
🧰 Tools
🪛 Ruff (0.15.7)

[warning] 322-322: Use explicit conversion flag

Replace with conversion flag

(RUF010)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/llamaparse/IInstance.py` at line 322, Replace the explicit
str() conversion in the f-string that builds error_text (the variable named
error_text in IInstance.py where the exception variable is e) by using the
f-string conversion flag {e!s} instead of {str(e)}; update the f-string in the
error handling block that constructs the LlamaParse Processing Error message so
it reads {e!s} to satisfy Ruff RUF010 while preserving the rest of the message
and surrounding variables (mime_type, document_data).

f'Unknown IBM Cloud location: {location!r}. '
f'Valid locations: {", ".join(sorted(_VALID_LOCATIONS))}'
)
raise ValueError(f'Unknown IBM Cloud location: {location!r}. Valid locations: {", ".join(sorted(_VALID_LOCATIONS))}')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

LGTM! Formatting change is correct.

The single-line f-string is a valid formatting improvement by ruff. The error message content and logic remain unchanged.

Optional future improvement: Ruff flags TRY003 here, suggesting that long exception messages should be defined within custom exception classes for better reusability and testing. Since this PR is formatting-only, consider addressing this in a follow-up if the error message is raised in multiple locations or if you want to standardize exception handling.

📝 Example TRY003 fix (optional, out of scope for this PR)

Create a custom exception:

class InvalidLocationError(ValueError):
    """Raised when an unknown IBM Cloud location is provided."""
    def __init__(self, location: str, valid_locations: frozenset):
        msg = f'Unknown IBM Cloud location: {location!r}. Valid locations: {", ".join(sorted(valid_locations))}'
        super().__init__(msg)

Then use it:

-        raise ValueError(f'Unknown IBM Cloud location: {location!r}. Valid locations: {", ".join(sorted(_VALID_LOCATIONS))}')
+        raise InvalidLocationError(location, _VALID_LOCATIONS)
🧰 Tools
🪛 Ruff (0.15.7)

[warning] 71-71: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/llm_ibm_watson/ibm_watson.py` at line 71, The current raise
ValueError(...) in ibm_watson.py can be replaced by a custom exception to
satisfy TRY003: create an InvalidLocationError(Exception or ValueError subclass)
that accepts location and valid_locations, builds the message using location and
_VALID_LOCATIONS, and replace the raise ValueError(...) with raise
InvalidLocationError(location, _VALID_LOCATIONS); ensure the new class is
defined near related helpers (e.g., above where location is validated) and
update any tests or callers to catch InvalidLocationError if needed.

Comment on lines 49 to 50
def _diag(msg):
pass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify only this rule on this file
ruff check nodes/src/nodes/ocr/IInstance.py --select ANN202

Repository: rocketride-org/rocketride-server

Length of output: 626


Add return type annotation for _diag function.

The private function at line 49 is missing a return type annotation, triggering Ruff rule ANN202.

Suggested fix
-        def _diag(msg):
+        def _diag(msg) -> None:
             pass
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _diag(msg):
pass
def _diag(msg) -> None:
pass
🧰 Tools
🪛 Ruff (0.15.7)

[warning] 49-49: Missing return type annotation for private function _diag

Add return type annotation: None

(ANN202)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/ocr/IInstance.py` around lines 49 - 50, The private helper
function _diag is missing a return type annotation (Ruff ANN202); update its
signature (def _diag(msg) -> None:) to explicitly return None so the linter is
satisfied, keeping the function body unchanged; locate the _diag definition
inside the IInstance class/function and add the "-> None" return annotation to
the def line.

Comment on lines 53 to 55
# fmt: off
'create_collection': (
'CREATE TABLE IF NOT EXISTS {collection} ('
'id bigserial PRIMARY KEY, '
'content text, '
'objectId text, '
'nodeId text, '
'parent text, '
'permissionId int, '
'isDeleted boolean, '
'chunkId int, '
'isTable boolean, '
'tableId int, '
'vectorSize int, '
'modelName text, '
'embedding vector({vector_size})'
');'
),
'create_collection': ('CREATE TABLE IF NOT EXISTS {collection} (id bigserial PRIMARY KEY, content text, objectId text, nodeId text, parent text, permissionId int, isDeleted boolean, chunkId int, isTable boolean, tableId int, vectorSize int, modelName text, embedding vector({vector_size}));'),
# fmt: on
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Formatting change consistent with PR objectives; consider multi-line SQL for readability.

The CREATE TABLE statement has been collapsed to a single ~260-character line, which aligns with the ruff format pass described in the PR objectives. However, the # fmt: off / # fmt: on markers suggest an intent to preserve manual formatting. If these markers are meant to prevent ruff from reformatting this section, you may need to verify your ruff configuration respects them.

For improved readability when reviewing or modifying the schema, consider formatting the SQL across multiple lines:

📖 Optional: Multi-line SQL for readability
-    # fmt: off
-    'create_collection': ('CREATE TABLE IF NOT EXISTS {collection} (id bigserial PRIMARY KEY, content text, objectId text, nodeId text, parent text, permissionId int, isDeleted boolean, chunkId int, isTable boolean, tableId int, vectorSize int, modelName text, embedding vector({vector_size}));'),
-    # fmt: on
+    'create_collection': (
+        'CREATE TABLE IF NOT EXISTS {collection} ('
+        'id bigserial PRIMARY KEY, '
+        'content text, '
+        'objectId text, '
+        'nodeId text, '
+        'parent text, '
+        'permissionId int, '
+        'isDeleted boolean, '
+        'chunkId int, '
+        'isTable boolean, '
+        'tableId int, '
+        'vectorSize int, '
+        'modelName text, '
+        'embedding vector({vector_size})'
+        ');'
+    ),

This keeps the SQL readable without requiring # fmt: off markers, as Python automatically concatenates adjacent string literals.

🧰 Tools
🪛 Ruff (0.15.7)

[warning] 53-53: This suppression comment is invalid because it cannot be in an expression, pattern, argument list, or other non-statement

Remove this comment

(RUF028)


[warning] 55-55: This suppression comment is invalid because it cannot be in an expression, pattern, argument list, or other non-statement

Remove this comment

(RUF028)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/vectordb_postgres/vectordb_postgres.py` around lines 53 - 55,
The SQL for the 'create_collection' entry is a single long literal and the
surrounding "# fmt: off/on" implies you wanted preserved manual formatting;
split the SQL into a multi-line Python string by concatenating adjacent string
literals (or a triple-quoted string) so the CREATE TABLE clause is readable
while keeping the same placeholders ({collection}, {vector_size}), and then
either remove the "# fmt: off/on" markers or ensure your ruff config respects
them; update the 'create_collection' value accordingly in vectordb_postgres.py
so the statement remains identical semantically but is formatted across multiple
lines for maintainability.

@charliegillet
Copy link
Copy Markdown
Contributor Author

Closing — duplicate of #485 by josephmccann which was submitted first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:nodes Python pipeline nodes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Run ruff on all Python files and enforce formatting in CI

1 participant