Skip to content

Fix jira#11883

Merged
ea-rus merged 12 commits into
releases/25.11.1from
fix_jira
Nov 28, 2025
Merged

Fix jira#11883
ea-rus merged 12 commits into
releases/25.11.1from
fix_jira

Conversation

@ZoranPandovski
Copy link
Copy Markdown
Member

Description

This PR fixes the Jira handler so that when there is null or missing data, we still return that.

Fixes https://linear.app/mindsdb/issue/FQE-1425/jira-handler-querying-the-issues-table-fails-with-a-keyerror-table

Type of change

(Please delete options that are not relevant)

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ⚡ New feature (non-breaking change which adds functionality)
  • 📢 Breaking change (fix or feature that would cause existing functionality not to work as expected)
  • 📄 This change requires a documentation update

Checklist:

  • My code follows the style guidelines(PEP 8) of MindsDB.
  • I have appropriately commented on my code, especially in complex areas.
  • Necessary documentation updates are either made or tracked in issues.
  • Relevant unit and integration tests are updated or added.

@entelligence-ai-pr-reviews
Copy link
Copy Markdown
Contributor

🔒 Entelligence AI Vulnerability Scanner

No security vulnerabilities found!

Your code passed our comprehensive security analysis.

📊 Files Analyzed: 2 files


@entelligence-ai-pr-reviews
Copy link
Copy Markdown
Contributor

Review Summary

@entelligence-ai-pr-reviews
Copy link
Copy Markdown
Contributor

Review Summary

🏷️ Draft Comments (2)

Skipped posting 2 draft comments that were valid but scored below your review threshold (>=13/15). Feel free to update them here.

mindsdb/integrations/handlers/jira_handler/jira_tables.py (2)

133-134: JiraIssuesTable.normalize will raise a KeyError if any expected column is missing from the input, due to direct column selection with issues_df[self.get_columns()].

📊 Impact Scores:

  • Production Impact: 4/5
  • Fix Specificity: 5/5
  • Urgency Impact: 3/5
  • Total Score: 12/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/integrations/handlers/jira_handler/jira_tables.py, lines 133-134, the code selects columns directly with `issues_df[self.get_columns()]`, which raises a KeyError if any expected column is missing from the data. Replace this with `issues_df.reindex(columns=self.get_columns(), fill_value=None)` to ensure missing columns are filled with None and prevent runtime errors.

92-96: JiraIssuesTable.list fetches all issues for all projects when no conditions are provided, which can cause severe performance degradation for large Jira instances.

📊 Impact Scores:

  • Production Impact: 4/5
  • Fix Specificity: 3/5
  • Urgency Impact: 3/5
  • Total Score: 10/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/integrations/handlers/jira_handler/jira_tables.py, lines 92-96, the code fetches all issues for all projects when no conditions are provided, which can cause severe performance degradation for large Jira instances. Update this logic to always respect the `limit` parameter if provided, and add a warning or safeguard to prevent unbounded fetching when no limit is set. Ensure that the code does not fetch all issues from all projects unless explicitly requested, and that it stops fetching once the limit is reached.

StpMax
StpMax previously approved these changes Nov 17, 2025
@ZoranPandovski ZoranPandovski changed the base branch from develop to releases/25.11.1 November 20, 2025 13:32
@ZoranPandovski ZoranPandovski changed the base branch from releases/25.11.1 to develop November 20, 2025 13:32
@ZoranPandovski ZoranPandovski dismissed StpMax’s stale review November 20, 2025 13:32

The base branch was changed.

@ea-rus ea-rus changed the base branch from develop to releases/25.11.1 November 27, 2025 16:55
# Conflicts:
#	mindsdb/integrations/handlers/hubspot_handler/hubspot_tables.py
@entelligence-ai-pr-reviews
Copy link
Copy Markdown
Contributor

Review Summary

🏷️ Draft Comments (22)

Skipped posting 22 draft comments that were valid but scored below your review threshold (>=13/15). Feel free to update them here.

docs/sdks/python/knowledge_bases/create.mdx (1)

105-105: snowflake provider is now documented as supporting both embedding_model and reranking_model, but Snowflake Cortex AI does not offer embedding models; this misleads users and may cause runtime errors if attempted.

📊 Impact Scores:

  • Production Impact: 3/5
  • Fix Specificity: 5/5
  • Urgency Impact: 3/5
  • Total Score: 11/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In docs/sdks/python/knowledge_bases/create.mdx, on line 105, the documentation incorrectly states that the `snowflake` provider supports both `embedding_model` and `reranking_model`. However, Snowflake Cortex AI does not offer embedding models. Please change the line to: 'This provider is supported for `reranking_model`. Note that Snowflake Cortex AI does not offer embedding models as of now.'

mindsdb/api/executor/planner/plan_join.py (1)

677-744: The get_filters_from_join_conditions method is currently disabled for filter pushdown, causing large cross-database joins to always fetch full tables and perform joins in memory, which is highly inefficient for large datasets.

📊 Impact Scores:

  • Production Impact: 4/5
  • Fix Specificity: 3/5
  • Urgency Impact: 3/5
  • Total Score: 10/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/api/executor/planner/plan_join.py, lines 677-744, the `get_filters_from_join_conditions` method is currently not used for filter pushdown, resulting in full-table scans and in-memory joins for large cross-database joins. Consider enabling and carefully integrating this method to push down join filters (e.g., via IN clauses or subqueries) to the source databases when safe, especially for cases where the join key cardinality is low. Ensure that the implementation avoids generating massive IN clauses and only applies filter pushdown when the resulting query size is manageable. This will significantly improve performance for large joins.

mindsdb/api/executor/planner/query_planner.py (1)

81-89: integration_name and integration are not lowercased when added to self.integrations and _projects, which can cause key mismatches and runtime errors when resolving integrations or projects by name.

📊 Impact Scores:

  • Production Impact: 4/5
  • Fix Specificity: 5/5
  • Urgency Impact: 3/5
  • Total Score: 12/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/api/executor/planner/query_planner.py, lines 81-89, the integration and project names are not consistently lowercased when added to self.integrations and _projects. This can cause runtime key mismatches when resolving integrations or projects by name. Update the code so that both integration_name and integration are always lowercased before being used as keys or added to _projects.

mindsdb/api/http/initialize.py (2)

328-340: check_session_auth is used in the before_request handler, but if it raises an exception or is not implemented correctly, all protected endpoints may become inaccessible, causing runtime failures.

📊 Impact Scores:

  • Production Impact: 4/5
  • Fix Specificity: 5/5
  • Urgency Impact: 3/5
  • Total Score: 12/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/api/http/initialize.py, lines 328-340, the use of `check_session_auth()` in the `before_request` handler can cause all protected endpoints to become inaccessible if it raises an exception. Wrap the call to `check_session_auth()` in a try/except block, log any exception, and treat exceptions as authentication failures to prevent runtime crashes and ensure endpoints remain accessible when possible.

202-379: initialize_app function is overly large and complex (73 statements, 22 branches), making it difficult to maintain and optimize for performance as the system grows.

📊 Impact Scores:

  • Production Impact: 2/5
  • Fix Specificity: 3/5
  • Urgency Impact: 2/5
  • Total Score: 7/15

🤖 AI Agent Prompt (Copy & Paste Ready):

Refactor the `initialize_app` function in mindsdb/api/http/initialize.py (lines 202-379) to reduce its size and complexity. Break it into smaller, well-named helper functions for route setup, namespace registration, error handling, and request context management. This will improve maintainability and make future performance optimizations easier.

mindsdb/api/mysql/mysql_proxy/mysql_proxy.py (2)

344-350: send_query_answer does not handle RESPONSE_TYPE.EOF, causing protocol errors for commands like COM_DEBUG that expect an EOF packet.

📊 Impact Scores:

  • Production Impact: 4/5
  • Fix Specificity: 5/5
  • Urgency Impact: 3/5
  • Total Score: 12/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/api/mysql/mysql_proxy/mysql_proxy.py, lines 344-350, the function `send_query_answer` does not handle `RESPONSE_TYPE.EOF`, which causes protocol errors for commands like COM_DEBUG that expect an EOF packet. Add a branch to handle `RESPONSE_TYPE.EOF` by sending an `EofPacket`.

626-799: handle method in MysqlProxy is over 120 lines, with deep nesting and many branches, making it hard to maintain and reason about as the protocol grows.

📊 Impact Scores:

  • Production Impact: 1/5
  • Fix Specificity: 2/5
  • Urgency Impact: 2/5
  • Total Score: 5/15

🤖 AI Agent Prompt (Copy & Paste Ready):

Refactor the `handle` method in mindsdb/api/mysql/mysql_proxy/mysql_proxy.py (lines 626-799) to reduce its complexity and improve maintainability. Move the command dispatch logic (the large if-elif-else block) into a new helper method (e.g., `_handle_command_packet`). The main `handle` loop should only be responsible for connection/session setup and repeatedly calling this new method for each packet. Preserve all existing logic and logging.

mindsdb/integrations/handlers/chromadb_handler/chromadb_handler.py (1)

226-347: select method is excessively complex (too many branches/statements), making it hard to maintain and optimize for performance as the codebase grows.

📊 Impact Scores:

  • Production Impact: 2/5
  • Fix Specificity: 3/5
  • Urgency Impact: 2/5
  • Total Score: 7/15

🤖 AI Agent Prompt (Copy & Paste Ready):

Refactor the `select` method in mindsdb/integrations/handlers/chromadb_handler/chromadb_handler.py (lines 226-347) to reduce its cyclomatic complexity and improve maintainability. The function currently has too many branches and statements, making it hard to optimize and maintain. Simplify the logic by consolidating condition handling, reducing nested blocks, and separating concerns where possible, while preserving all existing functionality and performance.

mindsdb/integrations/handlers/hubspot_handler/hubspot_handler.py (3)

492-494: all_statistics is overwritten inside the property loop, causing only the last column's statistics to be retained, resulting in missing statistics for other columns.

📊 Impact Scores:

  • Production Impact: 4/5
  • Fix Specificity: 5/5
  • Urgency Impact: 3/5
  • Total Score: 12/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/integrations/handlers/hubspot_handler/hubspot_handler.py, lines 492-494, there is a bug where `all_statistics` is overwritten inside the property column loop, causing only the last column's statistics to be retained and all previous statistics to be lost. Remove these lines entirely to ensure all collected statistics are preserved.

288-302: try-except inside a loop in get_tables (lines 288-302) causes significant performance overhead when processing many tables, as exception handling is expensive in Python.

📊 Impact Scores:

  • Production Impact: 2/5
  • Fix Specificity: 3/5
  • Urgency Impact: 2/5
  • Total Score: 7/15

🤖 AI Agent Prompt (Copy & Paste Ready):

Refactor mindsdb/integrations/handlers/hubspot_handler/hubspot_handler.py lines 288-302 to move the try-except block outside the main loop body in `get_tables`. This avoids the performance penalty of exception handling inside a loop. Only wrap the API call in try-except, and use `continue` to skip inaccessible tables. Place the metadata appending and logging outside the try-except so they only execute on success.

509-607: The _discover_columns method (lines 509-607) is overly complex (13+ branches), making it hard to maintain and optimize, which can impact performance and scalability as more features are added.

📊 Impact Scores:

  • Production Impact: 2/5
  • Fix Specificity: 4/5
  • Urgency Impact: 2/5
  • Total Score: 8/15

🤖 AI Agent Prompt (Copy & Paste Ready):

Refactor the `_discover_columns` method in mindsdb/integrations/handlers/hubspot_handler/hubspot_handler.py (lines 509-607) to reduce cyclomatic complexity and improve maintainability. Break the function into smaller helper methods for data fetching, property extraction, and column construction. This will make the code easier to optimize and scale as new features are added.

mindsdb/integrations/handlers/hubspot_handler/hubspot_tables.py (1)

191-207: try-except inside the for company in companies loop in get_companies (lines 191-207) causes significant performance overhead for large datasets due to repeated exception handler setup.

📊 Impact Scores:

  • Production Impact: 2/5
  • Fix Specificity: 4/5
  • Urgency Impact: 2/5
  • Total Score: 8/15

🤖 AI Agent Prompt (Copy & Paste Ready):

Refactor the `get_companies` method in mindsdb/integrations/handlers/hubspot_handler/hubspot_tables.py (lines 191-207) to move the `try`-`except` block outside the loop. Use a helper function to handle exceptions for each company, so the exception handler is not recreated on every iteration. This will reduce performance overhead for large datasets.

mindsdb/integrations/handlers/shopify_handler/shopify_handler.py (2)

184-190: check_connection sets response.success = True only if the Shopify connection succeeds, but if Yotpo credentials are present, it overwrites this with the Yotpo check result, potentially marking the connection as failed even if Shopify is up (or vice versa).

📊 Impact Scores:

  • Production Impact: 3/5
  • Fix Specificity: 4/5
  • Urgency Impact: 3/5
  • Total Score: 10/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/integrations/handlers/shopify_handler/shopify_handler.py, lines 184-190: The Yotpo connection check in `check_connection` overwrites the Shopify connection result by setting `response.success` directly, which can cause false negatives if either service is down. Change the logic so that `response.success` is only set to False if the Yotpo check fails, but do not overwrite a successful Shopify connection. Only set `response.success = False` if the Yotpo check fails, otherwise leave it unchanged.

187-187: The requests.get call in check_connection does not specify a timeout, risking resource exhaustion and thread blocking under network issues.

📊 Impact Scores:

  • Production Impact: 3/5
  • Fix Specificity: 5/5
  • Urgency Impact: 2/5
  • Total Score: 10/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/integrations/handlers/shopify_handler/shopify_handler.py, line 187, add a `timeout` parameter (e.g., `timeout=10`) to the `requests.get` call to prevent indefinite blocking and improve resource usage under network issues.

mindsdb/integrations/handlers/sqlite_handler/sqlite_handler.py (2)

59-68: connect() returns the connection object, but check_connection() expects a StatusResponse, causing downstream logic to break when using the returned value.

📊 Impact Scores:

  • Production Impact: 2/5
  • Fix Specificity: 2/5
  • Urgency Impact: 2/5
  • Total Score: 6/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/integrations/handlers/sqlite_handler/sqlite_handler.py, lines 59-68, the `connect()` method returns the raw connection object, but `check_connection()` and possibly other callers expect a `StatusResponse`. Update `connect()` so it always returns a `StatusResponse` object indicating success, not the connection itself.

105-145: native_query executes raw SQL queries directly from user input without sanitization, enabling SQL injection and unauthorized data access or modification.

📊 Impact Scores:

  • Production Impact: 5/5
  • Fix Specificity: 2/5
  • Urgency Impact: 4/5
  • Total Score: 11/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/integrations/handlers/sqlite_handler/sqlite_handler.py, lines 105-145, the `native_query` method executes raw SQL queries directly from user input, making it vulnerable to SQL injection and unauthorized data access or modification. Update this method to strictly validate or restrict the input query, ensuring only trusted, non-user-supplied queries are executed. Add checks to block dangerous SQL keywords and patterns, and return an error response if detected. Do not allow direct execution of arbitrary user input.

mindsdb/integrations/libs/api_handler.py (2)

199-202: The select method in APIResource redundantly filters and slices large DataFrames in memory, which can cause high memory and CPU usage for large datasets.

📊 Impact Scores:

  • Production Impact: 3/5
  • Fix Specificity: 2/5
  • Urgency Impact: 2/5
  • Total Score: 7/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/integrations/libs/api_handler.py, lines 199-202, the select method applies in-memory filtering and then slices the DataFrame for limit, which can be inefficient for large datasets. Refactor so that filtering and limiting are only applied if necessary, and avoid computing len(result) before slicing. Use the following pattern: if filters or raw_conditions: result = filter_dataframe(...); if limit is not None: result = result[:int(limit)].

556-566, 589-599, 623-635, 659-667, 694-702: The repeated pattern of DataFrame column checks and creation in meta_get_tables, meta_get_columns, meta_get_column_statistics, meta_get_primary_keys, and meta_get_foreign_keys causes code duplication and maintainability issues.

📊 Impact Scores:

  • Production Impact: 2/5
  • Fix Specificity: 3/5
  • Urgency Impact: 2/5
  • Total Score: 7/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/integrations/libs/api_handler.py, lines 556-566, 589-599, 623-635, 659-667, and 694-702, there is significant code duplication for initializing empty DataFrames with specific columns. Refactor this repeated logic into a shared utility function to improve maintainability and reduce the risk of inconsistencies.

mindsdb/integrations/utilities/sql_utils.py (1)

150-194: The function project_dataframe (lines 150-194) contains a complex, deeply nested structure with multiple branches, making it difficult to maintain and extend, especially as projection logic grows.

📊 Impact Scores:

  • Production Impact: 1/5
  • Fix Specificity: 4/5
  • Urgency Impact: 1/5
  • Total Score: 6/15

🤖 AI Agent Prompt (Copy & Paste Ready):

Refactor the function `project_dataframe` in mindsdb/integrations/utilities/sql_utils.py (lines 150-194) to reduce complexity and improve maintainability. Extract the inner logic for handling `ast.Star` and `ast.Identifier` into helper functions, minimize deep nesting, and avoid in-place DataFrame operations. Ensure the function remains functionally equivalent and preserves exact formatting.

mindsdb/interfaces/data_catalog/data_catalog_retriever.py (2)

163-163: primary_keys_df.sort_values(by="ORDINAL_POSITION", inplace=True) mutates the input DataFrame in-place, which can cause subtle bugs and data corruption when the same DataFrame is reused elsewhere, especially in large-scale or multi-table scenarios.

📊 Impact Scores:

  • Production Impact: 3/5
  • Fix Specificity: 5/5
  • Urgency Impact: 2/5
  • Total Score: 10/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/interfaces/data_catalog/data_catalog_retriever.py, line 163, replace the in-place sort `primary_keys_df.sort_values(by="ORDINAL_POSITION", inplace=True)` with an out-of-place sort to avoid mutating the input DataFrame, which can cause subtle bugs and data corruption when the same DataFrame is reused elsewhere. Change it to `primary_keys_df = primary_keys_df.sort_values(by="ORDINAL_POSITION")`.

215-280: _construct_metadata_string_for_column_statistics is overly complex (15+ branches), making it hard to maintain and optimize, which can slow down future development and introduce performance regressions.

📊 Impact Scores:

  • Production Impact: 2/5
  • Fix Specificity: 4/5
  • Urgency Impact: 2/5
  • Total Score: 8/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/interfaces/data_catalog/data_catalog_retriever.py, lines 215-280, refactor the `_construct_metadata_string_for_column_statistics` method to reduce its cyclomatic complexity (currently 15+ branches). Break it into smaller helper functions for each statistics section (e.g., most common values, null percentage, distinct values, min/max), and simplify nested logic. This will improve maintainability and reduce the risk of performance regressions.

mindsdb/utilities/config.py (1)

235-394: prepare_env_config (lines 235-394) is a single function with high cyclomatic complexity and too many branches/statements, making it hard to maintain and error-prone as config logic grows.

📊 Impact Scores:

  • Production Impact: 2/5
  • Fix Specificity: 3/5
  • Urgency Impact: 2/5
  • Total Score: 7/15

🤖 AI Agent Prompt (Copy & Paste Ready):

Refactor the function `prepare_env_config` in mindsdb/utilities/config.py (lines 235-394) to reduce its cyclomatic complexity and improve maintainability. Split the logic into smaller, focused helper methods (e.g., _prepare_env_storage_paths, _prepare_env_permanent_storage, etc.), and have `prepare_env_config` call these helpers in sequence. Ensure the refactor preserves all existing logic and config merging behavior.

🔍 Comments beyond diff scope (2)
mindsdb/api/http/initialize.py (1)

254-255: root_index route uses send_from_directory(static_root, path) without strict filename validation, risking path traversal if is_relative_to logic is bypassed or on older Python versions.
Category: security


mindsdb/integrations/handlers/hubspot_handler/hubspot_tables.py (1)

399-412: get_contacts does not apply where_conditions filtering if the search API cannot be used, causing results to ignore filters and return all contacts.
Category: correctness


@ea-rus ea-rus merged commit 186acf1 into releases/25.11.1 Nov 28, 2025
29 of 33 checks passed
@ea-rus ea-rus deleted the fix_jira branch November 28, 2025 13:26
@github-actions github-actions Bot locked and limited conversation to collaborators Nov 28, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants