Skip to content

[Opt](ai-func) Improving AI function performance#62494

Open
linrrzqqq wants to merge 1 commit intoapache:masterfrom
linrrzqqq:opt-ai-pr
Open

[Opt](ai-func) Improving AI function performance#62494
linrrzqqq wants to merge 1 commit intoapache:masterfrom
linrrzqqq:opt-ai-pr

Conversation

@linrrzqqq
Copy link
Copy Markdown
Contributor

@linrrzqqq linrrzqqq commented Apr 14, 2026

Release note

Improving the performance of AI functions through batch sending, embed controls the number of (text/file) items sent in a single batch through the variable embed_max_batch_size, and the remaining functions internally maintain a conservative context window.

The current sending format is similar to:

"input": [
    {"role": "system", "content": "system_prompt here"},
    {"role": "user", 
     "content": [
	{"idx": 1, "text": "xxx"},
        {"idx": 2, "text": "xxx"},
      ]
    }
]

performance:

-- AI_CLASSIFY
SELECT 
    COUNT(*) AS total_rows,
    SUM(IF(res = 'science', 1, 0)) AS excepte_eq_res
FROM (
    SELECT AI_CLASSIFY('deepseek-chat', str, ['science', 'sport']) AS res 
    FROM test_str
) t;

-- before 
+------------+----------------+
| total_rows | excepte_eq_res |
+------------+----------------+
|        100 |            100 |
+------------+----------------+
1 row in set (2 min 11.579 sec)

-- now
+------------+----------------+
| total_rows | excepte_eq_res |
+------------+----------------+
|        100 |            100 |
+------------+----------------+
1 row in set (10.487 sec)

-- AI_FILTER
SELECT 
    COUNT(*) AS total_rows,
    SUM(IF(res = 1, 1, 0)) AS zero_res_rows
FROM (
    SELECT AI_FILTER('deepseek-chat', str) AS res 
    FROM test_str
) t;

-- before
+------------+---------------+
| total_rows | zero_res_rows |
+------------+---------------+
|        100 |             0 |
+------------+---------------+
1 row in set (2 min 2.979 sec)

-- now
+------------+---------------+
| total_rows | zero_res_rows |
+------------+---------------+
|        100 |             0 |
+------------+---------------+
1 row in set (5.007 sec)

-- EMBED
select count(embed('qwen-embed', str)) FROM test_str;

-- before
+---------------------------------+
| count(embed('qwen-embed', str)) |
+---------------------------------+
|                             100 |
+---------------------------------+
1 row in set (4 min 4.888 sec)

-- now
set embed_max_batch_size = 10;
+---------------------------------+
| count(embed('qwen-embed', str)) |
+---------------------------------+
|                             100 |
+---------------------------------+
1 row in set (23.424 sec)

-- Multimodal_Embed
SELECT COUNT(EMBED('qwen_mul_embed', to_json(file))) FROM test_jpg2;
-- before: can't get results for a long time(over 20 mins).
-- now
set embed_max_batch_size = 20;
+----------------------------------------------------+
|                                               .... |
|                                               1152 |
+----------------------------------------------------+
1142 rows in set (1 min 13.577 sec)

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@linrrzqqq
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (2/2) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (2/2) 🎉
Increment coverage report
Complete coverage report

@linrrzqqq
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 1 blocking issue.

  1. be/src/exprs/function/ai/embed.h: text embedding batching breaks Gemini resources. _execute_text_embed() now batches multiple prompts into one build_embedding_request(inputs, ...) call, but GeminiAdapter::build_embedding_request() still serializes them into a single content object and parse_embedding_response() still returns a single embedding for the text path. Running embed() on multiple rows with a GEMINI AI resource will now fail the cardinality check (expected N got 1) or effectively only embed the last input. Please either keep Gemini on the old per-row path or implement Gemini's true batch text embedding protocol before enabling batching here.

Critical checkpoint conclusions:

  • Goal and correctness: The PR aims to improve AI-function performance via batching. That is only partially achieved because multi-row text embed() is no longer correct for all supported providers. Existing tests do not cover the failing Gemini text-embedding path.
  • Scope/minimality: The change is focused, but the generic text-embedding batching applies to providers with different protocol semantics.
  • Concurrency: No new thread-safety or locking issue identified; the path remains synchronous.
  • Lifecycle/static init: No special lifecycle or static initialization issue found.
  • Configuration: multimodal_embed_max_batch_file_count is added and forwarded to BE correctly through TQueryOptions.
  • Compatibility: No storage-format or persistence compatibility issue found; FE/BE query-option propagation looks complete for the new variable.
  • Parallel paths: Multimodal embedding and string AI functions were updated, but the provider-specific Gemini text embedding path was not handled consistently.
  • Special conditions/checks: The new multimodal input validation is reasonable.
  • Test coverage: Unit coverage improved, but there is no test for multi-row embed() with a GEMINI resource, which is the broken path here.
  • Test result files: Not applicable.
  • Observability: Existing observability is sufficient for this review; no blocker here.
  • Transaction/persistence/data writes/FE-BE variable passing: Not applicable beyond query-option forwarding, which is covered.
  • Performance: Batching should help supported providers, but this regression must be fixed first.
  • Other issues: No additional blocking issue confirmed beyond the Gemini regression.

@linrrzqqq linrrzqqq force-pushed the opt-ai-pr branch 2 times, most recently from ffce457 to 831a82e Compare April 15, 2026 02:20
@linrrzqqq
Copy link
Copy Markdown
Contributor Author

run buildall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants