Basic checks
What's broken?
RubyLLM's VertexAI embedding provider does not expose a way to send Vertex AI's per-request embedding task_type for gemini-embedding-001.
This matters because Vertex AI embedding task types select embeddings optimized for different use cases, such as SEMANTIC_SIMILARITY, RETRIEVAL_QUERY, and RETRIEVAL_DOCUMENT. If task_type is omitted, embeddings can be generated in a different space than the caller intended, with no obvious failure.
Current RubyLLM behaviour appears to be:
RubyLLM::Embedding.embed accepts and forwards dimensions:, but not task_type: or other provider-specific embedding params.
RubyLLM::Providers::VertexAI::Embeddings.render_embedding_payload builds instances as { content: ... } only, so task_type cannot reach the Vertex AI request body.
We had to carry a local monkey patch to thread task_type: through RubyLLM.embed and merge it into each VertexAI instances[] entry.
How to reproduce
- Configure RubyLLM for Vertex AI.
- Try to create an embedding that explicitly selects a Vertex AI task type:
RubyLLM.embed(
"hello",
model: "gemini-embedding-001",
provider: :vertexai,
dimensions: 1536,
task_type: "SEMANTIC_SIMILARITY"
)
- Ruby raises because task_type: is not accepted by RubyLLM.embed.
- If the caller omits task_type: to avoid the unsupported keyword:
RubyLLM.embed(
"hello",
model: "gemini-embedding-001",
provider: :vertexai,
dimensions: 1536
)
- The request succeeds, but the Vertex AI payload cannot include the intended task type.
Expected behavior
RubyLLM should support forwarding VertexAI embedding task type information, for example:
RubyLLM.embed(
"hello",
model: "gemini-embedding-001",
provider: :vertexai,
dimensions: 1536,
task_type: "SEMANTIC_SIMILARITY"
)
That should produce a VertexAI payload shaped like:
{
"instances": [
{
"content": "hello",
"task_type": "SEMANTIC_SIMILARITY"
}
],
"parameters": {
"outputDimensionality": 1536
}
}
For RETRIEVAL_DOCUMENT, it may also be worth supporting title, since Vertex's embedding docs expose that alongside task_type.
What actually happened
Passing task_type: to RubyLLM.embed is not currently supported:
ArgumentError: unknown keyword: :task_type
If task_type: is omitted, RubyLLM's VertexAI provider sends a payload with only content per instance:
{
"instances": [
{
"content": "hello"
}
],
"parameters": {
"outputDimensionality": 1536
}
}
That means the request succeeds, but the selected embedding task type is silently lost.
Environment
- Ruby version: 3.3.10
- RubyLLM version: 1.16.0
- Provider: VertexAI
- Model: gemini-embedding-001
- OS: macOS
AI Suggested Fix
A minimal fix would be to plumb provider-specific embedding options through the embedding path and let the VertexAI provider consume the ones it supports.
One possible shape:
# lib/ruby_llm/embedding.rb
def self.embed(text,
model: nil,
provider: nil,
assume_model_exists: false,
context: nil,
dimensions: nil,
**provider_params)
# existing setup...
RubyLLM.instrument('embedding.ruby_llm', payload, config: config) do |event|
result = provider_instance.embed(
text,
model: model_id,
dimensions: dimensions,
**provider_params
)
# existing instrumentation...
result
end
end
# lib/ruby_llm/providers/vertexai/embeddings.rb
def render_embedding_payload(text, model:, dimensions:, task_type: nil, title: nil)
instances = [text].flatten.map do |t|
instance = { content: t.to_s }
instance[:task_type] = task_type if task_type
instance[:title] = title if title
instance
end
{ instances: instances }.tap do |payload|
payload[:parameters] = { outputDimensionality: dimensions } if dimensions
end
end
The provider's embed method may also need to forward these kwargs into render_embedding_payload, depending on the current provider protocol method signature.
Basic checks
What's broken?
RubyLLM's VertexAI embedding provider does not expose a way to send Vertex AI's per-request embedding
task_typeforgemini-embedding-001.This matters because Vertex AI embedding task types select embeddings optimized for different use cases, such as
SEMANTIC_SIMILARITY,RETRIEVAL_QUERY, andRETRIEVAL_DOCUMENT. Iftask_typeis omitted, embeddings can be generated in a different space than the caller intended, with no obvious failure.Current RubyLLM behaviour appears to be:
RubyLLM::Embedding.embedaccepts and forwardsdimensions:, but nottask_type:or other provider-specific embedding params.RubyLLM::Providers::VertexAI::Embeddings.render_embedding_payloadbuildsinstancesas{ content: ... }only, sotask_typecannot reach the Vertex AI request body.We had to carry a local monkey patch to thread
task_type:throughRubyLLM.embedand merge it into each VertexAIinstances[]entry.How to reproduce
Expected behavior
RubyLLM should support forwarding VertexAI embedding task type information, for example:
That should produce a VertexAI payload shaped like:
{ "instances": [ { "content": "hello", "task_type": "SEMANTIC_SIMILARITY" } ], "parameters": { "outputDimensionality": 1536 } }For
RETRIEVAL_DOCUMENT, it may also be worth supportingtitle, since Vertex's embedding docs expose that alongsidetask_type.What actually happened
Passing
task_type:toRubyLLM.embedis not currently supported:If task_type: is omitted, RubyLLM's VertexAI provider sends a payload with only content per instance:
{ "instances": [ { "content": "hello" } ], "parameters": { "outputDimensionality": 1536 } }That means the request succeeds, but the selected embedding task type is silently lost.
Environment
AI Suggested Fix
A minimal fix would be to plumb provider-specific embedding options through the embedding path and let the VertexAI provider consume the ones it supports.
One possible shape:
The provider's embed method may also need to forward these kwargs into render_embedding_payload, depending on the current provider protocol method signature.