Skip to content

problem-framing(extract-worker): document embedding #53

@ClemDoum

Description

@ClemDoum

Inputs

  • document in markdown formats
  • embedding configuration

Outputs

  • document embedded as chunks stored in a vector DB

Success metrics (TBD)

Element to help us define success:

  • for which downstream task do we need vectors: similar doc search, search, RAG, ???
  • what storage and vector size can we afford ?
  • review embedding leaderboard: https://huggingface.co/spaces/mteb/leaderboard
  • where do we want to store vectors (storage ES ?) ?

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

Status
Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions