You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The `DocEmbedder` class provides an end-to-end pipeline for ingesting documents into Feast's online vector store. It handles chunking, embedding generation, and writing results -- all in a single step.
63
+
64
+
#### Key Components
65
+
66
+
***`DocEmbedder`**: High-level orchestrator that runs the full pipeline: chunk → embed → schema transform → write to online store
67
+
***`BaseChunker` / `TextChunker`**: Pluggable chunking layer. `TextChunker` splits text by word count with configurable `chunk_size`, `chunk_overlap`, `min_chunk_size`, and `max_chunk_chars`
68
+
***`BaseEmbedder` / `MultiModalEmbedder`**: Pluggable embedding layer with modality routing. `MultiModalEmbedder` supports text (via sentence-transformers) and image (via CLIP) with lazy model loading
69
+
***`SchemaTransformFn`**: A user-defined function that transforms the chunked + embedded DataFrame into the format expected by the FeatureView schema
# Create DocEmbedder -- automatically generates a FeatureView and applies the repo
84
+
embedder = DocEmbedder(
85
+
repo_path="feature_repo/",
86
+
feature_view_name="text_feature_view",
87
+
)
88
+
89
+
# Embed and ingest documents in one step
90
+
result = embedder.embed_documents(
91
+
documents=df,
92
+
id_column="id",
93
+
source_column="text",
94
+
column_mapping=("text", "text_embedding"),
95
+
)
96
+
```
97
+
98
+
#### Features
99
+
100
+
***Auto-generates FeatureView**: Creates a Python file with Entity and FeatureView definitions compatible with `feast apply`
101
+
***Auto-applies repo**: Registers the generated FeatureView in the registry automatically
102
+
***Custom schema transform**: Provide your own `SchemaTransformFn` to control how chunked + embedded data maps to your FeatureView schema
103
+
***Extensible**: Subclass `BaseChunker` or `BaseEmbedder` to plug in your own chunking or embedding strategies
104
+
105
+
For a complete walkthrough, see the [DocEmbedder tutorial notebook](../../examples/rag-retriever/rag_feast_docembedder.ipynb).
59
106
### Feature Transformation for LLMs
60
107
61
108
Feast supports transformations that can be used to:
@@ -89,6 +136,17 @@ Implement semantic search by:
89
136
2. Converting search queries to embeddings
90
137
3. Finding semantically similar documents using vector search
91
138
139
+
### AI Agents with Context and Memory
140
+
141
+
Feast can serve as both the **context provider** and **persistent memory layer** for AI agents. Unlike stateless RAG pipelines, agents make autonomous decisions about which tools to call and can write state back to the feature store:
142
+
143
+
1.**Structured context**: Retrieve customer profiles, account data, and other entity-keyed features
144
+
2.**Knowledge retrieval**: Search vector embeddings for relevant documents
145
+
3.**Persistent memory**: Store and recall per-entity interaction history (last topic, resolution, preferences) using `write_to_online_store`
146
+
4.**Governed access**: All reads and writes are subject to the same RBAC, TTL, and audit policies as any other feature
147
+
148
+
With MCP enabled, agents built with any framework (LangChain, LlamaIndex, CrewAI, AutoGen, or custom) can discover and call Feast tools dynamically. See the [Feast-Powered AI Agent example](../../examples/agent_feature_store/) and the blog post [Building AI Agents with Feast](https://feast.dev/blog/feast-agents-mcp/) for a complete walkthrough.
149
+
92
150
### Scaling with Spark Integration
93
151
94
152
Feast integrates with Apache Spark to enable large-scale processing of unstructured data for GenAI applications:
@@ -167,20 +225,25 @@ The MCP integration uses the `fastapi_mcp` library to automatically transform yo
167
225
The fastapi_mcp integration automatically exposes your Feast feature server's FastAPI endpoints as MCP tools. This means AI assistants can:
168
226
169
227
* **Call `/get-online-features`** to retrieve features from the feature store
228
+
* **Call `/retrieve-online-documents`** to perform vector similarity search
229
+
* **Call `/write-to-online-store`** to persist agent state (memory, notes, interaction history)
170
230
* **Use `/health`** to check server status
171
231
172
-
For a complete example, see the [MCP Feature Store Example](../../examples/mcp_feature_store/).
232
+
For a basic MCP example, see the [MCP Feature Store Example](../../examples/mcp_feature_store/). For a full agent with persistent memory, see the [Feast-Powered AI Agent Example](../../examples/agent_feature_store/).
Copy file name to clipboardExpand all lines: docs/how-to-guides/feast-on-kubernetes.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,6 +10,10 @@ Kubernetes is a common target environment for running Feast in production. You c
10
10
2. Run scheduled and ad-hoc jobs (e.g. materialization jobs) as Kubernetes Jobs.
11
11
3. Operate Feast components using Kubernetes-native primitives.
12
12
13
+
{% hint style="info" %}
14
+
**Planning a production deployment?** See the [Feast Production Deployment Topologies](./production-deployment-topologies.md) guide for architecture diagrams, sample FeatureStore CRs, RBAC policies, infrastructure recommendations, and scaling best practices across Minimal, Standard, and Enterprise topologies.
15
+
{% endhint %}
16
+
13
17
## Feast Operator
14
18
15
19
To deploy Feast components on Kubernetes, use the included [feast-operator](../../infra/feast-operator).
0 commit comments