You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The project's core value proposition is "$0/month when idle." This dictates every architectural decision, including model choice.
174
-
175
-
-**Cost Efficiency**: At $0.10 per 1M input tokens vs $1.25 for Pro (12.5x cheaper), Flash models align with our scale-to-zero philosophy
176
-
-**Speed**: Lower latency is crucial for streaming chat experiences
177
-
-**RAG Sufficiency**: In RAG pipelines, intelligence is split between retrieval (Upstash Vector) and synthesis (LLM). Flash excels at synthesis, it doesn't need to know everything; it just needs to read retrieved chunks and summarize them effectively
178
-
-**Massive Context Window**: 1M tokens means you can feed significantly more retrieved chunks, reducing the risk of missing information
179
-
180
-
### Why Client-Side PDF Parsing?
181
-
182
-
-**Edge Runtime Limits**: Vercel Edge has 1MB bundle size limits and strict CPU time constraints
183
-
-**Cost Efficiency**: Offloads parsing to user's browser (zero server cost)
184
-
-**Reliability**: Avoids "Module not found: fs" errors common in serverless RAG apps
185
-
186
185
### Why Upstash Over Pinecone?
187
186
188
187
-**True Scale-to-Zero**: No $50/month minimum (Pinecone Serverless has a floor)
@@ -208,6 +207,24 @@ The app will automatically:
208
207
- Use Node.js runtime for `/api/ingest`
209
208
- Serve static assets (including PDF.js worker) from CDN
210
209
210
+
### Optional: Run the gRPC Vector Gateway
211
+
212
+
For internal services that prefer gRPC/Protobuf over HTTP+JSON, this repo includes a small vector gateway:
213
+
214
+
- Proto: `services/vector-grpc/vector.proto` (`VectorService` with `UpsertChunks` and `QueryChunks`)
215
+
- Server: `services/vector-grpc/server.ts` (Node.js, wraps Upstash Vector over HTTP)
216
+
217
+
To run it locally:
218
+
219
+
```bash
220
+
UPSTASH_VECTOR_REST_URL=... \
221
+
UPSTASH_VECTOR_REST_TOKEN=... \
222
+
VECTOR_GRPC_PORT=50051 \
223
+
npm run vector-grpc:server
224
+
```
225
+
226
+
This starts a gRPC server exposing a binary-efficient, schema-safe API for document upsert and similarity search. You can point other backend services or batch jobs at this endpoint instead of calling `/api/ingest` or Upstash HTTP directly.
227
+
211
228
## Cost Analysis
212
229
213
230
### Idle State (0 Users)
@@ -252,4 +269,4 @@ If this project helped you, please consider giving it a star! ⭐
252
269
253
270
---
254
271
255
-
**Built with ❤️ for the serverless community**
272
+
**Built for the serverless community while drinking a lot of 🧃**
0 commit comments