Skip to content

Commit 646d206

Browse files
committed
Update journals and documentation for LangChain and StreamIO
- Revised the journal entry for May 1, 2025, to include new sections on CS and LangChain, enhancing the organization of topics. - Added a new journal entry for May 2, 2025, focusing on AI Chat Application technicalities with a link to a LangGraph how-to guide. - Created a new documentation page detailing LangChain's updates on content blocks and retry logic, including code examples and best practices. - Introduced a new page on handling rate limits and implementing exponential backoff in LangGraph, providing comprehensive strategies and examples. - Added documentation for StreamIO's rate limits and a new page on exponential backoff as a retry strategy.
1 parent d8761b3 commit 646d206

7 files changed

Lines changed: 211 additions & 4 deletions

journals/2025_05_01.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,15 @@
1-
## StreamIO
2-
- [[StreamIO/Chat/How To/Attach Data Files Larger than 5KB with File Uploads]]
3-
- ## Adjacent to #EdTech
1+
## CS
2+
- [[Exponential Backoff]]
3+
- ## #EdTech
44
- #Filed
55
- [[Org/Nonprofit/Burning Glass Institute]]
66
- they created [[Org/Skills First]]
77
- the top 5-year demand growth #Skills for [[Software/Developer]] roles, each of which is expected to at least double in five years (> 100% growth) in descending order:
88
- [[Kubernetes]]
99
- [[CICD]]
1010
- [[AI]]
11-
- [[Microsoft/Azure]]
11+
- [[Microsoft/Azure]]
12+
- ## LangChain
13+
- [[LangChain/Blog/25/04/17 LangChain Python Improved Content Blocks Retry Logic and More]]
14+
- ## StreamIO
15+
- [[StreamIO/Chat/How To/Attach Data Files Larger than 5KB with File Uploads]]

journals/2025_05_02.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
## AI Chat Application Technicalities
2+
- [[langgraph/How To/Stream messages into StreamIO with Exponential Backoff]]
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
date-created:: [[2025-04-17 Thu]]
2+
tags:: [[LangChain]], [[Blog]], [[Content-Blocks]], [[Retry-Logic]]
3+
4+
- # LangChain Python Updates: Improved Content Blocks, Retry Logic and More
5+
- [Original Announcement](https://changelog.langchain.com/announcements/langchain-python-updates-improved-contenet-blocks-retry-logic-and-more)
6+
- ## Standardized Multimodal Content Blocks
7+
- [Documentation](https://python.langchain.com/docs/how_to/multimodal_inputs/)
8+
- Example:
9+
- ```python
10+
from langchain_core.messages import HumanMessage, AIMessage
11+
from langchain_core.messages.content import ImageContent, TextContent
12+
13+
# Create a message with both text and image content
14+
message = HumanMessage(
15+
content=[
16+
TextContent(text="What's in this image?"),
17+
ImageContent(
18+
image_url="https://example.com/image.jpg",
19+
image_path=None,
20+
mime_type="image/jpeg"
21+
)
22+
]
23+
)
24+
```
25+
- ## ChatPromptTemplate with Arbitrary Content Blocks
26+
- [Documentation](https://python.langchain.com/docs/how_to/multimodal_prompts/)
27+
- Example:
28+
- ```python
29+
from langchain_core.prompts import ChatPromptTemplate
30+
from langchain_core.messages import HumanMessage
31+
32+
# Create a template that can handle text and images
33+
template = ChatPromptTemplate.from_messages([
34+
("human", [
35+
{"type": "text", "text": "Describe this image:"},
36+
{"type": "image_url", "image_url": "{image_url}"}
37+
])
38+
])
39+
40+
# Format the template
41+
messages = template.format_messages(
42+
image_url="https://example.com/image.jpg"
43+
)
44+
```
45+
- ## Custom [[Programming/Error Handling/Retry Logic/Exponential Backoff]] for Runnable.with_retry
46+
- [Documentation](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.with_retry)
47+
- Example:
48+
- ```python
49+
from langchain_core.runnables import RunnableConfig
50+
51+
# Configure custom retry parameters
52+
chain_with_retries = chain.with_retry(
53+
retry_patterns={
54+
"ConnectionError": {
55+
"max_attempts": 3,
56+
"initial_delay": 1.0,
57+
"max_delay": 60.0,
58+
"exponential_base": 2.0,
59+
}
60+
}
61+
)
62+
```
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
alias:: [[Rate Limit]], [[Rate Limits]]
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
tags:: [[Programming]], [[Error Handling]], [[Retry Logic]], [[Algorithm]]
2+
alias:: [[Exponential Backoff]]
3+
4+
- # Exponential Backoff
5+
- A retry strategy where the delay between retry attempts increases exponentially
6+
- ## Key Concepts
7+
- Initial delay: The first waiting period after a failure
8+
- Multiplier/Base: The factor by which the delay increases each time (typically 2)
9+
- Max delay: Upper limit on how long to wait between retries
10+
- Max attempts: Maximum number of retry attempts before giving up
11+
- ## Common Use Cases
12+
- Network requests and API calls
13+
- Distributed systems communication
14+
- Rate limiting and throttling
15+
- Database connection retries
16+
- ## Benefits
17+
- Prevents overwhelming systems under stress
18+
- Allows temporary issues to resolve naturally
19+
- Reduces network congestion
20+
- More efficient than fixed-interval retries
21+
- ## Example Formula
22+
- delay = min(max_delay, initial_delay * (base ^ attempt_number))
23+
- For base=2:
24+
- 1st retry: 1s
25+
- 2nd retry: 2s
26+
- 3rd retry: 4s
27+
- 4th retry: 8s
28+
- etc.
29+
- ## Implementation Examples
30+
- [[LangChain/Blog/25/04/17 LangChain Python Improved Content Blocks Retry Logic and More]] - LangChain's Runnable.with_retry implementation
31+
- ## See also
32+
- [[LangChain/output_parsers/retry/RetryOutputParser]]
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# [Rate Limits - Python Chat Messaging Docs](https://getstream.io/chat/docs/python/rate_limits/)
Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
# How to stream response chunks into [[StreamIO/Chat/Message]] from [[langgraph]] while addressing [[Rate Limits]] with [[Exponential Backoff]]
2+
- ## Problem Context
3+
- While streaming chunks from a LangGraph graph into a StreamIO chat message, the [[StreamIO/Chat/Rate Limits]] at the user or App level may refuse a request. In that case, addressing updating each chunk with exponential backoff would not be advisable.
4+
- In some streaming scenarios, subsequent chunks replace earlier chunks, in other streaming scenarios, subsequent chunks need to be combined with earlier chunks to obtain the full output.
5+
- In LangGraph, chunks are streamed with generators. The generators maintain the sequence of the chunks. If one of the chunks cannot update a StreamIO message because of a rate limit error, there are a few available approaches.
6+
- ## Analysis
7+
- ### LangGraph Streaming Overview
8+
- LangGraph supports streaming outputs from a graph using generators, with several streaming modes ("values", "updates", "messages", "custom", "debug"). Each chunk yielded by the generator represents a unit of work, such as a partial LLM output or a state update. See [LangGraph Streaming Concepts](https://langchain-ai.github.io/langgraph/concepts/streaming/) and [How to stream](https://langchain-ai.github.io/langgraph/how-tos/streaming/).
9+
- ### StreamIO Rate Limits
10+
- StreamIO applies rate limits at both the user and app level, typically 60 requests per minute per user and higher per app/platform. Exceeding these limits results in HTTP 429 errors. See [StreamIO Rate Limits](https://getstream.io/chat/docs/python/rate_limits/).
11+
- When a rate limit is hit, StreamIO recommends exponential backoff and retry, and provides headers to inspect remaining quota and reset time.
12+
- ### Exponential Backoff in Streaming Context
13+
- Exponential backoff is a standard approach to handling rate limits: after a 429 error, wait an increasing amount of time before retrying (e.g., 1s, 2s, 4s, ... up to a max interval).
14+
- In the context of streaming, this means that if a chunk update to StreamIO fails due to rate limiting, the application should pause and retry the update with exponential backoff, rather than immediately proceeding to the next chunk.
15+
- ### Tradeoffs and Best Practices
16+
- **Serial Processing with Backoff:** Processing each chunk serially and waiting for a successful update before yielding the next chunk ensures strict adherence to rate limits, but may result in unnecessary updates and increased latency.
17+
- **Batching Chunks:** If rate limits are frequently hit, consider batching multiple chunks together and updating StreamIO less frequently. This reduces the number of API calls and better utilizes the allowed quota.
18+
- **Skipping Redundant Updates:** In scenarios where only the latest chunk matters (e.g., replacing message content), it may be optimal to skip intermediate updates that failed due to rate limits and only update with the most recent chunk after the backoff period.
19+
- **Inspect Rate Limit Headers:** Always inspect the `X-RateLimit-Remaining` and `X-RateLimit-Reset` headers in StreamIO responses to dynamically adjust backoff timing and avoid unnecessary retries.
20+
- **Configurable Retry Policy:** In LangGraph, you can configure retry policies for nodes (see [How to add node retry policies](https://langchain-ai.github.io/langgraph/how-tos/node-retries/)), allowing for exponential backoff and custom retry logic on API errors like 429.
21+
- ### Example Retry Policy in LangGraph
22+
- Use the `RetryPolicy` when adding a node that updates StreamIO, specifying `initial_interval`, `backoff_factor`, `max_interval`, and `max_attempts`.
23+
- Example:
24+
~~~python
25+
from langgraph.pregel import RetryPolicy
26+
builder.add_node(
27+
"update_streamio",
28+
update_streamio_fn,
29+
retry=RetryPolicy(initial_interval=1.0, backoff_factor=2.0, max_interval=32.0, max_attempts=5)
30+
)
31+
~~~
32+
- This ensures that if a rate limit error occurs, the node will retry with exponential backoff, up to the specified maximum attempts.
33+
- ### Summary
34+
- When streaming from LangGraph to StreamIO, design your update logic to:
35+
- Handle 429 errors with exponential backoff
36+
- Consider batching or skipping redundant updates
37+
- Use LangGraph's retry policies for robust error handling
38+
- Monitor rate limit headers to optimize retry timing
39+
- This approach balances responsiveness, efficiency, and compliance with StreamIO's rate limits.
40+
- ## Algorithms
41+
- ### Exponential-Backoff Skip Algorithm
42+
- **Idea:** keep pushing chunks through LangGraph's generator, but only update Stream Chat when (a) the last update succeeded or (b) the back-off window has expired—whichever is later. If several new chunks arrive while you're waiting, keep just the latest (append/replace logic) so you don't waste requests.
43+
- **Steps**
44+
- Initialise `retry_interval = 1 s`, `max_interval = 32 s`, `backoff_factor = 2`.
45+
- For each `chunk` from `graph.astream(..., mode="messages")` ([Streaming](https://langchain-ai.github.io/langgraph/concepts/streaming/))
46+
- If *not* in back-off → try `update_message_partial`.
47+
- On **HTTP 429** → read `X-RateLimit-Reset`/`Remaining`, enter back-off for `retry_interval`, then double `retry_interval *= backoff_factor` up to `max_interval` ([Rate Limits - Python Chat Messaging Docs - getstream.io](https://getstream.io/chat/docs/python/rate_limits/?utm_source=chatgpt.com)).
48+
- While in back-off, buffer new chunks, replacing any previous buffered text if your UI only shows the latest content.
49+
- When the timer expires, send a single `update_message_partial` with the buffered text (or batch of concatenated chunks for additive streams).
50+
- On success → reset `retry_interval = 1 s`, clear buffer.
51+
- **Why:** guarantees you never exceed Stream's per-user 60 req/min default yet minimises redundant updates.
52+
- ### Sample Python (Async)
53+
- ```python
54+
import asyncio, time, itertools
55+
from stream_chat import StreamChat
56+
from langgraph import some_graph # your LangGraph instance
57+
58+
chat = StreamChat(api_key=API_KEY, api_secret=API_SECRET)
59+
channel = chat.channel("messaging", "general")
60+
bot_id = "ai-bot-general"
61+
62+
async def stream_to_streamio(run_id: str, message_id: str):
63+
retry_int = 1 # seconds
64+
max_int = 32
65+
backoff_factor= 2
66+
backoff_until = 0
67+
buffer_text = ""
68+
69+
async for (_, (chunk, _)) in some_graph.astream(run_id, mode="messages"):
70+
now = time.time()
71+
buffer_text += chunk.content # or `buffer_text = chunk.content` for replace-only
72+
if now < backoff_until:
73+
continue # still cooling down
74+
75+
try:
76+
await chat.update_message_partial(
77+
message_id,
78+
{"set": {"text": buffer_text, "generating": True}},
79+
bot_id,
80+
) # :contentReference[oaicite:2]{index=2}
81+
buffer_text, retry_int = "", 1 # reset on success
82+
except chat.exceptions.StreamAPIException as e:
83+
if e.status_code == 429:
84+
reset_ts = int(e.response.headers.get("X-RateLimit-Reset", now + retry_int))
85+
backoff_until = max(now + retry_int, reset_ts)
86+
retry_int = min(retry_int * backoff_factor, max_int)
87+
# keep accumulating chunks during back-off
88+
else:
89+
raise # surface non-rate-limit errors
90+
91+
```
92+
- Uses Stream's **partial-update** endpoint so you never overwrite undeclared fields ([Build an AI Assistant Using Python - getstream.io](https://getstream.io/blog/python-assistant/?utm_source=chatgpt.com)).
93+
- Works with any LangGraph streaming mode; just adapt the buffer strategy for "replace" vs "append".
94+
- ### Node-Level Retry Policy (optional)
95+
- ```python
96+
from langgraph.pregel import RetryPolicy
97+
builder.add_node(
98+
"update_streamio",
99+
lambda state: stream_to_streamio(state["run_id"], state["msg_id"]),
100+
retry=RetryPolicy(initial_interval=1.0, backoff_factor=2.0,
101+
max_interval=32.0, max_attempts=5)
102+
)
103+
104+
```
105+
- This lets LangGraph itself re-invoke the node when a 429 bubbles up. ([Streaming](https://langchain-ai.github.io/langgraph/concepts/streaming/))

0 commit comments

Comments
 (0)