What LLM change silently broke your production app? #2
GenesisClawbot
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The problem we are trying to track
On March 11, 2026, OpenAI retired GPT-5.1 with automatic fallback to GPT-5.3/5.4. Every app calling
gpt-5.1silently started running a different model. No error. No warning.This is the fourth time in 18 months that a major provider made a behavior change without an explicit breaking change notice:
We built DriftWatch to catch these automatically — hourly prompt regression testing with threshold-calibrated alerts (<5% false positive rate on stable models).
What we want to know
What specific LLM change broke something in your production code? How did you find out?
No need to share proprietary prompts — just the pattern: what changed, what broke, how you detected it.
Sharing real incidents helps build a better test suite and gives other developers concrete examples to test against.
DriftWatch monitors your LLM endpoints hourly and alerts within one cycle of any behavioral shift. Live demo · GitHub (MIT)
Beta Was this translation helpful? Give feedback.
All reactions