-
Notifications
You must be signed in to change notification settings - Fork 218
Rewrite on welcome to lf section #2872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,86 @@ | ||
| --- | ||
| title: AI Engineering Loop | ||
| description: A high-level map of the AI engineering lifecycle, from tracing and monitoring to building datasets, experimenting, and evaluating. | ||
| --- | ||
|
|
||
| import { Activity, BadgeCheck, Database, FlaskConical, Route } from "lucide-react"; | ||
|
|
||
| # The AI Engineering Loop | ||
|
|
||
| The AI Engineering Loop is how teams approach the continuous evolution and improvement of their AI-powered systems. It connects what happens in production directly to the work of improving quality, cost, latency, and reliability during development. | ||
|
|
||
| Many of the underlying concepts mirror traditional software engineering, but a key differentiator is the probabilistic nature of LLM outputs and the sheer number of paths a system can take. You cannot unit-test your way to confidence. You need a systematic way to observe, learn, and improve. | ||
|
|
||
|  | ||
|
|
||
| The loop clusters into two areas of work. | ||
|
|
||
| ## 1. Understanding what's happening in production | ||
|
|
||
| The first part is about visibility. What is your system actually doing in the real world? Which requests are going well, and which are failing in ways that matter? | ||
|
|
||
| <Cards num={2} className="gap-6"> | ||
| <Cards.Card | ||
| title="1. Tracing" | ||
| href="/academy/tracing" | ||
| icon={<Route className="w-5 h-5" />} | ||
| arrow | ||
| > | ||
| Capture the full path of a request, including prompts, retrieved context, tool calls, outputs, latency, and cost. Tracing is the raw record of what your system actually did. | ||
| </Cards.Card> | ||
| <Cards.Card | ||
| title="2. Monitoring" | ||
| href="/academy/monitoring" | ||
| icon={<Activity className="w-5 h-5" />} | ||
| arrow | ||
| > | ||
| Track how the system behaves over time and surface the traces that deserve attention. Monitoring turns a stream of raw data into an ongoing understanding of how the system evolves. | ||
| </Cards.Card> | ||
| </Cards> | ||
|
|
||
| ## 2. Improving systematically during development | ||
|
|
||
| The second part is about turning what you have observed into improvements you can trust — without degrading the parts of the system that are already working. | ||
|
|
||
| <Cards num={3} className="gap-6"> | ||
| <Cards.Card | ||
| title="3. Building datasets" | ||
|
Check warning on line 47 in content/academy/ai-engineering-loop.mdx
|
||
| href="/academy/datasets" | ||
| icon={<Database className="w-5 h-5" />} | ||
| arrow | ||
| > | ||
| Turn real scenarios surfaced through monitoring into repeatable test cases. Instead of testing against a handful of hand-picked examples, you build a set that reflects how the system actually gets used. | ||
| </Cards.Card> | ||
| <Cards.Card | ||
| title="4. Experimenting" | ||
| href="/academy/experiments" | ||
| icon={<FlaskConical className="w-5 h-5" />} | ||
| arrow | ||
| > | ||
| Change one variable at a time — a prompt, a model, a retrieval strategy — and compare it against a stable baseline. That way you know what actually improved instead of guessing. | ||
| </Cards.Card> | ||
| <Cards.Card | ||
| title="5. Evaluating" | ||
| href="/academy/evaluate" | ||
| icon={<BadgeCheck className="w-5 h-5" />} | ||
| arrow | ||
| > | ||
| Decide whether results are good enough to ship using manual review, code-based checks, or LLM judges. Evaluation is how you turn a comparison into a decision. | ||
| </Cards.Card> | ||
| </Cards> | ||
|
|
||
| Once you ship a change, the cycle starts again. The updated system produces new traces, new monitoring signals, and new opportunities to improve. | ||
|
|
||
| ## You don't have to close the full loop on day one | ||
|
|
||
| Most teams don't start with all five steps in place. That is fine. | ||
|
|
||
| The value of the loop is cumulative. Each step you add gives you better signal, more systematic coverage, and more confidence in what you are shipping. The goal is not to implement everything at once — it is to understand where you are and take the next step toward closing the loop. | ||
|
|
||
| {/* TODO: Link blog article about patterns of AI engineering lifecycle adoption once written */} | ||
|
|
||
| ## Start with tracing | ||
|
|
||
| The natural place to begin is tracing. You cannot monitor what you cannot see, and you cannot improve what you cannot measure. Tracing is the foundation everything else builds on. | ||
|
|
||
| [→ Start with Tracing](/academy/tracing) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,5 @@ | ||
| { | ||
| "title": "Datasets", | ||
| "title": "Building Datasets", | ||
| "pages": [ | ||
| "overview" | ||
| ] | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,5 @@ | ||
| { | ||
| "title": "Evaluate", | ||
| "title": "Evaluating", | ||
| "pages": [ | ||
| "overview" | ||
| ] | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,5 @@ | ||
| { | ||
| "title": "Experiments", | ||
| "title": "Experimenting", | ||
| "pages": [ | ||
| "overview" | ||
| ] | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,44 +1,42 @@ | ||
| --- | ||
| title: Langfuse Academy | ||
| description: Build a mental model for AI engineering. Learn the core disciplines teams rely on as LLM applications move from prototype to production. | ||
| description: Understand why LLM engineering is different and how to navigate the full AI engineering lifecycle. | ||
| --- | ||
|
|
||
| # Welcome to Langfuse Academy | ||
|
|
||
| This is the place to build a mental model for AI engineering. We'll introduce the core disciplines teams rely on as LLM applications move from prototype to production. | ||
| Building with LLMs changes what it means for a system to work. Outputs are probabilistic. A system can run fine and still produce responses that are wrong, off-brand, or useless. Teams need to reason about quality, cost, latency, and the tradeoffs between them. | ||
|
|
||
| Rather than focusing on individual product features, Academy is meant to help you understand the bigger picture, and how teams can work with that change in a systematic way. | ||
| Langfuse Academy maps the AI engineering lifecycle so you understand how the pieces fit and what it takes to ship from prototype to production. | ||
|
|
||
| ## Why LLM observability is different | ||
|
|
||
| Traditional observability remains essential. Teams still need to know whether their systems are up, whether requests are slow, whether dependencies are failing, and whether costs are under control. Those questions do not disappear when an application starts using LLMs. | ||
|
|
||
| But LLM applications introduce a different kind of challenge. Their behavior is probabilistic: the same input can produce different outputs, and a response can look plausible even when it is wrong, incomplete, off-brand, unsafe, or simply unhelpful. In other words, a request can succeed technically and still fail for the user. | ||
| ## What you will find here | ||
|
|
||
| <Callout> | ||
| TODO: insert a visual or an example here to break up the text | ||
| </Callout> | ||
|
|
||
| AI engineering is not only about reliability. It is also about quality. Teams need to understand whether the output was useful, grounded, safe, and worth the cost. Observability for LLM applications therefore sits closer to product quality and iteration than traditional application monitoring usually does. | ||
|
|
||
| Modern observability platforms for LLM systems increasingly treat prompts, responses, token usage, quality signals, and model-specific behavior as first-class telemetry. | ||
| The Langfuse Academy follows the AI engineering lifecycle from first visibility into production behavior all the way to structured improvement and evaluation. The goal is to explain why each step exists, what problem it solves, and how the steps connect. | ||
|
|
||
| ## The AI engineering loop | ||
| Start with [The AI Engineering Loop](/academy/ai-engineering-loop) for the high-level map, then go deeper into the individual parts: | ||
|
|
||
| Because of this, AI engineering is iterative. Teams do not build once, ship once, and assume the work is done. They observe behavior, learn from it, improve the system, and evaluate the result over time. | ||
| - [Tracing](/academy/tracing) | ||
| - [Monitoring](/academy/monitoring) | ||
| - [Building Datasets](/academy/datasets) | ||
| - [Experimenting](/academy/experiments) | ||
| - [Evaluating](/academy/evaluate) | ||
|
|
||
|  | ||
| Some pages explain the high-level concepts. Others are deeper dives into individual parts of the lifecycle. You can read the full sequence or jump to the topic that is most relevant to your team right now. | ||
|
|
||
| <Callout> | ||
| TODO: replace with final loop visual; should we explain each step in 1-2 sentences? Keep it concise | ||
| </Callout> | ||
|
|
||
| ## What comes next | ||
| ## Academy and docs do different jobs | ||
|
|
||
| The rest of Langfuse Academy goes deeper into each step of the loop. | ||
| Academy focuses on high-level concepts and how the lifecycle fits together. The [docs](/docs) and [guides](/guides) cover Langfuse features, product implementation details, and step-by-step how-tos. | ||
|
|
||
| Each section is designed to work on its own: it gives you an overview first, and then lets you go deeper if and when that makes sense for your use case. You can follow the full loop, or focus only on the parts that are most relevant for your team right now. | ||
| Use Academy to understand the lifecycle. Use the docs and guides when you are ready to implement it in Langfuse. | ||
|
|
||
| You also do not need to adopt everything at once. Most teams improve their setup iteratively over time, adding new practices as they become useful. Doing part of this loop is already better than having no LLM engineering practices at all. | ||
| <Callout type="info" title="Who this is for"> | ||
| - AI engineers and software engineers building LLM applications and agentic systems | ||
| - Product managers who need to reason about quality, iteration, and tradeoffs | ||
| - Technical and business leaders who need a working understanding of how AI systems are built and improved | ||
| - AI agents that support humans in understanding AI engineering concepts and workflows | ||
| </Callout> | ||
|
|
||
| Let's dive in! | ||
| ## Why we are publishing this | ||
| Langfuse is open source, and we want to open source the conceptual side of AI engineering too. The Academy is our way of making the core ideas, vocabulary, and workflows behind LLM application development easier to access for everyone. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,6 +2,7 @@ | |
| "title": "Academy", | ||
| "pages": [ | ||
| "index", | ||
| "ai-engineering-loop", | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if it's here, but it's weird that the AI engineering loop is nested in the AI engineering loop. It should just be one tab that I click on, and then I get the content; nothing to unfold. |
||
| "---The Loop---", | ||
| "tracing", | ||
| "monitoring", | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 On line 47, the card title reads
title="3. Building datasets"(lowercased), but the corresponding sidebar label incontent/academy/datasets/meta.jsonand the link incontent/academy/index.mdx:22both readBuilding Datasets(uppercaseD). A reader sees "Building datasets" on the card and then "Building Datasets" on the destination page. Suggest changing this totitle="3. Building Datasets"to match.Extended reasoning...
What the bug is
The new AI Engineering Loop page introduces five cards linking to the loop sub-sections. Four of those cards have single-word titles (
Tracing,Monitoring,Experimenting,Evaluating) where casing is unambiguous. The one multi-word card — the third one, for datasets — uses sentence case on the card but Title Case everywhere else in the same PR.The specific code path
content/academy/ai-engineering-loop.mdx:47→title="3. Building datasets"(lowercased).content/academy/datasets/meta.json→"title": "Building Datasets"(uppercaseD). This drives the sidebar entry and the page heading.content/academy/index.mdx:22→[Building Datasets](/academy/datasets)(uppercaseD). This is the link in the welcome page bullet list.So the card on the loop page is the only place that uses lowercase
d, and it diverges from both the sidebar label of the page it links to and the sibling link in the welcome page list.Why existing code does not prevent it
There is no shared constant or central source of truth for these section titles — each is hand-typed. Nothing in the build will warn when a card label diverges from its target page title. The other four cards happen to be safe because their titles are single words.
Step-by-step proof
/academy/ai-engineering-loopand sees five cards. The third reads "3. Building datasets"./academy/datasets.content/academy/datasets/meta.json, whosetitleis"Building Datasets"./academy(the index page), the bullet list at line 22 showsBuilding Datasets, also disagreeing with the loop card.Impact
Purely cosmetic — no broken navigation, no functional issue. But it is the only multi-word card and the only one whose label disagrees with its destination, so the inconsistency is asymmetric and stands out.
Fix
Change
content/academy/ai-engineering-loop.mdx:47fromtitle="3. Building datasets"totitle="3. Building Datasets"to align withdatasets/meta.jsonandindex.mdx.