Summary
The langextract observability integration is fully implemented in the core wrapper (praisonai/observability/langextract.py) and exposed via CLI (praisonai langextract view|render) and the --observe langextract flag, but it has no user-facing documentation in this repo.
This issue requests new content (not an update) — specifically a new page at docs/observability/langextract.mdx, plus two small cross-reference updates on existing pages.
Decision: Create new content — confirmed by:
grep -r langextract docs/ returns zero matches (only the daily-synced SDK source under praisonai/ shows up, which is not a user doc).
docs/observability/overview.mdx lists 20+ providers; langextract is absent from the supported-providers table.
docs.json Observability group (lines 720–740) has no langextract entry.
Source PRs (SDK ground truth)
- Feature introduction: MervinPraison/PraisonAI#1413 (langextract TraceSinkProtocol adapter + CLI commands)
- Bug fix / event-capture bridge: MervinPraison/PraisonAI#1420 — merged
2026-04-17 on main @ 2bd0287828a1bf055d35be363743c8333cbaa1c6
- Adds
_ContextToActionBridge so normal Agent.start() flows emit events (before this fix, only RouterAgent / PlanningAgent events were captured, so render produced an empty HTML).
What is langextract?
An optional observability integration that turns a PraisonAI agent run into a self-contained interactive HTML trace, grounded in the original input text. Unlike hosted providers (Langfuse, LangSmith, etc.), it writes two local files:
<name>.jsonl — annotated-documents JSONL (langextract format)
<name>.html — interactive visualization (opens in browser)
Extractions produced per run:
| Event |
Extraction class |
Grounded |
AGENT_START |
agent_run |
First 200 chars of input |
TOOL_START |
tool_call |
No (ungrounded) |
TOOL_END |
tool_result |
No |
OUTPUT |
final_output |
First 1000 chars of output |
ERROR |
error |
No |
SDK ground truth — files to read before writing docs
Per AGENTS.md §1.1 / §1.3 (SDK-first), read these before authoring the page:
| Concern |
File (in praisonai/ synced tree of this repo) |
| Sink + config dataclass |
praisonai/observability/langextract.py |
| Package export (lazy) |
praisonai/observability/__init__.py |
view / render CLI |
praisonai/cli/commands/langextract.py |
--observe langextract wiring |
praisonai/cli/app.py → _setup_langextract_observability |
| Context-event bridge (added in #1420) |
praisonai/observability/langextract.py → _ContextToActionBridge, LangextractSink.context_sink(), LangextractSink.bridge_context_events() |
Configuration options (extract verbatim — do not guess)
From @dataclass LangextractSinkConfig in praisonai/observability/langextract.py:
| Option |
Type |
Default |
Description |
output_path |
str |
"praisonai-trace.html" |
HTML file written on close() |
jsonl_path |
Optional[str] |
None (derived from output_path) |
Annotated-documents JSONL path |
document_id |
str |
"praisonai-run" |
Document ID in the JSONL |
auto_open |
bool |
False |
Open the HTML in a browser after render |
include_llm_content |
bool |
True |
Include response text in attributes |
include_tool_args |
bool |
True |
Include tool args in attributes |
enabled |
bool |
True |
Master switch |
Install
pip install 'praisonai[langextract]'
User-facing entry points to document
- CLI — render a YAML workflow
praisonai langextract render workflow.yaml -o trace.html
praisonai langextract render workflow.yaml -o trace.html --no-open
- CLI — view an existing JSONL
praisonai langextract view trace.jsonl -o trace.html
- CLI — instrument any
praisonai run
praisonai --observe langextract agents.yaml
- Programmatic (agent-centric — lead with this, per AGENTS.md §1.1.9)
from praisonaiagents import Agent
from praisonaiagents.trace.protocol import TraceEmitter, set_default_emitter
from praisonai.observability import LangextractSink, LangextractSinkConfig
sink = LangextractSink(LangextractSinkConfig(output_path="trace.html", auto_open=True))
set_default_emitter(TraceEmitter(sink=sink, enabled=True))
LangextractSink.bridge_context_events(sink=sink, session_id="my-run") # captures Agent.start/tool/llm
agent = Agent(name="Writer", instructions="Write a haiku about code.")
agent.start("Write a haiku about code.")
sink.close() # writes trace.jsonl + trace.html
The bridge_context_events(...) call is required for typical single-agent flows. Without it, only RouterAgent token-usage and PlanningAgent.plan_created events are captured and the HTML will be empty. This is the exact gap fixed by #1420 — please lead with the bridged example and only mention the un-bridged path as a footnote.
Files to create / modify
1. Create docs/observability/langextract.mdx (NEW)
Must follow AGENTS.md §2 page template exactly. Suggested section skeleton:
---
title: "Langextract"
sidebarTitle: "Langextract"
description: "Render PraisonAI agent runs as self-contained interactive HTML traces grounded in the input text"
icon: "file-code"
---
<intro sentence>
<hero mermaid — see AGENTS.md §3.3, LR graph: Agent → Sink → JSONL+HTML → Browser>
## Quick Start
<Steps>
<Step title="Install"> pip install 'praisonai[langextract]' </Step>
<Step title="Agent-centric (recommended)"> … code with bridge_context_events … </Step>
<Step title="CLI — render a workflow"> praisonai langextract render … </Step>
</Steps>
## How It Works
sequenceDiagram: User → CLI/Agent → LangextractSink → ContextTraceEmitter (bridged) → JSONL → HTML
## Configuration Options
<full LangextractSinkConfig table above>
## CLI Reference
- praisonai langextract render <yaml> [-o FILE] [--no-open] [--api-url URL]
- praisonai langextract view <jsonl> [-o FILE] [--no-open]
- praisonai --observe langextract <agents.yaml>
## Extraction Mapping
<event → extraction_class table above>
## Common Patterns
- Single-agent Agent.start() with bridge
- YAML workflow via render
- Post-hoc: re-render an existing JSONL with view
## Troubleshooting
- "Trace was not rendered" / empty HTML → ensure bridge_context_events(...) is called (fixed in PR #1420)
- ImportError → pip install 'praisonai[langextract]'
## Best Practices (AccordionGroup)
- Always call sink.close() (render happens there); CLI commands do this automatically
- Use auto_open=False in CI
- Scope the emitter: restore the previous emitter after your run
## Related (CardGroup)
- /observability/overview
- /observability/langfuse (hosted alternative)
Mermaid colours must follow AGENTS.md §3.1.
2. Update docs/observability/overview.mdx (small edit)
Add one row to the supported-providers table (around line 50):
| [Langextract](/observability/langextract) | – (local HTML) | `pip install 'praisonai[langextract]'` |
Consider noting in a small callout that langextract is a local file sink (HTML + JSONL) rather than a hosted backend — it’s qualitatively different from the rest of the table.
3. Update docs.json (small edit)
Insert "docs/observability/langextract" into the Observability group, between lines 720–740 (natural spot is right after overview). Keep alphabetical / logical ordering consistent with neighbours. Verify JSON remains valid after the edit (AGENTS.md §1.9).
Placement rules (AGENTS.md §1.8)
- ✅
docs/observability/ is agent-writable.
- ❌ Do not touch
docs/concepts/ — this feature is a provider integration, not a core concept.
- ❌ Do not touch
docs/js/ or docs/rust/ — langextract is Python-only; those trees are auto-generated.
Acceptance checklist (per AGENTS.md §9)
References
- PraisonAI PR #1413 (feature introduction)
- PraisonAI PR #1420 (bugfix wiring the context emitter bridge — merged on
main)
- Source-of-truth files listed in the “SDK ground truth” table above
Summary
The
langextractobservability integration is fully implemented in the core wrapper (praisonai/observability/langextract.py) and exposed via CLI (praisonai langextract view|render) and the--observe langextractflag, but it has no user-facing documentation in this repo.This issue requests new content (not an update) — specifically a new page at
docs/observability/langextract.mdx, plus two small cross-reference updates on existing pages.Decision: Create new content — confirmed by:
grep -r langextract docs/returns zero matches (only the daily-synced SDK source underpraisonai/shows up, which is not a user doc).docs/observability/overview.mdxlists 20+ providers; langextract is absent from the supported-providers table.docs.jsonObservability group (lines 720–740) has no langextract entry.Source PRs (SDK ground truth)
2026-04-17onmain@2bd0287828a1bf055d35be363743c8333cbaa1c6_ContextToActionBridgeso normalAgent.start()flows emit events (before this fix, only RouterAgent / PlanningAgent events were captured, sorenderproduced an empty HTML).What is langextract?
An optional observability integration that turns a PraisonAI agent run into a self-contained interactive HTML trace, grounded in the original input text. Unlike hosted providers (Langfuse, LangSmith, etc.), it writes two local files:
<name>.jsonl— annotated-documents JSONL (langextract format)<name>.html— interactive visualization (opens in browser)Extractions produced per run:
AGENT_STARTagent_runTOOL_STARTtool_callTOOL_ENDtool_resultOUTPUTfinal_outputERRORerrorSDK ground truth — files to read before writing docs
Per AGENTS.md §1.1 / §1.3 (SDK-first), read these before authoring the page:
praisonai/synced tree of this repo)praisonai/observability/langextract.pypraisonai/observability/__init__.pyview/renderCLIpraisonai/cli/commands/langextract.py--observe langextractwiringpraisonai/cli/app.py→_setup_langextract_observabilitypraisonai/observability/langextract.py→_ContextToActionBridge,LangextractSink.context_sink(),LangextractSink.bridge_context_events()Configuration options (extract verbatim — do not guess)
From
@dataclass LangextractSinkConfiginpraisonai/observability/langextract.py:output_pathstr"praisonai-trace.html"close()jsonl_pathOptional[str]None(derived fromoutput_path)document_idstr"praisonai-run"auto_openboolFalseinclude_llm_contentboolTrueinclude_tool_argsboolTrueenabledboolTrueInstall
pip install 'praisonai[langextract]'User-facing entry points to document
praisonairunFiles to create / modify
1. Create
docs/observability/langextract.mdx(NEW)Must follow AGENTS.md §2 page template exactly. Suggested section skeleton:
Mermaid colours must follow AGENTS.md §3.1.
2. Update
docs/observability/overview.mdx(small edit)Add one row to the supported-providers table (around line 50):
Consider noting in a small callout that langextract is a local file sink (HTML + JSONL) rather than a hosted backend — it’s qualitatively different from the rest of the table.
3. Update
docs.json(small edit)Insert
"docs/observability/langextract"into the Observability group, between lines 720–740 (natural spot is right afteroverview). Keep alphabetical / logical ordering consistent with neighbours. Verify JSON remains valid after the edit (AGENTS.md §1.9).Placement rules (AGENTS.md §1.8)
docs/observability/is agent-writable.docs/concepts/— this feature is a provider integration, not a core concept.docs/js/ordocs/rust/— langextract is Python-only; those trees are auto-generated.Acceptance checklist (per AGENTS.md §9)
docs/observability/langextract.mdxcreated and renders in Mintlifytitle,sidebarTitle,description,icon)<Steps>,<AccordionGroup>,<CardGroup>all usedLangextractSinkConfigfield documented with exact type + default from sourcebridge_context_events(...)render,view,--observe langextractdocs/observability/overview.mdxupdated with a langextract rowdocs.jsonupdated withdocs/observability/langextract; JSON validatesdocs/concepts/,docs/js/, ordocs/rust/References
main)