docs: add Langextract observability integration page (new content)

## Summary

The `langextract` observability integration is fully implemented in the core wrapper (`praisonai/observability/langextract.py`) and exposed via CLI (`praisonai langextract view|render`) and the `--observe langextract` flag, but it has **no user-facing documentation** in this repo.

This issue requests **new content** (not an update) — specifically a new page at `docs/observability/langextract.mdx`, plus two small cross-reference updates on existing pages.

Decision: **Create new content** — confirmed by:
- `grep -r langextract docs/` returns **zero** matches (only the daily-synced SDK source under `praisonai/` shows up, which is not a user doc).
- `docs/observability/overview.mdx` lists 20+ providers; langextract is **absent** from the supported-providers table.
- `docs.json` Observability group (lines 720–740) has no langextract entry.

## Source PRs (SDK ground truth)

- Feature introduction: **[MervinPraison/PraisonAI#1413](https://github.com/MervinPraison/PraisonAI/pull/1413)** (langextract TraceSinkProtocol adapter + CLI commands)
- Bug fix / event-capture bridge: **[MervinPraison/PraisonAI#1420](https://github.com/MervinPraison/PraisonAI/pull/1420)** — merged `2026-04-17` on `main` @ `2bd0287828a1bf055d35be363743c8333cbaa1c6`
  - Adds `_ContextToActionBridge` so normal `Agent.start()` flows emit events (before this fix, only RouterAgent / PlanningAgent events were captured, so `render` produced an empty HTML).

## What is langextract?

An optional observability integration that turns a PraisonAI agent run into a **self-contained interactive HTML trace**, grounded in the original input text. Unlike hosted providers (Langfuse, LangSmith, etc.), it writes two local files:

- `<name>.jsonl` — annotated-documents JSONL (langextract format)
- `<name>.html` — interactive visualization (opens in browser)

Extractions produced per run:

| Event | Extraction class | Grounded |
|-------|------------------|----------|
| `AGENT_START` | `agent_run` | First 200 chars of input |
| `TOOL_START` | `tool_call` | No (ungrounded) |
| `TOOL_END` | `tool_result` | No |
| `OUTPUT` | `final_output` | First 1000 chars of output |
| `ERROR` | `error` | No |

## SDK ground truth — files to read before writing docs

Per AGENTS.md §1.1 / §1.3 (SDK-first), read these before authoring the page:

| Concern | File (in `praisonai/` synced tree of this repo) |
|---|---|
| Sink + config dataclass | `praisonai/observability/langextract.py` |
| Package export (lazy) | `praisonai/observability/__init__.py` |
| `view` / `render` CLI | `praisonai/cli/commands/langextract.py` |
| `--observe langextract` wiring | `praisonai/cli/app.py` → `_setup_langextract_observability` |
| Context-event bridge (added in #1420) | `praisonai/observability/langextract.py` → `_ContextToActionBridge`, `LangextractSink.context_sink()`, `LangextractSink.bridge_context_events()` |

## Configuration options (extract verbatim — do not guess)

From `@dataclass LangextractSinkConfig` in `praisonai/observability/langextract.py`:

| Option | Type | Default | Description |
|---|---|---|---|
| `output_path` | `str` | `"praisonai-trace.html"` | HTML file written on `close()` |
| `jsonl_path` | `Optional[str]` | `None` (derived from `output_path`) | Annotated-documents JSONL path |
| `document_id` | `str` | `"praisonai-run"` | Document ID in the JSONL |
| `auto_open` | `bool` | `False` | Open the HTML in a browser after render |
| `include_llm_content` | `bool` | `True` | Include response text in attributes |
| `include_tool_args` | `bool` | `True` | Include tool args in attributes |
| `enabled` | `bool` | `True` | Master switch |

## Install

```bash
pip install 'praisonai[langextract]'
```

## User-facing entry points to document

1. **CLI — render a YAML workflow**
   ```bash
   praisonai langextract render workflow.yaml -o trace.html
   praisonai langextract render workflow.yaml -o trace.html --no-open
   ```
2. **CLI — view an existing JSONL**
   ```bash
   praisonai langextract view trace.jsonl -o trace.html
   ```
3. **CLI — instrument any `praisonai` run**
   ```bash
   praisonai --observe langextract agents.yaml
   ```
4. **Programmatic (agent-centric — lead with this, per AGENTS.md §1.1.9)**
   ```python
   from praisonaiagents import Agent
   from praisonaiagents.trace.protocol import TraceEmitter, set_default_emitter
   from praisonai.observability import LangextractSink, LangextractSinkConfig

   sink = LangextractSink(LangextractSinkConfig(output_path="trace.html", auto_open=True))
   set_default_emitter(TraceEmitter(sink=sink, enabled=True))
   LangextractSink.bridge_context_events(sink=sink, session_id="my-run")  # captures Agent.start/tool/llm

   agent = Agent(name="Writer", instructions="Write a haiku about code.")
   agent.start("Write a haiku about code.")

   sink.close()  # writes trace.jsonl + trace.html
   ```

> The `bridge_context_events(...)` call is **required** for typical single-agent flows. Without it, only `RouterAgent` token-usage and `PlanningAgent.plan_created` events are captured and the HTML will be empty. This is the exact gap fixed by #1420 — **please lead with the bridged example and only mention the un-bridged path as a footnote.**

## Files to create / modify

### 1. Create `docs/observability/langextract.mdx` (NEW)

Must follow AGENTS.md §2 page template exactly. Suggested section skeleton:

```
---
title: "Langextract"
sidebarTitle: "Langextract"
description: "Render PraisonAI agent runs as self-contained interactive HTML traces grounded in the input text"
icon: "file-code"
---

<intro sentence>

<hero mermaid — see AGENTS.md §3.3, LR graph: Agent → Sink → JSONL+HTML → Browser>

## Quick Start
  <Steps>
    <Step title="Install">  pip install 'praisonai[langextract]'  </Step>
    <Step title="Agent-centric (recommended)">  … code with bridge_context_events …  </Step>
    <Step title="CLI — render a workflow">  praisonai langextract render …  </Step>
  </Steps>

## How It Works
  sequenceDiagram: User → CLI/Agent → LangextractSink → ContextTraceEmitter (bridged) → JSONL → HTML

## Configuration Options
  <full LangextractSinkConfig table above>

## CLI Reference
  - praisonai langextract render <yaml> [-o FILE] [--no-open] [--api-url URL]
  - praisonai langextract view <jsonl> [-o FILE] [--no-open]
  - praisonai --observe langextract <agents.yaml>

## Extraction Mapping
  <event → extraction_class table above>

## Common Patterns
  - Single-agent Agent.start() with bridge
  - YAML workflow via render
  - Post-hoc: re-render an existing JSONL with view

## Troubleshooting
  - "Trace was not rendered" / empty HTML → ensure bridge_context_events(...) is called (fixed in PR #1420)
  - ImportError → pip install 'praisonai[langextract]'

## Best Practices (AccordionGroup)
  - Always call sink.close() (render happens there); CLI commands do this automatically
  - Use auto_open=False in CI
  - Scope the emitter: restore the previous emitter after your run

## Related (CardGroup)
  - /observability/overview
  - /observability/langfuse   (hosted alternative)
```

Mermaid colours must follow AGENTS.md §3.1.

### 2. Update `docs/observability/overview.mdx` (small edit)

Add one row to the supported-providers table (around line 50):

```md
| [Langextract](/observability/langextract) | – (local HTML) | `pip install 'praisonai[langextract]'` |
```

Consider noting in a small callout that langextract is a **local file** sink (HTML + JSONL) rather than a hosted backend — it’s qualitatively different from the rest of the table.

### 3. Update `docs.json` (small edit)

Insert `"docs/observability/langextract"` into the Observability group, between lines 720–740 (natural spot is right after `overview`). Keep alphabetical / logical ordering consistent with neighbours. Verify JSON remains valid after the edit (AGENTS.md §1.9).

## Placement rules (AGENTS.md §1.8)

- ✅ `docs/observability/` is agent-writable.
- ❌ Do **not** touch `docs/concepts/` — this feature is a provider integration, not a core concept.
- ❌ Do **not** touch `docs/js/` or `docs/rust/` — langextract is Python-only; those trees are auto-generated.

## Acceptance checklist (per AGENTS.md §9)

- [ ] `docs/observability/langextract.mdx` created and renders in Mintlify
- [ ] Frontmatter complete (`title`, `sidebarTitle`, `description`, `icon`)
- [ ] Hero Mermaid diagram present using the standard colour scheme
- [ ] `<Steps>`, `<AccordionGroup>`, `<CardGroup>` all used
- [ ] Every `LangextractSinkConfig` field documented with exact type + default from source
- [ ] Agent-centric code example leads (AGENTS.md §1.1.9) and includes `bridge_context_events(...)`
- [ ] CLI section covers all three surfaces: `render`, `view`, `--observe langextract`
- [ ] Troubleshooting section calls out the PR #1420 empty-trace failure mode
- [ ] `docs/observability/overview.mdx` updated with a langextract row
- [ ] `docs.json` updated with `docs/observability/langextract`; JSON validates
- [ ] No edits to `docs/concepts/`, `docs/js/`, or `docs/rust/`
- [ ] All code examples runnable copy-paste (AGENTS.md §5.1)

## References

- PraisonAI PR #1413 (feature introduction)
- PraisonAI PR #1420 (bugfix wiring the context emitter bridge — merged on `main`)
- Source-of-truth files listed in the “SDK ground truth” table above


Event	Extraction class	Grounded
`AGENT_START`	`agent_run`	First 200 chars of input
`TOOL_START`	`tool_call`	No (ungrounded)
`TOOL_END`	`tool_result`	No
`OUTPUT`	`final_output`	First 1000 chars of output
`ERROR`	`error`	No

Concern	File (in `praisonai/` synced tree of this repo)
Sink + config dataclass	`praisonai/observability/langextract.py`
Package export (lazy)	`praisonai/observability/__init__.py`
`view` / `render` CLI	`praisonai/cli/commands/langextract.py`
`--observe langextract` wiring	`praisonai/cli/app.py` → `_setup_langextract_observability`
Context-event bridge (added in #1420)	`praisonai/observability/langextract.py` → `_ContextToActionBridge`, `LangextractSink.context_sink()`, `LangextractSink.bridge_context_events()`

Option	Type	Default	Description
`output_path`	`str`	`"praisonai-trace.html"`	HTML file written on `close()`
`jsonl_path`	`Optional[str]`	`None` (derived from `output_path`)	Annotated-documents JSONL path
`document_id`	`str`	`"praisonai-run"`	Document ID in the JSONL
`auto_open`	`bool`	`False`	Open the HTML in a browser after render
`include_llm_content`	`bool`	`True`	Include response text in attributes
`include_tool_args`	`bool`	`True`	Include tool args in attributes
`enabled`	`bool`	`True`	Master switch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add Langextract observability integration page (new content) #173

Summary

Source PRs (SDK ground truth)

What is langextract?

SDK ground truth — files to read before writing docs

Configuration options (extract verbatim — do not guess)

Install

User-facing entry points to document

Files to create / modify

1. Create `docs/observability/langextract.mdx` (NEW)

2. Update `docs/observability/overview.mdx` (small edit)

3. Update `docs.json` (small edit)

Placement rules (AGENTS.md §1.8)

Acceptance checklist (per AGENTS.md §9)

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

docs: add Langextract observability integration page (new content) #173

Description

Summary

Source PRs (SDK ground truth)

What is langextract?

SDK ground truth — files to read before writing docs

Configuration options (extract verbatim — do not guess)

Install

User-facing entry points to document

Files to create / modify

1. Create docs/observability/langextract.mdx (NEW)

2. Update docs/observability/overview.mdx (small edit)

3. Update docs.json (small edit)

Placement rules (AGENTS.md §1.8)

Acceptance checklist (per AGENTS.md §9)

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

1. Create `docs/observability/langextract.mdx` (NEW)

2. Update `docs/observability/overview.mdx` (small edit)

3. Update `docs.json` (small edit)