Structured Output Prompts Implementation Plan

Issue: #402 - Add structured output prompt support to langstar CLI Milestone: ls-prompt-structured-outputs Date: 2025-11-29 Status: ✅ Completed

Executive Summary

This document describes the implementation of structured output prompts in Langstar, enabling users to create prompts with JSON Schema constraints that ensure LLM outputs conform to predefined structures.

What Was Built

SDK Support - StructuredPrompt types with LC-JSON serialization
CLI Integration - --schema and --schema-method flags
Schema Validation - Client-side JSON Schema validation before push
Full Round-trip - Push and pull structured prompts to/from LangSmith

Key Deliverables

Component	Status	PRs
SDK Types	✅ Complete	#415
SDK Client Methods	✅ Complete	#420
CLI Commands	✅ Complete	#431
Documentation	✅ Complete	#409

Research Phase

Research Report

Issue: #398 Document: 398-structured-output-prompts-scout.md

Key findings:

LangSmith stores prompts as LC-JSON serialized objects
StructuredPrompt class in Python SDK is the reference implementation
JSON Schema must be passed as dict, not Pydantic class
Two methods supported: json_schema and function_calling

Design Decisions

Issue: #403 Section: Research document Section 11

Key decisions:

Use --schema <FILE> flag (matches dataset import pattern)
Default method: json_schema
Client-side validation before push
No new environment variables required

OpenAPI Validation

Issue: #404

Validated against LangSmith OpenAPI spec to ensure API compatibility per issue #404.

Implementation Phases

Phase 1: SDK Types (#405, #415)

Goal: Create Rust types for structured prompts with LC-JSON serialization.

Implementation: sdk/src/prompts.rs:18-199

Types Created

LcJson<T> - Generic LC-JSON wrapper

pub struct LcJson<T> {
    pub lc: u8,
    pub type_: String,
    pub id: Vec<String>,
    pub kwargs: T,
    pub name: Option<String>,
}

StructuredPrompt - Main structured prompt type

pub struct StructuredPrompt {
    pub input_variables: Option<Vec<String>>,
    pub messages: Vec<LcJson<MessagePromptTemplateKwargs>>,
    pub schema_: Value,
    pub structured_output_kwargs: StructuredOutputKwargs,
}

StructuredOutputKwargs - Method configuration

pub struct StructuredOutputKwargs {
    pub method: String,  // "json_schema" or "function_calling"
}

Helper Types
- MessagePromptTemplateKwargs - Message template wrapper
- PromptTemplateKwargs - Base prompt template

Schema Validation

pub fn validate_json_schema(schema: &Value) -> Result<()>
pub fn validate_method(method: &str) -> Result<()>

Validation approach:

Uses jsonschema crate to compile and validate schemas
Validates method is json_schema or function_calling
Fails fast with clear error messages

Phase 2: SDK Client Methods (#406, #420)

Goal: Implement push/pull methods for structured prompts.

Implementation: sdk/src/client.rs (langchain_create_commit)

Push Logic

Validate schema with validate_json_schema()
Validate method with validate_method()
Build StructuredPrompt from CLI inputs
Wrap in LcJson format
Serialize to JSON manifest
POST to /api/v1/commits/{owner}/{repo}/

Key code:

let structured_prompt = StructuredPrompt {
    input_variables: Some(input_variables),
    messages: build_messages(template, input_variables),
    schema_: schema,
    structured_output_kwargs: StructuredOutputKwargs { method },
};

let manifest = structured_prompt.to_lc_json();

Pull Logic

Pull uses existing langchain_get_commit() method - no changes needed. The manifest field contains the full LC-JSON structure.

Phase 3: CLI Integration (#407, #431)

Goal: Add --schema and --schema-method flags to prompt push command.

Implementation: cli/src/commands/prompt.rs:76-113

CLI Flags

Push {
    // Existing flags...

    /// Path to JSON Schema file for structured output
    #[arg(long, value_name = "FILE")]
    schema: Option<std::path::PathBuf>,

    /// Structured output method: json_schema or function_calling
    #[arg(long, default_value = "json_schema")]
    schema_method: String,
}

Implementation Flow

Check if --schema flag provided
Read schema file from disk
Parse JSON with serde_json
Validate schema and method
Call SDK with schema parameters
Handle errors with user-friendly messages

Error handling:

let schema: Value = match std::fs::read_to_string(&schema_path) {
    Ok(content) => serde_json::from_str(&content)
        .map_err(|e| anyhow!("Schema file contains invalid JSON: {}", e))?,
    Err(e) => return Err(anyhow!("Failed to read schema file: {}", e)),
};

validate_json_schema(&schema)?;
validate_method(&schema_method)?;

Phase 4: Documentation (#409)

Goal: Document the structured output prompts feature.

Deliverables:

✅ README updates with examples
✅ Usage guide: docs/examples/structured-output-prompts.md
✅ Implementation plan (this document)
✅ Rustdoc comments on SDK types

Testing Approach

Unit Tests

Location: sdk/src/prompts.rs

Tests cover:

LC-JSON serialization/deserialization
Schema validation (valid and invalid schemas)
Method validation
StructuredPrompt construction

Integration Tests

Location: sdk/tests/prompts_integration.rs

Tests cover:

Push structured prompt to LangSmith (requires API key)
Pull structured prompt from LangSmith
Round-trip: push then pull, verify schema preserved

Manual Testing

# Create test schema
cat > test-schema.json << 'EOF'
{
  "type": "object",
  "properties": {
    "answer": {"type": "string"},
    "confidence": {"type": "number"}
  },
  "required": ["answer"]
}
EOF

# Push structured prompt
cargo run -- prompt push \
  -o test -r structured-test \
  -t "Answer: {question}" \
  --schema test-schema.json

# Pull and verify
cargo run -- prompt pull test/structured-test

Architecture Decisions

Why LC-JSON Format?

Decision: Use LangChain's LC-JSON serialization format for manifests.

Rationale:

LangSmith stores prompts in this format
Python SDK uses this format
Round-trip compatibility with Python ecosystem
Structured and well-documented format

Alternative considered: Custom JSON format

❌ Would break Python SDK compatibility
❌ Would require custom deserialization on LangSmith side

Why Client-Side Validation?

Decision: Validate JSON Schema on client before pushing.

Rationale:

Fail fast with clear error messages
Reduce API round-trips for invalid schemas
Better user experience (immediate feedback)

Implementation: Uses jsonschema crate

[dependencies]
jsonschema = "0.18"

Why PathBuf for Schema Argument?

Decision: Use std::path::PathBuf for --schema flag.

Rationale:

Proper path handling across platforms
Consistent with dataset import --file pattern
Type-safe file path representation

Alternative considered: String

❌ Less type-safe
❌ Requires manual path validation

Code References

SDK

File	Lines	Description
`sdk/src/prompts.rs`	18-70	LC-JSON types and helpers
`sdk/src/prompts.rs`	71-149	StructuredPrompt types
`sdk/src/prompts.rs`	150-232	Schema validation functions
`sdk/src/client.rs`	(commit method)	Push/pull implementation

CLI

File	Lines	Description
`cli/src/commands/prompt.rs`	76-113	CLI flags definition
`cli/src/commands/prompt.rs`	(execute method)	Schema file handling

Tests

File	Description
`sdk/src/prompts.rs`	Unit tests for types and validation
`sdk/tests/integration_test.rs`	Integration tests with LangSmith API

Future Enhancements

Not in Scope (Intentional)

Pydantic class support - Users should export schema to JSON first
Model binding - No include_model parameter (Python SDK feature)
Transform logic - No RunnableSequence conversion
Schema generation - No automatic schema inference from templates

Potential Future Work

Schema library - Common schemas for typical use cases
Inline schema - Accept schema as JSON string via --schema-inline
Schema validation on pull - Warn if pulled schema is invalid
Schema diff - Compare schemas between prompt versions
OpenAPI to JSON Schema - Convert OpenAPI specs to prompt schemas

Lessons Learned

What Went Well

Research first - Thorough research saved implementation time
Validation early - Client-side validation prevented many API errors
Type safety - Rust's type system caught serialization bugs
Incremental PRs - Splitting work into SDK → CLI → docs worked well

Challenges

LC-JSON complexity - Nested structure took time to understand
Schema validation - Finding the right jsonschema crate version
Error messages - Balancing detail vs. simplicity

Recommendations for Similar Features

Start with comprehensive research and experiments
Design CLI flags before implementation
Validate against OpenAPI specs early
Write unit tests alongside code
Document as you go, not after

Related Issues

Completed

#398 - Research
#403 - Design DX consistency
#404 - OpenAPI validation
#405 - SDK types
#406 - SDK client methods
#407 - CLI commands
#408 - Testing
#409 - Documentation

Related Milestones

ls-prompt-structured-outputs - Parent milestone

References

External Documentation

Internal Documentation

Research: 398-structured-output-prompts-scout.md
Usage guide: structured-output-prompts.md
OpenAPI validation: Completed per issue #404

Summary

The structured output prompts feature is fully implemented and tested. Users can now:

Create JSON Schema files defining output structure
Push prompts with --schema flag
Pull prompts and view their schemas
Use prompts with LLMs to get structured, validated outputs

The implementation follows Langstar's design principles:

✅ Thin wrapper over LangSmith API
✅ Type-safe Rust implementation
✅ Automation-friendly CLI
✅ Clear error messages
✅ Comprehensive documentation

Next steps: Users should refer to the usage guide for detailed examples and best practices.

FilesExpand file tree

402-structured-prompts-implementation.md

Latest commit

History

402-structured-prompts-implementation.md

File metadata and controls

Structured Output Prompts Implementation Plan

Executive Summary

What Was Built

Key Deliverables

Research Phase

Research Report

Design Decisions

OpenAPI Validation

Implementation Phases

Phase 1: SDK Types (#405, #415)

Types Created

Schema Validation

Phase 2: SDK Client Methods (#406, #420)

Push Logic

Pull Logic

Phase 3: CLI Integration (#407, #431)

CLI Flags

Implementation Flow

Phase 4: Documentation (#409)

Testing Approach

Unit Tests

Integration Tests

Manual Testing

Architecture Decisions

Why LC-JSON Format?

Why Client-Side Validation?

Why PathBuf for Schema Argument?

Code References

SDK

CLI

Tests

Future Enhancements

Not in Scope (Intentional)

Potential Future Work

Lessons Learned

What Went Well

Challenges

Recommendations for Similar Features

Related Issues

Completed

Related Milestones

References

External Documentation

Internal Documentation

Summary