Skip to content

Latest commit

 

History

History
330 lines (247 loc) · 11.4 KB

File metadata and controls

330 lines (247 loc) · 11.4 KB

Proposal: Pumas-Core Integration in Pantograph

Context

Pantograph's ModelProviderTask node (workflow-nodes/src/input/model_provider.rs) currently has a TODO to integrate pumas-core for model management. The node just passes through a model name string with no validation, path resolution, or library lookup. PumaBot (and any Pantograph consumer) needs real model discovery and validation when building inference workflows.

This proposal covers the full pumas-core integration: revising ModelProviderTask, optional integration in the inference crate, and fixing the dependency setup.

Reference: This is Ask 4 from PROPOSAL-PANTOGRAPH-INTEGRATION.md (Phase 4).


Current State

ModelProviderTask (what exists)

// workflow-nodes/src/input/model_provider.rs, line 121-124
// TODO: In future, use pumas-core to:
// 1. Validate model exists
// 2. Get model path and metadata
// 3. Search models if search_query is provided

The task defaults to "llama2" and outputs ModelInfo { name, path: None, model_type: Some("llm") }.

Dependency (broken)

# workflow-nodes/Cargo.toml — absolute path, no feature flag
pumas-library = { path = "/media/jeremy/OrangeCream/Linux Software/Pumas-Library/rust/crates/pumas-core" }

This breaks on any machine without that exact path. It's also unconditional — all builds require pumas-library.


Pumas-Core API Surface (relevant subset)

Construction

let api = PumasApi::builder("/path/to/launcher")
    .auto_create_dirs(true)
    .with_hf_client(false)         // disable HuggingFace for headless
    .with_process_manager(false)    // disable process management
    .build()
    .await?;

Model Discovery

// List all indexed models
api.list_models().await -> Result<Vec<ModelRecord>>

// Full-text search with pagination
api.search_models("llama 7b", limit, offset).await -> Result<SearchResult>

// Get specific model by ID
api.get_model("model-uuid").await -> Result<Option<ModelRecord>>

// Rebuild search index
api.rebuild_model_index().await -> Result<usize>

ModelRecord (the key type)

pub struct ModelRecord {
    pub id: String,              // UUID
    pub path: String,            // Absolute path to model file
    pub cleaned_name: String,    // Normalized name for display
    pub official_name: String,   // Canonical model name
    pub model_type: String,      // "llm", "diffusion", "embedding", etc.
    pub tags: Vec<String>,       // Searchable tags
    pub hashes: HashMap<String, String>,  // sha256, blake3, etc.
    pub metadata: serde_json::Value,      // Additional metadata JSON
    pub updated_at: String,      // ISO timestamp
}

SearchResult

pub struct SearchResult {
    pub models: Vec<ModelRecord>,
    pub total_count: usize,
    pub query_time_ms: f64,
    pub query: String,
}

System Resources (for pre-flight checks)

api.get_disk_space().await -> Result<DiskSpaceResponse>
api.get_system_resources().await -> Result<SystemResourcesResponse>
api.is_ollama_running().await -> bool

Security Tiers

pub enum SecurityTier {
    Safe,      // safetensors, gguf, ggml, onnx
    Unknown,   // undetected format
    Pickle,    // potentially unsafe PyTorch format
}

What We Need

1. Fix the dependency setup

Problem: Absolute path, unconditional dependency.

Solution: Feature-gated workspace dependency.

# Pantograph workspace Cargo.toml
[workspace.dependencies]
pumas-library = { path = "../Pumas-Library/rust/crates/pumas-core", optional = true }
# workflow-nodes/Cargo.toml
[dependencies]
pumas-library = { workspace = true, optional = true }

[features]
default = ["desktop"]
desktop = []
model-library = ["pumas-library"]
# inference/Cargo.toml
[dependencies]
pumas-library = { workspace = true, optional = true }

[features]
model-library = ["pumas-library"]

When model-library is not enabled, ModelProviderTask compiles but operates in passthrough mode (current behavior). When enabled, it gains full pumas-core integration.

2. Revise ModelProviderTask

The revised task should:

Accept PumasApi via workflow context

The WorkflowExecutor context needs a well-known key for the PumasApi handle. Since graph_flow::Context stores serde_json::Value, and PumasApi is not serializable, use a separate mechanism — a shared Arc<PumasApi> stored in an extension map on the executor.

// node-engine: Add optional typed context extensions
pub struct ExecutorExtensions {
    extensions: HashMap<String, Box<dyn std::any::Any + Send + Sync>>,
}

impl ExecutorExtensions {
    pub fn set<T: Send + Sync + 'static>(&mut self, key: &str, value: T) { ... }
    pub fn get<T: Send + Sync + 'static>(&self, key: &str) -> Option<&T> { ... }
}

The WorkflowExecutor exposes this so consumers can inject dependencies:

executor.extensions_mut().set("pumas_api", api.clone());

Alternatively, if extensions are too complex, accept a PumasApi factory closure at executor creation time and store it alongside the context. The key requirement is that ModelProviderTask can access Arc<PumasApi> during execution without requiring it to be serializable.

Operate in two modes based on feature flag

#[async_trait]
impl Task for ModelProviderTask {
    async fn run(&self, context: Context) -> graph_flow::Result<TaskResult> {
        let model_name_input: Option<String> = context.get(&input_key).await;
        let search_query: Option<String> = context.get(&search_key).await;

        #[cfg(feature = "model-library")]
        {
            if let Some(api) = get_pumas_api(&context) {
                return self.run_with_library(api, model_name_input, search_query, &context).await;
            }
        }

        // Fallback: passthrough mode (current behavior)
        self.run_passthrough(model_name_input, &context).await
    }
}

When pumas-core is available, the task should:

  1. Search by query: If search_query is provided, call api.search_models(query, 10, 0) and output the first match (or all matches as JSON array on a search_results port).

  2. Resolve by name: If model_name is provided, call api.search_models(name, 1, 0) to find a matching model. If no match, return an error rather than silently defaulting.

  3. Validate existence: Confirm the model file exists at ModelRecord.path using std::path::Path::exists().

  4. Output full ModelInfo: Expand the output to include everything downstream inference nodes need:

pub struct ModelInfo {
    pub name: String,
    pub path: Option<String>,
    pub model_type: Option<String>,
    pub family: Option<String>,         // from metadata
    pub official_name: Option<String>,
    pub security_tier: Option<String>,  // "safe", "unknown", "pickle"
    pub hashes: HashMap<String, String>,
    pub tags: Vec<String>,
    pub metadata: serde_json::Value,
}
  1. Output ports: Keep existing ports, add new ones:
Port Type Description
model_name String Selected model name (existing)
model_path String Absolute file path to model (existing, now populated)
model_info Json Full ModelInfo as JSON (existing, now populated)
model_type String "llm", "diffusion", "embedding" (new)
model_id String Pumas-core model UUID (new)
search_results Json Array of matching models when search_query used (new)

3. Optional inference crate integration

The inference crate can optionally use pumas-core for smarter backend selection. This is lower priority than the ModelProviderTask revision.

Model path resolution: Given a model name from a workflow, resolve it to an actual file path via pumas-core instead of requiring the user to specify the full path.

// inference/src/gateway.rs — when model-library feature is enabled
#[cfg(feature = "model-library")]
pub async fn resolve_model_path(
    api: &PumasApi,
    model_name: &str,
) -> Option<String> {
    let results = api.search_models(model_name, 1, 0).await.ok()?;
    results.models.first().map(|m| m.path.clone())
}

Backend selection: Pumas-core tracks which backends are running. Use this to choose between Ollama and llama.cpp:

#[cfg(feature = "model-library")]
pub async fn suggest_backend(api: &PumasApi) -> &'static str {
    if api.is_ollama_running().await {
        "ollama"
    } else {
        "llama-cpp"
    }
}

Pre-flight resource checks: Before starting inference, verify sufficient resources:

#[cfg(feature = "model-library")]
pub async fn check_resources(api: &PumasApi) -> Result<(), String> {
    let disk = api.get_disk_space().await.map_err(|e| e.to_string())?;
    if disk.free < 1_000_000_000 { // 1GB minimum
        return Err("Insufficient disk space for inference".into());
    }
    Ok(())
}

4. Binding layer updates

pantograph-rustler: Add NIFs for PumasApi lifecycle if the executor needs an injected API handle:

executor_set_pumas_api(executor, launcher_root_path) -> :ok
  # Creates PumasApi internally and stores in executor extensions

This is only needed if the extensions mechanism (from section 2) is used. If PumasApi is constructed inside Pantograph from a path, the NIF just needs the root path.

pantograph-uniffi: Same pattern — FfiWorkflowEngine gets a set_model_library_path(path) method.


Files to Modify

File Change
Cargo.toml (workspace) Add pumas-library to workspace deps with relative path
crates/workflow-nodes/Cargo.toml Use workspace dep behind model-library feature flag
crates/workflow-nodes/src/input/model_provider.rs Full revision with dual-mode operation
crates/node-engine/src/engine.rs Add ExecutorExtensions (or equivalent) for typed dependency injection
crates/inference/Cargo.toml Optional pumas-library dependency behind model-library feature
crates/inference/src/gateway.rs Model resolution, backend selection, resource checks (all #[cfg] gated)
crates/pantograph-rustler/src/lib.rs NIF for injecting PumasApi path into executor
crates/pantograph-uniffi/src/lib.rs FFI method for setting model library path

Implementation Order

  1. Fix dependency — Move to workspace dep with relative path, add model-library feature flag
  2. Add ExecutorExtensions — Typed dependency injection in node-engine
  3. Revise ModelProviderTask — Dual-mode with full pumas-core integration when available
  4. Update bindings — NIF/FFI for injecting PumasApi path
  5. Inference crate (optional) — Model resolution, backend selection, resource checks

Verification

  1. cargo build -p workflow-nodes --no-default-features — Compiles without pumas-library
  2. cargo build -p workflow-nodes --features model-library — Compiles with pumas-core integration
  3. cargo test -p workflow-nodes — Passthrough tests pass (existing)
  4. cargo test -p workflow-nodes --features model-library — Integration tests pass:
    • Search by query returns matching models
    • Resolve by name finds correct model and path
    • Missing model returns error (not silent default)
    • ModelInfo output includes path, type, hashes, security tier
  5. cargo build --workspace — Full workspace builds with both features
  6. From Elixir: Create workflow with ModelProvider node, set model name input, execute, verify model_path output is populated