Skip to content

Latest commit

 

History

History
412 lines (319 loc) · 30.4 KB

File metadata and controls

412 lines (319 loc) · 30.4 KB

Agents

Types:

from giskard_hub.types import (
    Agent,
    AgentDetectStatefulness,
    AgentOutput,
    ChatMessage,
    Header,
)

Methods:

Audit

Types:

from giskard_hub.types import (
    Audit,
    AuditDisplay,
    APIPaginatedMetadata,
)

Methods:

Checks

Types:

from giskard_hub.types import (
    Check,
    CheckResult,
    ConformityParams,
    CorrectnessParams,
    GroundednessParams,
    MetadataParams,
    SemanticSimilarityParams,
    StringMatchParams,
)

Methods:

Datasets

Types:

from giskard_hub.types import (
    Dataset,
    TestCase,
    TaskProgress,
    APIPaginatedMetadata,
)

Methods:

Evaluations

Types:

from giskard_hub.types import (
    Agent,
    AgentOutput,
    CheckResult,
    Dataset,
    DatasetSubset,
    DivergenceWarning,
    Evaluation,
    Metric,
    MinimalAgent,
    OutputAnnotation,
)

Methods:

Helpers

The helpers resource exposes convenience methods for common workflows such as running evaluations and waiting for completion.

Types:

from giskard_hub.resources.helpers import (
    HelpersResource,
    AsyncHelpersResource,
)
from giskard_hub.types import (
    Evaluation,
    ChatMessage,
)

Methods:

  • client.helpers.wait_for_completion(entity, *, poll_interval=5.0, max_retries=360, running_states={"running"}, error_states={"error"}, raise_on_error=True) -> TStateful Waits until a task-like entity (such as an evaluation) leaves a running state or reaches an error state.

  • client.helpers.evaluate(agent, *, dataset, project=None, name=None, tags=None) -> Evaluation Run an evaluation for a given agent over a dataset. The agent can be:

    • a remote agent identifier (str or Agent), which creates a regular evaluation, or
    • a local Python callable with signature (messages: list[ChatMessage]) -> AgentReturn, which creates a local evaluation and submits outputs on your behalf.
  • client.helpers.print_metrics(entity) -> None Print metrics for an evaluation or scan result to the console. For an evaluation, displays a table of metric names, success rates, and pass/fail/error/skipped counts. For a scan result, displays probe categories, names, severity, and issue/attack counts.

For asynchronous usage, use the corresponding methods on async_client.helpers (for example, await async_client.helpers.evaluate(...)).

Results

Types:

from giskard_hub.types.evaluation import (
    FailureCategory,
    TaskState,
    TestCaseEvaluation,
)
from giskard_hub.types import (
    TestCase,
    APIPaginatedMetadata,
)

Methods:

KnowledgeBases

Types:

from giskard_hub.types import (
    KnowledgeBase,
    KnowledgeBaseDocumentRow,
    KnowledgeBaseDocumentDetail,
    APIPaginatedMetadata,
)

Methods:

Projects

Types:

from giskard_hub.types import (
    Project,
)

Methods:

Scenarios

Types:

from giskard_hub.types import (
    Scenario,
    ScenarioPreview,
)

Methods:

Scans

Types:

from giskard_hub.types import (
    Agent,
    AgentReference,
    KnowledgeBase,
    Scan,
    ScanAvailableProbe,
    ScanCategory,
    ScanProbe,
)

Methods:

Probes

Types:

from giskard_hub.types.scan import (
    ScanProbe,
    ScanProbeAttempt,
)

Methods:

Attempts

Types:

from giskard_hub.types.scan import (
    ReviewStatus,
    ScanProbeAttempt,
    Severity,
)

Methods:

ScheduledEvaluations

Types:

from giskard_hub.types import (
    Agent,
    Dataset,
    ScheduledEvaluation,
    Evaluation,
    ErrorExecutionStatus,
    FrequencyOption,
    SuccessExecutionStatus,
)

Methods:

TestCases

Types:

from giskard_hub.types import (
    TestCase,
    TestCaseComment,
    TestCaseCheckConfig,
    ChatMessageWithMetadata,
)

Methods:

Comments

Types:

from giskard_hub.types import (
    TestCaseComment,
)

Methods:

Tasks

Types:

from giskard_hub.types import (
    Task,
    TaskStatus,
    TaskPriority,
)

Methods:

PlaygroundChats

Types:

from giskard_hub.types import (
    Agent,
    PlaygroundChat,
)

Methods: