Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
190 changes: 190 additions & 0 deletions autonomous-ai-agents/data_quality_agent/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
# Select AI - Data Quality Check Agent for Oracle Autonomous Database

## Release Metadata

- Release Version: `1.1`
- Release Date: `19-May-2026`

## Overview

The **Data Quality Check Agent** provides schema-aware data quality assessment for Oracle Autonomous Database tables using Select AI Agent tools.

It supports:

- Table profiling
- Null/duplicate/outlier detection
- Quality score computation with history tracking
- Drift detection based on recent vs baseline score windows
- Issue listing with severity and remediation guidance
- Safe remediation preview and controlled apply mode
- OML Services monitoring setup and run trigger hooks

For definitions of **Tool**, **Task**, **Agent**, and **Agent Team**, see the top-level guide: [README](../README.md#simple-agent-execution-flow).

---

## Repository Contents

```text
.
├── database_quality_check_tools.sql
│ ├── Installer bootstrap and grants
│ ├── DATABASE_QUALITY package (core DQ logic)
│ ├── SELECT_AI_DATA_QUALITY_AGENT package (tool wrappers)
│ └── Tool registration
├── database_quality_check_agent.sql
│ ├── Task definition (DATA_QUALITY_TASKS)
│ ├── Agent creation (DATA_QUALITY_ADVISOR)
│ ├── Team creation (DATA_QUALITY_TEAM)
│ └── Default target schema behavior (DQ_TARGET_SCHEMA)
└── README.md
```

---

## Architecture Overview

```text
User Request
DATA_QUALITY_TASKS
DATA_QUALITY_ADVISOR Reasoning
├── PROFILE_TABLE_TOOL
├── DETECT_NULLS_TOOL
├── DETECT_DUPLICATES_TOOL
├── DETECT_OUTLIERS_TOOL
├── DETECT_DRIFT_TOOL
├── GENERATE_QUALITY_RULES_TOOL
├── EVALUATE_QUALITY_SCORE_TOOL
├── LIST_QUALITY_ISSUES_TOOL
├── SUGGEST_REMEDIATION_TOOL
├── APPLY_REMEDIATION_TOOL
├── SETUP_OML_DATA_MONITORING_TOOL
└── RUN_OML_DATA_MONITORING_TOOL
Issue Summary + Severity + Quality Score + Next Action
```

---

## Prerequisites

- Oracle Autonomous AI Database (26ai recommended)
- Select AI and `DBMS_CLOUD_AI_AGENT` enabled
- `ADMIN` (or equivalent privileged user) for installation
- A valid AI profile (`DBMS_CLOUD_AI.CREATE_PROFILE`)
- Object privileges from install schema to target data schema tables (for cross-schema checks)

For OML monitoring tools:

- `SELECTAI_AGENT_CONFIG` entries for `AGENT='DATA_QUALITY'`:
- `OML_MONITORING_ENDPOINT`
- `OML_MONITORING_CREDENTIAL`

For controlled remediation apply:

- `SELECTAI_AGENT_CONFIG` entry for `AGENT='DATA_QUALITY'`:
- `REMEDIATION_APPROVAL_CODE`

---

## Installation

Run as `ADMIN` (or privileged user) from this folder:

```sql
sqlplus admin@<adb_connect_string> @database_quality_check_tools.sql
sqlplus admin@<adb_connect_string> @database_quality_check_agent.sql
```

Prompts in tools script:

- `SCHEMA_NAME` (schema where package/tools are installed)

Prompts in agent script:

- `SCHEMA_NAME` (same install schema)
- `AI_PROFILE_NAME`
- `DQ_TARGET_SCHEMA` (default schema for DQ checks; if blank uses `SCHEMA_NAME`)

Important:

- Re-run `database_quality_check_agent.sql` whenever task instructions are changed.

---

## Internal Tables

Created in install schema (if missing):

- `DQ_RUN_HISTORY$`:
- score history per table/run
- stores score component metrics and worst severity
- `DQ_FINDINGS$`:
- issue registry with severity, recommendation, and optional fix SQL
- `DQ_OML_MONITORS$`:
- registered OML monitor metadata and last run response

---

## Tool-to-Function Mapping

| Tool | Function | Purpose |
|---|---|---|
| `PROFILE_TABLE_TOOL` | `select_ai_data_quality_agent.profile_table` | Baseline profile |
| `DETECT_NULLS_TOOL` | `select_ai_data_quality_agent.detect_nulls` | Null issue detection |
| `DETECT_DUPLICATES_TOOL` | `select_ai_data_quality_agent.detect_duplicates` | Duplicate detection |
| `DETECT_OUTLIERS_TOOL` | `select_ai_data_quality_agent.detect_outliers` | Outlier detection |
| `DETECT_DRIFT_TOOL` | `select_ai_data_quality_agent.detect_drift` | Drift analysis |
| `GENERATE_QUALITY_RULES_TOOL` | `select_ai_data_quality_agent.generate_quality_rules` | Rule suggestions |
| `EVALUATE_QUALITY_SCORE_TOOL` | `select_ai_data_quality_agent.evaluate_quality_score` | Score + persistence |
| `LIST_QUALITY_ISSUES_TOOL` | `select_ai_data_quality_agent.list_quality_issues` | Issue review |
| `SUGGEST_REMEDIATION_TOOL` | `select_ai_data_quality_agent.suggest_remediation` | SQL guidance |
| `APPLY_REMEDIATION_TOOL` | `select_ai_data_quality_agent.apply_remediation` | Preview/apply fix SQL |
| `SETUP_OML_DATA_MONITORING_TOOL` | `select_ai_data_quality_agent.setup_oml_data_monitoring` | Register OML monitor |
| `RUN_OML_DATA_MONITORING_TOOL` | `select_ai_data_quality_agent.run_oml_data_monitoring` | Trigger OML monitor run |

---

## Operational Behavior

- If `owner_name` is omitted, agent defaults to `DQ_TARGET_SCHEMA`.
- For schema-wide requests (for example, “all tables”), task instruction is configured to auto-discover tables and not ask user to list table names.
- `APPLY_REMEDIATION_TOOL`:
- default mode is `PREVIEW`
- `APPLY` requires matching `approval_code`
- SQL safety checks block unsafe statements

---

## Example Prompts

- `Check null issues in SALES and show columns with null_count, null_rate_pct, and severity.`
- `Detect duplicates in SALES using all columns and show duplicate_row_count, duplicate_rate_pct, and severity.`
- `Find numeric outliers in SALES using z-score threshold 3 and rank by severity.`
- `Evaluate quality score for SALES and explain null, duplicate, outlier, and drift components.`
- `Evaluate quality score for every table in the default target schema and return table-wise summary.`
- `List open HIGH severity quality issues for SALES with recommendation and generated_fix_sql.`
- `Preview remediation for issue_id 1 on SALES.`
- `Apply remediation for issue_id 1 on SALES with execute_mode APPLY and approval_code <code>.`

OML examples:

- `Set up OML data monitoring for SH.SALES with monitor name SH_SALES_DQ_MON, baseline query "<baseline_sql>", and new-data query "<new_sql>".`
- `Run OML data monitoring for monitor SH_SALES_DQ_MON and return the job response.`

---

## Troubleshooting

- `ORA-00942` during package compilation:
- Re-run `database_quality_check_tools.sql`; it pre-creates `DQ_*` tables.
- Agent asks for table list during “all tables” request:
- Re-run `database_quality_check_agent.sql` to recreate task with latest instructions.
- OML monitoring tool errors with missing config:
- Insert required keys in `SELECTAI_AGENT_CONFIG` for `AGENT='DATA_QUALITY'`.
- Apply remediation blocked:
- Ensure `REMEDIATION_APPROVAL_CODE` is configured and supplied as `approval_code`.
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
rem ============================================================================
rem LICENSE
rem Copyright (c) 2026 Oracle and/or its affiliates.
rem Licensed under the Universal Permissive License (UPL), Version 1.0
rem https://oss.oracle.com/licenses/upl/
rem
rem NAME
rem database_quality_check_agent.sql
rem
rem DESCRIPTION
rem Installer and configuration script for Data Quality Check AI Agent Team.
rem
rem RELEASE VERSION
rem 1.0
rem
rem RELEASE DATE
rem 18-May-2026
rem ============================================================================

SET SERVEROUTPUT ON
SET VERIFY OFF

PROMPT ======================================================
PROMPT Data Quality Check AI Agent Installer
PROMPT ======================================================

VAR v_schema VARCHAR2(128)
EXEC :v_schema := '&SCHEMA_NAME';

VAR v_ai_profile_name VARCHAR2(128)
EXEC :v_ai_profile_name := '&AI_PROFILE_NAME';

PROMPT
PROMPT DQ_TARGET_SCHEMA:
PROMPT Schema to inspect by default for data quality checks.
PROMPT If blank, SCHEMA_NAME is used as default.
PROMPT

VAR v_dq_target_schema VARCHAR2(128)
EXEC :v_dq_target_schema := '&DQ_TARGET_SCHEMA';

DECLARE
l_sql VARCHAR2(500);
l_schema VARCHAR2(128);
l_session_user VARCHAR2(128);
BEGIN
l_schema := DBMS_ASSERT.SIMPLE_SQL_NAME(:v_schema);
l_session_user := SYS_CONTEXT('USERENV', 'SESSION_USER');

IF UPPER(l_schema) <> UPPER(l_session_user) THEN
l_sql := 'GRANT EXECUTE ON DBMS_CLOUD_AI_AGENT TO ' || l_schema;
EXECUTE IMMEDIATE l_sql;

l_sql := 'GRANT EXECUTE ON DBMS_CLOUD_AI TO ' || l_schema;
EXECUTE IMMEDIATE l_sql;

l_sql := 'GRANT EXECUTE ON DBMS_CLOUD TO ' || l_schema;
EXECUTE IMMEDIATE l_sql;
ELSE
DBMS_OUTPUT.PUT_LINE('Skipping grants for schema ' || l_schema ||
' (same as session user).');
END IF;

DBMS_OUTPUT.PUT_LINE('Grants completed.');
END;
/

BEGIN
EXECUTE IMMEDIATE 'ALTER SESSION SET CURRENT_SCHEMA = ' || :v_schema;
END;
/

CREATE OR REPLACE PROCEDURE install_data_quality_check_agent(
p_install_schema IN VARCHAR2,
p_profile_name IN VARCHAR2,
p_dq_target_schema IN VARCHAR2
)
AUTHID DEFINER
AS
l_target_schema VARCHAR2(128);
BEGIN
l_target_schema := UPPER(TRIM(NVL(p_dq_target_schema, p_install_schema)));

DBMS_OUTPUT.PUT_LINE('--------------------------------------------');
DBMS_OUTPUT.PUT_LINE('Starting Data Quality Check AI installation');
DBMS_OUTPUT.PUT_LINE('--------------------------------------------');

BEGIN
DBMS_CLOUD_AI_AGENT.DROP_TASK('DATA_QUALITY_TASKS');
EXCEPTION
WHEN OTHERS THEN
NULL;
END;

DBMS_CLOUD_AI_AGENT.CREATE_TASK(
task_name => 'DATA_QUALITY_TASKS',
description => 'Task for data quality profiling, scoring, and remediation planning',
attributes => '{
"instruction": "You are a Data Quality specialist for Oracle Autonomous Database. '
|| 'Default target schema for data quality checks is ' || l_target_schema || '. '
|| 'If the user does not provide owner_name, use owner_name=' || l_target_schema || '. '
|| 'If the user provides a different schema explicitly, use that schema. '
|| 'Cross-schema analysis is allowed when object privileges are granted to the install schema. '
|| 'When the user asks for all tables or schema-wide analysis, automatically discover table names from the target schema and run checks without asking the user to provide table lists. '
|| 'If no owner_name is provided in schema-wide requests, use owner_name=' || l_target_schema || '. '
|| 'Do not ask the user for table names when this can be derived from ALL_TABLES/USER_TABLES metadata. '
|| 'Use PROFILE_TABLE_TOOL first to establish table baseline when user provides owner/table. '
|| 'Use DETECT_NULLS_TOOL, DETECT_DUPLICATES_TOOL, and DETECT_OUTLIERS_TOOL to identify quality issues with severity. '
|| 'Use GENERATE_QUALITY_RULES_TOOL to propose enforceable quality rules. '
|| 'Use EVALUATE_QUALITY_SCORE_TOOL to compute/store overall quality score and history point. '
|| 'Use DETECT_DRIFT_TOOL to identify recent score drift against baseline history. '
|| 'Use SETUP_OML_DATA_MONITORING_TOOL for automated OML Services data monitoring setup when requested. '
|| 'Use RUN_OML_DATA_MONITORING_TOOL to trigger OML monitoring jobs and report run response. '
|| 'Use LIST_QUALITY_ISSUES_TOOL for issue review. '
|| 'Use SUGGEST_REMEDIATION_TOOL to produce practical SQL-based fixes. '
|| 'Only use APPLY_REMEDIATION_TOOL in PREVIEW mode unless the user explicitly asks to apply changes and provides approval_code. '
|| 'Always return: issue summary, severity, quality score, and next remediation step. '
|| 'User request: {query}",
"tools": [
"PROFILE_TABLE_TOOL",
"DETECT_NULLS_TOOL",
"DETECT_DUPLICATES_TOOL",
"DETECT_OUTLIERS_TOOL",
"DETECT_DRIFT_TOOL",
"SETUP_OML_DATA_MONITORING_TOOL",
"RUN_OML_DATA_MONITORING_TOOL",
"GENERATE_QUALITY_RULES_TOOL",
"EVALUATE_QUALITY_SCORE_TOOL",
"LIST_QUALITY_ISSUES_TOOL",
"SUGGEST_REMEDIATION_TOOL",
"APPLY_REMEDIATION_TOOL"
],
"enable_human_tool": "true"
}'
);
DBMS_OUTPUT.PUT_LINE('Created task DATA_QUALITY_TASKS');

BEGIN
DBMS_CLOUD_AI_AGENT.DROP_AGENT('DATA_QUALITY_ADVISOR');
EXCEPTION
WHEN OTHERS THEN
NULL;
END;

DBMS_CLOUD_AI_AGENT.CREATE_AGENT(
agent_name => 'DATA_QUALITY_ADVISOR',
attributes =>
'{' ||
'"profile_name":"' || p_profile_name || '",' ||
'"role":"You are a Data Quality Advisor. You profile data, detect anomalies and drift, compute quality scores, and recommend safe remediation steps for Oracle Autonomous Database tables."' ||
'}',
description => 'AI agent for Oracle Autonomous Database data quality monitoring and remediation guidance'
);
DBMS_OUTPUT.PUT_LINE('Created agent DATA_QUALITY_ADVISOR');

BEGIN
DBMS_CLOUD_AI_AGENT.DROP_TEAM('DATA_QUALITY_TEAM');
EXCEPTION
WHEN OTHERS THEN
NULL;
END;

DBMS_CLOUD_AI_AGENT.CREATE_TEAM(
team_name => 'DATA_QUALITY_TEAM',
attributes => '{
"agents":[{"name":"DATA_QUALITY_ADVISOR","task":"DATA_QUALITY_TASKS"}],
"process":"sequential"
}'
);

DBMS_OUTPUT.PUT_LINE('Created team DATA_QUALITY_TEAM');
DBMS_OUTPUT.PUT_LINE('--------------------------------------------');
DBMS_OUTPUT.PUT_LINE('Data Quality Check AI installation COMPLETE');
DBMS_OUTPUT.PUT_LINE('--------------------------------------------');
END install_data_quality_check_agent;
/

PROMPT Executing installer procedure ...
BEGIN
install_data_quality_check_agent(
p_install_schema => :v_schema,
p_profile_name => :v_ai_profile_name,
p_dq_target_schema => :v_dq_target_schema
);
END;
/

ALTER SESSION SET CURRENT_SCHEMA = ADMIN;

PROMPT ======================================================
PROMPT Installation finished successfully
PROMPT ======================================================
Loading