oracle-devrel
diff --git a/‎data-platform/autonomous-ai-database/autonomous-ai-lakehouse/lakehouse-ai-skills/LICENSE‎
Lines changed: 35 additions & 0 deletions b/‎data-platform/autonomous-ai-database/autonomous-ai-lakehouse/lakehouse-ai-skills/LICENSE‎
Lines changed: 35 additions & 0 deletions
diff --git a/‎data-platform/autonomous-ai-database/autonomous-ai-lakehouse/lakehouse-ai-skills/README.md‎
Lines changed: 76 additions & 0 deletions b/‎data-platform/autonomous-ai-database/autonomous-ai-lakehouse/lakehouse-ai-skills/README.md‎
Lines changed: 76 additions & 0 deletions
diff --git a/‎data-platform/autonomous-ai-database/autonomous-ai-lakehouse/lakehouse-ai-skills/autonomous-data-loader/files/SKILL.md‎
Lines changed: 165 additions & 0 deletions b/‎data-platform/autonomous-ai-database/autonomous-ai-lakehouse/lakehouse-ai-skills/autonomous-data-loader/files/SKILL.md‎
Lines changed: 165 additions & 0 deletions
diff --git a/‎data-platform/autonomous-ai-database/autonomous-ai-lakehouse/lakehouse-ai-skills/autonomous-data-loader/files/agents/openai.yaml‎
Lines changed: 3 additions & 0 deletions b/‎data-platform/autonomous-ai-database/autonomous-ai-lakehouse/lakehouse-ai-skills/autonomous-data-loader/files/agents/openai.yaml‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎data-platform/autonomous-ai-database/autonomous-ai-lakehouse/lakehouse-ai-skills/autonomous-data-loader/files/examples/csv-conservative-staging.md‎
Lines changed: 72 additions & 0 deletions b/‎data-platform/autonomous-ai-database/autonomous-ai-lakehouse/lakehouse-ai-skills/autonomous-data-loader/files/examples/csv-conservative-staging.md‎
Lines changed: 72 additions & 0 deletions
@@ -0,0 +1,35 @@
+Copyright (c) 2026 Oracle and/or its affiliates.
+ 
+The Universal Permissive License (UPL), Version 1.0
+ 
+Subject to the condition set forth below, permission is hereby granted to any
+person obtaining a copy of this software, associated documentation and/or data
+(collectively the "Software"), free of charge and under any and all copyright
+rights in the Software, and any and all patent rights owned or freely
+licensable by each licensor hereunder covering either (i) the unmodified
+Software as contributed to or provided by such licensor, or (ii) the Larger
+Works (as defined below), to deal in both
+ 
+(a) the Software, and
+(b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
+one is included with the Software (each a "Larger Work" to which the Software
+is contributed by such licensors),
+ 
+without restriction, including without limitation the rights to copy, create
+derivative works of, display, perform, and distribute the Software and make,
+use, sell, offer for sale, import, export, have made, and have sold the
+Software and the Larger Work(s), and to sublicense the foregoing rights on
+either these or other terms.
+ 
+This license is subject to the following condition:
+The above copyright notice and either this complete permission notice or at
+a minimum a reference to the UPL must be included in all copies or
+substantial portions of the Software.
+ 
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
@@ -0,0 +1,76 @@
+# Autonomous AI Lakehouse Skills
+
+This repository contains two community-driven Skills designed to accelerate common workflows when working with Oracle Autonomous AI Lakehouse:
+
+- **AI Lakehouse Ops Skill**
+- **AI Lakehouse Data Loader Skill**
+
+These Skills help users automate and simplify operational and data-loading tasks related to AI Lakehouse environments.
+
+**IMPORTANT NOTE:** These Skills are not official Oracle products and are provided as community-driven resources to assist in expediting AI Lakehouse-related workflows. While they aim to be helpful, they are not guaranteed to cover all scenarios or provide complete accuracy. Users are strongly encouraged to consult the official Oracle documentation for definitive guidance and support.
+
+Reviewed: 06.05.2026
+
+# Included Skills
+
+## AI Lakehouse Ops Skill
+
+The **AI Lakehouse Ops Skill** is designed to assist with operational tasks related to Oracle Autonomous AI Lakehouse environments.
+
+It helps users review operational inputs, summarize environment status, identify potential issues, and generate guidance for troubleshooting or validating AI Lakehouse-related configurations.
+
+Typical use cases include:
+
+- Reviewing AI Lakehouse operational logs or diagnostic information
+- Summarizing environment status
+- Identifying possible configuration or runtime issues
+- Generating operational reports
+- Providing troubleshooting guidance
+- Suggesting next steps for validation or investigation
+
+This Skill is intended to support operations, enablement, and troubleshooting activities, especially in non-production or test environments.
+
+## AI Lakehouse Data Loader Skill
+
+The **AI Lakehouse Data Loader Skill** is designed to assist with loading, preparing, and validating data for AI Lakehouse workflows.
+
+It helps users understand data-loading requirements, prepare data ingestion steps, review source data inputs, and generate guidance for moving data into an AI Lakehouse environment.
+
+Typical use cases include:
+
+- Preparing data-loading workflows
+- Reviewing source files, metadata, or schemas
+- Generating data ingestion guidance
+- Identifying possible data preparation issues
+- Creating step-by-step data loading instructions
+- Summarizing loaded or pending datasets
+- Supporting demos, prototypes, and enablement scenarios
+
+This Skill is intended to accelerate data onboarding and ingestion-related tasks for AI Lakehouse use cases.
+
+# When to use this asset?
+
+Use these Skills when working with Oracle Autonomous AI Lakehouse and you need assistance with either operational tasks or data-loading workflows.
+
+Use the **AI Lakehouse Ops Skill** when you need to:
+
+- Review operational information
+- Analyze logs or configuration snippets
+- Generate an operations-oriented summary
+- Troubleshoot AI Lakehouse-related issues
+- Document findings or recommended next steps
+
+Use the **AI Lakehouse Data Loader Skill** when you need to:
+
+- Prepare data for AI Lakehouse usage
+- Review data-loading inputs
+- Generate ingestion steps
+- Summarize datasets, schemas, or metadata
+- Validate data readiness before loading
+
+These Skills are intended for advisory and productivity purposes. They should not replace official Oracle documentation, formal validation, testing, or Oracle support.
+
+# How to use this asset?
+
+Each Skill can be used independently depending on the task. It is built for any Agent like OpenAI or Claude.
+
@@ -0,0 +1,165 @@
+---
+name: autonomous-data-loader
+description: generate and safely execute oracle autonomous ai lakehouse data loading and oci object storage lakehouse access workflows using dbms_cloud. use when the user wants to list oci object storage files, choose files or prefixes to load, create conservative csv staging tables, generate or run copy_data or copy_collection, tune dbms_cloud format options, load json documents into soda collections, create external tables to query apache iceberg data stored in oci object storage using direct metadata.json or hadoop catalog patterns, monitor user_load_operations or dba_load_operations, inspect logfile_table or badfile_table, troubleshoot rejected rows, reconcile loads, or profile staged data after loading. this skill is mcp-first with generate-only fallback and is scoped to oci object storage and dbms_cloud-based workflows.
+---
+
+# Autonomous AI Lakehouse Data Loader
+
+## Purpose
+
+Use this skill to help users load data from OCI Object Storage into Oracle Autonomous AI Lakehouse with `DBMS_CLOUD`, and to create external-table access to Apache Iceberg data stored in OCI Object Storage. The skill is designed for a portable Agent Skill workflow: it can generate SQL/PLSQL for manual execution, or execute through an available MCP SQL tool when connected to the target Autonomous database.
+
+## Core Scope
+
+Handle these workflows:
+
+- Discover objects in OCI Object Storage with `DBMS_CLOUD.LIST_OBJECTS`.
+- Normalize Object Storage URIs and choose a file, selected file list, prefix, wildcard, or regex pattern.
+- Prefer existing `DBMS_CLOUD` credential names and never request secrets in chat.
+- Check whether target tables exist before generating `COPY_DATA`.
+- Generate and optionally execute `DBMS_CLOUD.COPY_DATA` for supported file loads into existing relational tables.
+- Generate and optionally execute `DBMS_CLOUD.COPY_COLLECTION` for JSON documents into SODA collections.
+- For CSV without an existing target table, offer conservative staging from the CSV header using `VARCHAR2(4000)` columns.
+- Generate format options for CSV, JSON, Parquet, ORC, and Avro. Treat XML as version-specific and verify official documentation before generating XML load workflows.
+- Create and validate external tables that query Apache Iceberg data stored in OCI Object Storage, using only the direct `metadata.json` and HadoopCatalog-on-OCI patterns documented for Autonomous AI Database.
+- Monitor and reconcile loads with native `USER_LOAD_OPERATIONS` or `DBA_LOAD_OPERATIONS`.
+- Inspect `LOGFILE_TABLE` and `BADFILE_TABLE` after failures or rejected rows.
+- Profile staged data after load and propose curated DDL only as a proposal.
+
+Do not make Data Pump or `DBMS_CLOUD_PIPELINE` part of any default workflow. Do not add non-OCI Iceberg providers such as Unity, Polaris, AWS Glue, S3, Azure, or GCS to the default workflow.
+
+## Execution Model
+
+Default to MCP-enabled execution when a SQL execution tool is available. If no MCP SQL tool is available, use generate-only mode.
+
+### MCP-enabled mode
+
+- Use the available MCP SQL execution tool for read-only inspection queries.
+- Do not assume a specific tool name. Prefer the SQL tool connected to the target Autonomous AI Lakehouse database.
+- Execute read-only checks directly when useful: dictionary queries, `LIST_OBJECTS`, load-history queries, log and badfile inspection, Iceberg external-table sanity checks such as `COUNT(*)`.
+- For mutating operations, generate the SQL/PLSQL first, explain the impact, and require approval before execution.
+
+### Generate-only mode
+
+- Generate SQL/PLSQL and ask the user to execute it manually in their preferred Oracle client.
+- Ask the user to paste results back when the next step depends on inspection output.
+
+## Approval Policy
+
+Support two approval styles:
+
+- **Strict approval**: ask before every mutating operation. This is the default.
+- **Batch approval**: show the complete non-destructive mutating plan first, then execute the approved plan. Use only when the user asks for batch approval or clearly approves the entire plan.
+
+Always require strict approval for destructive operations, even when batch approval is active.
+
+Mutating operations include:
+
+- `CREATE TABLE`, `ALTER TABLE`, `CREATE COLLECTION` patterns, and similar DDL.
+- `DBMS_CLOUD.COPY_DATA`.
+- `DBMS_CLOUD.COPY_COLLECTION`.
+- `DBMS_CLOUD.COPY_COLLECTION` may create a missing SODA collection; treat it as mutating even before rows or documents are loaded.
+- `DBMS_CLOUD.CREATE_CREDENTIAL`.
+- `DBMS_CLOUD.CREATE_EXTERNAL_TABLE` for Iceberg access.
+- `DBMS_NETWORK_ACL_ADMIN.APPEND_HOST_ACE` for Iceberg Object Storage ACL setup.
+- `INSERT`, `UPDATE`, `DELETE`, `MERGE`.
+
+Destructive operations include:
+
+- `DROP TABLE`.
+- `TRUNCATE TABLE`.
+- `ALTER TABLE DROP COLUMN`.
+- `DELETE` without a narrowly scoped predicate.
+- Replacing, truncating, or recreating an existing staging table.
+
+Prefer non-destructive alternatives, such as a new staging table name, before recommending destructive cleanup.
+
+## Guardrails
+
+- Never ask users to paste secrets, API keys, auth tokens, private keys, or passwords into the prompt.
+- Prefer an existing `DBMS_CLOUD` credential name.
+- If a credential is missing, generate a `CREATE_CREDENTIAL` template with placeholders and warn users to replace placeholders outside the chat.
+- Do not infer a final CSV business schema from a filename, bucket, or folder alone.
+- For CSV without a target table, ask whether the user wants conservative staging, a user-provided schema, or profiling first.
+- Do not mix formats in a single `COPY_DATA` operation.
+- Do not load from a whole prefix until object discovery shows the files are homogeneous enough.
+- Exclude marker/control files such as `_SUCCESS`, `.crc`, manifests, readme files, and zero-byte files unless the user explicitly requests otherwise.
+- Treat generated curated DDL as proposed until the user approves it.
+- For Iceberg workflows, keep the scope to OCI Object Storage only and generate only external table access patterns; do not treat Iceberg as a `COPY_DATA` load.
+- For Iceberg direct metadata, warn that the table points to a specific `metadata.json` snapshot and may need to be recreated after snapshot or schema changes.
+- For Iceberg HadoopCatalog on OCI, require the lakehouse folder URI and `iceberg_table_path`.
+- Warn users about documented Iceberg limitations before creating an external table: fixed external-table schema, no query-time time travel, unsupported merge-on-read delete files, and provider/version-specific restrictions.
+
+## Workflow Decision Tree
+
+1. Identify the source request:
+   - bucket or prefix discovery: use `references/object-discovery-and-selection.md`.
+   - direct relational table load: use `references/copy-data.md`.
+   - JSON document collection load: use `references/copy-collection-json.md`.
+   - Apache Iceberg data stored in OCI Object Storage: use `references/iceberg-oci-object-storage.md`.
+   - failed load or rejected rows: use `references/monitoring-and-troubleshooting.md`.
+   - CSV with no target table: use `references/csv-staging-and-profiling.md`.
+
+2. Collect minimum inputs:
+   - `credential_name` or instruction to create one.
+   - OCI Object Storage URI, bucket/prefix, or exact file URI.
+   - target table or collection name, unless the user wants discovery/planning only.
+   - format or enough evidence to infer the format from selected object names.
+   - for Iceberg: external table name, credential name, OCI Object Storage URI for `metadata.json` or lakehouse folder, and optionally `iceberg_table_path` for HadoopCatalog.
+
+3. Run read-only pre-checks when MCP is available:
+   - list object candidates with `DBMS_CLOUD.LIST_OBJECTS`.
+   - check target table or collection existence.
+   - inspect target columns when loading into a relational table.
+   - inspect recent load history when troubleshooting.
+   - inspect Iceberg metadata file or lakehouse folder candidates when building an Iceberg external table.
+
+4. Plan the load or access pattern:
+   - choose exact file list, prefix/wildcard, or regex pattern.
+   - select `COPY_DATA`, `COPY_COLLECTION`, or `CREATE_EXTERNAL_TABLE` for Iceberg query access.
+   - select format options or Iceberg access protocol configuration.
+   - decide direct load versus user-named staging.
+
+5. For mutating operations:
+   - present the SQL/PLSQL.
+   - explain the risk.
+   - ask for strict or batch approval.
+   - execute only after approval if MCP is available.
+
+6. After execution:
+   - query `USER_LOAD_OPERATIONS` or `DBA_LOAD_OPERATIONS` for load operations.
+   - for Iceberg external tables, run a read-only sanity check such as `SELECT COUNT(*)` and inspect table columns.
+   - reconcile status, operation id, start/update time, log table, badfile table, and row counts where possible.
+   - if failed or rejected rows are present, switch to troubleshooting.
+   - if CSV staging was used, offer post-load profiling and curated DDL proposal.
+
+## Response Style
+
+Be flexible and concise. Do not force every answer into a rigid template. For mutating or destructive operations, always clearly show:
+
+- what will change,
+- the SQL/PLSQL involved,
+- whether approval is required,
+- how to monitor the result,
+- and how to troubleshoot failures.
+
+For Iceberg external-table access, clearly state that the operation creates query access to data in Object Storage; it does not copy the Iceberg data into Autonomous.
+
+## References
+
+Use these files when relevant:
+
+- `references/oracle-docs-index.md` for official Oracle documentation links.
+- `references/version-notes.md` for the v0.1 scope and release notes.
+- `references/minimum-inputs.md` for minimum required inputs by workflow.
+- `references/object-discovery-and-selection.md` for object listing and file selection.
+- `references/source-and-credentials.md` for OCI Object Storage credentials and URI patterns.
+- `references/copy-data.md` for relational table loads with `DBMS_CLOUD.COPY_DATA`.
+- `references/copy-collection-json.md` for JSON document loads into SODA collections.
+- `references/format-options.md` for format option guidance.
+- `references/iceberg-oci-object-storage.md` for querying Iceberg data in OCI Object Storage with external tables.
+- `references/csv-staging-and-profiling.md` for CSV staging and post-load profiling.
+- `references/monitoring-and-troubleshooting.md` for load monitoring, logs, badfiles, and retry guidance.
+- `references/mcp-execution.md` for MCP-first execution behavior.
+
+Use examples in `examples/` only when the user needs a concrete pattern.
@@ -0,0 +1,3 @@
+interface:
+  display_name: "Autonomous AI Lakehouse Data Loader"
+  short_description: "Load OCI Object Storage data and create Iceberg query access with DBMS_CLOUD."
@@ -0,0 +1,72 @@
+# Example: CSV conservative staging
+
+User request:
+
+> There is a CSV in this bucket. I do not have a table. Load it.
+
+Recommended response:
+
+- Explain that CSV does not provide reliable types.
+- List objects first.
+- Read or ask for the header.
+- Ask the user for a staging table name.
+- Create a staging table with all CSV columns as `VARCHAR2(4000)`.
+- Load with `COPY_DATA` and `skipheaders => 1`.
+- Profile after load.
+
+Assume header:
+
+```text
+order_id,customer_id,order_date,amount,currency
+```
+
+Proposed staging table:
+
+```sql
+create table STG_SALES_CSV (
+  ORDER_ID    varchar2(4000),
+  CUSTOMER_ID varchar2(4000),
+  ORDER_DATE  varchar2(4000),
+  AMOUNT      varchar2(4000),
+  CURRENCY    varchar2(4000)
+);
+```
+
+Load:
+
+```sql
+declare
+  l_operation_id number;
+begin
+  dbms_cloud.copy_data(
+    table_name      => 'STG_SALES_CSV',
+    credential_name => 'OBJ_STORE_CRED',
+    file_uri_list   => 'https://objectstorage.<region>.oraclecloud.com/n/<namespace>/b/<bucket>/o/raw/sales/sales.csv',
+    format          => json_object(
+      'type' value 'csv',
+      'skipheaders' value 1,
+      'delimiter' value ',',
+      'quote' value '"',
+      'blankasnull' value 'true',
+      'rejectlimit' value 100,
+      'enablelogs' value 'true',
+      'logretention' value 7
+    ),
+    operation_id    => l_operation_id
+  );
+  dbms_output.put_line('operation_id=' || l_operation_id);
+end;
+/
+```
+
+Post-load profiling:
+
+```sql
+select count(*) as row_count from STG_SALES_CSV;
+
+select
+  max(length(ORDER_ID)) as order_id_max_len,
+  max(length(CUSTOMER_ID)) as customer_id_max_len,
+  max(length(CURRENCY)) as currency_max_len
+from STG_SALES_CSV;
+```
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+interface:`
	`2`	`+ display_name: "Autonomous AI Lakehouse Data Loader"`
	`3`	`+ short_description: "Load OCI Object Storage data and create Iceberg query access with DBMS_CLOUD."`