redpanda-data · nvartolomei · Apr 8, 2026 · Apr 9, 2026 · Apr 9, 2026 · kbatuigas
@@ -3,6 +3,10 @@
 :page-categories: Iceberg, Tiered Storage, Management, High Availability, Data Replication, Integration
 
 // tag::single-source[]
+:page-topic-type: how-to
+:personas: platform_admin, data_engineer
+:learning-objective-1: Configure a Unity Catalog integration for Redpanda Iceberg topics with AWS S3
+:learning-objective-2: Query Redpanda topic data as Iceberg tables in Databricks SQL
 
 ifndef::env-cloud[]
 [NOTE]
@@ -13,6 +17,11 @@ endif::[]
 
 This guide walks you through querying Redpanda topics as managed Iceberg tables in Databricks, with AWS S3 as object storage and a catalog integration using https://docs.databricks.com/aws/en/data-governance/unity-catalog[Unity Catalog^]. For general information about Iceberg catalog integrations in Redpanda, see xref:manage:iceberg/use-iceberg-catalogs.adoc[].
 
+After reading this page, you will be able to:
+
+* [ ] {learning-objective-1}
+* [ ] {learning-objective-2}
+
 == Prerequisites
 
 ifndef::env-cloud[]
@@ -78,11 +87,38 @@ endif::[]
 
 Follow the steps in the https://docs.databricks.com/aws/en/connect/unity-catalog/cloud-storage/external-locations[Databricks documentation] to *manually* create an external location. You can create the external location in the Catalog Explorer or with SQL. You must create the external location manually because the location needs to be associated with the existing Tiered Storage bucket URL, `s3://<bucket-name>`.
 
-== Create a new catalog
+== Choose a catalog setup
+
-
+
+You can either create a new catalog dedicated to Redpanda topics or use an existing catalog. If you create a new catalog, Redpanda automatically creates the required schema for you. If you need to integrate with an existing catalog, you must manually create the schema in that catalog before Redpanda creates any Iceberg tables.
+
+After you set up your catalog, the authorization and Redpanda configuration steps are the same for both options.
+
-
+
+You can either create a new catalog dedicated to Redpanda topics or use an existing catalog. If you create a new catalog, Redpanda automatically creates the required schema for you. If you need to integrate with an existing catalog, you must manually create the schema in that catalog before Redpanda creates any Iceberg tables.
+
+After you set up your catalog, the authorization and Redpanda configuration steps are the same for both options.
+
+You can either create a new catalog dedicated to Redpanda topics or use an existing catalog. If you create a new catalog, Redpanda automatically creates the required schema for you. If you need to integrate with an existing catalog, you must manually create the schema in that catalog before Redpanda creates any Iceberg tables.
+
+After you set up your catalog, the authorization and Redpanda configuration steps are the same for both options.
+
+=== Option 1: Create a new catalog (recommended)
 
 Follow the steps in the Databricks documentation to https://docs.databricks.com/aws/en/catalogs/create-catalog[create a standard catalog^]. When you create the catalog, specify the external location you created in the previous step as the storage location.
 
-You use the catalog name when you set the Iceberg cluster configuration properties in Redpanda in a later step.
+In this setup, Redpanda creates the default `redpanda` schema for you. You use the catalog name when you set the Iceberg cluster configuration properties in Redpanda in a later step.
+
+=== Option 2: Use an existing catalog with a pre-created schema
+
+If you need to integrate Redpanda with an existing Unity Catalog catalog object, follow the steps to https://docs.databricks.com/aws/en/schemas/create-schema[create a schema^] in the catalog.
+
+* By default, Redpanda creates tables in a schema named `redpanda`. If you want to use a different schema, set config_ref:iceberg_default_catalog_namespace,true,properties/cluster-properties[`iceberg_default_catalog_namespace`] before enabling Iceberg, then manually create that schema in the catalog.
+* Set the schema's managed storage location to the same S3 bucket used for Redpanda Tiered Storage, using the external location you created in the previous step.
+
+Unity Catalog resolves managed storage locations through a hierarchy of metastore > catalog > schema. If you assign the schema its own managed storage location, Redpanda can use the existing catalog while the schema stores its managed Iceberg data in the schema-specific location.
+
+For example:
-For example:
+Unity Catalog resolves managed storage locations through a hierarchy of metastore > catalog > schema. If you assign the schema its own managed storage location, Redpanda can use the existing catalog while the schema stores its managed Iceberg data in the schema-specific location.
+
+For example:
-For example:
+Unity Catalog resolves managed storage locations through a hierarchy of metastore > catalog > schema. If you assign the schema its own managed storage location, Redpanda can use the existing catalog while the schema stores its managed Iceberg data in the schema-specific location.
+
+For example:
+
+* Your existing Unity Catalog catalog stores managed data in `s3://<catalog-bucket-name>`.
+ifdef::env-cloud[]
+* You manually create a `redpanda` schema in that catalog and override its managed storage location, through the external location, to the S3 bucket that Redpanda uses for your cluster's object storage (`s3://redpanda-cloud-storage-<cluster-id>` for BYOC, or your customer-managed bucket for BYOVPC).
+endif::[]
+ifndef::env-cloud[]
+* You manually create a `redpanda` schema in that catalog and override its managed storage location, through the external location, to `s3://<cluster-bucket-name>`, which matches the S3 bucket that Redpanda uses for Tiered Storage.
+endif::[]
+
+For more information, see the https://docs.databricks.com/aws/en/data-governance/unity-catalog/#managed-storage-location-hierarchy[Unity Catalog managed storage location hierarchy^] in the Databricks documentation.
 
 == Authorize access to Unity Catalog
 
@@ -118,6 +154,8 @@ iceberg_rest_catalog_client_id: <service-principal-client-id>
 iceberg_rest_catalog_client_secret: <service-principal-client-secret>
 iceberg_rest_catalog_warehouse: <unity-catalog-name>
 iceberg_disable_snapshot_tagging: true
+# Optional. Set a custom namespace only if you want to use a schema other than the default `redpanda`
+# iceberg_default_catalog_namespace: ["<custom-namespace>"]
 ----
 endif::[]
 ifdef::env-cloud[]
@@ -142,6 +180,8 @@ rpk cluster config set \
   iceberg_rest_catalog_client_secret='${secrets.<service-principal-client-secret-name>}' \
   iceberg_rest_catalog_warehouse=<unity-catalog-name> \
   iceberg_disable_snapshot_tagging=true
+  # Optional. Set a custom namespace only if you want to use a schema other than the default `redpanda`
+  # iceberg_default_catalog_namespace='["<custom-namespace>"]'
 ----
 endif::[]
 +
@@ -210,7 +250,11 @@ The following example shows how to query the Iceberg table using SQL in Databric
 +
 [,sql]
 ----
--- Ensure that the catalog and table name are correctly parsed in case they contain special characters
+/* Ensure that the catalog and table name are correctly parsed in case they 
+contain special characters.
+
+If you set iceberg_default_catalog_namespace to a custom namespace, replace 
+`redpanda` with that namespace in the query below. */
 SELECT * FROM `<catalog-name>`.redpanda.`<table-name>` LIMIT 10;
 ----
 +