Skip to content

Commit 15e77ad

Browse files
(fix): cln and other CLD notebook related updates
1 parent c5799a3 commit 15e77ad

1 file changed

Lines changed: 37 additions & 1 deletion

File tree

site/sfguides/src/lakehouse-iceberg-production-pipelines/lakehouse-iceberg-production-pipelines.md

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,12 @@ This chapter creates the Glue Iceberg REST catalog integration, tightens IAM tru
314314

315315
### Easy Path — Interactive Notebook
316316

317-
Open [cld_lab_guide.ipynb](https://github.com/Snowflake-Labs/sfguide-lakehouse-iceberg-production-pipelines/blob/main/notebooks/cld_lab_guide.ipynb) in Snowflake Notebooks for an interactive walkthrough. The notebook covers the same steps with inline IAM policy output and live SQL execution.
317+
Open [cld_lab_guide.ipynb](https://github.com/Snowflake-Labs/sfguide-lakehouse-iceberg-production-pipelines/blob/main/notebooks/cld_lab_guide.ipynb) in Snowflake Notebooks for an interactive walkthrough. The notebook uses:
318+
319+
- **`ENABLED = FALSE`** on the catalog integration to break the IAM chicken-egg dependency — trust policy values are generated without connecting to Glue
320+
- A companion **CloudFormation template** ([cfn-snowflake-cld.yaml](https://github.com/Snowflake-Labs/sfguide-lakehouse-iceberg-production-pipelines/blob/main/notebooks/cfn-snowflake-cld.yaml)) that deploys all IAM roles, Lake Formation registration, and permissions in a single stack
321+
- **Cortex Code prompts** embedded in each step — use Cmd+K or the chat sidebar to generate SQL and CLI commands from natural language
322+
- **`SYSTEM$VERIFY_CATALOG_INTEGRATION`** to validate connectivity before creating the CLD
318323

319324
Follow the **Detailed Path** below for step-by-step shell commands.
320325

@@ -334,6 +339,8 @@ This writes the IAM role ARN to **.aws-config/snowflake-glue-catalog-iam-role-ar
334339

335340
After this step, return to the Bronze chapter and run **task bronze:lakeformation-setup** to grant the SIGV4 role access via Lake Formation before proceeding.
336341

342+
> **Notebook path (CloudFormation):** The interactive notebook uses a single CloudFormation template instead of individual **task** commands for IAM and Lake Formation setup. It creates the catalog integration as **ENABLED = FALSE** first, extracts trust values from **DESC INTEGRATION**, and passes them as stack parameters — eliminating the manual trust-render-apply cycle. See the notebook for details.
343+
337344
#### Generate SQL
338345

339346
Generate runnable SQL from your **.aws-config/** artifacts:
@@ -370,6 +377,8 @@ The generated SQL creates **glue_rest_catalog_int** (default name) with these se
370377
- **CATALOG_NAMESPACE** = **GLUE_DATABASE**
371378
- **SIGV4_IAM_ROLE** = ARN from **.aws-config/snowflake-glue-catalog-iam-role-arn.txt**
372379

380+
> **Tip: Breaking the chicken-egg dependency.** The notebook creates the catalog integration with **ENABLED = FALSE**. This generates **API_AWS_IAM_USER_ARN** and **API_AWS_EXTERNAL_ID** immediately without requiring the IAM role to exist yet. After deploying the CloudFormation stack with the real trust values, the integration is enabled via **ALTER CATALOG INTEGRATION … SET ENABLED = TRUE**. This avoids a two-pass trust policy update.
381+
373382
#### Describe and Capture Trust Fields
374383

375384
Print catalog integration properties including the Snowflake-generated trust fields:
@@ -402,6 +411,16 @@ task snowflake:apply-glue-catalog-trust-from-rendered
402411

403412
This updates the IAM role's trust policy to scope access to Snowflake's specific IAM user ARN and external ID. Alternatively, paste the rendered JSON from **.aws-config/snowflake-glue-catalog-trust-policy.rendered.json** directly in the IAM console under **Trust relationships**.
404413

414+
#### Verify Catalog Integration
415+
416+
After applying the trust policy, verify the integration can connect to Glue:
417+
418+
```sql
419+
SELECT SYSTEM$VERIFY_CATALOG_INTEGRATION('glue_rest_catalog_int');
420+
```
421+
422+
Expect `"success": true` in the JSON response. If it fails, check that the trust policy values match **DESC INTEGRATION** output exactly and that IAM propagation is complete (up to 30 seconds).
423+
405424
#### Create Catalog-Linked Database
406425

407426
Apply the generated CLD and verify script:
@@ -1011,6 +1030,15 @@ aws s3 rm "s3://$BRONZE_BUCKET_NAME/iceberg/" --recursive
10111030

10121031
Lake Formation registrations and IAM roles created for LF are not removed by **bronze:cleanup** — delete those in the AWS console or via CLI as needed.
10131032

1033+
> **Notebook users (CloudFormation):** If you used the notebook's CloudFormation template, a single command removes all IAM roles, policies, Lake Formation registration, and permissions:
1034+
>
1035+
> ```bash
1036+
> aws cloudformation delete-stack --stack-name snowflake-cld-iam --region $AWS_REGION
1037+
> aws cloudformation wait stack-delete-complete --stack-name snowflake-cld-iam --region $AWS_REGION
1038+
> ```
1039+
>
1040+
> If deletion fails due to active Lake Formation dependencies, check the CloudFormation **Events** tab for the failed resource.
1041+
10141042
### Optional: Delete SIGV4 Lab Role
10151043
10161044
If **task snowflake:create-glue-catalog-read-role** created the IAM role, remove it after Snowflake teardown:
@@ -1056,6 +1084,14 @@ SHOW ICEBERG TABLES IN SCHEMA balloon_game_events."ksampath_balloon_pops";
10561084

10571085
IAM trust policy changes can take up to 30 seconds to propagate. Wait briefly and re-run **DESC CATALOG INTEGRATION** — the status should update. If it stays DISABLED, confirm **GLUE_AWS_IAM_USER_ARN** and **GLUE_AWS_EXTERNAL_ID** in the rendered trust JSON match the current **DESC** output exactly.
10581086

1087+
Use **SYSTEM$VERIFY_CATALOG_INTEGRATION** to test connectivity explicitly:
1088+
1089+
```sql
1090+
SELECT SYSTEM$VERIFY_CATALOG_INTEGRATION('glue_rest_catalog_int');
1091+
```
1092+
1093+
The JSON response includes error details when the trust policy or IAM permissions are misconfigured.
1094+
10591095
### Empty Windowed DTs
10601096

10611097
**dt_realtime_scores**, **dt_balloon_colored_pops**, and **dt_color_performance_trends** use 15-second **TIME_SLICE** windows. They are empty when all bronze events fall in a single bucket or when DTs have not yet completed an initial refresh.

0 commit comments

Comments
 (0)