Skip to content

[SS-279] turn on Nightly GCP tests (Iceberg)#37285

Merged
ublubu merged 3 commits into
MaterializeInc:mainfrom
ublubu:gcp-nightly
Jun 25, 2026
Merged

[SS-279] turn on Nightly GCP tests (Iceberg)#37285
ublubu merged 3 commits into
MaterializeInc:mainfrom
ublubu:gcp-nightly

Conversation

@ublubu

@ublubu ublubu commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

I think the struggling QA Canary tests might've been slowing down the GCP Iceberg environment enough to cause this Nightly test to flake.

Now that those QA Canary tests are disabled, I wonder if the Nightly GCP Iceberg tests will stop flaking.

@ublubu ublubu requested a review from a team as a code owner June 24, 2026 21:13
@ublubu

ublubu commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author
gcp-iceberg-e2e.td:14:1: error: executing query failed: db error:
ERROR: internal error: connection validation panicked

Well, that's not a flake 🤔

@ublubu ublubu requested a review from a team as a code owner June 24, 2026 21:42
@ublubu

ublubu commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

What in the world is this?

gcp-materialized-1  | environmentd: 2026-06-24T21:55:40.807375Z ERROR mz_adapter::coord::sequencer::inner: connection validation panicked: 
gcp-materialized-1  | Could not automatically determine the process-level CryptoProvider from Rustls crate features.
gcp-materialized-1  | Call CryptoProvider::install_default() before this point to select a provider manually, or make sure exactly one of the 'aws-lc-rs' and 'ring' features is enabled.
gcp-materialized-1  | See the documentation of the CryptoProvider type for more information.
environmentd: 2026-06-24T22:02:02.747867Z ERROR mz_adapter::coord::sequencer::inner: connection validation panicked: 
Could not automatically determine the process-level CryptoProvider from Rustls crate features.
Call CryptoProvider::install_default() before this point to select a provider manually, or make sure exactly one of the 'aws-lc-rs' and 'ring' features is enabled.
See the documentation of the CryptoProvider type for more information.
             (at /Users/kynan/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/rustls-0.23.38/src/crypto/mod.rs:249:14) backtrace="   0: <std::backtrace::Backtrace>::create\n   1: mz_ore::panic::install_enhanced_handler::{closure#0}\n   2: std::panicking::panic_with_hook\n   3: std::panicking::panic_handler::{closure#0}\n   4: std::sys::backtrace::__rust_end_short_backtrace::<std::panicking::panic_handler::{closure#0}, !>\n   5: __rustc::rust_begin_unwind\n   6: core::panicking::panic_fmt\n   7: core::option::expect_failed\n   8: <core::option::Option<rustls::crypto::CryptoProvider>>::expect\n   9: <rustls::crypto::CryptoProvider>::get_default_or_install_from_crate_features\n  10: <rustls::client::client_conn::ClientConfig>::builder_with_protocol_versions\n  11: <rustls::client::client_conn::ClientConfig>::builder\n  12: <hyper_rustls::connector::builder::ConnectorBuilder<hyper_rustls::connector::builder::WantsTlsConfig>>::with_native_roots\n  13: <gcp_auth::types::HttpClient>::new\n  14: <gcp_auth::custom_service_account::CustomServiceAccount>::from_json\n  15: <mz_storage_types::connections::gcp::GcpConnection>::read_credentials::{closure#0}\n  16: <mz_storage_types::connections::IcebergCatalogConnection>::connect_rest::{closure#0}\n  17: <mz_storage_types::connections::IcebergCatalogConnection>::connect::{closure#0}\n  18: <mz_storage_types::connections::IcebergCatalogConnection>::validate::{closure#0}\n  19: <mz_storage_types::connections::Connection>::validate::{closure#0}\n  20: <core::panic::unwind_safe::AssertUnwindSafe<<mz_storage_types::connections::Connection>::validate::{closure#0}> as core::future::future::Future>::poll\n  21: <futures_util::future::future::catch_unwind::CatchUnwind<core::panic::unwind_safe::AssertUnwindSafe<<mz_storage_types::connections::Connection>::validate::{closure#0}>> as core::future::future::Future>::poll::{closure#0}\n  22: <core::panic::unwind_safe::AssertUnwindSafe<<futures_util::future::future::catch_unwind::CatchUnwind<core::panic::unwind_safe::AssertUnwindSafe<<mz_storage_types::connections::Connection>::validate::{closure#0}>> as core::future::future::Future>::poll::{closure#0}> as core::ops::function::FnOnce<()>>::call_once\n  23: std::panicking::catch_unwind::do_call::<core::panic::unwind_safe::AssertUnwindSafe<<futures_util::future::future::catch_unwind::CatchUnwind<core::panic::unwind_safe::AssertUnwindSafe<<mz_storage_types::connections::Connection>::validate::{closure#0}>> as core::future::future::Future>::poll::{closure#0}>, core::task::poll::Poll<core::result::Result<(), mz_storage_types::connections::ConnectionValidationError>>>\n  24: __rust_try\n  25: std::panic::catch_unwind::<core::panic::unwind_safe::AssertUnwindSafe<<futures_util::future::future::catch_unwind::CatchUnwind<core::panic::unwind_safe::AssertUnwindSafe<<mz_storage_types::connections::Connection>::validate::{closure#0}>> as core::future::future::Future>::poll::{closure#0}>, core::task::poll::Poll<core::result::Result<(), mz_storage_types::connections::ConnectionValidationError>>>\n  26: <futures_util::future::future::catch_unwind::CatchUnwind<core::panic::unwind_safe::AssertUnwindSafe<<mz_storage_types::connections::Connection>::validate::{closure#0}>> as core::future::future::Future>::poll\n  27: <tokio::task::task_local::TaskLocalFuture<bool, futures_util::future::future::catch_unwind::CatchUnwind<core::panic::unwind_safe::AssertUnwindSafe<<mz_storage_types::connections::Connection>::validate::{closure#0}>>> as core::future::future::Future>::poll::{closure#0}\n  28: <tokio::task::task_local::LocalKey<bool>>::scope_inner::<<tokio::task::task_local::TaskLocalFuture<bool, futures_util::future::future::catch_unwind::CatchUnwind<core::panic::unwind_safe::AssertUnwindSafe<<mz_storage_types::connections::Connection>::validate::{closure#0}>>> as core::future::future::Future>::poll::{closure#0}, core::option::Option<core::task::poll::Poll<core::result::Result<core::result::Result<(), mz_storage_types::connections::ConnectionValidationError>, alloc::boxed::Box<dyn core::any::Any + core::marker::Send>>>>>\n  29: <tokio::task::task_local::TaskLocalFuture<bool, futures_util::future::future::catch_unwind::CatchUnwind<core::panic::unwind_safe::AssertUnwindSafe<<mz_storage_types::connections::Connection>::validate::{closure#0}>>> as core::future::future::Future>::poll\n  30: <mz_ore::future::OreCatchUnwindWithDetails<core::panic::unwind_safe::AssertUnwindSafe<<mz_storage_types::connections::Connection>::validate::{closure#0}>> as core::future::future::Future>::poll\n  31: <mz_adapter::coord::Coordinator>::sequence_create_connection::{closure#0}::{closure#0}::{closure#1}\n  32: <core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future<Output = ()> + core::marker::Send>> as core::future::future::Future>::poll\n  33: <tracing::instrument::Instrumented<core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future<Output = ()> + core::marker::Send>>> as core::future::future::Future>::poll\n  34: <tokio::runtime::task::core::Core<tracing::instrument::Instrumented<core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future<Output = ()> + core::marker::Send>>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>>::poll::{closure#0}\n  35: <tokio::runtime::task::core::Core<tracing::instrument::Instrumented<core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future<Output = ()> + core::marker::Send>>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>>::poll\n  36: tokio::runtime::task::harness::poll_future::<tracing::instrument::Instrumented<core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future<Output = ()> + core::marker::Send>>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>::{closure#0}\n  37: <core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future<tracing::instrument::Instrumented<core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future<Output = ()> + core::marker::Send>>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>::{closure#0}> as core::ops::function::FnOnce<()>>::call_once\n  38: std::panicking::catch_unwind::do_call::<core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future<tracing::instrument::Instrumented<core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future<Output = ()> + core::marker::Send>>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>::{closure#0}>, core::task::poll::Poll<()>>\n  39: __rust_try\n  40: std::panic::catch_unwind::<core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future<tracing::instrument::Instrumented<core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future<Output = ()> + core::marker::Send>>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>::{closure#0}>, core::task::poll::Poll<()>>\n  41: tokio::runtime::task::harness::poll_future::<tracing::instrument::Instrumented<core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future<Output = ()> + core::marker::Send>>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>\n  42: <tokio::runtime::task::harness::Harness<tracing::instrument::Instrumented<core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future<Output = ()> + core::marker::Send>>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>>::poll_inner\n  43: <tokio::runtime::task::harness::Harness<tracing::instrument::Instrumented<core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future<Output = ()> + core::marker::Send>>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>>::poll\n  44: tokio::runtime::task::raw::poll::<tracing::instrument::Instrumented<core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future<Output = ()> + core::marker::Send>>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>\n  45: <tokio::runtime::task::raw::RawTask>::poll\n  46: <tokio::runtime::task::LocalNotified<alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>>::run\n  47: <tokio::runtime::scheduler::multi_thread::worker::Context>::run_task::{closure#0}\n  48: <tokio::runtime::scheduler::multi_thread::worker::Context>::run_task\n  49: <tokio::runtime::scheduler::multi_thread::worker::Context>::run\n  50: tokio::runtime::scheduler::multi_thread::worker::run::{closure#0}::{closure#0}\n  51: <tokio::runtime::context::scoped::Scoped<tokio::runtime::scheduler::Context>>::set::<tokio::runtime::scheduler::multi_thread::worker::run::{closure#0}::{closure#0}, ()>\n  52: tokio::runtime::context::set_scheduler::<(), tokio::runtime::scheduler::multi_thread::worker::run::{closure#0}::{closure#0}>::{closure#0}\n  53: <std::thread::local::LocalKey<tokio::runtime::context::Context>>::try_with::<tokio::runtime::context::set_scheduler<(), tokio::runtime::scheduler::multi_thread::worker::run::{closure#0}::{closure#0}>::{closure#0}, ()>\n  54: <std::thread::local::LocalKey<tokio::runtime::context::Context>>::with::<tokio::runtime::context::set_scheduler<(), tokio::runtime::scheduler::multi_thread::worker::run::{closure#0}::{closure#0}>::{closure#0}, ()>\n  55: tokio::runtime::context::set_scheduler::<(), tokio::runtime::scheduler::multi_thread::worker::run::{closure#0}::{closure#0}>\n  56: tokio::runtime::scheduler::multi_thread::worker::run::{closure#0}\n  57: tokio::runtime::context::runtime::enter_runtime::<tokio::runtime::scheduler::multi_thread::worker::run::{closure#0}, ()>\n  58: tokio::runtime::scheduler::multi_thread::worker::run\n  59: <tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}\n  60: <tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}> as core::future::future::Future>::poll\n  61: <tracing::instrument::Instrumented<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>> as core::future::future::Future>::poll\n  62: <tokio::runtime::task::core::Core<tracing::instrument::Instrumented<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>>, tokio::runtime::blocking::schedule::BlockingSchedule>>::poll::{closure#0}\n  63: <tokio::runtime::task::core::Core<tracing::instrument::Instrumented<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>>, tokio::runtime::blocking::schedule::BlockingSchedule>>::poll\n  64: tokio::runtime::task::harness::poll_future::<tracing::instrument::Instrumented<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>>, tokio::runtime::blocking::schedule::BlockingSchedule>::{closure#0}\n  65: <core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future<tracing::instrument::Instrumented<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>>, tokio::runtime::blocking::schedule::BlockingSchedule>::{closure#0}> as core::ops::function::FnOnce<()>>::call_once\n  66: std::panicking::catch_unwind::do_call::<core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future<tracing::instrument::Instrumented<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>>, tokio::runtime::blocking::schedule::BlockingSchedule>::{closure#0}>, core::task::poll::Poll<()>>\n  67: __rust_try\n  68: std::panic::catch_unwind::<core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future<tracing::instrument::Instrumented<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>>, tokio::runtime::blocking::schedule::BlockingSchedule>::{closure#0}>, core::task::poll::Poll<()>>\n  69: tokio::runtime::task::harness::poll_future::<tracing::instrument::Instrumented<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>>, tokio::runtime::blocking::schedule::BlockingSchedule>\n  70: <tokio::runtime::task::harness::Harness<tracing::instrument::Instrumented<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>>, tokio::runtime::blocking::schedule::BlockingSchedule>>::poll_inner\n  71: <tokio::runtime::task::harness::Harness<tracing::instrument::Instrumented<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>>, tokio::runtime::blocking::schedule::BlockingSchedule>>::poll\n  72: tokio::runtime::task::raw::poll::<tracing::instrument::Instrumented<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>>, tokio::runtime::blocking::schedule::BlockingSchedule>\n  73: <tokio::runtime::task::raw::RawTask>::poll\n  74: <tokio::runtime::task::UnownedTask<tokio::runtime::blocking::schedule::BlockingSchedule>>::run\n  75: <tokio::runtime::blocking::pool::Task>::run\n  76: <tokio::runtime::blocking::pool::Inner>::run\n  77: <tokio::runtime::blocking::pool::Spawner>::spawn_thread::{closure#0}\n  78: std::sys::backtrace::__rust_begin_short_backtrace::<<tokio::runtime::blocking::pool::Spawner>::spawn_thread::{closure#0}, ()>\n  79: std::thread::lifecycle::spawn_unchecked::<<tokio::runtime::blocking::pool::Spawner>::spawn_thread::{closure#0}, ()>::{closure#1}::{closure#0}\n  80: <core::panic::unwind_safe::AssertUnwindSafe<std::thread::lifecycle::spawn_unchecked<<tokio::runtime::blocking::pool::Spawner>::spawn_thread::{closure#0}, ()>::{closure#1}::{closure#0}> as core::ops::function::FnOnce<()>>::call_once\n  81: std::panicking::catch_unwind::do_call::<core::panic::unwind_safe::AssertUnwindSafe<std::thread::lifecycle::spawn_unchecked<<tokio::runtime::blocking::pool::Spawner>::spawn_thread::{closure#0}, ()>::{closure#1}::{closure#0}>, ()>\n  82: __rust_try\n  83: std::thread::lifecycle::spawn_unchecked::<<tokio::runtime::blocking::pool::Spawner>::spawn_thread::{closure#0}, ()>::{closure#1}\n  84: <std::thread::lifecycle::spawn_unchecked<<tokio::runtime::blocking::pool::Spawner>::spawn_thread::{closure#0}, ()>::{closure#1} as core::ops::function::FnOnce<()>>::call_once::{shim:vtable#0}\n  85: <std::sys::thread::unix::Thread>::new::thread_start\n  86: <unknown>\n  87: <unknown>\n"

@ublubu

ublubu commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Claude:

gcp_auth builds its HTTPS client (HttpClient::new → hyper-rustls → ClientConfig::builder()) and rustls 0.23 panics because it can't auto-pick a crypto provider — both ring and aws-lc-rs are now compiled in, so the choice is ambiguous and nobody called install_default(). That's why it regressed without our code changing: something landed recently (likely the AWS Glue / aws-lc-rs work) that pulled the second provider into the build, flipping rustls from "exactly one" to "ambiguous."

Want me to wire up rustls::crypto::aws_lc_rs::default_provider().install_default() in both binaries, or do you prefer ring?

Nice. 😎

@ublubu ublubu force-pushed the gcp-nightly branch 2 times, most recently from feefa1c to 058e45a Compare June 25, 2026 19:01
@ublubu

ublubu commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

Fix #1: Set gcp_auth's aws-lc-rs feature flag.

Marty's AWS Glue stuff brought in a new dependency, aws-lc-rs. This caused gcp_auth to panic because it couldn't choose between ring and aws-lc-rs. This broke Iceberg sinks to GCP.

Fix #2: Read Iceberg snapshot's manifest list. Get the table's row count that way.

The original test flake: The Iceberg snapshot summary only counted rows added during that commit. The first snapshot's summary would show the correct row count. But any subsequent snapshots would show 0 rows. (I checked with BigQuery to confirm the rows were actually still there.)

@ublubu ublubu merged commit 5066cda into MaterializeInc:main Jun 25, 2026
134 checks passed
@ublubu ublubu deleted the gcp-nightly branch June 25, 2026 19:35
@def-

def- commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Marty's AWS Glue stuff brought in a new dependency, aws-lc-rs. This caused gcp_auth to panic because it couldn't choose between ring and aws-lc-rs. This broke Iceberg sinks to GCP.

Ouch, good thing we have the test and you reenabled it! Is this bad enough that we need to backport it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants