Skip to content

Experimental support for IPv6#7671

Merged
achamayou merged 49 commits into
mainfrom
ipv6_experiment
Jun 22, 2026
Merged

Experimental support for IPv6#7671
achamayou merged 49 commits into
mainfrom
ipv6_experiment

Conversation

@achamayou

@achamayou achamayou commented Feb 13, 2026

Copy link
Copy Markdown
Member

Tracking issue: #3215

Overview

Experimental support for IPv6 in CCF's host networking. Node RPC and node-to-node interface hosts may be specified as IPv6 literals in bracketed form (e.g. [::1]:8000), and addresses are consistently parsed, bound, connected (with fallback across mixed IPv4/IPv6 resolved addresses), serialised, and embedded in redirect URLs for both IPv4 and IPv6. IPv4 and DNS behaviour is unchanged.

Changes

Host TCP (src/host/tcp.h)

  • Client socket creation is deferred to connect_resolved(), where the resolved address family (AF_INET/AF_INET6) is known, instead of being hard-coded to AF_INET.
  • TCP_USER_TIMEOUT is applied on every client socket, including the client_bind() path, via set_connection_timeout() / set_connection_timeout_on_uv_handle().
  • The client recreates its socket when the resolved address family changes between connection attempts, enabling fallback across mixed IPv4/IPv6 resolved address lists (a uv_tcp_t cannot switch family in place).
  • Pre-libuv sockets are closed on error paths to avoid leaks.

Address handling

  • make_net_address / split_net_address (include/ccf/service/node_info_network.h) and cli::validate_address (src/ds/cli_helper.h) handle bracketed IPv6 literals ([host]:port): IPv6 hosts are wrapped in brackets when joined and stripped when split.
  • split_net_address keeps a port-less host in the host position (not the port slot), and cli::validate_address validates the port range and rejects malformed brackets. The two are intentionally kept separate (documented inline): split_net_address is on the consensus deserialisation path and must stay lenient, while validate_address is strict input validation at the CLI boundary.
  • Consensus Configuration::NodeInfo serialisation (src/kv/kv_types.h) and resolved RPC address output (src/host/run.cpp) use these helpers for consistent, URL-safe addresses.
  • Node certificate SANs detect IPv6 literals (is_ip via inet_pton) and store them unbracketed (src/node/node_state.h).

Test infrastructure

  • A network-wide ipv6 option (tests/infra/network.py) is threaded to each node at construction (tests/infra/node.py); when set, localhost is expanded to the IPv6 loopback ::1 via expand_localhost(ipv6=...) (tests/infra/net.py), so nodes added later are covered too. The previous CCF_IPV6 / IPv4-mapped ::ffff: mechanism has been removed.
  • The ipv6 option is also inherited by networks constructed from an existing_network (tests/infra/network.py), so it propagates across recovery, where recovered networks are built from the previous one.
  • infra.net.ipv6_loopback_available() reports whether ::1 can be bound, so IPv6 tests skip cleanly where it is unavailable.
  • infra.interfaces.make_address is bracket-aware and idempotent, and is used for published addresses and for redirect Location assertions (tests/e2e_operations.py, tests/redirects.py), the recovery-share script (tests/infra/member.py), and node addresses.
  • The raw client socket uses socket.create_connection (address-family agnostic) instead of a hard-coded AF_INET (tests/infra/clients.py).
  • tests/infra/node.py skips the IPv4-only "client interface" used for partition simulation when the node RPC host is IPv6.

Tests

  • common_ipv6 (tests/e2e_common_endpoints.py, registered in tests/e2e_logging.py) starts a network on the IPv6 loopback ::1 and exercises common endpoints and redirects.
  • reconfiguration_ipv6 (tests/reconfiguration.py, registered in tests/nodes.py) runs the full reconfiguration scenario - join, retire, replace, snapshot - with every node bound to ::1, and asserts that no IPv4 literal appears in any node config (assert_no_ipv4_in_node_configs).
  • recovery_ipv6 (tests/recovery.py) runs the full recovery scenario - repeated service recoveries with and without snapshots, primary changes, and historical receipt checks - with every node bound to ::1, and reuses assert_no_ipv4_in_node_configs to confirm no IPv4 literal appears in any node config.
  • All skip automatically when IPv6 loopback is unavailable.
  • Unit tests for the address helpers (src/node/test/node_info_json.cpp).

Documentation

  • CHANGELOG.md records the experimental IPv6 support under the current version.

Out of scope (for now)

  • A dedicated dual-stack run (mixed IPv4 and IPv6 nodes in a single network).
  • UDP / QUIC IPv6 parity. The host UDP address plumbing represents peer addresses as a fixed-size sockaddr (udp::sockaddr_encode/sockaddr_decode, PendingIO, the host UDP read/write path), which truncates IPv6 sockaddr_in6 addresses, so UDP/QUIC replies do not reach IPv6 peers reliably. Only the TCP path has been made address-family-aware; UDP/QUIC over IPv6 is left for follow-up.
  • Partition simulation on IPv6 (the IPv4-based client-interface harness is skipped for IPv6 hosts).
  • Confidential-hardware coverage of the IPv6 tests: SNP CI runs on ACI, which cannot bind the ::1 loopback, so the IPv6 e2e tests only execute on the virtual/VMSS runners and are skipped on SNP.

@achamayou

Copy link
Copy Markdown
Member Author

@copilot build, run the tests, summarize failures and take them into account to apply a tentative fix.

Copilot AI commented Mar 11, 2026

Copy link
Copy Markdown
Contributor

@achamayou I've opened a new pull request, #7732, to work on those changes. Once the pull request is ready, I'll request review from you.

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>
@achamayou achamayou marked this pull request as ready for review March 11, 2026 16:19
@achamayou achamayou requested a review from a team as a code owner March 11, 2026 16:19
Copilot AI review requested due to automatic review settings March 11, 2026 16:19

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds initial end-to-end coverage for IPv6 loopback usage in the test suite, and updates host/node networking code paths to better handle IPv6 addresses (notably in TCP connection setup and certificate SAN generation).

Changes:

  • Add an e2e test that starts a network with RPC interfaces bound to ::1 and exercises common endpoints.
  • Adjust test infra to avoid binding the IPv4-only “client interface” used for partition simulation when the node RPC host is IPv6.
  • Improve IPv6 handling in node certificate SAN construction and in host TCP client socket creation.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/infra/node.py Skips node_client_host binding when RPC host is IPv6 to avoid incompatible partition-simulation client interface.
tests/e2e_logging.py Registers a new common_ipv6 sub-test using the logging sample app.
tests/e2e_common_endpoints.py Adds run_ipv6() which forces nodes to bind to ::1 (skipping if IPv6 loopback is unavailable).
src/node/node_state.h Strips brackets from IPv6 hosts when constructing default SANs; updates IPv6-related comment in is_ip().
src/host/tcp.h Defers client socket creation to use the resolved address family (IPv4 vs IPv6).
Comments suppressed due to low confidence (1)

src/node/node_state.h:2347

  • The is_ip() heuristic does not correctly detect general IPv6 literals: it only returns true when the final ':' component is purely numeric, so valid IPv6 addresses ending with hex letters (eg "2001:db8::a") will be treated as DNS names. This can produce incorrect SAN types in generated node certificates and break TLS validation when connecting by IP. Consider replacing this heuristic with a real IPv4/IPv6 parse (eg inet_pton / uv_inet_pton) after stripping brackets (and any zone ID if present).
    bool is_ip(const std::string_view& hostname)
    {
      // IP address components are purely numeric. DNS names may be largely
      // numeric, but at least the final component (TLD) must not be
      // all-numeric. So this distinguishes "1.2.3.4" (an IP address) from
      // "1.2.3.c4m" (a DNS name). "1.2.3." is invalid for either, and will
      // throw. Handles IPv6 by splitting on ':' after splitting on '.'.
      const auto final_component =
        ccf::nonstd::split(ccf::nonstd::split(hostname, ".").back(), ":")
          .back();
      if (final_component.empty())
      {
        throw std::runtime_error(fmt::format(
          "{} has a trailing period, is not a valid hostname", hostname));
      }

      return std::ranges::all_of(
        final_component, [](char c) { return c >= '0' && c <= '9'; });

You can also share your feedback on Copilot code review. Take the survey.

Comment thread src/host/tcp.h Outdated
Comment thread src/host/tcp.h
Copilot AI and others added 2 commits April 16, 2026 21:42

Copilot AI commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

@copilot build, run the tests, summarize failures and take them into account to apply a tentative fix.

Done in 1ae1a18.

I ran local checks and reviewed CI failures:

  • Local build currently fails in this environment with fatal error: 'uv.h' file not found in src/tasks/task_system.cpp (pre-existing environment/dependency issue).
  • CI failure logs (run 22936601579) showed common_ipv6 failing to start node 0, causing e2e_logging and e2e_logging_http2 failures.

Tentative fix applied in src/host/tcp.h:

  • avoid repeated uv_tcp_open on already-open handles during reconnect attempts,
  • apply TCP_USER_TIMEOUT on client-bind sockets too,
  • close pre-libuv sockets on error paths to avoid leaks.

Comment thread src/host/tcp.h Outdated
Comment thread src/host/tcp.h Outdated
@achamayou

Copy link
Copy Markdown
Member Author

@copilot add to the description of the PR mermaid flowcharts that show the connection flow before and after your changes.

Agent-Logs-Url: https://github.com/microsoft/CCF/sessions/0cf0011c-18a2-4003-951c-55a903a0b6a5

Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>
@achamayou

Copy link
Copy Markdown
Member Author

@copilot add to the description of the PR mermaid flowcharts that show the connection flow before and after your changes.

Agent-Logs-Url: https://github.com/microsoft/CCF/sessions/2caf1046-8ae3-4198-98b2-d97b68196b3f

Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>

Copilot AI commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

@copilot add to the description of the PR mermaid flowcharts that show the connection flow before and after your changes.

Updated the PR description with Mermaid flowcharts showing the connection flow before and after the TCP IPv6 retry/timeout changes (update pushed with 262f563).

@copilot add to the description of the PR mermaid flowcharts that show the connection flow before and after your changes.

Updated the PR description with Mermaid flowcharts showing the connection flow before and after the TCP IPv6 retry/timeout changes (update pushed with 262f563).

Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 28 changed files in this pull request and generated 1 comment.

Comment thread src/ds/cli_helper.h
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 28 changed files in this pull request and generated no new comments.

@achamayou achamayou removed the run-long-test Run Long Test job label Jun 22, 2026
@achamayou

Copy link
Copy Markdown
Member Author

@copilot formatting

Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>

Copilot AI commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

@copilot formatting

Done — src/ds/cli_helper.h had a long condition line that clang-format wanted split; fixed in the latest commit.

@mergify

mergify Bot commented Jun 22, 2026

Copy link
Copy Markdown

Tick the box to add this pull request to the merge queue (same as @mergifyio queue).

  • Queue this pull request

Comment thread include/ccf/service/node_info_network.h
@achamayou achamayou requested a review from cjen1-msft June 22, 2026 12:17
@achamayou achamayou merged commit ed415a0 into main Jun 22, 2026
19 checks passed
@achamayou achamayou deleted the ipv6_experiment branch June 22, 2026 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants