fix: data race on AddrInfo.Addrs in queryPeer#1244
Merged
Conversation
findProvidersAsyncRoutine hands the same []*peer.AddrInfo to two places: routing.QueryEvent.Responses (read by any consumer of RegisterForQueryEvents, e.g. kubo's findprovs HTTP handler marshaling events) and the return value, which queryPeer then mutates via next.Addrs = append(next.Addrs, ...) in query.go. The two goroutines race on AddrInfo.Addrs. Under -race the detector reports the conflict; without -race the torn slice header can crash inside go-multiaddr. Refs: ipfs/kubo#11287, ipfs/kubo#11116.
Before queryPeer runs, the queryFn closures in routing.go publish the returned []*peer.AddrInfo as routing.QueryEvent.Responses. Consumers (e.g. kubo's findprovs HTTP handler) then read those structs concurrently while queryPeer did next.Addrs = append(next.Addrs, curInfo.Addrs...) racing on AddrInfo.Addrs. Build the combined address slice locally and leave next untouched. Covers GetValue, FindProviders, and FindPeer closures since all three share queryPeer as the worker. Refs: ipfs/kubo#11287, ipfs/kubo#11116.
This was referenced Apr 19, 2026
Closed
guillaumemichel
approved these changes
Apr 20, 2026
Collaborator
guillaumemichel
left a comment
There was a problem hiding this comment.
Great finding @lidel! Thanks for the fix and informative regression test!
lidel
added a commit
to ipfs/kubo
that referenced
this pull request
Apr 20, 2026
bump go-libp2p-kad-dht to master (248f53b7) to pick up the queryPeer AddrInfo.Addrs race fix from libp2p/go-libp2p-kad-dht#1244, which resolves the random daemon crashes reported in #11287 and #11116.
This was referenced Apr 20, 2026
Merged
lidel
added a commit
to ipfs/kubo
that referenced
this pull request
Apr 20, 2026
bump go-libp2p-kad-dht to v0.39.1 to pick up the queryPeer AddrInfo.Addrs race fix from libp2p/go-libp2p-kad-dht#1244, which resolves the random daemon crashes reported in #11287 and #11116.
lidel
added a commit
to ipfs/kubo
that referenced
this pull request
Apr 20, 2026
bump go-libp2p-kad-dht to v0.39.1 to pick up the queryPeer AddrInfo.Addrs race fix from libp2p/go-libp2p-kad-dht#1244, which resolves the random daemon crashes reported in #11287 and #11116.
lidel
added a commit
to ipfs/kubo
that referenced
this pull request
Apr 20, 2026
bump go-libp2p-kad-dht to v0.39.1 to pick up the queryPeer AddrInfo.Addrs race fix from libp2p/go-libp2p-kad-dht#1244, which resolves the random daemon crashes reported in #11287 and #11116. (cherry picked from commit eb2cfac)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The race
findProvidersAsyncRoutine(and theGetValue/FindPeer) hand the same[]*peer.AddrInfoto two places:routing.QueryEvent.Responses, observed by anyRegisterForQueryEventssubscriber (e.g. handlers thatjson.Marshalthe event).queryPeerinquery.go, which didnext.Addrs = append(next.Addrs, ...).Two goroutines end up writing and reading the same
AddrInfo.Addrsslice header. Under-racethe detector flags the conflict; without-race, a torn slice header can crash insidego-multiaddrwhen the consumer walksAddrs.Refs: ipfs/kubo#11287, ipfs/kubo#11116
The fix
Note
This class of errors could be fixed by upstream libp2p/go-libp2p#3490, but I also believe we should do this PR (belt-and-suspenders) so we don't wait for go-libp2p release or are hit by a future refactor there.
In
queryPeer, build a fresh slice withslices.Concat(next.Addrs, curInfo.Addrs)instead of mutatingnext.Addrs. The combined addresses are passed toqueryPeerFilterandmaybeAddAddrs; the published*peer.AddrInfois no longer touched.slices.Concatis preferred overappend: with spare capacity,appendwould reusenext.Addrs's backing array, which the consumer still references.slices.Concatalways allocates a fresh array, so no memory is shared with the published event.Also documents the read-only contract on the
queryFntype so LLMs and humans avoid introducing regressions that mutate.Tests
TestFindProvidersAsyncQueryEventResponsesRacedrives realFindProvidersAsyncagainst an interceptedprotoMessenger.masterunder-race,-race.