Skip to content

fix: go unwinding stops at systemstacks#1313

Open
wehzzz wants to merge 12 commits into
open-telemetry:mainfrom
wehzzz:fix-go-unwinding-stops-at-systemstacks
Open

fix: go unwinding stops at systemstacks#1313
wehzzz wants to merge 12 commits into
open-telemetry:mainfrom
wehzzz:fix-go-unwinding-stops-at-systemstacks

Conversation

@wehzzz
Copy link
Copy Markdown
Contributor

@wehzzz wehzzz commented Apr 1, 2026

Unwinding Through Go Stack Switches

Reference: #1275, Based on @florianl's work #1279.

systemstack

Locating the user goroutine

During systemstack, m.curg always points to the frozen user goroutine (systemstack does not call dropg). The profiler reads curg.sched.sp which points into the user stack where the frame pointer prologue saved FP and LR/RA.

gobuf.pc contains systemstack_switch+8 (a synthetic UNDEF marker for Go's stack scanner) and is intentionally ignored.

AMD64 recovery

The Go linker injects PUSH RBP; MOV RSP, RBP into systemstack (obj6.go#L637). gosave_systemstack_switch then saves gobuf.sp = LEAQ 8(SP), which skips the useless return address from CALL gosave and points to the saved RBP:

caller_fp = *(sched_sp);       // saved RBP
caller_pc = *(sched_sp + 8);   // return address
caller_sp = sched_sp + 16;

ARM64 recovery

Go's ARM64 assembler (cmd/internal/obj/arm64, function preprocess in obj7.go#L691) injects a MOVD.W prologue instead of the standard STP. Unlike STP X29, X30 which places R29 at the lower address and LR at the higher address, the MOVD.W sequence places LR at SP+0 and R29 below SP at SP-8.

Verified by disassembling runtime.systemstack (go tool objdump -s 'runtime\.systemstack' ./binary):

TEXT runtime.systemstack.abi0(SB)
  asm_arm64.s:255  0x83d70  f81f0ffe  MOVD.W R30, -16(RSP)    // *(SP+0) = LR
  asm_arm64.s:255  0x83d74  f81f83fd  MOVD R29, -8(RSP)       // *(SP-8) = R29
  asm_arm64.s:255  0x83d78  d10023fd  SUB $8, RSP, R29

The profiler reads:

caller_pc = *(sched_sp);       // LR at SP+0
caller_fp = *(sched_sp - 8);   // R29 at SP-8
caller_sp = sched_sp + 16;

mcall

Locating the user goroutine

mcall's callee (park_m, goexit0, etc.) calls dropg() which sets m.curg = nil and g.m = nil. In most samples, m.curg is nil. The profiler falls back to reading the goroutine pointer from the g0 stack at *(g0.sched.sp - 8).

AMD64: mcall does PUSHQ AX (old_g) before calling fn. This writes old_g to g0.sched.sp - 8 deterministically. This is hand-written assembly, stable across all Go versions.

Verified by disassembling runtime.mcall (go tool objdump -s 'runtime\.mcall' ./binary):

TEXT runtime.mcall(SB)
  asm_amd64.s:450  0x46f514  MOVQ R14, AX                  // AX = old_g
  asm_amd64.s:451  0x46f517  MOVQ SI, R14                  // R14 = g0
  asm_amd64.s:453  0x46f51a  MOVQ R14, FS:0xfffffff8       // TLS = g0
  asm_amd64.s:454  0x46f523  MOVQ 0x38(R14), SP            // SP = g0.sched.sp
  asm_amd64.s:455  0x46f527  MOVQ $0x0, BP
  asm_amd64.s:456  0x46f52e  PUSHQ AX                      // *(g0.sched.sp - 8) = old_g
  asm_amd64.s:458  0x46f532  CALL R12                      // fn(old_g)

ARM64: mcall passes old_g in R0 (no push). The goroutine pointer is at g0.sched.sp - 8 only if fn spills R0 to its ABIInternal arg area. This depends on the compiler's spill decisions, not hand-written assembly. Stable for the current ABI; a future ABI change could alter the spill offset. The DWARF DW_OP_fbreg 8 should be used as the source of truth.

Verified by disassembling runtime.mcall on ARM64:

TEXT runtime.mcall(SB)
  asm_arm64.s:234  0x83d24  MOVD 56(R28), R0              // R0 = g0.sched.sp
  asm_arm64.s:235  0x83d28  MOVD R0, RSP                  // RSP = g0.sched.sp
  asm_arm64.s:236  0x83d2c  MOVD ZR, R29
  asm_arm64.s:237  0x83d30  MOVD R3, R0                   // R0 = old_g
  asm_arm64.s:238  0x83d34  MOVD ZR, -16(RSP) 
  asm_arm64.s:239  0x83d38  SUB $16, RSP, RSP             // allocate 16-byte arg space
  asm_arm64.s:241  0x83d40  CALL (R4)                     // fn(old_g)

fn's prologue then spills R0 to entry_SP + 8 = g0.sched.sp - 16 + 8 = g0.sched.sp - 8. This is confirmed by both disassembly and DWARF.

park_m spills (go tool objdump -s 'runtime\.park_m' ./binary):

TEXT runtime.park_m(SB)
  proc.go:4229  0x55f6c  MOVD.W R30, -96(RSP)             // frame = 96 bytes
  proc.go:4229  0x55f70  MOVD R29, -8(RSP)
  proc.go:4229  0x55f74  SUB $8, RSP, R29
  proc.go:4229  0x55f78  MOVD R0, 104(RSP)                // spill gp: RSP+104 = CFA+8 = g0.sched.sp - 8

DWARF confirms (dwarfdump -i -S match=runtime.park_m -Wc ./binary):

DW_TAG_formal_parameter
  DW_AT_name   gp
  DW_AT_location:
    [0x55f60, 0x55fa0): DW_OP_reg0          <- gp in R0 (before spill)
    [0x55fa0, 0x56230): DW_OP_fbreg 8       <- gp at CFA+8 = g0.sched.sp - 8

ARM64: functions that don't spill

goexit0 does NOT spill - passes R0 directly to gdestroy (go tool objdump -s 'runtime\.goexit0' ./binary):

TEXT runtime.goexit0(SB)
  proc.go:4447  0x56b7c  MOVD.W R30, -32(RSP) 
  proc.go:4447  0x56b80  MOVD R29, -8(RSP)
  proc.go:4447  0x56b84  SUB $8, RSP, R29
  proc.go:4448  0x56b88  CALL runtime.gdestroy(SB)         // R0 passed directly, no spill

DWARF confirms - location list only contains DW_OP_reg0, no DW_OP_fbreg:

DW_TAG_formal_parameter
  DW_AT_name   gp
  DW_AT_location:
    [0x56b70, 0x56b8c): DW_OP_reg0          <- gp stays in R0, never spilled
    end-of-list

All mcall callees can be found with:

grep -n 'mcall(' /usr/local/go/src/runtime/*.go | grep -v '//' | grep -v 'func mcall'

Each callee's spill behavior can be verified with DWARF:

dwarfdump -i -S match=runtime.<callee> -Wc ./binary

If the gp parameter's location list contains DW_OP_fbreg 8, it spills to g0.sched.sp - 8. If it only contains DW_OP_reg0 followed by end-of-list, it does not spill.

When the profiler cannot resolve the user goroutine through mcall (non-spilling function on ARM64, or goroutine already rescheduled on another M), it stops unwinding at the mcall frame (*stop = true) without crossing to the user stack. The g0 frames (park_m, schedule, findRunnable, etc.) are still captured. This is the same behavior as before this change (UNWIND_COMMAND_STOP). There is no regression.

Callee Spills R0? DWARF
park_m Yes DW_OP_fbreg 8
preemptPark Yes DW_OP_fbreg 8
exitsyscall0 Yes DW_OP_fbreg 8
goyield_m Yes DW_OP_fbreg 8
goschedguarded_m Yes DW_OP_fbreg 8
goexit0 No DW_OP_reg0 only
gosched_m No DW_OP_reg0 only
gopreempt_m No DWARF abstract (inlined), verified via objdump

Stale goroutine detection

The goroutine at the g0 slot may have been rescheduled on another M since the mcall. Its gobuf could then contain values from systemstack on that other thread (observed in testing: systemstack_switch frames from a different M). The profiler validates candidate.m == nil (parked, gobuf reliable) vs candidate.m != nil (rescheduled, STOP).

Recovery

Unlike systemstack, mcall saves the caller's actual registers into gobuf (not a synthetic marker):

state->pc = *(curg + sched_pc);    // gobuf.pc = real return address
state->sp = *(curg + sched_sp);    // gobuf.sp = real SP
state->fp = *(curg + sched_bp);    // gobuf.bp = real FP

systemstack test

amd64
image

arm64
image

mcall test

amd64
image

arm64
image

@florianl
Copy link
Copy Markdown
Member

florianl commented Apr 1, 2026

I also do have something in progress, based on #1279, but need to land #1310 first. During KubeCon last week, I didn't find time to work on it.

@wehzzz
Copy link
Copy Markdown
Contributor Author

wehzzz commented Apr 2, 2026

I also do have something in progress, based on #1279, but need to land #1310 first. During KubeCon last week, I didn't find time to work on it.

Hope KubeCon went well!

I ended up getting nerd-sniped by this problem and wanted to dig into it since the previous PR was closed. No worries at all if you already have a fix in the works based on #1279. I currently have something working, but if you think your approach is the better way to tackle this, we can absolutely close this PR. Just let me know how you'd like to proceed!

@wehzzz wehzzz changed the title [WIP] fix: go unwinding stops at systemstacks fix: go unwinding stops at systemstacks Apr 2, 2026
@wehzzz wehzzz marked this pull request as ready for review April 8, 2026 08:01
@wehzzz wehzzz requested review from a team as code owners April 8, 2026 08:01
@wehzzz wehzzz force-pushed the fix-go-unwinding-stops-at-systemstacks branch from dfb9247 to 6961a92 Compare April 8, 2026 08:16
Copy link
Copy Markdown
Contributor

@fabled fabled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome stuff! Added first round of comments and questions.

Comment thread support/ebpf/native_stack_trace.ebpf.c Outdated
Comment thread support/ebpf/native_stack_trace.ebpf.c Outdated
Comment thread support/ebpf/native_stack_trace.ebpf.c Outdated
// synthetic marker for Go's stack scanner and scheduler, not a real return address.
//
// https://github.com/golang/go/blob/917949cc1d16c652cb09ba369718f45e5d814d8f/src/runtime/asm_amd64.s#L886
GoLabelsOffsets *go_offs = go_get_go_offsets();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go_get_go_offsets is used in many places, and this is inside a rolled loop. I'm wondering if it'd make sense to read this data in the native unwinder beginning to the PerCPURecord? The Go plugins also could use it and avoid the lookup later on.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to copy GoLabelsOffsets into PerCPURecord to avoid reading the same values again. I'm not sure what would be the best place for the first read - collect_trace seems to initialize the struct, so it should work there, but from my understanding it would mean calling this for every non-Go process, adding one bpf_read for each of them (at the moment it's only for go processes).

Not sure if we want to do this here or in a follow-up.

Comment thread support/ebpf/native_stack_trace.ebpf.c Outdated
Comment thread support/ebpf/native_stack_trace.ebpf.c Outdated
Comment thread nativeunwind/elfunwindinfo/elfgopclntab.go
Copy link
Copy Markdown
Member

@florianl florianl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor comments. Overall I think a coredump test would be great here.

Comment on lines +132 to +133
// to perform Go stack unwinding.
interpreterLoaders = append(interpreterLoaders, golabels.Loader)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Load this interpreter only if it is configured.

Suggested change
// to perform Go stack unwinding.
interpreterLoaders = append(interpreterLoaders, golabels.Loader)
// to perform Go stack unwinding.
if includeTracers.Has(types.Labels) || includeTracers.Has(types.GoTracer) {
interpreterLoaders = append(interpreterLoaders, golabels.Loader)
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to tie loading Go labels (and thus unwinding) to symbolication (GoTracer), which is why I made sure to always load the labels. Since the use of the Go tracer is configurable, if a user decides to disable it in favor of a method other than local symbolication, it will cause an issue because they won't have the offsets for unwinding.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don't have a strong preference, there is a clear use case for users who need native unwinding and want to disable all other interpreters, including labels. This is often done to minimize memory usage.

For approaches that are specifically focused (e.g., only on Go), it is reasonable to expect users will enable GoTracer if they are interested in unwinding past runtime.mcall. Therefore, the question is: Are there many scenarios where users require unwinding beyond runtime.mcall but without Go symbolization?

Copy link
Copy Markdown
Contributor Author

@wehzzz wehzzz Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there many scenarios where users require unwinding beyond runtime.mcall but without Go symbolization?

Yes. When Go symbolization is done on the backend, we want to disable local symbolization (noticeable memory overhead on Go-heavy nodes) while still unwinding past runtime.mcall. golabels is cheap enough that we don't mind enabling it for the offsets.

That said, golabels now serves two purposes: extracting goroutine labels, and exposing the runtime offsets the native unwinder needs to cross runtime.mcall / runtime.systemstack (and runtime.asmcgocall later). We don't see a realistic case where a user wants native unwinding but not to cross these boundaries, so gating on Labels feels like the wrong abstraction.

What do you think about splitting the offsets out into a dedicated interpreter that's always loaded for Go binaries (e.g. a new GoRuntime or GoOffsets interpreter)? This would allow GoTracer to focus on symbolization and keep Labels for label extraction only. We can do that in a follow-up PR to keep this one focused on the unwinder fix, unless you'd rather see it

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to follow up here. There's now three separate Go support things: unwinding, labels and symbolization. Depending on configuration all three might want to be managed separately. It would also reduce memory overhead if all three shared the same ebpf per-pid structures.

It might make even sense to preload the data to UnwindState or PerCPURecord on the tracer entry function. This would reduce runtime overhead of looking up the data potentially multiple times through out the unwind.

But agreed, worth a separate follow up PR / ticket.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me!
How do you usually track this kind of follow-up?

Comment thread support/ebpf/native_stack_trace.ebpf.c Outdated
@wehzzz wehzzz force-pushed the fix-go-unwinding-stops-at-systemstacks branch 2 times, most recently from ea8cc5b to c92c28e Compare April 20, 2026 07:35
@wehzzz
Copy link
Copy Markdown
Contributor Author

wehzzz commented Apr 20, 2026

I have included coredump tests. The relevant coredump files (go-*) can be accessed via the link below.
https://drive.google.com/drive/folders/1zbNEXH8zL8waaqUtHWBP0tlu7btGqFCm?usp=drive_link

@wehzzz wehzzz requested a review from florianl April 20, 2026 09:43
Comment thread tools/coredump/testdata/arm64/go-mcall.json Outdated
@florianl
Copy link
Copy Markdown
Member

@fabled friendly ping for review

Copy link
Copy Markdown
Contributor

@fabled fabled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good to me. Some nits and potential optimization opportunities pointed out in comments.

Comment thread support/ebpf/go_runtime.h Outdated
Comment thread support/ebpf/go_runtime.h Outdated
Comment thread support/ebpf/native_stack_trace.ebpf.c Outdated
Comment on lines +132 to +133
// to perform Go stack unwinding.
interpreterLoaders = append(interpreterLoaders, golabels.Loader)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to follow up here. There's now three separate Go support things: unwinding, labels and symbolization. Depending on configuration all three might want to be managed separately. It would also reduce memory overhead if all three shared the same ebpf per-pid structures.

It might make even sense to preload the data to UnwindState or PerCPURecord on the tracer entry function. This would reduce runtime overhead of looking up the data potentially multiple times through out the unwind.

But agreed, worth a separate follow up PR / ticket.

Comment thread support/ebpf/native_stack_trace.ebpf.c Outdated
@wehzzz wehzzz force-pushed the fix-go-unwinding-stops-at-systemstacks branch from 380a486 to e0bdb10 Compare May 4, 2026 09:25
@wehzzz wehzzz requested a review from fabled May 4, 2026 09:32
@wehzzz wehzzz force-pushed the fix-go-unwinding-stops-at-systemstacks branch from e0bdb10 to d8ebcaf Compare May 5, 2026 08:17
@florianl
Copy link
Copy Markdown
Member

@fabled could you have another look?

"lwp": 10910,
"frames": [
"linux-vdso.1.so+0xcfd",
"runtime.nanotime1+0 in /home/ubuntu/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.0.linux-amd64/src/runtime/sys_linux_amd64.s:265",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems nanotime1 needs special unwind handling also.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this! I had already started looking on my end to follow up on asmcgocall, but I can add this one to the list too.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1331 for asmcgocall support. Though it should probably be converted to do frame pointer function unwind override.

"runtime.schedule+0 in /home/ubuntu/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.0.linux-amd64/src/runtime/proc.go:4138",
"runtime.goschedImpl+0 in /home/ubuntu/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.0.linux-amd64/src/runtime/proc.go:4301",
"runtime.gosched_m+0 in /home/ubuntu/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.0.linux-amd64/src/runtime/proc.go:4306",
"runtime.mcall+0 in /home/ubuntu/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.0.linux-amd64/src/runtime/asm_amd64.s:462"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, and other instances of mcall seem to be terminating the unwind. Just double checking if this is intended? And if yes, where's the stack trace that this testcase is expected to verify?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. This testcase is mostly meant to exercise the hard-to-catch mcall states after dropg(), where we recover a candidate goroutine from the g0 stack and then have to validate candidate.g.m before using its gobuf.

The main case I wanted to preserve is when candidate.g.m is non-nil, meaning the goroutine has already been rescheduled on another M. In that case, continuing would risk reading a stale/overwritten gobuf and producing a wrong stack trace, so stopping at runtime.mcall is the expected safe behavior.

This was the scenario I had trouble catching reliably in deployments, so I considered this safe-stop regression coverage more valuable here than only adding success cases, which are usually much easier to notice when they break.

That said, I agree the intent is not obvious from the expected stack trace alone. This probably deserves either more documentation in the testsource, or an additional more targeted test.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding might also be wrong since it's my first time working on unwinding issues, so please feel free to challenge what I'm saying.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdyt ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gentle ping on ^ @fabled

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So would it then make sense add a second mcall test case that actually shows unwind over mcall?

Copy link
Copy Markdown
Contributor Author

@wehzzz wehzzz May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While trying to add and adapt the existing tests, I ran into 3 problems:

1. Go offsets weren't loaded in the coredump tests

files: ebpfhelpers.go and ebpfmaps.go

I fixed that, and then realized my existing mcall tests weren't actually testing what I wanted - the result was biased by the fact that unwinding was failing in every case.

2. TLS reads return 0 on arm64

file: go_runtime.h

After the fix, running the coredump analysis:

./coredump analyze --case testdata/arm64/go-mcall.json --debug-ebpf
tsd_base at 0x0, g offset: 16
tsd_base at 0x0, g offset: 16
tsd_base at 0x0, g offset: 16
... (every Go-runtime thread)

Looking into it, this is expected for programs that don't explicitly import "C", because the Go runtime gates TLS on runtime detection via runtime.iscgo. The problem on our side is that we only gate on the buildInfo. That means a program compiled with CGO_ENABLED=1 but with no actual import "C" slips through, and the existing R28 fallback (which only fires when tls_offset == 0) never triggers. I modified that path so it falls back correctly when the TLS read fails or returns 0.

3. Spill-fallback path is broken on arm64

After fixing the other two, I noticed we had an unwinding issue on the argument-spill fallback path on arm64. gosched_m isn't supposed to spill, so we shouldn't be able to unwind past it - but here we do.

# LWP 22277
runtime.osyield
runtime.goschedImpl
runtime.gosched_m
runtime.mcall+0 
runtime.gopark
goroutine
runtime.block
main.main
runtime.main
runtime.goexit

# LWP 22281
runtime.lock2
runtime.findRunnable
runtime.schedule
runtime.goschedImpl
runtime.gosched_m
runtime.mcall+0
runtime.gopark
runtime.bgsweep
runtime.gcenable.gowrap1
runtime.goexit

# LWP 22280
runtime.futex
runtime.semasleep
runtime.lock2
runtime.goschedImpl
runtime.gosched_m
runtime.mcall+0
runtime.gopark
runtime.(*scavengerState).park
runtime.bgscavenge
runtime.gcenable.gowrap2
runtime.goexit

On amd64, runtime.mcall deterministically pushes old_g to (g0.sched.sp - 8) via PUSHQ AX. Arm64 has no equivalent - only certain callees (park_m, preemptPark, exitsyscall0, goyield_m, goschedguarded_m) spill R0 there. When the callee doesn't spill (gosched_m, goexit0, gopreempt_m), the slot keeps whatever was there last - typically old_g from a previous mcall(park_m) on the same M. Our BPF checks only prove that the value still looks like a parked goroutine (m == nil, sched.sp != 0); they do not prove that it is the goroutine passed to the current mcall.

I don't see an obvious quick fix off the top of my head, so I suggest that for now we don't enable the fallback on arm64. This implies we'll have more unwinding stops than on amd64. Unless you see a way to fix this, I think the best path is to move forward with this PR and leave it for a follow-up.

Next steps

I suggest the following:

  • Don't fall back on argument-spill on arm64 for mcall.
  • Regenerate the mcall test data on arm64.
  • Open a follow-up for arm64 mcall unwinding.
  • Open a follow-up to split Go runtime offsets from Go labels.

Open question

On arm64, we currently decide whether to extract a TLS g offset from build info
(CGO_ENABLED) rather than from the runtime value of runtime.iscgo. This can
produce tls_offset != 0 for binaries built with CGO_ENABLED=1 but with no
actual import "C", while the Go runtime itself takes the non-cgo path and keeps
g in R28. Is that distinction intentional or a constraint for our use case, or should we try
to match Go's runtime behavior more closely and gate this on runtime.iscgo
instead?

Thanks @fabled for pushing on this. Sorry, I should have paid closer attention to the arm64 runtime path earlier.

Also, small ping @florianl since you already approved this. I’d appreciate your thoughts on the points above.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gentle bump @florianl @fabled

Copy link
Copy Markdown
Member

@florianl florianl May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wehzzz can you please resolve the merge conflict?

So would it then make sense add a second mcall test case that actually shows unwind over mcall?

For me it would be fine to do this as a follow up.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will do so!

For me it would be fine to do this as a follow up.

Regarding this point, it was mostly about this comment: #1313 (comment). When I tried to add a second mcall test, I encountered several issues. I fixed some of them, but right now the main blocker is that the spill fallback I implemented for arm64 can't work, and we'll need to find a workaround.

Before I dive into anything, I wanted to make sure the steps documented in the previous comment works for you, so we can try to get this merged as quickly as possible.

@wehzzz wehzzz requested a review from fabled May 13, 2026 09:22
@wehzzz wehzzz force-pushed the fix-go-unwinding-stops-at-systemstacks branch from 7ac1a24 to eac0e41 Compare May 15, 2026 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants