restorer: unmap vDSO target before ARCH_MAP_VDSO#3012
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes an x86 restore-path issue where using ARCH_MAP_VDSO_* could successfully map a new vDSO/VVAR pair at an address different from CRIU’s requested vdso_rt_parked_at when the reserved bootstrap tail is still mapped there, leaving CRIU’s runtime vDSO bookkeeping inconsistent with the actual restored address space.
Changes:
- Unmaps the reserved “runtime vDSO tail” region at
vdso_rt_parked_atbefore callingmap_vdso()whenargs->can_map_vdsois enabled. - Adds an explanatory comment describing why the tail must be freed for
ARCH_MAP_VDSO_*to honor the requested address.
94b4f3a to
89a7df1
Compare
There was a problem hiding this comment.
Pull request overview
Fixes a restore-path correctness issue in restorer when using ARCH_MAP_VDSO_*: the reserved “runtime vDSO tail” at vdso_rt_parked_at must be unmapped before requesting the kernel to map a new vDSO/VVAR there, otherwise the kernel may place it at a different address and CRIU’s bookkeeping becomes inconsistent.
Changes:
- Unmap the reserved runtime vDSO parking range (
vdso_rt_parked_at..+vdso_rt_size) immediately before callingmap_vdso()whenargs->can_map_vdsois set. - Add error handling/logging to abort restore if the unmap fails.
|
@cjolivier01 thank you for working on this. please write a detailed commit message. |
89a7df1 to
9b40bc6
Compare
done |
|
@cjolivier01 the DCO check is failing, you need a Signed-off-by line in your commit. |
e26141c to
5b57c35
Compare
When the kernel supports ARCH_MAP_VDSO, the restorer asks the kernel to map a fresh runtime vDSO/VVAR pair at task_args->vdso_rt_parked_at. That address is also the tail of the bootstrap mapping reserved for the runtime vDSO when the restorer cannot use ARCH_MAP_VDSO. The bootstrap unmap helper keeps this tail mapped so a parked runtime vDSO remains reachable. On the ARCH_MAP_VDSO path, leaving the tail mapped means the requested address is still occupied before arch_prctl(). Kernels that use MAP_FIXED_NOREPLACE for ARCH_MAP_VDSO fail with EEXIST there, and any alternate placement would leave CRIU's runtime vDSO bookkeeping pointing at the wrong address. Unmap the reserved tail before calling map_vdso(). Check the sys_munmap() return value and abort restore with a clear error if the parking area cannot be released, instead of continuing into map_vdso() with inconsistent assumptions. Validated with: - make -j$(nproc) criu/criu - sudo env PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python ./test/zdtm.py run -t zdtm/static/vdso00 -t zdtm/static/vdso01 -t zdtm/static/vdso02 -t zdtm/static/vdso-proxy --criu-bin ./criu/criu --pycriu-search-path ./lib --keep-going --ignore-taint -k failed Signed-off-by: Chris Olivier <colivier@tesla.com>
5b57c35 to
644f7b9
Compare
fixed |
|
Normally I don't resolve my own PR's comments, so unresolve if that's not the procedure. I figured since none were resolved that maybe I was supposed to do it. |
There was a problem hiding this comment.
Pull request overview
This PR fixes a restore-time vDSO placement inconsistency when the restorer uses ARCH_MAP_VDSO_*: it ensures the reserved runtime vDSO “parking” tail of the bootstrap mapping is unmapped before requesting the kernel to map a fresh vDSO/VVAR pair at that address.
Changes:
- Unmap the reserved runtime vDSO tail (
vdso_rt_parked_at/vdso_rt_size) before callingmap_vdso()whenargs->can_map_vdsois true. - Add an error path that aborts restore if the unmap of the reserved area fails, preventing
map_vdso()from proceeding with an occupied target.
The bootstrap region is allocated by restorer_get_vma_hint. restorer_get_vma_hint is looking for a unused region in the current address space. It means that we don't expect to see any unexpected mappings in the bootstrap area. @cjolivier01 could you give more details where this issue has been triggered. What environments was it, what criu verion was used, etc. |
@cjolivier01 friendly ping. We need to understand what can be mapped there. I think you change can hide a deeper issue. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## criu-dev #3012 +/- ##
============================================
+ Coverage 57.22% 57.25% +0.02%
============================================
Files 154 154
Lines 40440 40440
Branches 8863 8863
============================================
+ Hits 23142 23154 +12
+ Misses 17034 17022 -12
Partials 264 264 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I’ll try to set up a repro this week.
…On Tue, May 5, 2026 at 5:52 PM Andrei Vagin ***@***.***> wrote:
*avagin* left a comment (checkpoint-restore/criu#3012)
<#3012 (comment)>
unmap_old_vmas() keeps the bootstrap mapping intact and excludes the
runtime vDSO tail from the range it unmaps. That is correct for the parking
path, because vdso_do_park() needs a reserved destination range.
The bootstrap region is allocated by restorer_get_vma_hint.
restorer_get_vma_hint is looking for a unused region in the current address
space. It means that we don't expect to see any unexpected mappings in the
bootstrap area. @cjolivier01 <https://github.com/cjolivier01> could you
give more details where this issue has been triggered. What environments
was it, what criu verion was used, etc.
@cjolivier01 <https://github.com/cjolivier01> friendly ping. We need to
understand what can be mapped there. I think you change can hide a deeper
issue.
—
Reply to this email directly, view it on GitHub
<#3012 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACVWZ7JI6ASRRQKX6DTWN634ZKEFBAVCNFSM6AAAAACYJ3Y4FSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DGOBUGMYDMOBZG4>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
@cjolivier01 thank you. While I was reviewing your patch, I found and fixed one issue #3017. It might be related to your issue. |
|
@cjolivier01 any update? |
Summary
When the restorer uses
ARCH_MAP_VDSO_*instead of parking CRIU's own runtime vDSO, unmap the reserved runtime vDSO tail before asking the kernel to map a fresh vDSO there.CRIU already reserves
vdso_rt_sizebytes at the end of the bootstrap mapping and records that address invdso_rt_parked_at. The non-ARCH_MAP_VDSOpath moves CRIU's current vDSO/VVAR pair into that reserved range withvdso_do_park(). TheARCH_MAP_VDSOpath skips parking and asks the kernel to create a new vDSO/VVAR pair atvdso_rt_parked_atinstead.The problem is that the bootstrap tail is still mapped at that point. If the requested address is occupied, the kernel does not necessarily install the vDSO/VVAR pair at the requested address. On current x86 kernels the syscall can still succeed, but the mapping is placed elsewhere. CRIU then continues with
map_vdso()recordingvdso_rt_parked_atas the runtime vDSO location, even though the actual vDSO/VVAR pair is somewhere else.That leaves
vdso_maps_rtinconsistent with the restored address space and can break later vDSO handling, including gettimeofday setup and vDSO proxification decisions.Root cause
unmap_old_vmas()keeps the bootstrap mapping intact and excludes the runtime vDSO tail from the range it unmaps. That is correct for the parking path, becausevdso_do_park()needs a reserved destination range.For the
ARCH_MAP_VDSOpath, however, the same reserved range has to be free before callingarch_map_vdso(). The kernel treats the supplied address as a requested mapping location; if it is already occupied, the call can succeed while choosing a different address. CRIU does not observe that actual address and instead assumes the requested address was used.This patch frees the reserved tail immediately before
map_vdso()whenargs->can_map_vdsois true.Reproducer
This reproducer models the CRIU state at the failing point:
vdso_rt_parked_atarch_prctl(ARCH_MAP_VDSO_64, vdso_rt_parked_at)while that address is still occupiedOn an Ubuntu 22.04 x86_64 kernel (
5.19.0-45-generic) the occupied case succeeds but maps VVAR/VDSO elsewhere:A minimal standalone reproducer is:
User-visible impact
The bug is only on the restore path where CRIU uses
ARCH_MAP_VDSO_*instead of parking the runtime vDSO/VVAR mapping. It is architecture and kernel behavior dependent, but when it triggers CRIU's runtime vDSO bookkeeping no longer describes the actual mappings in the restored process.The fix keeps the existing allocation model and only frees the already-reserved runtime VDSO tail at the point where it must become the destination for
ARCH_MAP_VDSO_*.Validation
git diff --check