Skip to content

drm/i915: fix FreeBSD Arc A770 load blockers#468

Open
ryanfahy314 wants to merge 2 commits into
freebsd:masterfrom
ryanfahy314:fix/i915-freebsd-lmem-register
Open

drm/i915: fix FreeBSD Arc A770 load blockers#468
ryanfahy314 wants to merge 2 commits into
freebsd:masterfrom
ryanfahy314:fix/i915-freebsd-lmem-register

Conversation

@ryanfahy314
Copy link
Copy Markdown

@ryanfahy314 ryanfahy314 commented Jun 5, 2026

Related to #315.

Summary

This PR contains two i915 fixes found while debugging Intel Arc A770 / DG2 support on 15.1-STABLE.

Changes

  1. Avoid double-incrementing i915 scatterlist length on FreeBSD.

In the LinuxKPI scatterlist.h, there is a preprocessor macro that aliases sg_dma_len(sg) as (sg)->length.

In linuxkpi/common/include/linux/scatterlist.h:

#define sg_dma_address(sg)      (sg)->dma_address
#define sg_dma_len(sg)          (sg)->length

However, in drivers/gpu/drm/i915/i915_scatterlist.c, there are two instances with the pattern:

sg->length += len;
sg_dma_len(sg) += len;

On FreeBSD, this was causing the sg->length to increment twice, leading to a kernel panic on kldload i915kms

  1. Register FreeBSD fictitious pages for the i915 LMEM BAR

The existing i915 FreeBSD fbdev handling registers a small framebuffer range.

On DG2, the CPU-visible aperture is larger than the fbdev range, so mappings can touch PFNs that do not have corresponding vm_page_t metadata.

This patch registers the CPU-visible LMEM BAR aperture during LMEM initialization, and also preserves fbdev registration as a fallback. LMEM and fbdev now each track whether they own the fictitious range, and cleanup only unregisters from the path that registers it.

I tried to follow the precedent set by the amdgpu driver for calling the register_fictitious_range() and unregister_fictitious_range() helpers. The i915 case needed some additional ownership handling logic because the fbdev path can register a small framebuffer range (on my system it was about 8 MiB).

Testing

  • Built drm-kmod from source on FreeBSD 15.1-STABLE.
  • Tested on Intel Arc A770 / DG2.
  • i915kms loads.
  • DMC and GuC firmware load with hw.i915kms.enable_guc=1.
  • /dev/dri/card0 and /dev/dri/renderD128 are created.
  • Xorg starts with the modesetting driver.
  • SDDM/Plasma starts and drives two monitors with AccelMethod "none".

Known Limitations

These changes appear to be stable on my system with AccelMethod "none".

Using hardware acceleration works initially, but I experienced two crashes with hardware acceleration enabled, one after about an hour of uptime and the other after about 5 minutes of uptime. After the crash, dmesg reports the following:

drmn0: [drm] *ERROR* GT0: GUC: CT: Unsolicited response message: len 1, data 0xe0000104 (fence 43258, last 43258)
drmn0: [drm] *ERROR* GT0: GUC: CT: Failed to handle HXG message (0xffffffffffffff82e) fffff8059f5e8b18h
drmn0: [drm] *ERROR* GT0: GUC: CT: Failed to process CT message (0xffffffffffffff82e) fffff8059f5e8b14h
drmn0: [drm] *ERROR* GT0: GUC: Bad context sched_state 0x6, ctx_id 4111
drmn0: [drm] *ERROR* GT0: GUC: CT: Failed to process request 1002 (0xffffffffffffffa4e)
drmn0: [drm] *ERROR* GT0: GUC: CT: Failed to process CT message (0xffffffffffffffa4e) fffff80526ca8cd4h

I'm currently not sure what's causing the instability, but my current assumption is that it is a failure further in the hardware acceleration pipeline. However, I thought it was best to disclose in case it is relevant to review.

Waking from sleep also appears to fail at the moment

Contributor Note

For transparency, AI assistance was used in a limited scope during discovery and debugging. This included debugging, source navigation guidance, log/crashdump analysis, and temporary instrumentation of certain functions to gather information (all temporary instrumentation was reverted). Any behavioral code was written manually. All code changes were written, reviewed, built, and tested by me. Documentation, PR body and commit text, and code comments were all written manually by me with AI review for accuracy.

Ryan Fahy added 2 commits June 5, 2026 01:12
Found that in the LinuxKPI scatterlist.h, there is a preproscessor
macro that aliases sg_dma_len(sg) with sg->length
In the i915_scatterlist.c, there are two instances of the pattern:

sg->length += len
sg_dma_len(sg) += len

Which causes the variable to double increment on FreeBSD and causes
the i915kms driver to cause a kernel panic on load. In the observed crash,
sg->length was 0x20000 while the related object size was 0x10000

Update i915_rsgt_from_mm_node() and i915_rsgt_from_buddy_resource()
so that on FreeBSD, only sg_dma_len(sg) is incremented to avoid
the double increment.

Built and tested on FreeBSD 15.1-STABLE and this change allows
the i915kms driver to load with enable_guc=1.

Signed-off-by: Ryan Fahy <ryan@rfahy.com>
On FreeBSD, CPU visible memory needs to be registered as fictitious
vm_page_t metadata so PFN lookups can resolve.

The current i915 implementation only registers the fbdev framebuffer range.
On Intel Arc DG2 GPUs, mappings outside of the fbdev range can touch PFNs
that do not thave corresponding metadata because the CPU-vibile LMEM BAR
aperture exceeds the framebuffer registered range

This patch registers the full CPU-visible LMEM aperture range during LMEM
initialization using mem->io.start and resource_size(&mem->io)

Because the i915 driver already included logic to initalize the framebuffer range,
preserve that behavior as a fallback. Added a fictitious_range_registered bool to
both the intel_memory_region struct and the intel_fbdev struct. Each bool tracks
whether or not a fictitious range was registered by lmem or by fbdev.

The intended behavior is:
1) LMEM Initialization -> If ddev->fictitious_range_registerd is false,
proceed to register the CPU visible LMEM aperture and set
intel_memory_region->fictitious_range_registered true
2) FBDEV Initialization -> If ddev->fictitious_range_registered is false,
meaning LMEM didn't initialize anything, initialize just the small framebuffer
memory region and set intel_fbdev->fictitious_range_registered to true
3) On FBDEV cleanup, check intel_fbdev->fictitious_range_registered to
see whether fbdev was the one to register the range. If so, call the unregister function
4) On LMEM cleanup, check intel_memory_region->fictitious_range_registered
to establish whether LMEM was the one to register the range. If so, call the unregister function.

Also modify drm_os_freebsd so that unregister_fictitious_range() clears
ddev->fictitious_range_registered.
register_fictitious_range() also now only sets ddev->fictitious_range_registered if the range was
registered succesfully. If registration fails, clean up with vt_unfreeze_main_vd() and return
the error status

Tested on FreeBSD 15.1-STABLE with an Intel Arc A770/DG2. i915kms loads, DMC and GuC load,
/dev/dri/card0 and /dev/dri/renderD128 appear, Xorg is able to use the modesetting driver,
and SDDM and Plasma work with AccelMethod none. Hardware acceleration still appears to
be unstable on DG2, but does not appear to be caused by this patch.

Signed-off-by: Ryan Fahy <ryan@rfahy.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant