Update to Linux 6.9 drivers#361
Conversation
|
Hi @dumbbell, thank you for doing this work! I tested your changes with the latest upstream firmware, but I was unsuccessful in starting sway:
Hardware info: https://bsd-hardware.info/?probe=fd3536cb97 |
|
Thank you @centromere for the test and feedback! Could you please share the entire dmesg without filtering it? Some DRM-related messages do not contai "drm" in it. Could you also please run sway(1) inside truss(1) like this: ... and share both |
|
@dumbbell, is there anything else I can provide to be of assistance to you? |
I still didn’t have a chance to deeply look at the code to understand what it wants to do here. Thank you for sharing the log files by the way! It should be enough to investigate, though it’s unlikely I will be able to do it in the coming week. |
|
I just looked at firmware and the Linux 6.9 drivers don’t use newer firmwares. However, I put a branch with updated existing files. I don’t know if this can fix some of the bugs people saw though. Here is the branch: |
|
@centromere: Here is my current understanding understanding: Mesa performs a series of ioctl(2) calls to the device. You can find the breakdown below, based on the sources of Mesa 24.1.7 currently available in the Ports tree, and the I’m not sure about the last I suppose that the returned size is incorrect and triggered an error from calloc(3). I would like to know the params of that last Are you able to recompile Mesa from the Ports tree after applying a patch? Here is diff that adds a printf() to help me understand what the kernel returned: --- work/mesa-24.1.7/src/intel/common/i915/intel_gem.h.orig 2025-07-09 01:06:36.275143000 +0200
+++ work/mesa-24.1.7/src/intel/common/i915/intel_gem.h 2025-07-09 01:09:28.522240000 +0200
@@ -75,6 +75,7 @@
};
int ret = intel_ioctl(fd, DRM_IOCTL_I915_QUERY, &args);
+ fprintf(stderr, "DRM_IOCTL_I915_QUERY: query_id=%lu ret=%d length=%d\n", query_id, ret, item.length);
if (ret != 0)
return -errno;
else if (item.length < 0)You can apply it |
|
I have collected the requested information. |
|
Thank you! Indeed, the returned length looks like an error code (-22 =
In the i915 driver, this would be this code ( classinstance = *((struct i915_engine_class_instance *)&query_item->flags);
engine = intel_engine_lookup_user(i915, (u8)classinstance.engine_class,
(u8)classinstance.engine_instance);
if (!engine)
return -EINVAL;
if (engine->class != RENDER_CLASS)
return -EINVAL;So the lookup fails or returns an unexpected engine, based on the query flags passed by Mesa. I can’t dig deeper right now, but I will follow that lead. |
|
@centromere: Also, I forgot to ask, is it a regression compared to previous DRM drivers? |
No. This is brand new hardware which has never successfully had graphics with FreeBSD. Graphics do work with Ubuntu 24.04, however. |
|
Do you know which version of the kernel Ubuntu 24.04 uses? |
|
|
|
I've applied the following change: diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
index c3a2d15c84..5b537887d5 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -112,12 +112,17 @@ static int query_geometry_subslices(struct drm_i915_private *i915,
engine = intel_engine_lookup_user(i915, (u8)classinstance.engine_class,
(u8)classinstance.engine_instance);
+ i915_report_error(i915, "query_geometry_subslices: engine_class=%hhu, engine_instance=%hhu\n", (u8)classinstance.engine_class, (u8)classinstance.engine_instance);
- if (!engine)
+ if (!engine) {
+ i915_report_error(i915, "!engine\n");
return -EINVAL;
+ }
- if (engine->class != RENDER_CLASS)
+ if (engine->class != RENDER_CLASS) {
+ i915_report_error(i915, "engine->class = %hhu\n", engine->class);
return -EINVAL;
+ }
sseu = &engine->gt->info.sseu;
which resulted in the following |
|
Thank you! The class and instance being 0 is logical as Mesa sets the flags to 0. I looked at the diff between Mesa 24.1.7 and 24.2.8. I didn't spot anything relevant. Next, I will look at rbtree in linuxkpi and fixes committed to 6.8.x and 6.9.x in Linux. |
|
I've discovered that
|
|
The driver populates debug information through lindebugfs. In particular, we can get details about the engines in |
dee696a to
09683dc
Compare
|
I pushed two fixes to my freebsd-src branch, as well as a small patch to drm-kmod. It fixes debugfs support for me. Could you please try to mount |
|
Ubuntu 24.04 |
|
Sorry for the delay, @centromere, it’s been a couple busy weeks. I fixed the implementation of I pushed everything to my freebsd-src branch. Could you please give it another try? That said, the panic you report seems unrelated because it happens further down the call stack. I noticed in your message that you are running Linux 6.11, not 6.8. So perhaps 6.10/6.11 contains fixes or improvements that we miss? |
09683dc to
11774df
Compare
|
Tracked down the cause of the class/uabi_class lookup failure today. It happens due to It looks like |
|
Thank you @amshafer. Is there anything I can do to be of service at this time? |
|
I think one good test would be confirming that the same 6.9 linux version actually supports this chip. I think we have only confirmed 6.11 so far. Given that there's an issue with requests timing out it could be a common issue that got fixed in 6.10 or 6.11. |
|
It works on Ubuntu 24.04, which runs kernel version 6.8. |
[Why] During DP tunnel creation, CM preallocates BW and reduces estimated BW of other DPIA. CM release preallocation only when allocation is complete. Display mode validation logic validates timings based on bw available per host router. In multi display setup, this causes bw allocation failure when allocation greater than estimated bw. [How] Do zero alloc to make the CM to release preallocation and update estimated BW correctly for all DPIAs per host router. Reviewed-by: PeiChen Huang <peichen.huang@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts drm/amdgpu: fix ftrace event amdgpu_bo_move always move
on same heap. The basic problem here is that after the move the old
location is simply not available any more.
Some fixes were suggested, but essentially we should call the move
notification before actually moving things because only this way we have
the correct order for DMA-buf and VM move notifications as well.
Also rework the statistic handling so that we don't update the eviction
counter before the move.
v2: add missing NULL check
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 94aeb4117343 ("drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3171
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
Add VCO speed parameters in the bounding box array. Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why && How] Screen flickering saw on 4K@60 eDP with high refresh rate external monitor when booting up in DC mode. DC Mode Capping is disabled which caused wrong UCLK being used. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Leo Ma <hanghong.ma@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why] preOS will not support display mode programming and link training for UHBR rates. [how] If we detect a sink that's UHBR capable, disable seamless boot Reviewed-by: Anthony Koo <anthony.koo@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Sung Joon Kim <sungjoon.kim@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch adds a missed handling of PL domain doorbell while
handling VRAM faults.
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Fixes: a6ff969fe9cb ("drm/amdgpu: fix visible VRAM handling during faults")
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Panel replay was enabled by default in commit 5950efe25ee0
("drm/amd/display: Enable Panel Replay for static screen use case"), but
it isn't working properly at least on some BOE and AUO panels. Instead
of being static the screen is solid black when active. As it's a new
feature that was just introduced that regressed VRR disable it for now
so that problem can be properly root caused.
Cc: Tom Chung <chiahsuan.chung@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3344
Fixes: 5950efe25ee0 ("drm/amd/display: Enable Panel Replay for static screen use case")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We missed setting the CCS mode during resume and engine resets.
Create a workaround to be added in the engine's workaround list.
This workaround sets the XEHP_CCS_MODE value at every reset.
The issue can be reproduced by running:
$ clpeak --kernel-latency
Without resetting the CCS mode, we encounter a fence timeout:
Fence expiration time out i915-0000:03:00.0:clpeak[2387]:2!
Fixes: 6db31251bb26 ("drm/i915/gt: Enable only one CCS for compute workload")
Reported-by: Gnattu OC <gnattuoc@me.com>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10895
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v6.2+
Tested-by: Gnattu OC <gnattuoc@me.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tested-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240426000723.229296-1-andi.shyti@linux.intel.com
(cherry picked from commit 4cfca03f76413db115c3cc18f4370debb1b81b2b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Intel hardware is capable of programming the Maud/Naud SDPs on its own based on real-time clocks. While doing so, it takes care of any deviations from the theoretical values. Programming the registers explicitly with static values can interfere with this logic. Therefore, let the HW decide the Maud and Naud SDPs on it's own. Cc: stable@vger.kernel.org # v5.17 Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8097 Co-developed-by: Kai Vehmanen <kai.vehmanen@intel.com> Signed-off-by: Kai Vehmanen <kai.vehmanen@intel.com> Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Animesh Manna <animesh.manna@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240430091825.733499-1-chaitanya.kumar.borah@intel.com (cherry picked from commit 8e056b50d92ae7f4d6895d1c97a69a2a953cf97b) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Starting BDB version 239, hdr_dpcd_refresh_timeout is introduced to
backlight BDB data. Commit 700034566d68 ("drm/i915/bios: Define more BDB
contents") updated the backlight BDB data accordingly. This broke the
parsing of backlight BDB data in VBT for versions 236 - 238 (both
inclusive) and hence the backlight controls are not responding on units
with the concerned BDB version.
backlight_control information has been present in backlight BDB data
from at least BDB version 191 onwards, if not before. Hence this patch
extracts the backlight_control information for BDB version 191 or newer.
Tested on Chromebooks using Jasperlake SoC (reports bdb->version = 236).
Tested on Chromebooks using Raptorlake SoC (reports bdb->version = 251).
v2: removed checking the block size of the backlight BDB data
[vsyrjala: this is completely safe thanks to commit e163cfb4c96d
("drm/i915/bios: Make copies of VBT data blocks")]
Fixes: 700034566d68 ("drm/i915/bios: Define more BDB contents")
Cc: stable@vger.kernel.org
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Karthikeyan Ramasubramanian <kramasub@chromium.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240221180622.v2.1.I0690aa3e96a83a43b3fc33f50395d334b2981826@changeid
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit c286f6a973c66c0d993ecab9f7162c790e7064c8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The debug print clearly lacks a \n at the end. Add it.
Fixes: 8f86c82aba8b ("drm/connector: demote connector force-probes for non-master clients")
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240502153234.1.I2052f01c8d209d9ae9c300b87c6e4f60bd3cc99e@changeid
[Why] Underflow occurs when running Netflix in a 4k144 eDP + 4k60 HDMI FRL setup. It is caused by latency varying based on the DCFCLK/FCLK state. [How] Enable urgent latency adjustment and match the reference to existing ASIC that also see increased latency at low FCLK. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Nicholas Susanto <nicholas.susanto@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
This fixes a bug introduced by commit c53655545141 ("drm/amd/display: dsc
mst re-compute pbn for changes on hub").
The change caused light-up issues with a second display that required
DSC on some MST docks.
[How]
Use Virtual DPCD for DSC caps in MST case.
[Limitations]
This change only affects MST DSC devices that follow specifications
additional changes are required to check for old MST DSC devices such as
ones which do not check for Virtual DPCD registers.
Reviewed-by: Swapnil Patel <swapnil.patel@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Agustin Gutierrez <agustin.gutierrez@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
…ual eDP [Why] Idle optimizations are blocked if there's more than one eDP connector on the board - blocking S0i3 and IPS2 for static screen. [How] Fix the checks to correctly detect number of active eDP. Also restrict the eDP support to panels that have correct feature support. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Some older MST hubs do not report DPCD registers according to
specification.
[How]
This change re-applies commit c53655545141 ("drm/amd/display: dsc mst
re-compute pbn for changes on hub").
With an additional check for these older MST devices.
Reviewed-by: Swapnil Patel <swapnil.patel@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Agustin Gutierrez <agustin.gutierrez@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
….11 users
Limit the workaround introduced by commit 31729e8c21ec ("drm/amd/pm: fixes
a random hang in S4 for SMU v13.0.4/11") to only run in the s4 path.
Cc: Tim Huang <Tim.Huang@amd.com>
Fixes: 31729e8c21ec ("drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3351
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It incorrectly claimed a resource isn't CPU visible if it's located at
the very end of CPU visible VRAM.
Fixes: a6ff969fe9cb ("drm/amdgpu: fix visible VRAM handling during faults")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3343
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reported-and-Tested-by: Jeremy Day <jsday@noreason.ca>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
…reeBSD] `hexdump()` was already redefined earlier in the source file. Sponsored by: The FreeBSD Foundation
It is implemented in FreeBSD as of ... Sponsored by: The FreeBSD Foundation
b230476 to
2b36b05
Compare
|
@dumbbell Where is the best place to continue troubleshooting? Should I open a new issue here on GitHub? |
|
The problem doesn't seem to be related to any version update, so it might be easier to track it in a dedicated issue indeed. |
Try switching to a console vty and back and see if it goes away. There are reports of various sorts of corruption in the discussion in #332. On 6.10 I still have corrupted colours at startup sometimes, which are gone after a switch to console vty and back. |
|
Ah, I should have mentioned - even the console vty is broken with only green kernel messages being visible, and even that only partially. While at it, I tried to pkg upgrade my framework 16" (amd), since it's on pkgbase with snapshot releases. Currently at ~15alpha4, and with drm-66 or drm-latest built from ports it panics (and I have dumps to prove it :). Where should I drop those core.txt files and whatnot? |
I have the same problem locally. I started to look at this issue.
You can share here if that’s ok with you? |



This is the backport of the DRM drivers from Linux 6.9.
Progress:

Changes in Linux 6.9
You can read this Phoronix article to learn about the changes in the DRM drivers in Linux 6.9:
https://www.phoronix.com/news/Linux-6.9-DRM
Patches to linuxkpi
This update depends on the following patches to linuxkpi in FreeBSD.
These patches are maintained in the following repository and branch:
https://github.com/dumbbell/freebsd-src/tree/drm-related-linuxkpi-changes
Patches were submitted for review:
Firmware updates
There is no associated firmware update for now (to be checked).
How to test
You need to run a recent FreeBSD 15-CURRENT to test it.
Here are some instructions:
You need to checkout the FreeBSD src branch I mentionned,
drm-related-linuxkpi-changes, and compile a kernel from that branch:You need to checkout the branch referenced in this pull request and compile it:
Load the relevant driver(s) as you usually do.