x11bridge: tolerate redundant GEM close for multi-plane DRI3 buffers#231
Open
aford173 wants to merge 1 commit into
Open
x11bridge: tolerate redundant GEM close for multi-plane DRI3 buffers#231aford173 wants to merge 1 commit into
aford173 wants to merge 1 commit into
Conversation
teohhanhui
reviewed
Jun 29, 2026
fc278f8 to
f70ae41
Compare
teohhanhui
reviewed
Jun 30, 2026
teohhanhui
reviewed
Jun 30, 2026
teohhanhui
reviewed
Jun 30, 2026
f70ae41 to
e333416
Compare
teohhanhui
reviewed
Jun 30, 2026
Collaborator
e333416 to
ea44604
Compare
A DRI3 PixmapFromBuffers for a multi-plane buffer (or any buffer whose planes share one backing object) passes several dma-buf fds that all resolve to the same virtgpu GEM handle, since drm_prime_fd_to_handle deduplicates imports of the same object. vgpu_id_from_prime creates one GemHandleFinalizer per fd, so the bridge calls drm_gem_close on that handle once per fd. The first close frees it; each redundant close returns EINVAL (ENOENT on some drivers), which propagated out of process_socket and tore down the whole X11 client. Refcount the live finalizers per handle: vgpu_id_from_prime increments on import and finalize decrements, closing only when the last one runs. This makes the redundant close impossible rather than swallowing its error, so genuine drm_gem_close errors are still propagated. The Intel iris native context hands DRI3 a multi-plane CCS-aux buffer and was affected: every GL client (glmark2 and others) died with "XIO: fatal IO error 95" on the first frame, now they run to completion. Tested on an Intel Raptor Lake-U iGPU (8086:a721), host iGPU bound to both i915 and xe. AMD (Strix Point) and Mali (MediaTek MT8196) do not pass multi-plane buffers this way and were unaffected. The fix is host-driver-independent, as the offending close is against the guest virtio-gpu device. Signed-off-by: Adam Ford <adam.ford@anodize.com>
ea44604 to
6530142
Compare
Author
teohhanhui
approved these changes
Jun 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A DRI3 PixmapFromBuffers request for a multi-plane buffer (or any buffer whose planes share one backing object) passes several dma-buf fds that all resolve to the same virtgpu GEM handle, because drm_prime_fd_to_handle deduplicates imports of the same underlying object. vgpu_id_from_prime creates one GemHandleFinalizer per fd, so once the request has been forwarded the bridge calls drm_gem_close on that handle once per fd. The first close frees the handle; each subsequent close of the now-invalid handle returns EINVAL (ENOENT on some drivers), which propagated out of process_socket and tore down the entire X11 client connection.
This is easy to hit with the Intel iris native context, which hands DRI3 a multi-plane (CCS-aux) buffer: glmark2 and other GL clients died with "XIO: fatal IO error 95 ... after 38 requests" the moment the first frame's pixmap was finalized.
Treat EINVAL/ENOENT from the redundant close as success: the handle is already gone, which is exactly the intended outcome.
Tested on an Intel Raptor Lake-U iGPU (8086:a721) with the iris native context, with the host iGPU bound to both the i915 and the xe kernel driver: before this change every GL client died with the XIO 95 above; after it, glmark2 runs to completion on both drivers (GL_RENDERER "Mesa Intel(R) Graphics (RPL-U)", clean exit, no finalizer/XIO errors). The fix is host-driver-independent, as the offending close is against the guest virtio-gpu device.