Rendering artifact when using ANGLE's vulkan backend with MoltenVK #2418

dboyan · 2025-01-08T09:26:28Z

I'm using qemu+virgl for 3d-accelerated linux vm with akihikodaki's patches (details available here). The host-side virgl renderer is backed by ANGLE (which itself can use different backends). I recently tried to play with different ANGLE backends but found if I use the vulkan backend it will result in heavy rendering artifact with moltenvk (while it renders fine with ANGEL's default OpenGL backend). Here's some more details.

Environment

macOS 15.2 on Apple M1 Max
Vulkan SDK 1.3.296
ANGLE: https://chromium.googlesource.com/angle/angle/+/81a41e4f10f960a290d89147c7a152fcd48f09a4
Other components are the same as described here.

Issue description

First, to provide a little bit more background, the patched qemu on macos provides 3d acceleration to guest in the following way:

Guest GL --(virgl)-> Host GLES on EGL --(ANGLE)-> Backends (OpenGL / Metal / Vulkan)

When I choose ANGLE's backend to vulkan (using configuration when initializing EGL), the rendering will contain heavy artifact when the VM boots into operating system. For example, here is a screenshot at the gdm login screen:

while a correct rendering should look like the following:

Preliminary analysis

I'm aware that a chain with multiple steps is involved in the rendering process and mistake in any step can make result incorrect. Specifically, there are the following possibilities:

virgl generates incorrect gles commands on host (unlikely, as discussed below)
ANGLE's vulkan backend translates gles input incorrectly
mvk translates vulkan to metal incorrectly

I'm not 100% sure for now but I'm sharing my data to provide some clarification and they may serve as pointers to further diagnose the issue.

I used two capture tools to record the command stream at different stages of the chain. First, I recorded the OpenGL ES command stream at the input of ANGLE (with a heavily adapted version of apitrace). Meanwhile, I used gfxreconstruct to record the vulkan command stream generated by ANGLE. I attached both files recorded from a single vm launch at the same time:

GLES+EGL trace from apitrace, can be replayed on linux
Vulkan trace from gfxreconstruct

Actually the first trace can be correctly rendered when replayed directly on Linux. In fact, the second screenshot above is taken from there. This means that step 1 (virgl) should be most likely correct. However, the vulkan trace cannot help to a equal degree because it is apparently not portable across platforms (and the desktop doesn't even render when replayed on macos, not very sure if the issue is with recording or rendering replay). So now I cannot tell if this is ANGLE's or mvk's fault.

Further I can try to build ANGLE with vulkan on linux and try to see the result from native vulkan rendering from the same gles input, but I cannot promise to get it done very soon. But please let me know if I can provide any clarification or other necessary information.

The text was updated successfully, but these errors were encountered:

dboyan · 2025-01-11T23:19:20Z

I'm able to get some details which might shed some lights on the issue. Now I'm able to narrow down a small set of rendering commands to reproduce the issue. Here is a trimmed apitrace file to reproduce the issue, the rendering commands are in the second frame if viewed using qapitrace.

The incorrect rendering can happen when composing a series of textures on a larger texture, which I guess is the basics of 2d acceleration. The each iteration of rendering calls in OpenGL ES looks like the following:

For each iteration, only the texture (used by the sampler) and beginning vertex index (of glDrawArrayInstanced) are changed, both of which are circled in the image above. The coordinates of the textures to be composed are passed into the vertex shader from vertex attributes arrays.

It turns out that on (ANGLE over) MoltenVK, some textures are composed at incorrect locations in a series of draw calls, seemingly occupying the place intended for other textures. I captured the intermediate states of one texture composing process using apitrace tool to read the offscreen texture after the exact API call. For each step, I list the result of the composed image rendered from:

ANGLE over vulkan (nvidia gpu on linux), I obtain identical result when replaying the trace on native gles on linux
ANGLE over MoltenVK

For easier viewing, I applied black background and vertically flipped from each resulting texture:

Step 1 (call 30306):

Step 2 (call 30328): a "black" texture is rendered in this step, effect is invisible

Step 3 (call 30350): The result start to diverge, it seems the "black" texture is misplaced to cover the whole area in MoltenVK

Step 4 (call 30372):

Step 5 (call 30394):

Step 6 (call 30416):

I'm guessing that the vertex indices and textures for multiple draw calls somehow becomes mismatched when translated into metal. I might be able to create some minimal gles or vulkan sample code to reproduce problem. Meanwhile, please let me know if there is more information I can share.

dboyan mentioned this issue Jan 13, 2025

MVKCmdDraw: Fix indirect index for triangle fan topology #2419

Merged

cdavis5e closed this as completed in #2419 Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rendering artifact when using ANGLE's vulkan backend with MoltenVK #2418

Rendering artifact when using ANGLE's vulkan backend with MoltenVK #2418

dboyan commented Jan 8, 2025 •

edited

Loading

dboyan commented Jan 11, 2025

Rendering artifact when using ANGLE's vulkan backend with MoltenVK #2418

Rendering artifact when using ANGLE's vulkan backend with MoltenVK #2418

Comments

dboyan commented Jan 8, 2025 • edited Loading

Environment

Issue description

Preliminary analysis

dboyan commented Jan 11, 2025

dboyan commented Jan 8, 2025 •

edited

Loading