If nothing goes wrong, Mesa 23.1 will ship in the next few hours, which means that at last everyone will have a new zink release.
And it’s a big one.
Since I’m expecting lots of people will be testing zink for the first time now that it should be usable for most things, I thought it would be useful to have a post about debugging zink. Or maybe not debugging but issue reporting. Yeah that sounds right.
Zink is a complex driver. It has many components:
- DRI frontend
- Mesa’s GL API
- Gallium state tracker
- zink driver
- ntv compiler
- Vulkan driver underneath
There are many systems at play when zink is running, and malfunctions in any of them may lead to bugs. If you encounter a bug, there are a number of steps you can take to try mitigating its effect or diagnosing it.
Let’s dig in.
Step 1: Threads
Zink tries to use a lot of threads. Sometimes they can cause problems. One of the first steps I take when I encounter an issue is to disable them:
mesa_glthread=false GALLIUM_THREAD=0 ZINK_DEBUG=flushsync MESA_LOADER_DRIVER_OVERRIDE=zink <command>
mesa_glthread=falsedisables glthread, which very occasionally causes issues related to vertex buffers
GALLIUM_THREAD=0disables threaded context, which very occasionally causes issues in everything
ZINK_DEBUG=flushsyncdisables threaded queue submission, which has historically never been an issue but may remove some timing discrepancies to enable bugs to shine through more readily
If none of the above affects your problem, it’s time to move on to the next step.
Step 2: Optimizations
Zink tries a lot of optimizations. Sometimes they can cause problems. One of the first steps I take when I encounter an issue that isn’t fixed by the above is to disable them:
mesa_glthread=false GALLIUM_THREAD=0 ZINK_DEBUG=flushsync,noreorder,norp MESA_LOADER_DRIVER_OVERRIDE=zink <command>
noreorderdisables command reordering, which will probably fix any issue
norpdisables renderpass optimization, which is only enabled by default on tiling GPUs so it’s definitely not impacting anyone reading this
If none of the above affects your problem, then
you’re probably screwed it’s time to move on to the next step.
Step 3: Synchronization
Zink tries to obey Vulkan synchronization rules. Sometimes these rules are difficult to understand and confusing to implement. One of the first steps I take when I encounter an issue that isn’t fixed by the above is to
cry a lot check out this blog post the synchronization. Manually inspecting this is impossible hard, so there’s only a big hammer:
mesa_glthread=false GALLIUM_THREAD=0 ZINK_DEBUG=flushsync,noreorder,norp,sync MESA_LOADER_DRIVER_OVERRIDE=zink <command>
syncforces full memory synchronization before every command, which will solve the problem of extant FPS
If none of the above affects your problem, then I totally have more ideas, and I’m gonna blog about them right here, but I gotta go, um, get something. From my…refrigerator. It’s just down the street I’ll berightback.
Step 4: Pray
Zink tries to conform to Vulkan API rules. Sometimes these rules are obfuscated by a trillion Valid Usage statements that nobody has time to read or the mental capacity to comprehend in their totality.
And when a problem appears that cannot be mitigated with any of the above, we must pray to our lord and savior the Vulkan Validation Layer for guidance:
mesa_glthread=false GALLIUM_THREAD=0 ZINK_DEBUG=flushsync,noreorder,norp,sync,validation MESA_LOADER_DRIVER_OVERRIDE=zink <command>
validationenables VVL at runtime, which will (hopefully) explain how I fucked up
If this doesn’t fix your problem,
Step 5: Acceptance
You should file a ticket if you find an issue. But if any of the above provided some additional info, providing that can reduce the time it takes to resolve that issue.