Two posts in one month is a record for May of 2022. Might even shoot for three at this rate.
Yesterday I posted a hasty roundup of what’s been going on with zink.
It was not comprehensive.
What else have I been working on, you might ask.
Introducing A New Zink-enabled Platform
We all know I’m champing at the bit to further my goal of world domination with zink on all platforms, so it was only a matter of time before I branched out from the big two-point-five of desktop GPU vendors (NVIDIA, AMD, and also maybe Intel sometimes).
Thus it was that a glorious bounty descended from on high at Valve: a shiny new A630 Coachz Chromebook that runs on the open source Qualcomm driver, Turnip.
How did the initial testing go?
My initial combined, all-the-tests-at-once CTS run (KHR46, 4.6 confidential, all dEQP) yielded over 1500 crashes and over 3000 failures.
I then accidentally bricked my kernel by foolishly allowing my distro to upgrade it for me, at which point I defenestrated the device.
There were many problems, but two were glaring:
- Maximum of 4 descriptor sets allowed
- No 64-bit support
It’s tough to say which issue is a bigger problem. As zinkologists know, the preferred minimum number of descriptor sets is six:
Thus, almost all the crashes I encountered were from the tests which attempted to use shader images, as they access an out-of-bounds set.
On the other hand, while there was an amount of time in which I had to use zink without 64-bit support on my Intel machine before emulation was added at the driver level, Intel’s driver has always been helpfully tolerant of my continuing to jam 64-bit i/o into shaders without adverse effects since the backend compiler supported such operations. This was not the case with Turnip, and all I got for my laziness was crashing and failing.
Both problems were varying degrees of excruciating to fix, so I chose the one I thought would be worse to start off: descriptors.
We all know zink has various modes for its descriptor management:
- caching without templates
The default is currently the caching mode, which calculates hash values for all the descriptors being used for a draw/dispatch and then stores the sets for reuse. My plan was to collapse the like sets: merging UBOs and SSBOs into one set as well as Samplers and Images into another set. This would use a total of four sets including the bindless one, opening up all the features.
The project actually turniped out to be easier than expected. All I needed to do was add indirection to the existing set indexing, then modify the indirect indices on startup based on how many sets were available. In this way, accessing e.g.,
descriptor_set[SSBO] would actually access
descriptor_set[UBO] in the new compact mode, and everything else would work the same.
It was then that I remembered the tricky part of doing anything descriptor-related: updating the caching mechanism.
In time, I finally managed to settle on just merging the hash values from the merged sets just as carefully as the fusion dance must be performed, and things seemed to be working, bringing results down to just over 1000 total CTS fails.
The harder problem was actually the easier one.
Which means I’m gonna need another post.