Queries

The next problem

With xfb defeated, it’s time to move on to the next big issue: improving query handling.

The existing implementation of queries in Zink has a few problems:

They’re limited to 100 queries (50 for xfb) before it’s necessary to get the result of a query before hitting an assert()
There’s a number of issues related to API correctness, so running in a validator will generate tons of errors (though mostly they’re harmless in the sense that the code still works as expected)

Ideally, queries shouldn’t trigger abort() due to pool depletion, and they shouldn’t trigger validator errors, however, so this needs some work.

Step one: understanding the problems

Let’s check out some of the existing code.

static bool
zink_begin_query(struct pipe_context *pctx,
                 struct pipe_query *q)
{
   struct zink_context *ctx = zink_context(pctx);
   struct zink_query *query = (struct zink_query *)q;

   /* ignore begin_query for timestamps */
   if (query->type == PIPE_QUERY_TIMESTAMP)
      return true;

   /* TODO: resetting on begin isn't ideal, as it forces render-pass exit...
    * should instead reset on creation (if possible?)... Or perhaps maintain
    * the pool in the batch instead?
    */
   struct zink_batch *batch = zink_batch_no_rp(zink_context(pctx));
   vkCmdResetQueryPool(batch->cmdbuf, query->query_pool, 0, MIN2(query->curr_query + 1, query->num_queries));
   query->curr_query = 0;

   begin_query(batch, query);
   list_addtail(&query->active_list, &ctx->active_queries);

   return true;
}

Here’s a function that’s called any time a new query is begun by an application.

Notice here that vkCmdResetQueryPool is being called here rather than alongside vkCreateQueryPool. According to Vulkan spec, each query must be reset before being used (which is why the call is here), but with this usage, any time a query is stopped without its results being returned, those results are lost because the query has been reset.

Also, as noted in the comment, because there’s a reset here, the current renderpass gets flushed in order to comply with Vulkan spec, which requires that reset be called outside of a renderpass. This causes a slowdown, which isn’t optimal.

static void
end_query(struct zink_batch *batch, struct zink_query *q)
{
   assert(q->type != PIPE_QUERY_TIMESTAMP);
   vkCmdEndQuery(batch->cmdbuf, q->query_pool, q->curr_query);
   if (++q->curr_query == q->num_queries) {
      assert(0);
      /* need to reset pool! */
   }
}

This is a function that’s called any time a query is ended (more on the multiple meanings of this in a future post) either internally or by an application.

The idea here is that the query is ended, so the “current” query id is incremented in preparation for the next query, as all queries are tied to an id value. If the incremented query id reaches the size of the query pool, then the code triggers an assert() since vkCmdResetQueryPool needs to be emitted, but that can’t happen without discarding the existing query results (as already happens above).

Step two: devising solutions

Ideally, what needs to happen here is:

Handle pool overflows by triggering a reset without discarding existing query results
Move reset commands out of zink_begin_query()

This turns out to be significantly more difficult than it sounds due to a number of other constraints in the driver, however. Stay tuned for more query-related problem solving!

Written on June 12, 2020