Opengl 3.1

Long Weekend

Not really, but I didn’t get around to blogging on Friday because I was working until pretty late on something that’s Kind Of A Big Deal.

Not really, but it’s probably more interesting than my posts about unhandled ALUs.

ARB_uniform_buffer_object support

This extension is one of the last remaining items (along with GL_NV_primitive_restart, which is likely to be done soon as well, and some fixups for GLSL-1.40) required for OpenGL 3.1 support, so I decided to take a break from fixing corner case piglit tests to try doing something useful.

At a very basic level, this extension provides shaders with the ability to declare a “struct” type uniform containing explicitly-defined members that can be referenced normally. Here’s a quick vertex shader example from piglit’s rendering test from the arb_uniform_buffer_object extension tests:

#extension GL_ARB_uniform_buffer_object : require

layout(std140) uniform;
uniform ub_pos_size { vec2 pos; float size; };
uniform ub_rot {float rotation; };

void main()
   mat2 m;
   m[0][0] = m[1][1] = cos(rotation); 
   m[0][1] = sin(rotation); 
   m[1][0] = -m[0][1]; 
   gl_Position.xy = m * gl_Vertex.xy * vec2(size) + pos; = vec2(0, 1);

Seen here, there’s two UBOs passed as inputs, and the shader’s main() function directly references their members to perform a rotation on the passed vertex.

What does this actually mean?

That was my first question. In essence, what it means is that once PIPE_SHADER_CAP_INDIRECT_CONST_ADDR is enabled for the driver, shaders are going to start being compiled that contain instructions to perform UBO loads with offsets, as the “struct” member access is really just loading memory from a buffer at an offset from the base.

There’s two types of indexing that need to be handled:

  • constant - this is like array[1], where the index is explicitly defined
  • dynamic - this is array[i], where the index has been computed by the shader

This type of indexing applies to both uniform block indexing, which determines which UBO is being accessed by the instruction, and uniform block offset, which is the precise region in that UBO being accessed.

At present, only constant block indexing is going to be discussed, though both types of addressing need to be handled for block offsets.

Evaluating the existing code

Let’s check out the core of the implementation:

static void
emit_load_ubo(struct ntv_context *ctx, nir_intrinsic_instr *intr)
   nir_const_value *const_block_index = nir_src_as_const_value(intr->src[0]);
   assert(const_block_index); // no dynamic indexing for now
   assert(const_block_index->u32 == 0); // we only support the default UBO for now

   nir_const_value *const_offset = nir_src_as_const_value(intr->src[1]);
   if (const_offset) {
      SpvId uvec4_type = get_uvec_type(ctx, 32, 4);
      SpvId pointer_type = spirv_builder_type_pointer(&ctx->builder,

      unsigned idx = const_offset->u32;
      SpvId member = emit_uint_const(ctx, 32, 0);
      SpvId offset = emit_uint_const(ctx, 32, idx);
      SpvId offsets[] = { member, offset };
      SpvId ptr = spirv_builder_emit_access_chain(&ctx->builder, pointer_type,
                                                  ctx->ubos[0], offsets,
      SpvId result = spirv_builder_emit_load(&ctx->builder, uvec4_type, ptr);

      SpvId type = get_dest_uvec_type(ctx, &intr->dest);
      unsigned num_components = nir_dest_num_components(intr->dest);
      if (num_components == 1) {
         uint32_t components[] = { 0 };
         result = spirv_builder_emit_composite_extract(&ctx->builder,
                                                       result, components,
      } else if (num_components < 4) {
         SpvId constituents[num_components];
         SpvId uint_type = spirv_builder_type_uint(&ctx->builder, 32);
         for (uint32_t i = 0; i < num_components; ++i)
            constituents[i] = spirv_builder_emit_composite_extract(&ctx->builder,
                                                                   result, &i,

         result = spirv_builder_emit_composite_construct(&ctx->builder,

      if (nir_dest_bit_size(intr->dest) == 1)
         result = uvec_to_bvec(ctx, result, num_components);

      store_dest(ctx, &intr->dest, result, nir_type_uint);
   } else
      unreachable("uniform-addressing not yet supported");

This is the handler for the load_ubo instruction in ntv. It performs a load operation on a previously-emitted UBO variable, using the first parameter (intr->src[0]) as the block index and the second parameter (intr->src[1]) as the block offset, and storing the resulting data that was loaded into the destination (intr->dest).

In this implementation, which is what’s currently in the repo, there’s some assert()s which verify that both of the parameters passed are constant rather than dynamic; as previously-mentioned, this is going to need to change, at least for the block offset. Additionally, the block index is restricted to 0, which I’ll explain a bit later, but it’s a problem.

Work items

So at a minimum, these are the following changes that need to be made:

  • Enable nonzero block indexing so that more than one UBO can be accessed
  • Handle block access using dynamic offsets

As with all projects I decide to tackle, however, these are not going to be the only changes required, as this is going to uncover a small tangle if I try to fix it directly by just removing the assert().

Stay tuned as this saga progresses.

Written on June 29, 2020