When Maths Get Weird

Shader ALUs

Today let’s talk about ALUs a little.

During shader compilation, GLSL gets serialized into SSA form, which is what ntv operates on when translating it into SPIR-V. An ALU in the context of Zink (specifically ntv) is an algebraic operation which takes a varying number of inputs and generates an output. This is represented in NIR by a struct, nir_alu_instr, which contains the operation type, the inputs, and the output.

When writing GLSL, there’s the general assumption that writing something like 1 + 2 will yield 3, but this is contingent on the driver being able to correctly compile the NIR form of the shader into instructions that the physical hardware runs in order to get that result. In Zink, there’s the need to translate all these NIR instructions into SPIR-V, which is sometimes made trickier by both different semantics between similar GLSL and SPIR-V operations as well as aggressive NIR optimizations.

A deep dive into isnan()

The isnan function checks whether the input is a number. It’s a simple enough functionality to describe, but the implementation and transit through the GLSL->NIR->SPIR-V->NIR pipeline is fraught with perils.

In mesa, isnan(x) is serialized to NIR as fne(x, x), where fne is the operation for float-not-equal, which compares two floats to determine whether they are equal. As such, there’s never actually a case where isnan gets passed through ntv. Let’s see what this looks like in practice with this failing shader test:

// from piglit's fs-isnan-vec2.shader_test for GLSL 1.30
#version 130
uniform vec2 numerator;
uniform vec2 denominator;

void main()
{
  gl_FragColor = vec4(isnan(numerator/denominator), 0.0, 1.0);
}

In Zink, this yields:

shader: MESA_SHADER_FRAGMENT
inputs: 0
outputs: 0
uniforms: 0
ubos: 1
shared: 0
decl_var ubo INTERP_MODE_NONE struct_uniform_0 uniform_0 (~0, 0, 640)
decl_var shader_out INTERP_MODE_NONE vec4 gl_FragColor (FRAG_RESULT_COLOR.xyzw, 4, 0)
decl_function main (0 params)

impl main {
	block block_0:
	/* preds: */
	vec4 32 ssa_1 = load_const (0x00000000 /* 0.000000 */, 0x00000000 /* 0.000000 */, 0x00000000 /* 0.000000 */, 0x3f800000 /* 1.000000 */)
	vec1 32 ssa_2 = load_const (0x00000000 /* 0.000000 */)
	intrinsic store_output (ssa_1, ssa_2) (8, 15, 0, 160) /* base=8 */ /* wrmask=xyzw */ /* component=0 */ /* type=float32 */	/* gl_FragColor */
	/* succs: block_1 */
	block block_1:
}

As with yesterday’s shader adventure, here’s IRIS as a control:

shader: MESA_SHADER_FRAGMENT
name: GLSL3
inputs: 0
outputs: 1
uniforms: 0
ubos: 1
shared: 0
decl_var uniform INTERP_MODE_NONE vec2 numerator (0, 0, 0)
decl_var uniform INTERP_MODE_NONE vec2 denominator (1, 2, 0)
decl_var ubo INTERP_MODE_NONE vec4[1] uniform_0 (0, 0, 0)
decl_var shader_out INTERP_MODE_NONE vec4 gl_FragColor (FRAG_RESULT_COLOR.xyzw, 4, 0)
decl_function main (0 params)

impl main {
	block block_0:
	/* preds: */
	vec1 32 ssa_0 = load_const (0x00000000 /* 0.000000 */)
	vec1 32 ssa_1 = load_const (0x3f800000 /* 1.000000 */)
	vec1 32 ssa_2 = load_const (0x00000001 /* 0.000000 */)
	vec4 32 ssa_3 = intrinsic load_ubo (ssa_2, ssa_0) (0, 4, 0) /* access=0 */ /* align_mul=4 */ /* align_offset=0 */
	vec1 32 ssa_6 = frcp ssa_3.z
	vec1 32 ssa_7 = frcp ssa_3.w
	vec1 32 ssa_8 = fmul ssa_3.x, ssa_6
	vec1 32 ssa_9 = fmul ssa_3.y, ssa_7
	vec1 32 ssa_10 = fne32 ssa_8, ssa_8
	vec1 32 ssa_12 = b2f32 ssa_10
	vec1 32 ssa_11 = fne32 ssa_9, ssa_9
	vec1 32 ssa_13 = b2f32 ssa_11
	vec4 32 ssa_14 = vec4 ssa_12, ssa_13, ssa_0, ssa_1
	intrinsic store_output (ssa_14, ssa_0) (4, 15, 0, 160) /* base=4 */ /* wrmask=xyzw */ /* component=0 */ /* type=float32 */	/* gl_FragColor */
	/* succs: block_1 */
	block block_1:
}

This is clearly much different. In particular, note that IRIS retains its fne instructions, but Zink has lost them along the way.

Why is this?

The problem comes from how SPIR-V is translated back to NIR. When emitting fne(a, a) into SPIR-V with OpFOrdNotEqual, the result is that the NaN-ness is ignored, and the NaN value is compared against itself, managing to be equivalent somehow, which breaks the test. This is due to how OpFOrdNotEqual is explicitly used for ordered (numeric).

Using OpFUnordNotEqual for this case has no such issue, as this op always return false if either of the inputs are unordered (NaN).

Written on June 24, 2020