cascadelake

The plots show the relative difference in runtime (LoopVectorization.jl - libxsmm) / libxsmm for every (m, n, k) triplet. Negative / red values are better for LoopVectorization.jl, positive / blue values are better for libxsmm.

1

Q₁ = -0.580. Q₂ = -0.320. Q₃ = 0.154

2

Q₁ = -0.709. Q₂ = -0.504. Q₃ = -0.350

4

Q₁ = -0.701. Q₂ = -0.533. Q₃ = -0.346

8

Q₁ = -0.620. Q₂ = -0.377. Q₃ = -0.177

16

Q₁ = -0.709. Q₂ = -0.513. Q₃ = -0.330