Doesn't work because 1x2 jagged mapping is required to achieve throughput for storing the bigger C matrix (2x4, vs. 2x2 in A).
9.2 KiB
9.2 KiB
Doesn't work because 1x2 jagged mapping is required to achieve throughput for storing the bigger C matrix (2x4, vs. 2x2 in A).