Divide into first half & last half for warpgroup 0 & 1, and allocate Q/K and P/V in different banks for parallel acccess.
38 KiB
38 KiB
Divide into first half & last half for warpgroup 0 & 1, and allocate Q/K and P/V in different banks for parallel acccess.