Since Oi rescale has data dependency to previous Oi which gets produced
at the PV GEMM, both rescale+GEMM needs to be in a single pipeline stage
or otherwise it requires a stall. So instead, compute only the
rescale factor in the online softmax stage and apply rescaling right
before PV.