... so that you don't have to run (warpgroup_id == 0) condition at every loop iteration which is expensive due to vx_split/join.
... so that you don't have to run (warpgroup_id == 0) condition at every loop iteration which is expensive due to vx_split/join.