Files
unifolm-world-model-action/scripts
olivame a2cd34dd51 1. einsum('b i d, b j d -> b i j') → torch.bmm(q, k.transpose(-1,-2)) — 直接映射 rocBLAS batched GEMM
2. baddbmm 把 scale 融合进 GEMM,少一次 kernel launch
3. 第二个 einsum 同理换torch.bm
每一轮加速1到两秒
2026-02-08 18:54:48 +00:00
..
2025-09-12 21:53:41 +08:00
2025-09-12 21:53:41 +08:00
2025-09-12 21:53:41 +08:00
2025-09-12 21:53:41 +08:00