This website requires JavaScript.
Explore
Help
Sign In
wu-arch
/
kernels
Watch
1
Star
0
Fork
0
You've already forked kernels
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
068d48534efa7a352c00270806006cf8679e7071
kernels
/
tests
/
regression
/
flash_attention
History
Hansung Kim
068d48534e
flash: Swap S1/S0 to avoid GEMM II - softmax bank conflict
...
+ remove spurrious fences to better overlap GEMM I and DMA
2024-09-11 00:55:36 -07:00
..
.gitignore
flash: gitignore
2024-08-15 21:04:59 -07:00
common.h
flash: Change kernel arg to contain qkv; strip stimulus gen from host code
2024-08-15 21:03:02 -07:00
flash_impl.hpp
flash: Add flag in SMEM for dependency check on O
2024-09-10 13:42:47 -07:00
half.hpp
Add flash attention kernel skeleton
2024-08-14 20:46:09 -07:00
kernel.cpp
flash: Add Gemmini-accelerated kernel
2024-09-07 22:40:58 -07:00
kernel.gemmini.cpp
flash: Swap S1/S0 to avoid GEMM II - softmax bank conflict
2024-09-11 00:55:36 -07:00
main.cpp
flash: Fix load addr for V tile; test with seqlen=128
2024-08-20 14:34:09 -07:00
Makefile
flash: Fix online softmax for DMA layout
2024-09-07 23:21:28 -07:00