Commit Graph

13 Commits

Author SHA1 Message Date
Hansung Kim
793779aa6c sgemm_wg: 128x128 config 2024-04-24 21:10:21 -07:00
Hansung Kim
6cbfbfb856 sgemm_wg: Output CPU data to binary 2024-04-24 21:10:21 -07:00
Hansung Kim
37a60b1141 sgemm_wg: Output C result to binary 2024-04-14 12:36:06 -07:00
Hansung Kim
3383b70732 sgemm_wg: Hardcode device address 2024-04-14 12:36:00 -07:00
Hansung Kim
510a834db5 sgemm_wg: Implement software barrier for inter-core synchronization 2024-03-12 15:34:42 -07:00
Hansung Kim
6f4dfe5a0e sgemm_wg: Implement 2D threadtiling 2024-02-29 14:40:54 -08:00
Hansung Kim
f1e7407d3a sgemm_wg: Run multiple threadblock per core 2024-02-27 15:44:04 -08:00
Hansung Kim
d2da0d3394 sgemm_wg: Parameterize threadblock dimensions 2024-02-17 18:05:59 -08:00
Hansung Kim
301f1ca260 sgemm_wg: Implement blocking over k-dimension 2024-02-16 16:20:57 -08:00
Hansung Kim
5f79e8a3f1 sgemm_wg: reference matmul in cpu 2024-02-12 22:29:38 -08:00
Hansung Kim
6b420aceb6 sgemm_wg: write simple C=A*A matmul 2024-02-12 22:22:28 -08:00
Hansung Kim
6a1a506b64 sgemm_wg: save args and input bin 2024-02-12 20:49:08 -08:00
Hansung Kim
ad8bf9b223 Add sgemm_wg C kernel 2024-02-07 21:31:08 -08:00