Commit Graph

8 Commits

Author SHA1 Message Date
Hansung Kim
28f6cd59b5 tensor: Improve commit efficiency by decoupling dpu with fifo 2024-05-26 22:00:25 -07:00
Hansung Kim
864265bda5 tensor: Fix consecutive commits to write to same warp
... by splitting the pending_uops queue across warps.
2024-05-25 20:04:31 -07:00
Hansung Kim
5034d8d14b tensor: Add buffer to hide 2cyc commit latency
Since operand and commit throughput are the same (2 cycles), it is
unnecessary to stall the dpu during the multi-cycle commit.
This enables the dpu to operate at full throughput of 1 operand every 2
cycles.
2024-05-16 20:09:08 -07:00
Hansung Kim
89e7d65926 tensor: Add ready signal to enforce 1 warp occupancy
Currently disabled as the timing behavior is already ~accurate
2024-05-16 15:34:54 -07:00
Richard Yan
d624b3e50a store fencing, large smem, fix tensor core for firesim 2024-05-15 21:45:48 -07:00
joshua
5bd25985c6 i kinda forgot most of changes 2024-05-04 23:01:47 -07:00
joshua
b254281295 initial tcore impl 2024-03-21 01:29:38 -07:00
joshua
978dd3bdfe seemingly working fp32 implementation 2024-03-19 17:56:59 -07:00