tensor: Add buffer to hide 2cyc commit latency

Since operand and commit throughput are the same (2 cycles), it is
unnecessary to stall the dpu during the multi-cycle commit.
This enables the dpu to operate at full throughput of 1 operand every 2
cycles.
This commit is contained in:
Hansung Kim
2024-05-16 20:07:30 -07:00
parent 317695a8d0
commit 5034d8d14b
3 changed files with 25 additions and 4 deletions

View File

@@ -391,7 +391,7 @@
// Tensor Core Latency
`ifndef LATENCY_HMMA
`define LATENCY_HMMA 2
`define LATENCY_HMMA 8
`endif
// Icache Configurable Knobs //////////////////////////////////////////////////