1. ensure FatBank prioritze Ack read over Ack write to downstream
coalescer
2. Between FatBank and L2, use the new sourceGenerator to allow both Read and
Write Reqs sharing the same pool of available src_ids
With sourceWidth = 1, we hit an unsynchronized vx_wspawn bug, where the
previously spawned warps get killed and overridden by a new vx_wspawn
call before all the warps complete execution. Setting sourceWidth = 1
somehow slows down the progress of the spawned warps in relation to warp
0 (presumably because fetch stalls, but not sure why they would slow
down more than warp 0) and results in this bug. sourceWidth = 4 seems
to work for vecadd.
Previously VortexBundle was being instantiated using the parameters of
the TileLink bundle from VortexTile. This results in tight coupling
between Vortex interface parameters and downstream TileLink parameters.
This change adds a standalone Bundle used by the VortexCore wrapper
and is independently instantiated from the TL params, i.e. different
source widths. Ideally we want to move away from using TL-like
structures for VortexBundle and handling adapter logic completely
outside the core blackbox.
Issues addressed:
1. FatBank ack to downstream coalescer with the correct size on ChannelD
2. FatBank ack to downstream coalescer immediately after W Req
3. FatBank generates unique ID for W Req to L2
4. Allows coalescer to config max Coal to L1 ReadSize at compile time
Ungoing issues:
1. Magic Number
2. Verification
3. Multi-Bank Integration
This fixes sourceId collision that occurs when naively re-using tag bit
of a Vortex dmem request as TL source, which happens because Vortex core
does not allocate a new LSU entry for writes.
`VortexSourceGen` module acts as a Vortax tag <-> new TL source ID
converter, where it allocates a new ID for every new Vortex request, and
restores its original tag bits from the metadata embedded in the
SourceGenerator module.
TODO:
- Decouple sourceWidth of downstream TL nodes from Vortex's tag bit
width; they are set to be the same for convenience as of now
- Apply this to imem requests as well
This makes it trickier to hook the driver up to sbus since we need to
assert its io.start. We still need io.finished coming out of it to tell
when the trace finished.