When commit stage is stalled, LSU ready is deasserted for mem writes
since stores commit immediately; however, the same was not applied to
valid, creating duplicate memory write requests. Fix by guarding both
ready and valid properly.
Setting mem_req_mask to all-zero triggers an assertion error in
mem_scheduler. Instead, disallow initialize-by-x in instruction decode
which is the source of x-propagation. Since this seems to only happen
in VCS, define-gate it accordingly.
This reverts commit a15f4fd483.
Fence instructions have address field set to X's which propagates to
cache_req_ready, causing issue stalls. Fix this by setting req_mask to all-zero
so that they can be handled unaffected by x-propagation.
Setting req_valid to 0 does not fix the problem because the LSU only commits
instructions when they have a matching response coming back.
smem_unit stays inside the core, and the two separate buses to dcache
and smem are exposed at VX_core.
Currently core_wrapper ties req valid to 1'b0, stalling kernels that
reads from sharedmem.
VCS requires the output of the dpi calls to be of a type that can come
at the LHS of a procedural assignment, i.e. reg type. Seems to be a
different requirement from Verilator.
addResource() calls in Chisel BlackBox does not preserve order of the
files being included; the actual compile order for these files are
re-arranged to be in alphabetical order.
Therefore, while VX_gpu_pkg.sv has to be compiled before all the other
modules because it holds the top-level package definition, that order
cannot be ensured from Chisel. As a hacky workaround, simply `include
this file in some of the sv files whose name starts earlier than
VX_gpu_pkg in lexicographical order.