Fence instructions have address field set to X's which propagates to
cache_req_ready, causing issue stalls. Fix this by setting req_mask to all-zero
so that they can be handled unaffected by x-propagation.
Setting req_valid to 0 does not fix the problem because the LSU only commits
instructions when they have a matching response coming back.
smem_unit stays inside the core, and the two separate buses to dcache
and smem are exposed at VX_core.
Currently core_wrapper ties req valid to 1'b0, stalling kernels that
reads from sharedmem.
Since the core's response ready signal depends on response valid, but core does
not accept write ACKs, we need to manually assert ready when there is a valid
response coming in for a write regardless of the core's ready state (which would
be 0).
It seems many of the initial arch/uarch states, including the GPR, are
uninitialized in the VCS simulation, which results in functional errors caused
by propagated X's. In this particular case it resulted in a dcache request not
being fired due to the rs1 data for an lw instruction having values as X,
causing the smem_unit to not arbitrate the request correctly.
A workaround of this issue is to stop the X propagation by using the
===-operation instead of == in the GPR unit, which had been the main source of X
propagation into the raddr port of the GPR.
Also, we run the simulation with GSR_RESET set to 1 so that the contents of the
GPR are initialized at the beginning of the simulation (however, this alone does
not prevent reading in X's, hence this fix.)
FIXME: This is a slight deviation from the upstream code; ideally, we want to do
clean & full initialization of microarchitectural states.