diff --git a/docs/Customization/Dsptools-Blocks.rst b/docs/Customization/Dsptools-Blocks.rst
index 8f8d2d2d..bdf68fc1 100644
--- a/docs/Customization/Dsptools-Blocks.rst
+++ b/docs/Customization/Dsptools-Blocks.rst
@@ -6,7 +6,7 @@ Dsptools is a Chisel library that aids in writing custom signal processing accel
* Structures for packaging DSP blocks and integrating them into a rocketchip-based SoC.
* Test harnesses for testing DSP circuits, as well as VIP-style drivers and monitors for DSP blocks.
-The `Dsptools `_ repository has more documentation.
+The `Dsptools repository `_ has more documentation.
Dsptools Blocks
@@ -16,11 +16,16 @@ It has a AXI4-stream interface and an optional memory interface.
The idea is that these ``DspBlocks`` can be easily designed, unit tested, and assembled lego-style to build complex functionality.
A ``DspChain`` is one example of how to assemble ``DspBlocks``, in which case the streaming interfaces are connected serially into a pipeline, and a bus is instatiated and connected to every block with a memory interface.
-Chipyard has example designs that integrate a ``DspBlock`` to a rocketchip-based SoC as an MMIO peripheral. The custom ``DspBlock`` has a ``ReadQueue`` before it and a ``WriteQueue`` after it, which allow memory mapped access to the streaming interfaces so the rocket core can interact with the ``DspBlock``. This section will primarily focus on designing Tilelink-based peripherals. However, through the resources provided in Dsptools, one could also define an AXI4-based peripheral by following similar steps. Furthermore, the examples here are simple, but can be extended to implement more complex accelerators, for example an `OFDM baseband `_ or a `spectrometer `_.
+Chipyard has example designs that integrate a ``DspBlock`` to a rocketchip-based SoC as an MMIO peripheral. The custom ``DspBlock`` has a ``ReadQueue`` before it and a ``WriteQueue`` after it, which allow memory mapped access to the streaming interfaces so the rocket core can interact with the ``DspBlock`` [#]_. This section will primarily focus on designing Tilelink-based peripherals. However, through the resources provided in Dsptools, one could also define an AXI4-based peripheral by following similar steps. Furthermore, the examples here are simple, but can be extended to implement more complex accelerators, for example an `OFDM baseband `_ or a `spectrometer `_.
-For this example, we will show you how to connect a simple FIR filter created using Dsptools as an MMIO peripheral. The full code can be found in ``generators/chipyard/src/main/scala/example/dsptools/GenericFIR.scala``. That being said, one could substitute any module with a ready valid interface in the place of the FIR and achieve the same results. As long as the read and valid signals of the module are attached to those of a corresponding ``DSPBlock`` wrapper, and that wrapper is placed in a chain with a ``ReadQueue`` and a ``WriteQueue``, following the general outline establised by these steps will allow you to interact with that block as a memory mapped IO.
+.. figure:: ../_static/images/fir-block-diagram.svg
+ :align: center
+ :alt: Block diagram showing how FIR is integrated with rocket.
+ :width: 400px
-The module ``GenericFIR`` is the overall wrapper of our FIR module. This module links together a variable number of ``GenericFIRDirectCell`` submodules, each of which performs the computations for one coefficient in a FIR direct form architecture. It is important to note that both modules are type generic, which means that they can be instantiated for any datatype that implements ``Ring`` operations per the specifications on ``T``.
+For this example, we will show you how to connect a simple FIR filter created using Dsptools as an MMIO peripheral as shown in the figure above. The full code can be found in ``generators/chipyard/src/main/scala/example/dsptools/GenericFIR.scala``. That being said, one could substitute any module with a ready valid interface in the place of the FIR and achieve the same results. As long as the read and valid signals of the module are attached to those of a corresponding ``DSPBlock`` wrapper, and that wrapper is placed in a chain with a ``ReadQueue`` and a ``WriteQueue``, following the general outline establised by these steps will allow you to interact with that block as a memory mapped IO.
+
+The module ``GenericFIR`` is the overall wrapper of our FIR module. This module links together a variable number of ``GenericFIRDirectCell`` submodules, each of which performs the computations for one coefficient in a FIR direct form architecture. It is important to note that both modules are type-generic, which means that they can be instantiated for any datatype ``T`` that implements ``Ring`` operations (e.g. addition, multiplication, identities).
.. literalinclude:: ../../generators/chipyard/src/main/scala/example/dsptools/GenericFIR.scala
:language: scala
@@ -32,10 +37,10 @@ The module ``GenericFIR`` is the overall wrapper of our FIR module. This module
:start-after: DOC include start: GenericFIRDirectCell chisel
:end-before: DOC include end: GenericFIRDirectCell chisel
-Creating a DspBlock Extension
------------------------------
+Creating a DspBlock
+-------------------
-The first step in attaching the FIR filter as a MMIO peripheral is to create an abstract extension of ``DspBlock`` the wraps around the ``GenericFIR`` module. The main steps of this process are as follows.
+The first step in attaching the FIR filter as a MMIO peripheral is to create an abstract subclass of ``DspBlock`` the wraps around the ``GenericFIR`` module. Streaming outputs and inputs are packed and unpacked into ``UInt`` s. If there were control signals, this is where they'd go from raw IOs to memory mapped. The main steps of this process are as follows.
1. Instantiate a ``GenericFIR`` within ``GenericFIRBlock``.
2. Attach the ready and valid signals from the in and out connections.
@@ -47,6 +52,8 @@ The first step in attaching the FIR filter as a MMIO peripheral is to create an
:start-after: DOC include start: GenericFIRBlock chisel
:end-before: DOC include end: GenericFIRBlock chisel
+Note that at this point the ``GenericFIRBlock`` does not have a type of memory interface specified. This abstract class can be used to create different flavors that use AXI-4, TileLink, AHB, or whatever other memory interface you like like.
+
Connecting DspBlock by TileLink
-------------------------------
With these classes implemented, you can begin to construct the chain by extending ``GenericFIRBlock`` while using the ``TLDspBlock`` trait via mixin.
@@ -56,7 +63,7 @@ With these classes implemented, you can begin to construct the chain by extendin
:start-after: DOC include start: TLGenericFIRBlock chisel
:end-before: DOC include end: TLGenericFIRBlock chisel
-We can then construct the final chain by utilizing the ``TLWriteQueue`` and ``TLReadeQueue`` modules found in ``generators/chipyard/src/main/scala/example/dsptools/DspBlocks.scala``. Inside our chain, we construct an instance of each queue as well as our ``TLGenericFIRBlock``. We then take the ``steamnode`` from each module and wire them all together to link the chain.
+We can then construct the final chain by utilizing the ``TLWriteQueue`` and ``TLReadeQueue`` modules found in ``generators/chipyard/src/main/scala/example/dsptools/DspBlocks.scala``. The chain is created by passing a list of factory functions to the constructor of ``TLChain``. The constructor then automatically instantiates these ``DspBlocks``, connects their stream nodes in order, creates a bus, and connects any ``DspBlocks`` that have memory interfaces to the bus.
.. literalinclude:: ../../generators/chipyard/src/main/scala/example/dsptools/GenericFIR.scala
:language: scala
@@ -72,9 +79,7 @@ As in the previous MMIO example, we use a cake pattern to hook up our module to
:start-after: DOC include start: CanHavePeripheryStreamingFIR chisel
:end-before: DOC include end: CanHavePeripheryStreamingFIR chisel
-Note that this is the point at which we decide the datatype for our FIR.
-
-Our module does not need to be connected to concrete IOs or wires, so we do not need to create a concrete trait.
+Note that this is the point at which we decide the datatype for our FIR. You could create different configs that use different types for the FIR, for example a config that instantiates a complex-valued FIR filter.
Constructing the Top and Config
-------------------------------
@@ -116,3 +121,5 @@ Now we can run our simulation.
cd sims/verilator
make CONFIG=StreamingFIRRocketConfig BINARY=../../tests/streaming-fir.riscv run-binary
+
+.. [#] ``ReadQueue`` and ``WriteQueue`` are good illustrations of how to write a ``DspBlock`` and how they can be integrated into rocket, but in a real design a DMA engine would be preferred. ``ReadQueue`` will stall the processor if you try to read an empty queue, and ``WriteQueue`` will stall if you try to write to a full queue, which a DMA engine can more elegantly avoid. Furthermore, a DMA engine can do the work of moving data, freeing the processor to do other useful work (or sleep).
diff --git a/docs/_static/images/fir-block-diagram.svg b/docs/_static/images/fir-block-diagram.svg
new file mode 100644
index 00000000..c56379e5
--- /dev/null
+++ b/docs/_static/images/fir-block-diagram.svg
@@ -0,0 +1 @@
+
\ No newline at end of file