Revamp the config system for Top/Harness (#347)

* Refactor how Configs parameterize the Top and TestHarnesses

* Bump sha3, testchipip, icenet, firesim
This commit is contained in:
Jerry Zhao
2020-01-21 20:44:54 -08:00
committed by GitHub
parent 1786b9a7f4
commit ac5235e5ed
35 changed files with 1087 additions and 858 deletions

View File

@@ -1,261 +0,0 @@
.. _adding-an-accelerator:
Adding an Accelerator/Device
===============================
Accelerators or custom IO devices can be added to your SoC in several ways:
* MMIO Peripheral (a.k.a TileLink-Attached Accelerator)
* Tightly-Coupled RoCC Accelerator
These approaches differ in the method of the communication between the processor and the custom block.
With the TileLink-Attached approach, the processor communicates with MMIO peripherals through memory-mapped registers.
In contrast, the processor communicates with a RoCC accelerators through a custom protocol and custom non-standard ISA instructions reserved in the RISC-V ISA encoding space.
Each core can have up to four accelerators that are controlled by custom instructions and share resources with the CPU.
RoCC coprocessor instructions have the following form.
.. code-block:: none
customX rd, rs1, rs2, funct
The X will be a number 0-3, and determines the opcode of the instruction, which controls which accelerator an instruction will be routed to.
The ``rd``, ``rs1``, and ``rs2`` fields are the register numbers of the destination register and two source registers.
The ``funct`` field is a 7-bit integer that the accelerator can use to distinguish different instructions from each other.
Note that communication through a RoCC interface requires a custom software toolchain, whereas MMIO peripherals can use that standard toolchain with appropriate driver support.
Integrating into the Generator Build System
-------------------------------------------
While developing, you want to include Chisel code in a submodule so that it can be shared by different projects.
To add a submodule to the Chipyard framework, make sure that your project is organized as follows.
.. code-block:: none
yourproject/
build.sbt
src/main/scala/
YourFile.scala
Put this in a git repository and make it accessible.
Then add it as a submodule to under the following directory hierarchy: ``generators/yourproject``.
.. code-block:: shell
cd generators/
git submodule add https://git-repository.com/yourproject.git
Then add ``yourproject`` to the Chipyard top-level build.sbt file.
.. code-block:: scala
lazy val yourproject = (project in file("generators/yourproject")).settings(commonSettings).dependsOn(rocketchip)
You can then import the classes defined in the submodule in a new project if
you add it as a dependency. For instance, if you want to use this code in
the ``example`` project, change the final line in build.sbt to the following.
.. code-block:: scala
lazy val example = (project in file(".")).settings(commonSettings).dependsOn(testchipip, yourproject)
MMIO Peripheral
------------------
The easiest way to create a TileLink peripheral is to use the ``TLRegisterRouter``, which abstracts away the details of handling the TileLink protocol and provides a convenient interface for specifying memory-mapped registers.
To create a RegisterRouter-based peripheral, you will need to specify a parameter case class for the configuration settings, a bundle trait with the extra top-level ports, and a module implementation containing the actual RTL.
In this case we use a submodule ``PWMBase`` to actually perform the pulse-width modulation. The ``PWMModule`` class only creates the registers and hooks them
up using ``regmap``.
.. literalinclude:: ../../generators/example/src/main/scala/PWM.scala
:language: scala
:start-after: DOC include start: PWM generic traits
:end-before: DOC include end: PWM generic traits
Once you have these classes, you can construct the final peripheral by extending the ``TLRegisterRouter`` and passing the proper arguments.
The first set of arguments determines where the register router will be placed in the global address map and what information will be put in its device tree entry.
The second set of arguments is the IO bundle constructor, which we create by extending ``TLRegBundle`` with our bundle trait.
The final set of arguments is the module constructor, which we create by extends ``TLRegModule`` with our module trait.
.. literalinclude:: ../../generators/example/src/main/scala/PWM.scala
:language: scala
:start-after: DOC include start: PWMTL
:end-before: DOC include end: PWMTL
The full module code can be found in ``generators/example/src/main/scala/PWM.scala``.
After creating the module, we need to hook it up to our SoC.
Rocket Chip accomplishes this using the cake pattern.
This basically involves placing code inside traits.
In the Rocket Chip cake, there are two kinds of traits: a ``LazyModule`` trait and a module implementation trait.
The ``LazyModule`` trait runs setup code that must execute before all the hardware gets elaborated.
For a simple memory-mapped peripheral, this just involves connecting the peripheral's TileLink node to the MMIO crossbar.
.. literalinclude:: ../../generators/example/src/main/scala/PWM.scala
:language: scala
:start-after: DOC include start: HasPeripheryPWMTL
:end-before: DOC include end: HasPeripheryPWMTL
Note that the ``PWMTL`` class we created from the register router is itself a ``LazyModule``.
Register routers have a TileLink node simply named "node", which we can hook up to the Rocket Chip bus.
This will automatically add address map and device tree entries for the peripheral.
The module implementation trait is where we instantiate our PWM module and connect it to the rest of the SoC.
Since this module has an extra `pwmout` output, we declare that in this trait, using Chisel's multi-IO functionality.
We then connect the ``PWMTL``'s pwmout to the pwmout we declared.
.. literalinclude:: ../../generators/example/src/main/scala/PWM.scala
:language: scala
:start-after: DOC include start: HasPeripheryPWMTLModuleImp
:end-before: DOC include end: HasPeripheryPWMTLModuleImp
Now we want to mix our traits into the system as a whole.
This code is from ``generators/example/src/main/scala/Top.scala``.
.. literalinclude:: ../../generators/example/src/main/scala/Top.scala
:language: scala
:start-after: DOC include start: TopWithPWMTL
:end-before: DOC include end: TopWithPWMTL
Just as we need separate traits for ``LazyModule`` and module implementation, we need two classes to build the system.
The ``Top`` classes already have the basic peripherals included for us, so we will just extend those.
The ``Top`` class includes the pre-elaboration code and also a ``lazy val`` to produce the module implementation (hence ``LazyModule``).
The ``TopModule`` class is the actual RTL that gets synthesized.
Next, we need to add a configuration mixin in ``generators/example/src/main/scala/ConfigMixins.scala`` that tells the ``TestHarness`` to instantiate ``TopWithPWMTL`` instead of the default ``Top``.
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
:language: scala
:start-after: DOC include start: WithPWMTop
:end-before: DOC include end: WithPWMTop
And finally, we create a configuration class in ``generators/example/src/main/scala/Configs.scala`` that uses this mixin.
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
:language: scala
:start-after: DOC include start: PWMRocketConfig
:end-before: DOC include end: PWMRocketConfig
Now we can test that the PWM is working. The test program is in ``tests/pwm.c``.
.. literalinclude:: ../../tests/pwm.c
:language: c
This just writes out to the registers we defined earlier.
The base of the module's MMIO region is at 0x2000.
This will be printed out in the address map portion when you generated the verilog code.
Compiling this program with make produces a ``pwm.riscv`` executable.
Now with all of that done, we can go ahead and run our simulation.
.. code-block:: shell
cd sims/verilator
make CONFIG=PWMRocketConfig TOP=TopWithPWMTL
./simulator-example-PWMRocketConfig ../../tests/pwm.riscv
Adding a RoCC Accelerator
----------------------------
RoCC accelerators are lazy modules that extend the ``LazyRoCC`` class.
Their implementation should extends the ``LazyRoCCModule`` class.
.. code-block:: scala
class CustomAccelerator(opcodes: OpcodeSet)
(implicit p: Parameters) extends LazyRoCC(opcodes) {
override lazy val module = new CustomAcceleratorModule(this)
}
class CustomAcceleratorModule(outer: CustomAccelerator)
extends LazyRoCCModuleImp(outer) {
val cmd = Queue(io.cmd)
// The parts of the command are as follows
// inst - the parts of the instruction itself
// opcode
// rd - destination register number
// rs1 - first source register number
// rs2 - second source register number
// funct
// xd - is the destination register being used?
// xs1 - is the first source register being used?
// xs2 - is the second source register being used?
// rs1 - the value of source register 1
// rs2 - the value of source register 2
...
}
The ``opcodes`` parameter for ``LazyRoCC`` is the set of custom opcodes that will map to this accelerator.
More on this in the next subsection.
The ``LazyRoCC`` class contains two TLOutputNode instances, ``atlNode`` and ``tlNode``.
The former connects into a tile-local arbiter along with the backside of the L1 instruction cache.
The latter connects directly to the L1-L2 crossbar.
The corresponding Tilelink ports in the module implementation's IO bundle are ``atl`` and ``tl``, respectively.
The other interfaces available to the accelerator are ``mem``, which provides access to the L1 cache;
``ptw`` which provides access to the page-table walker;
the ``busy`` signal, which indicates when the accelerator is still handling an instruction;
and the ``interrupt`` signal, which can be used to interrupt the CPU.
Look at the examples in ``generators/rocket-chip/src/main/scala/tile/LazyRocc.scala`` for detailed information on the different IOs.
Adding RoCC accelerator to Config
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
RoCC accelerators can be added to a core by overriding the ``BuildRoCC`` parameter in the configuration.
This takes a sequence of functions producing ``LazyRoCC`` objects, one for each accelerator you wish to add.
For instance, if we wanted to add the previously defined accelerator and route custom0 and custom1 instructions to it, we could do the following.
.. code-block:: scala
class WithCustomAccelerator extends Config((site, here, up) => {
case BuildRoCC => Seq((p: Parameters) => LazyModule(
new CustomAccelerator(OpcodeSet.custom0 | OpcodeSet.custom1)(p)))
})
class CustomAcceleratorConfig extends Config(
new WithCustomAccelerator ++ new RocketConfig)
To add RoCC instructions in your program, use the RoCC C macros provided in ``tests/rocc.h``. You can find examples in the files ``tests/accum.c`` and ``charcount.c``.
Adding a DMA port
-------------------
For IO devices or accelerators (like a disk or network driver), instead of
having the CPU poll data from the device, we may want to have the device write
directly to the coherent memory system instead. For example, here is a device
that writes zeros to the memory at a configured address.
.. literalinclude:: ../../generators/example/src/main/scala/InitZero.scala
:language: scala
.. literalinclude:: ../../generators/example/src/main/scala/Top.scala
:language: scala
:start-after: DOC include start: TopWithInitZero
:end-before: DOC include end: TopWithInitZero
We use ``TLHelper.makeClientNode`` to create a TileLink client node for us.
We then connect the client node to the memory system through the front bus (fbus).
For more info on creating TileLink client nodes, take a look at :ref:`Client Node`.
Once we've created our top-level module including the DMA widget, we can create a configuration for it as we did before.
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
:language: scala
:start-after: DOC include start: WithInitZero
:end-before: DOC include end: WithInitZero
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
:language: scala
:start-after: DOC include start: InitZeroRocketConfig
:end-before: DOC include end: InitZeroRocketConfig

View File

@@ -0,0 +1,59 @@
.. _custom_chisel:
Integrating Custom Chisel Projects into the Generator Build System
==================================================================
.. warning::
This section assumes integration of custom Chisel through git submodules.
While it is possible to directly commit custom Chisel into the Chipyard framework,
we heavily recommend managing custom code through git submodules. Using submodules decouples
development of custom features from development on the Chipyard framework.
While developing, you want to include Chisel code in a submodule so that it can be shared by different projects.
To add a submodule to the Chipyard framework, make sure that your project is organized as follows.
.. code-block:: none
yourproject/
build.sbt
src/main/scala/
YourFile.scala
Put this in a git repository and make it accessible.
Then add it as a submodule to under the following directory hierarchy: ``generators/yourproject``.
The ``build.sbt`` is a minimal file which describes metadata for a Chisel project.
For a simple project, the ``build.sbt`` can even be empty, but below we provide an example
build.sbt.
.. code-block:: scala
organization := "edu.berkeley.cs"
version := "1.0"
name := "yourproject"
scalaVersion := "2.12.4"
.. code-block:: shell
cd generators/
git submodule add https://git-repository.com/yourproject.git
Then add ``yourproject`` to the Chipyard top-level build.sbt file.
.. code-block:: scala
lazy val yourproject = (project in file("generators/yourproject")).settings(commonSettings).dependsOn(rocketchip)
You can then import the classes defined in the submodule in a new project if
you add it as a dependency. For instance, if you want to use this code in
the ``example`` project, change the final line in build.sbt to the following.
.. code-block:: scala
lazy val example = (project in file(".")).settings(commonSettings).dependsOn(testchipip, yourproject)

View File

@@ -0,0 +1,39 @@
.. _dma-devices:
Adding a DMA Device
===================
DMA devices are Tilelink widgets which act as masters. In other words,
DMA devices can send their own read and write requests to the chip's memory
system.
For IO devices or accelerators (like a disk or network driver), instead of
having the CPU poll data from the device, we may want to have the device write
directly to the coherent memory system instead. For example, here is a device
that writes zeros to the memory at a configured address.
.. literalinclude:: ../../generators/example/src/main/scala/InitZero.scala
:language: scala
.. literalinclude:: ../../generators/example/src/main/scala/Top.scala
:language: scala
:start-after: DOC include start: Top
:end-before: DOC include end: Top
We use ``TLHelper.makeClientNode`` to create a TileLink client node for us.
We then connect the client node to the memory system through the front bus (fbus).
For more info on creating TileLink client nodes, take a look at :ref:`Client Node`.
Once we've created our top-level module including the DMA widget, we can create a configuration for it as we did before.
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
:language: scala
:start-after: DOC include start: WithInitZero
:end-before: DOC include end: WithInitZero
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
:language: scala
:start-after: DOC include start: InitZeroRocketConfig
:end-before: DOC include end: InitZeroRocketConfig

View File

@@ -8,8 +8,7 @@ design flows. Fortunately, both Chisel and Chipyard provide extensive
support for Verilog integration.
Here, we will examine the process of incorporating an MMIO peripheral
(similar to the PWM example from the previous section) that uses a
Verilog implementation of Greatest Common Denominator (GCD)
that uses a Verilog implementation of Greatest Common Denominator (GCD)
algorithm. There are a few steps to adding a Verilog peripheral:
* Adding a Verilog resource file to the project
@@ -58,7 +57,7 @@ and Verilog sources follow the prescribed directory layout.
build.sbt
src/main/
scala/
GCDMMIOBlackBox.scala
GCD.scala
resources/
vsrc/
GCDMMIOBlackBox.v
@@ -89,7 +88,7 @@ as the bitwidth of the GCD calculation does in this example.
**Chisel BlackBox Definition**
.. literalinclude:: ../../generators/example/src/main/scala/GCDMMIOBlackBox.scala
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
:language: scala
:start-after: DOC include start: GCD blackbox
:end-before: DOC include end: GCD blackbox
@@ -102,54 +101,32 @@ diplomatic memory mapping on the system bus, we still have to
integrate the peripheral at the Chisel level by mixing
peripheral-specific traits into a ``TLRegisterRouter``. The ``params``
member and ``HasRegMap`` base trait should look familiar from the
previous memory-mapped PWM device example.
previous memory-mapped GCD device example.
.. literalinclude:: ../../generators/example/src/main/scala/GCDMMIOBlackBox.scala
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
:language: scala
:start-after: DOC include start: GCD instance regmap
:end-before: DOC include end: GCD instance regmap
Advanced Features of RegField Entries
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
One significant difference from the PWM example is in the peripheral's
memory map. ``RegField`` exposes polymorphic ``r`` and ``w`` methods
that allow read- and write-only memory-mapped registers to be
interfaced to hardware in multiple ways.
* ``RegField.r(2, status)`` is used to create a 2-bit, read-only register that captures the current value of the ``status`` signal when read.
* ``RegField.r(params.width, gcd)`` "connects" the decoupled handshaking interface ``gcd`` to a read-only memory-mapped register. When this register is read via MMIO, the ``ready`` signal is asserted. This is in turn connected to ``output_ready`` on the Verilog blackbox through the glue logic.
* ``RegField.w(params.width, x)`` exposes a plain register (much like those in the PWM example) via MMIO, but makes it write-only.
* ``RegField.w(params.width, y)`` associates the decoupled interface signal ``y`` with a write-only memory-mapped register, causing ``y.valid`` to be asserted when the register is written.
Since the ready/valid signals of ``y`` are connected to the
``input_ready`` and ``input_valid`` signals of the blackbox,
respectively, this register map and glue logic has the effect of
triggering the GCD algorithm when ``y`` is written. Therefore, the
algorithm is set up by first writing ``x`` and then performing a
triggering write to ``y``. Polling can be used for status checks.
Defining a Chip with a GCD Peripheral
Defining a Chip with a BlackBox
---------------------------------------
As with the PWM example, a few more pieces are needed to tie the system together.
Since we've parameterized the GCD instantiation to choose between the
Chisel and the verilog module, creating a config is easy.
**Composing traits into a complete cake pattern peripheral**
.. literalinclude:: ../../generators/example/src/main/scala/GCDMMIOBlackBox.scala
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
:language: scala
:start-after: DOC include start: GCD cake
:end-before: DOC include end: GCD cake
:start-after: DOC include start: GCDAXI4BlackBoxRocketConfig
:end-before: DOC include end: GCDAXI4BlackBoxRocketConfig
Note the differences arising due to the fact that this peripheral has
no top-level IO. To build a complete system, a new ``Top`` and new
``Config`` objects are added in a manner exactly analogous to the PWM
example.
You can play with the parameterization of the mixin to choose a TL/AXI4, BlackBox/Chisel
version of the GCD.
Software Testing
----------------
The GCD module has a slightly more complex interface, so polling is
The GCD module has a more complex interface, so polling is
used to check the status of the device before each triggering read or
write.

View File

@@ -0,0 +1,106 @@
.. _keys-traits-configs:
Keys, Traits, and Configs
=========================
You have probably seen snippets of Chisel referencing Keys, Traits, and Configs by this point.
This section aims to elucidate the interactions between these Chisel/Scala components, and provide
best practices for how these should be used to create a parameterized design and configure it.
We will continue to use the GCD example.
Keys
----
Keys specify some parameter which controls some custom widget. Keys should typically be implemented as **Option types**, with a default value of ``None`` that means no change in the system. In other words, the default behavior when the user does not explicitly set the key should be a no-op.
Keys should be defined and documented in sub-projects, since they generally deal with some specific block, and not system-level integration. (We make an exception for the example GCD widget).
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
:language: scala
:start-after: DOC include start: GCD key
:end-before: DOC include end: GCD key
The object within a key is typically a ``case class XXXParams``, which defines a set of parameters which some block accepts. For example, the GCD widget's ``GCDParams`` parameterizes its address, operand widths, whether the widget should be connected by Tilelink or AXI4, and whether the widget should use the blackbox-verilog implementation, or the Chisel implementation.
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
:language: scala
:start-after: DOC include start: GCD params
:end-before: DOC include end: GCD params
Accessing the value stored in the key is easy in Chisel, as long as the ``implicit p: Parameters`` object is being passed through to the relevant module. For example, ``p(GCDKey).get.address`` returns the address field of ``GCDParams``. Note this only works if ``GCDKey`` was not set to ``None``, so your Chisel should check for that case!
Traits
------
Typically, most custom blocks will need to modify the behavior of some pre-existing block. For example, the GCD widget needs the ``Top`` module to instantiate and connect the widget via Tilelink, generate a top-level ``gcd_busy`` port, and connect that to the module as well. Traits let us do this without modifying the existing code for the ``Top``, and enables compartmentalization of code for different custom blocks.
Top-level traits specify that the ``Top`` has been parameterized to read some custom Key and optionally instantiate and connect a widget defined by that Key. Traits **should not** mandate the instantiation of custom logic. In other words, traits should be written with ``CanHave`` semantics, where the default behavior when the Key is unset is a no-op.
Top-level traits should be defined and documented in subprojects, alongside their corresponding Keys. The traits should then be added to the ``Top`` being used by Chipyard.
Below we see the traits for the GCD example. The Lazy trait connects the GCD module to the Diplomacy graph, while the Implementation trait causes the ``Top`` to instantiate an additional port and concretely connect it to the GCD module.
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
:language: scala
:start-after: DOC include start: GCD lazy trait
:end-before: DOC include end: GCD imp trait
These traits are added to the default ``Top`` in Chipyard.
.. literalinclude:: ../../generators/example/src/main/scala/Top.scala
:language: scala
:start-after: DOC include start: Top
:end-before: DOC include end: Top
Mixins
------
Mixins set the keys to a non-default value. Together, the collection of Mixins which define a configuration generate the values for all the keys used by the generator.
For example, the ``WithGCDMixin`` is parameterized by the type of GCD widget you want to instantiate. When this mixin is added to a config, the ``GCDKey`` is set to a instance of ``GCDParams``, informing the previously mentioned traits to instantiate and connect the GCD widget appropriately.
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
:language: scala
:start-after: DOC include start: GCD mixin
:end-before: DOC include end: GCD mixin
We can use this mixin when composing our configs.
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
:language: scala
:start-after: DOC include start: GCDTLRocketConfig
:end-before: DOC include end: GCDTLRocketConfig
BuildTop
--------
The ``BuildTop`` key is special, because sometimes, we need to instantiate ``TestHarness`` modules to interface with a custom widget. The ``BuildTop`` key provides a function which can call some method of the Top to instantiate these ``TestHarness`` modules. Since the ``BuildTop`` key is called from the ``TestHarness``, these modules will appear in the ``TestHarness``. The config system also lets the ``BuildTop`` key look recursively into previous definitions of itself. This enables composability of the ``Top`` configurations.
For example, conside a config that contains the mixins ``WithGPIO ++ WithTSI``. We need to instantiate the TSI serial adapter, and connect it to the ``success`` signal of our ``TestHarness``. We also need to instantiate the GPIO pins, and tie their inputs to 0 in the ``TestHarness``, since we currently cannot drive the GPIOs in simulation.
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
:language: scala
:start-after: DOC include start: tsi mixin
:end-before: DOC include end: tsi mixin
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
:language: scala
:start-after: DOC include start: gpio mixin
:end-before: DOC include end: gpio mixin
When ``WithGPIO ++ WithTSI`` is evaluated right to left, the call to ``up(BuildTop, site)`` in ``WithGPIO`` will reference the function defined in the ``BuildTop`` key of ``WithTSI``. Thus, at elaboration time, when the ``BuildTop`` function is called by the ``TestHarness``, first the ``BuildTop`` function in ``WithTSI`` will be evaluated. This connects the ``success`` signal of the ``TestHarness`` to the ``SerialAdapter`` enabled by ``WithTSI``. Then, the rest of the code in the ``BuildTop`` function of ``WithGPIO`` will execute, tieing off the top-level GPIO input pins. Thus the evaluation of the ``BuildTop`` functions in a completed config is "right-to-left", matching how the evaluation of the mixins at compile-time is also "right-to-left".
.. warning::
Note that in some cases, the ordering and duplication of mixins which extend ``BuildTop`` will have unintended consequences.
For example, ``WithTSI ++ WithTSI`` will attempt to generate and connect two ``SimSerial`` widgets in the ``TestHarness``,
which will likely break the simulation.
In general, you should avoid attaching multiple mixins which interface to the same top-level ports.
.. note::
Readers who want more information on the configuration system may be interested in reading :ref:`cdes`.

View File

@@ -0,0 +1,142 @@
.. _mmio-accelerators:
MMIO Peripherals
==================
The easiest way to create a MMIO peripheral is to use the ``TLRegisterRouter`` or ``AXI4RegisterRouter`` widgets, which abstracts away the details of handling the interconnect protocols and provides a convenient interface for specifying memory-mapped registers. Since Chipyard and Rocket Chip SoCs primarily use Tilelink as the on-chip interconnect protocol, this section will primarily focus on designing Tilelink-based peripherals. However, see ``generators/example/src/main/scala/GCD.scala`` for how an example AXI4 based peripheral is defined and connected to the Tilelink graph through converters.
To create a RegisterRouter-based peripheral, you will need to specify a parameter case class for the configuration settings, a bundle trait with the extra top-level ports, and a module implementation containing the actual RTL.
For this example, we will show how to connect a MMIO peripheral which computes the GCD.
The full code can be found in ``generators/example/src/main/scala/GCD.scala``.
In this case we use a submodule ``GCDMMIOChiselModule`` to actually perform the GCD. The ``GCDModule`` class only creates the registers and hooks them up using ``regmap``.
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
:language: scala
:start-after: DOC include start: GCD chisel
:end-before: DOC include end: GCD chisel
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
:language: scala
:start-after: DOC include start: GCD instance regmap
:end-before: DOC include end: GCD instance regmap
Advanced Features of RegField Entries
-------------------------------------
``RegField`` exposes polymorphic ``r`` and ``w`` methods
that allow read- and write-only memory-mapped registers to be
interfaced to hardware in multiple ways.
* ``RegField.r(2, status)`` is used to create a 2-bit, read-only register that captures the current value of the ``status`` signal when read.
* ``RegField.r(params.width, gcd)`` "connects" the decoupled handshaking interface ``gcd`` to a read-only memory-mapped register. When this register is read via MMIO, the ``ready`` signal is asserted. This is in turn connected to ``output_ready`` on the GCD module through the glue logic.
* ``RegField.w(params.width, x)`` exposes a plain register via MMIO, but makes it write-only.
* ``RegField.w(params.width, y)`` associates the decoupled interface signal ``y`` with a write-only memory-mapped register, causing ``y.valid`` to be asserted when the register is written.
Since the ready/valid signals of ``y`` are connected to the
``input_ready`` and ``input_valid`` signals of the GCD module,
respectively, this register map and glue logic has the effect of
triggering the GCD algorithm when ``y`` is written. Therefore, the
algorithm is set up by first writing ``x`` and then performing a
triggering write to ``y``. Polling can be used for status checks.
Connecting by TileLink
----------------------
Once you have these classes, you can construct the final peripheral by extending the ``TLRegisterRouter`` and passing the proper arguments.
The first set of arguments determines where the register router will be placed in the global address map and what information will be put in its device tree entry.
The second set of arguments is the IO bundle constructor, which we create by extending ``TLRegBundle`` with our bundle trait.
The final set of arguments is the module constructor, which we create by extends ``TLRegModule`` with our module trait.
Notice how we can create an analogous AXI4 version of our peripheral.
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
:language: scala
:start-after: DOC include start: GCD router
:end-before: DOC include end: GCD router
Top-level Traits
----------------
After creating the module, we need to hook it up to our SoC.
Rocket Chip accomplishes this using the cake pattern.
This basically involves placing code inside traits.
In the Rocket Chip cake, there are two kinds of traits: a ``LazyModule`` trait and a module implementation trait.
The ``LazyModule`` trait runs setup code that must execute before all the hardware gets elaborated.
For a simple memory-mapped peripheral, this just involves connecting the peripheral's TileLink node to the MMIO crossbar.
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
:language: scala
:start-after: DOC include start: GCD lazy trait
:end-before: DOC include end: GCD lazy trait
Note that the ``GCDTL`` class we created from the register router is itself a ``LazyModule``.
Register routers have a TileLink node simply named "node", which we can hook up to the Rocket Chip bus.
This will automatically add address map and device tree entries for the peripheral.
Also observe how we have to place additional AXI4 buffers and converters for the AXI4 version of this peripheral.
For peripherals which instantiate a concrete module, or which need to be connected to concrete IOs or wires, a matching concrete trait is necessary. We will make our GCD example output a ``gcd_busy`` signal as a top-level port to demonstrate. In the concrete module implementation trait, we instantiate the top level IO (a concrete object) and wire it to the IO of our lazy module.
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
:language: scala
:start-after: DOC include start: GCD imp trait
:end-before: DOC include end: GCD imp trait
Constructing the Top and Config
-------------------------------
Now we want to mix our traits into the system as a whole.
This code is from ``generators/example/src/main/scala/Top.scala``.
.. literalinclude:: ../../generators/example/src/main/scala/Top.scala
:language: scala
:start-after: DOC include start: Top
:end-before: DOC include end: Top
Just as we need separate traits for ``LazyModule`` and module implementation, we need two classes to build the system.
The ``Top`` class contains the set of traits which parameterize and define the ``Top``. Typically these traits will optionally add IOs or peripherals to the ``Top``.
The ``Top`` class includes the pre-elaboration code and also a ``lazy val`` to produce the module implementation (hence ``LazyModule``).
The ``TopModule`` class is the actual RTL that gets synthesized.
And finally, we create a configuration class in ``generators/example/src/main/scala/Configs.scala`` that uses the ``WithGCD`` mixin defined earlier.
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
:language: scala
:start-after: DOC include start: GCD mixin
:end-before: DOC include end: GCD mixin
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
:language: scala
:start-after: DOC include start: GCDTLRocketConfig
:end-before: DOC include end: GCDTLRocketConfig
Testing
-------
Now we can test that the GCD is working. The test program is in ``tests/gcd.c``.
.. literalinclude:: ../../tests/gcd.c
:language: c
This just writes out to the registers we defined earlier.
The base of the module's MMIO region is at 0x2000 by default.
This will be printed out in the address map portion when you generate the verilog code.
You can also see how this changes the emitted ``.json`` addressmap files in ``generated-src``.
Compiling this program with ``make`` produces a ``gcd.riscv`` executable.
Now with all of that done, we can go ahead and run our simulation.
.. code-block:: shell
cd sims/verilator
make CONFIG=GCDTLRocketConfig BINARY=../../tests/gcd.riscv run-binary

View File

@@ -0,0 +1,70 @@
.. _rocc-accelerators:
Adding a RoCC Accelerator
----------------------------
RoCC accelerators are lazy modules that extend the ``LazyRoCC`` class.
Their implementation should extends the ``LazyRoCCModule`` class.
.. code-block:: scala
class CustomAccelerator(opcodes: OpcodeSet)
(implicit p: Parameters) extends LazyRoCC(opcodes) {
override lazy val module = new CustomAcceleratorModule(this)
}
class CustomAcceleratorModule(outer: CustomAccelerator)
extends LazyRoCCModuleImp(outer) {
val cmd = Queue(io.cmd)
// The parts of the command are as follows
// inst - the parts of the instruction itself
// opcode
// rd - destination register number
// rs1 - first source register number
// rs2 - second source register number
// funct
// xd - is the destination register being used?
// xs1 - is the first source register being used?
// xs2 - is the second source register being used?
// rs1 - the value of source register 1
// rs2 - the value of source register 2
...
}
The ``opcodes`` parameter for ``LazyRoCC`` is the set of custom opcodes that will map to this accelerator.
More on this in the next subsection.
The ``LazyRoCC`` class contains two TLOutputNode instances, ``atlNode`` and ``tlNode``.
The former connects into a tile-local arbiter along with the backside of the L1 instruction cache.
The latter connects directly to the L1-L2 crossbar.
The corresponding Tilelink ports in the module implementation's IO bundle are ``atl`` and ``tl``, respectively.
The other interfaces available to the accelerator are ``mem``, which provides access to the L1 cache;
``ptw`` which provides access to the page-table walker;
the ``busy`` signal, which indicates when the accelerator is still handling an instruction;
and the ``interrupt`` signal, which can be used to interrupt the CPU.
Look at the examples in ``generators/rocket-chip/src/main/scala/tile/LazyRocc.scala`` for detailed information on the different IOs.
Adding RoCC accelerator to Config
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
RoCC accelerators can be added to a core by overriding the ``BuildRoCC`` parameter in the configuration.
This takes a sequence of functions producing ``LazyRoCC`` objects, one for each accelerator you wish to add.
For instance, if we wanted to add the previously defined accelerator and route custom0 and custom1 instructions to it, we could do the following.
.. code-block:: scala
class WithCustomAccelerator extends Config((site, here, up) => {
case BuildRoCC => Seq((p: Parameters) => LazyModule(
new CustomAccelerator(OpcodeSet.custom0 | OpcodeSet.custom1)(p)))
})
class CustomAcceleratorConfig extends Config(
new WithCustomAccelerator ++
new RocketConfig)
To add RoCC instructions in your program, use the RoCC C macros provided in ``tests/rocc.h``. You can find examples in the files ``tests/accum.c`` and ``charcount.c``.

View File

@@ -0,0 +1,27 @@
.. _rocc-vs-mmio:
RoCC vs MMIO
------------
Accelerators or custom IO devices can be added to your SoC in several ways:
* MMIO Peripheral (a.k.a TileLink-Attached Accelerator)
* Tightly-Coupled RoCC Accelerator
These approaches differ in the method of the communication between the processor and the custom block.
With the TileLink-Attached approach, the processor communicates with MMIO peripherals through memory-mapped registers.
In contrast, the processor communicates with a RoCC accelerators through a custom protocol and custom non-standard ISA instructions reserved in the RISC-V ISA encoding space.
Each core can have up to four accelerators that are controlled by custom instructions and share resources with the CPU.
RoCC coprocessor instructions have the following form.
.. code-block:: none
customX rd, rs1, rs2, funct
The X will be a number 0-3, and determines the opcode of the instruction, which controls which accelerator an instruction will be routed to.
The ``rd``, ``rs1``, and ``rs2`` fields are the register numbers of the destination register and two source registers.
The ``funct`` field is a 7-bit integer that the accelerator can use to distinguish different instructions from each other.
Note that communication through a RoCC interface requires a custom software toolchain, whereas MMIO peripherals can use that standard toolchain with appropriate driver support.

View File

@@ -3,18 +3,41 @@ Customization
These guides will walk you through customization of your system-on-chip:
- Contructing heterogenous systems-on-chip using the Chipyard generators and configuration system.
- Contructing heterogenous systems-on-chip using the existing Chipyard generators and configuration system.
- Adding custom accelerators to your system-on-chip.
- How to include your custom Chisel sources in the Chipyard build system
Hit next to get started!
- Adding custom RoCC accelerators to an existing Chipyard core (BOOM or Rocket)
- Adding custom MMIO widgets to the Chipyard memory system by Tilelink or AXI4, with custom Top-level IOs
- Standard practices for using Keys, Traits, and Configs to parameterize your design
- Customizing the memory hierarchy
- Connect widgets which act as TileLink masters
- Adding custom blackboxed verilog to a Chipyard design
We also provide information on:
- The boot process for Chipyard SoCs
- Examples of FIRRTL transforms used in Chipyard, and where they are specified
We recommend reading all these pages in order. Hit next to get started!
.. toctree::
:maxdepth: 2
:caption: Customization:
Heterogeneous-SoCs
Adding-An-Accelerator
Custom-Chisel
RoCC-or-MMIO
RoCC-Accelerators
MMIO-Peripherals
Keys-Traits-Configs
DMA-Devices
Incorporating-Verilog-Blocks
Memory-Hierarchy
Boot-Process