Revamp the config system for Top/Harness (#347)
* Refactor how Configs parameterize the Top and TestHarnesses * Bump sha3, testchipip, icenet, firesim
This commit is contained in:
@@ -1,261 +0,0 @@
|
||||
.. _adding-an-accelerator:
|
||||
|
||||
Adding an Accelerator/Device
|
||||
===============================
|
||||
|
||||
Accelerators or custom IO devices can be added to your SoC in several ways:
|
||||
|
||||
* MMIO Peripheral (a.k.a TileLink-Attached Accelerator)
|
||||
* Tightly-Coupled RoCC Accelerator
|
||||
|
||||
These approaches differ in the method of the communication between the processor and the custom block.
|
||||
|
||||
With the TileLink-Attached approach, the processor communicates with MMIO peripherals through memory-mapped registers.
|
||||
|
||||
In contrast, the processor communicates with a RoCC accelerators through a custom protocol and custom non-standard ISA instructions reserved in the RISC-V ISA encoding space.
|
||||
Each core can have up to four accelerators that are controlled by custom instructions and share resources with the CPU.
|
||||
RoCC coprocessor instructions have the following form.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
customX rd, rs1, rs2, funct
|
||||
|
||||
The X will be a number 0-3, and determines the opcode of the instruction, which controls which accelerator an instruction will be routed to.
|
||||
The ``rd``, ``rs1``, and ``rs2`` fields are the register numbers of the destination register and two source registers.
|
||||
The ``funct`` field is a 7-bit integer that the accelerator can use to distinguish different instructions from each other.
|
||||
|
||||
Note that communication through a RoCC interface requires a custom software toolchain, whereas MMIO peripherals can use that standard toolchain with appropriate driver support.
|
||||
|
||||
Integrating into the Generator Build System
|
||||
-------------------------------------------
|
||||
|
||||
While developing, you want to include Chisel code in a submodule so that it can be shared by different projects.
|
||||
To add a submodule to the Chipyard framework, make sure that your project is organized as follows.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
yourproject/
|
||||
build.sbt
|
||||
src/main/scala/
|
||||
YourFile.scala
|
||||
|
||||
Put this in a git repository and make it accessible.
|
||||
Then add it as a submodule to under the following directory hierarchy: ``generators/yourproject``.
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
cd generators/
|
||||
git submodule add https://git-repository.com/yourproject.git
|
||||
|
||||
Then add ``yourproject`` to the Chipyard top-level build.sbt file.
|
||||
|
||||
.. code-block:: scala
|
||||
|
||||
lazy val yourproject = (project in file("generators/yourproject")).settings(commonSettings).dependsOn(rocketchip)
|
||||
|
||||
You can then import the classes defined in the submodule in a new project if
|
||||
you add it as a dependency. For instance, if you want to use this code in
|
||||
the ``example`` project, change the final line in build.sbt to the following.
|
||||
|
||||
.. code-block:: scala
|
||||
|
||||
lazy val example = (project in file(".")).settings(commonSettings).dependsOn(testchipip, yourproject)
|
||||
|
||||
MMIO Peripheral
|
||||
------------------
|
||||
|
||||
The easiest way to create a TileLink peripheral is to use the ``TLRegisterRouter``, which abstracts away the details of handling the TileLink protocol and provides a convenient interface for specifying memory-mapped registers.
|
||||
To create a RegisterRouter-based peripheral, you will need to specify a parameter case class for the configuration settings, a bundle trait with the extra top-level ports, and a module implementation containing the actual RTL.
|
||||
In this case we use a submodule ``PWMBase`` to actually perform the pulse-width modulation. The ``PWMModule`` class only creates the registers and hooks them
|
||||
up using ``regmap``.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/PWM.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: PWM generic traits
|
||||
:end-before: DOC include end: PWM generic traits
|
||||
|
||||
Once you have these classes, you can construct the final peripheral by extending the ``TLRegisterRouter`` and passing the proper arguments.
|
||||
The first set of arguments determines where the register router will be placed in the global address map and what information will be put in its device tree entry.
|
||||
The second set of arguments is the IO bundle constructor, which we create by extending ``TLRegBundle`` with our bundle trait.
|
||||
The final set of arguments is the module constructor, which we create by extends ``TLRegModule`` with our module trait.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/PWM.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: PWMTL
|
||||
:end-before: DOC include end: PWMTL
|
||||
|
||||
The full module code can be found in ``generators/example/src/main/scala/PWM.scala``.
|
||||
|
||||
After creating the module, we need to hook it up to our SoC.
|
||||
Rocket Chip accomplishes this using the cake pattern.
|
||||
This basically involves placing code inside traits.
|
||||
In the Rocket Chip cake, there are two kinds of traits: a ``LazyModule`` trait and a module implementation trait.
|
||||
|
||||
The ``LazyModule`` trait runs setup code that must execute before all the hardware gets elaborated.
|
||||
For a simple memory-mapped peripheral, this just involves connecting the peripheral's TileLink node to the MMIO crossbar.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/PWM.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: HasPeripheryPWMTL
|
||||
:end-before: DOC include end: HasPeripheryPWMTL
|
||||
|
||||
Note that the ``PWMTL`` class we created from the register router is itself a ``LazyModule``.
|
||||
Register routers have a TileLink node simply named "node", which we can hook up to the Rocket Chip bus.
|
||||
This will automatically add address map and device tree entries for the peripheral.
|
||||
|
||||
The module implementation trait is where we instantiate our PWM module and connect it to the rest of the SoC.
|
||||
Since this module has an extra `pwmout` output, we declare that in this trait, using Chisel's multi-IO functionality.
|
||||
We then connect the ``PWMTL``'s pwmout to the pwmout we declared.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/PWM.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: HasPeripheryPWMTLModuleImp
|
||||
:end-before: DOC include end: HasPeripheryPWMTLModuleImp
|
||||
|
||||
Now we want to mix our traits into the system as a whole.
|
||||
This code is from ``generators/example/src/main/scala/Top.scala``.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/Top.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: TopWithPWMTL
|
||||
:end-before: DOC include end: TopWithPWMTL
|
||||
|
||||
Just as we need separate traits for ``LazyModule`` and module implementation, we need two classes to build the system.
|
||||
The ``Top`` classes already have the basic peripherals included for us, so we will just extend those.
|
||||
|
||||
The ``Top`` class includes the pre-elaboration code and also a ``lazy val`` to produce the module implementation (hence ``LazyModule``).
|
||||
The ``TopModule`` class is the actual RTL that gets synthesized.
|
||||
|
||||
Next, we need to add a configuration mixin in ``generators/example/src/main/scala/ConfigMixins.scala`` that tells the ``TestHarness`` to instantiate ``TopWithPWMTL`` instead of the default ``Top``.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: WithPWMTop
|
||||
:end-before: DOC include end: WithPWMTop
|
||||
|
||||
And finally, we create a configuration class in ``generators/example/src/main/scala/Configs.scala`` that uses this mixin.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: PWMRocketConfig
|
||||
:end-before: DOC include end: PWMRocketConfig
|
||||
|
||||
Now we can test that the PWM is working. The test program is in ``tests/pwm.c``.
|
||||
|
||||
.. literalinclude:: ../../tests/pwm.c
|
||||
:language: c
|
||||
|
||||
This just writes out to the registers we defined earlier.
|
||||
The base of the module's MMIO region is at 0x2000.
|
||||
This will be printed out in the address map portion when you generated the verilog code.
|
||||
|
||||
Compiling this program with make produces a ``pwm.riscv`` executable.
|
||||
|
||||
Now with all of that done, we can go ahead and run our simulation.
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
cd sims/verilator
|
||||
make CONFIG=PWMRocketConfig TOP=TopWithPWMTL
|
||||
./simulator-example-PWMRocketConfig ../../tests/pwm.riscv
|
||||
|
||||
Adding a RoCC Accelerator
|
||||
----------------------------
|
||||
|
||||
RoCC accelerators are lazy modules that extend the ``LazyRoCC`` class.
|
||||
Their implementation should extends the ``LazyRoCCModule`` class.
|
||||
|
||||
.. code-block:: scala
|
||||
|
||||
class CustomAccelerator(opcodes: OpcodeSet)
|
||||
(implicit p: Parameters) extends LazyRoCC(opcodes) {
|
||||
override lazy val module = new CustomAcceleratorModule(this)
|
||||
}
|
||||
|
||||
class CustomAcceleratorModule(outer: CustomAccelerator)
|
||||
extends LazyRoCCModuleImp(outer) {
|
||||
val cmd = Queue(io.cmd)
|
||||
// The parts of the command are as follows
|
||||
// inst - the parts of the instruction itself
|
||||
// opcode
|
||||
// rd - destination register number
|
||||
// rs1 - first source register number
|
||||
// rs2 - second source register number
|
||||
// funct
|
||||
// xd - is the destination register being used?
|
||||
// xs1 - is the first source register being used?
|
||||
// xs2 - is the second source register being used?
|
||||
// rs1 - the value of source register 1
|
||||
// rs2 - the value of source register 2
|
||||
...
|
||||
}
|
||||
|
||||
|
||||
The ``opcodes`` parameter for ``LazyRoCC`` is the set of custom opcodes that will map to this accelerator.
|
||||
More on this in the next subsection.
|
||||
|
||||
The ``LazyRoCC`` class contains two TLOutputNode instances, ``atlNode`` and ``tlNode``.
|
||||
The former connects into a tile-local arbiter along with the backside of the L1 instruction cache.
|
||||
The latter connects directly to the L1-L2 crossbar.
|
||||
The corresponding Tilelink ports in the module implementation's IO bundle are ``atl`` and ``tl``, respectively.
|
||||
|
||||
The other interfaces available to the accelerator are ``mem``, which provides access to the L1 cache;
|
||||
``ptw`` which provides access to the page-table walker;
|
||||
the ``busy`` signal, which indicates when the accelerator is still handling an instruction;
|
||||
and the ``interrupt`` signal, which can be used to interrupt the CPU.
|
||||
|
||||
Look at the examples in ``generators/rocket-chip/src/main/scala/tile/LazyRocc.scala`` for detailed information on the different IOs.
|
||||
|
||||
Adding RoCC accelerator to Config
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
RoCC accelerators can be added to a core by overriding the ``BuildRoCC`` parameter in the configuration.
|
||||
This takes a sequence of functions producing ``LazyRoCC`` objects, one for each accelerator you wish to add.
|
||||
|
||||
For instance, if we wanted to add the previously defined accelerator and route custom0 and custom1 instructions to it, we could do the following.
|
||||
|
||||
.. code-block:: scala
|
||||
|
||||
class WithCustomAccelerator extends Config((site, here, up) => {
|
||||
case BuildRoCC => Seq((p: Parameters) => LazyModule(
|
||||
new CustomAccelerator(OpcodeSet.custom0 | OpcodeSet.custom1)(p)))
|
||||
})
|
||||
|
||||
class CustomAcceleratorConfig extends Config(
|
||||
new WithCustomAccelerator ++ new RocketConfig)
|
||||
|
||||
To add RoCC instructions in your program, use the RoCC C macros provided in ``tests/rocc.h``. You can find examples in the files ``tests/accum.c`` and ``charcount.c``.
|
||||
|
||||
Adding a DMA port
|
||||
-------------------
|
||||
|
||||
For IO devices or accelerators (like a disk or network driver), instead of
|
||||
having the CPU poll data from the device, we may want to have the device write
|
||||
directly to the coherent memory system instead. For example, here is a device
|
||||
that writes zeros to the memory at a configured address.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/InitZero.scala
|
||||
:language: scala
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/Top.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: TopWithInitZero
|
||||
:end-before: DOC include end: TopWithInitZero
|
||||
|
||||
We use ``TLHelper.makeClientNode`` to create a TileLink client node for us.
|
||||
We then connect the client node to the memory system through the front bus (fbus).
|
||||
For more info on creating TileLink client nodes, take a look at :ref:`Client Node`.
|
||||
|
||||
Once we've created our top-level module including the DMA widget, we can create a configuration for it as we did before.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: WithInitZero
|
||||
:end-before: DOC include end: WithInitZero
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: InitZeroRocketConfig
|
||||
:end-before: DOC include end: InitZeroRocketConfig
|
||||
|
||||
|
||||
59
docs/Customization/Custom-Chisel.rst
Normal file
59
docs/Customization/Custom-Chisel.rst
Normal file
@@ -0,0 +1,59 @@
|
||||
.. _custom_chisel:
|
||||
|
||||
Integrating Custom Chisel Projects into the Generator Build System
|
||||
==================================================================
|
||||
|
||||
.. warning::
|
||||
This section assumes integration of custom Chisel through git submodules.
|
||||
While it is possible to directly commit custom Chisel into the Chipyard framework,
|
||||
we heavily recommend managing custom code through git submodules. Using submodules decouples
|
||||
development of custom features from development on the Chipyard framework.
|
||||
|
||||
|
||||
While developing, you want to include Chisel code in a submodule so that it can be shared by different projects.
|
||||
To add a submodule to the Chipyard framework, make sure that your project is organized as follows.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
yourproject/
|
||||
build.sbt
|
||||
src/main/scala/
|
||||
YourFile.scala
|
||||
|
||||
Put this in a git repository and make it accessible.
|
||||
Then add it as a submodule to under the following directory hierarchy: ``generators/yourproject``.
|
||||
|
||||
The ``build.sbt`` is a minimal file which describes metadata for a Chisel project.
|
||||
For a simple project, the ``build.sbt`` can even be empty, but below we provide an example
|
||||
build.sbt.
|
||||
|
||||
.. code-block:: scala
|
||||
|
||||
organization := "edu.berkeley.cs"
|
||||
|
||||
version := "1.0"
|
||||
|
||||
name := "yourproject"
|
||||
|
||||
scalaVersion := "2.12.4"
|
||||
|
||||
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
cd generators/
|
||||
git submodule add https://git-repository.com/yourproject.git
|
||||
|
||||
Then add ``yourproject`` to the Chipyard top-level build.sbt file.
|
||||
|
||||
.. code-block:: scala
|
||||
|
||||
lazy val yourproject = (project in file("generators/yourproject")).settings(commonSettings).dependsOn(rocketchip)
|
||||
|
||||
You can then import the classes defined in the submodule in a new project if
|
||||
you add it as a dependency. For instance, if you want to use this code in
|
||||
the ``example`` project, change the final line in build.sbt to the following.
|
||||
|
||||
.. code-block:: scala
|
||||
|
||||
lazy val example = (project in file(".")).settings(commonSettings).dependsOn(testchipip, yourproject)
|
||||
39
docs/Customization/DMA-Devices.rst
Normal file
39
docs/Customization/DMA-Devices.rst
Normal file
@@ -0,0 +1,39 @@
|
||||
.. _dma-devices:
|
||||
|
||||
Adding a DMA Device
|
||||
===================
|
||||
|
||||
DMA devices are Tilelink widgets which act as masters. In other words,
|
||||
DMA devices can send their own read and write requests to the chip's memory
|
||||
system.
|
||||
|
||||
For IO devices or accelerators (like a disk or network driver), instead of
|
||||
having the CPU poll data from the device, we may want to have the device write
|
||||
directly to the coherent memory system instead. For example, here is a device
|
||||
that writes zeros to the memory at a configured address.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/InitZero.scala
|
||||
:language: scala
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/Top.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: Top
|
||||
:end-before: DOC include end: Top
|
||||
|
||||
We use ``TLHelper.makeClientNode`` to create a TileLink client node for us.
|
||||
We then connect the client node to the memory system through the front bus (fbus).
|
||||
For more info on creating TileLink client nodes, take a look at :ref:`Client Node`.
|
||||
|
||||
Once we've created our top-level module including the DMA widget, we can create a configuration for it as we did before.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: WithInitZero
|
||||
:end-before: DOC include end: WithInitZero
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: InitZeroRocketConfig
|
||||
:end-before: DOC include end: InitZeroRocketConfig
|
||||
|
||||
|
||||
@@ -8,8 +8,7 @@ design flows. Fortunately, both Chisel and Chipyard provide extensive
|
||||
support for Verilog integration.
|
||||
|
||||
Here, we will examine the process of incorporating an MMIO peripheral
|
||||
(similar to the PWM example from the previous section) that uses a
|
||||
Verilog implementation of Greatest Common Denominator (GCD)
|
||||
that uses a Verilog implementation of Greatest Common Denominator (GCD)
|
||||
algorithm. There are a few steps to adding a Verilog peripheral:
|
||||
|
||||
* Adding a Verilog resource file to the project
|
||||
@@ -58,7 +57,7 @@ and Verilog sources follow the prescribed directory layout.
|
||||
build.sbt
|
||||
src/main/
|
||||
scala/
|
||||
GCDMMIOBlackBox.scala
|
||||
GCD.scala
|
||||
resources/
|
||||
vsrc/
|
||||
GCDMMIOBlackBox.v
|
||||
@@ -89,7 +88,7 @@ as the bitwidth of the GCD calculation does in this example.
|
||||
|
||||
**Chisel BlackBox Definition**
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/GCDMMIOBlackBox.scala
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCD blackbox
|
||||
:end-before: DOC include end: GCD blackbox
|
||||
@@ -102,54 +101,32 @@ diplomatic memory mapping on the system bus, we still have to
|
||||
integrate the peripheral at the Chisel level by mixing
|
||||
peripheral-specific traits into a ``TLRegisterRouter``. The ``params``
|
||||
member and ``HasRegMap`` base trait should look familiar from the
|
||||
previous memory-mapped PWM device example.
|
||||
previous memory-mapped GCD device example.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/GCDMMIOBlackBox.scala
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCD instance regmap
|
||||
:end-before: DOC include end: GCD instance regmap
|
||||
|
||||
Advanced Features of RegField Entries
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
One significant difference from the PWM example is in the peripheral's
|
||||
memory map. ``RegField`` exposes polymorphic ``r`` and ``w`` methods
|
||||
that allow read- and write-only memory-mapped registers to be
|
||||
interfaced to hardware in multiple ways.
|
||||
|
||||
* ``RegField.r(2, status)`` is used to create a 2-bit, read-only register that captures the current value of the ``status`` signal when read.
|
||||
* ``RegField.r(params.width, gcd)`` "connects" the decoupled handshaking interface ``gcd`` to a read-only memory-mapped register. When this register is read via MMIO, the ``ready`` signal is asserted. This is in turn connected to ``output_ready`` on the Verilog blackbox through the glue logic.
|
||||
* ``RegField.w(params.width, x)`` exposes a plain register (much like those in the PWM example) via MMIO, but makes it write-only.
|
||||
* ``RegField.w(params.width, y)`` associates the decoupled interface signal ``y`` with a write-only memory-mapped register, causing ``y.valid`` to be asserted when the register is written.
|
||||
|
||||
Since the ready/valid signals of ``y`` are connected to the
|
||||
``input_ready`` and ``input_valid`` signals of the blackbox,
|
||||
respectively, this register map and glue logic has the effect of
|
||||
triggering the GCD algorithm when ``y`` is written. Therefore, the
|
||||
algorithm is set up by first writing ``x`` and then performing a
|
||||
triggering write to ``y``. Polling can be used for status checks.
|
||||
|
||||
Defining a Chip with a GCD Peripheral
|
||||
Defining a Chip with a BlackBox
|
||||
---------------------------------------
|
||||
|
||||
As with the PWM example, a few more pieces are needed to tie the system together.
|
||||
Since we've parameterized the GCD instantiation to choose between the
|
||||
Chisel and the verilog module, creating a config is easy.
|
||||
|
||||
**Composing traits into a complete cake pattern peripheral**
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/GCDMMIOBlackBox.scala
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCD cake
|
||||
:end-before: DOC include end: GCD cake
|
||||
:start-after: DOC include start: GCDAXI4BlackBoxRocketConfig
|
||||
:end-before: DOC include end: GCDAXI4BlackBoxRocketConfig
|
||||
|
||||
Note the differences arising due to the fact that this peripheral has
|
||||
no top-level IO. To build a complete system, a new ``Top`` and new
|
||||
``Config`` objects are added in a manner exactly analogous to the PWM
|
||||
example.
|
||||
You can play with the parameterization of the mixin to choose a TL/AXI4, BlackBox/Chisel
|
||||
version of the GCD.
|
||||
|
||||
Software Testing
|
||||
----------------
|
||||
|
||||
The GCD module has a slightly more complex interface, so polling is
|
||||
The GCD module has a more complex interface, so polling is
|
||||
used to check the status of the device before each triggering read or
|
||||
write.
|
||||
|
||||
|
||||
106
docs/Customization/Keys-Traits-Configs.rst
Normal file
106
docs/Customization/Keys-Traits-Configs.rst
Normal file
@@ -0,0 +1,106 @@
|
||||
.. _keys-traits-configs:
|
||||
|
||||
Keys, Traits, and Configs
|
||||
=========================
|
||||
|
||||
You have probably seen snippets of Chisel referencing Keys, Traits, and Configs by this point.
|
||||
This section aims to elucidate the interactions between these Chisel/Scala components, and provide
|
||||
best practices for how these should be used to create a parameterized design and configure it.
|
||||
|
||||
We will continue to use the GCD example.
|
||||
|
||||
Keys
|
||||
----
|
||||
|
||||
Keys specify some parameter which controls some custom widget. Keys should typically be implemented as **Option types**, with a default value of ``None`` that means no change in the system. In other words, the default behavior when the user does not explicitly set the key should be a no-op.
|
||||
|
||||
Keys should be defined and documented in sub-projects, since they generally deal with some specific block, and not system-level integration. (We make an exception for the example GCD widget).
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCD key
|
||||
:end-before: DOC include end: GCD key
|
||||
|
||||
The object within a key is typically a ``case class XXXParams``, which defines a set of parameters which some block accepts. For example, the GCD widget's ``GCDParams`` parameterizes its address, operand widths, whether the widget should be connected by Tilelink or AXI4, and whether the widget should use the blackbox-verilog implementation, or the Chisel implementation.
|
||||
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCD params
|
||||
:end-before: DOC include end: GCD params
|
||||
|
||||
Accessing the value stored in the key is easy in Chisel, as long as the ``implicit p: Parameters`` object is being passed through to the relevant module. For example, ``p(GCDKey).get.address`` returns the address field of ``GCDParams``. Note this only works if ``GCDKey`` was not set to ``None``, so your Chisel should check for that case!
|
||||
|
||||
Traits
|
||||
------
|
||||
|
||||
Typically, most custom blocks will need to modify the behavior of some pre-existing block. For example, the GCD widget needs the ``Top`` module to instantiate and connect the widget via Tilelink, generate a top-level ``gcd_busy`` port, and connect that to the module as well. Traits let us do this without modifying the existing code for the ``Top``, and enables compartmentalization of code for different custom blocks.
|
||||
|
||||
Top-level traits specify that the ``Top`` has been parameterized to read some custom Key and optionally instantiate and connect a widget defined by that Key. Traits **should not** mandate the instantiation of custom logic. In other words, traits should be written with ``CanHave`` semantics, where the default behavior when the Key is unset is a no-op.
|
||||
|
||||
Top-level traits should be defined and documented in subprojects, alongside their corresponding Keys. The traits should then be added to the ``Top`` being used by Chipyard.
|
||||
|
||||
Below we see the traits for the GCD example. The Lazy trait connects the GCD module to the Diplomacy graph, while the Implementation trait causes the ``Top`` to instantiate an additional port and concretely connect it to the GCD module.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCD lazy trait
|
||||
:end-before: DOC include end: GCD imp trait
|
||||
|
||||
These traits are added to the default ``Top`` in Chipyard.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/Top.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: Top
|
||||
:end-before: DOC include end: Top
|
||||
|
||||
Mixins
|
||||
------
|
||||
|
||||
Mixins set the keys to a non-default value. Together, the collection of Mixins which define a configuration generate the values for all the keys used by the generator.
|
||||
|
||||
For example, the ``WithGCDMixin`` is parameterized by the type of GCD widget you want to instantiate. When this mixin is added to a config, the ``GCDKey`` is set to a instance of ``GCDParams``, informing the previously mentioned traits to instantiate and connect the GCD widget appropriately.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCD mixin
|
||||
:end-before: DOC include end: GCD mixin
|
||||
|
||||
We can use this mixin when composing our configs.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCDTLRocketConfig
|
||||
:end-before: DOC include end: GCDTLRocketConfig
|
||||
|
||||
|
||||
BuildTop
|
||||
--------
|
||||
|
||||
The ``BuildTop`` key is special, because sometimes, we need to instantiate ``TestHarness`` modules to interface with a custom widget. The ``BuildTop`` key provides a function which can call some method of the Top to instantiate these ``TestHarness`` modules. Since the ``BuildTop`` key is called from the ``TestHarness``, these modules will appear in the ``TestHarness``. The config system also lets the ``BuildTop`` key look recursively into previous definitions of itself. This enables composability of the ``Top`` configurations.
|
||||
|
||||
For example, conside a config that contains the mixins ``WithGPIO ++ WithTSI``. We need to instantiate the TSI serial adapter, and connect it to the ``success`` signal of our ``TestHarness``. We also need to instantiate the GPIO pins, and tie their inputs to 0 in the ``TestHarness``, since we currently cannot drive the GPIOs in simulation.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: tsi mixin
|
||||
:end-before: DOC include end: tsi mixin
|
||||
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: gpio mixin
|
||||
:end-before: DOC include end: gpio mixin
|
||||
|
||||
When ``WithGPIO ++ WithTSI`` is evaluated right to left, the call to ``up(BuildTop, site)`` in ``WithGPIO`` will reference the function defined in the ``BuildTop`` key of ``WithTSI``. Thus, at elaboration time, when the ``BuildTop`` function is called by the ``TestHarness``, first the ``BuildTop`` function in ``WithTSI`` will be evaluated. This connects the ``success`` signal of the ``TestHarness`` to the ``SerialAdapter`` enabled by ``WithTSI``. Then, the rest of the code in the ``BuildTop`` function of ``WithGPIO`` will execute, tieing off the top-level GPIO input pins. Thus the evaluation of the ``BuildTop`` functions in a completed config is "right-to-left", matching how the evaluation of the mixins at compile-time is also "right-to-left".
|
||||
|
||||
.. warning::
|
||||
Note that in some cases, the ordering and duplication of mixins which extend ``BuildTop`` will have unintended consequences.
|
||||
For example, ``WithTSI ++ WithTSI`` will attempt to generate and connect two ``SimSerial`` widgets in the ``TestHarness``,
|
||||
which will likely break the simulation.
|
||||
In general, you should avoid attaching multiple mixins which interface to the same top-level ports.
|
||||
|
||||
.. note::
|
||||
Readers who want more information on the configuration system may be interested in reading :ref:`cdes`.
|
||||
|
||||
|
||||
142
docs/Customization/MMIO-Peripherals.rst
Normal file
142
docs/Customization/MMIO-Peripherals.rst
Normal file
@@ -0,0 +1,142 @@
|
||||
.. _mmio-accelerators:
|
||||
|
||||
MMIO Peripherals
|
||||
==================
|
||||
|
||||
The easiest way to create a MMIO peripheral is to use the ``TLRegisterRouter`` or ``AXI4RegisterRouter`` widgets, which abstracts away the details of handling the interconnect protocols and provides a convenient interface for specifying memory-mapped registers. Since Chipyard and Rocket Chip SoCs primarily use Tilelink as the on-chip interconnect protocol, this section will primarily focus on designing Tilelink-based peripherals. However, see ``generators/example/src/main/scala/GCD.scala`` for how an example AXI4 based peripheral is defined and connected to the Tilelink graph through converters.
|
||||
|
||||
To create a RegisterRouter-based peripheral, you will need to specify a parameter case class for the configuration settings, a bundle trait with the extra top-level ports, and a module implementation containing the actual RTL.
|
||||
|
||||
For this example, we will show how to connect a MMIO peripheral which computes the GCD.
|
||||
The full code can be found in ``generators/example/src/main/scala/GCD.scala``.
|
||||
|
||||
In this case we use a submodule ``GCDMMIOChiselModule`` to actually perform the GCD. The ``GCDModule`` class only creates the registers and hooks them up using ``regmap``.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCD chisel
|
||||
:end-before: DOC include end: GCD chisel
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCD instance regmap
|
||||
:end-before: DOC include end: GCD instance regmap
|
||||
|
||||
Advanced Features of RegField Entries
|
||||
-------------------------------------
|
||||
|
||||
``RegField`` exposes polymorphic ``r`` and ``w`` methods
|
||||
that allow read- and write-only memory-mapped registers to be
|
||||
interfaced to hardware in multiple ways.
|
||||
|
||||
* ``RegField.r(2, status)`` is used to create a 2-bit, read-only register that captures the current value of the ``status`` signal when read.
|
||||
* ``RegField.r(params.width, gcd)`` "connects" the decoupled handshaking interface ``gcd`` to a read-only memory-mapped register. When this register is read via MMIO, the ``ready`` signal is asserted. This is in turn connected to ``output_ready`` on the GCD module through the glue logic.
|
||||
* ``RegField.w(params.width, x)`` exposes a plain register via MMIO, but makes it write-only.
|
||||
* ``RegField.w(params.width, y)`` associates the decoupled interface signal ``y`` with a write-only memory-mapped register, causing ``y.valid`` to be asserted when the register is written.
|
||||
|
||||
Since the ready/valid signals of ``y`` are connected to the
|
||||
``input_ready`` and ``input_valid`` signals of the GCD module,
|
||||
respectively, this register map and glue logic has the effect of
|
||||
triggering the GCD algorithm when ``y`` is written. Therefore, the
|
||||
algorithm is set up by first writing ``x`` and then performing a
|
||||
triggering write to ``y``. Polling can be used for status checks.
|
||||
|
||||
|
||||
Connecting by TileLink
|
||||
----------------------
|
||||
|
||||
Once you have these classes, you can construct the final peripheral by extending the ``TLRegisterRouter`` and passing the proper arguments.
|
||||
The first set of arguments determines where the register router will be placed in the global address map and what information will be put in its device tree entry.
|
||||
The second set of arguments is the IO bundle constructor, which we create by extending ``TLRegBundle`` with our bundle trait.
|
||||
The final set of arguments is the module constructor, which we create by extends ``TLRegModule`` with our module trait.
|
||||
Notice how we can create an analogous AXI4 version of our peripheral.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCD router
|
||||
:end-before: DOC include end: GCD router
|
||||
|
||||
|
||||
|
||||
Top-level Traits
|
||||
----------------
|
||||
|
||||
After creating the module, we need to hook it up to our SoC.
|
||||
Rocket Chip accomplishes this using the cake pattern.
|
||||
This basically involves placing code inside traits.
|
||||
In the Rocket Chip cake, there are two kinds of traits: a ``LazyModule`` trait and a module implementation trait.
|
||||
|
||||
The ``LazyModule`` trait runs setup code that must execute before all the hardware gets elaborated.
|
||||
For a simple memory-mapped peripheral, this just involves connecting the peripheral's TileLink node to the MMIO crossbar.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCD lazy trait
|
||||
:end-before: DOC include end: GCD lazy trait
|
||||
|
||||
Note that the ``GCDTL`` class we created from the register router is itself a ``LazyModule``.
|
||||
Register routers have a TileLink node simply named "node", which we can hook up to the Rocket Chip bus.
|
||||
This will automatically add address map and device tree entries for the peripheral.
|
||||
Also observe how we have to place additional AXI4 buffers and converters for the AXI4 version of this peripheral.
|
||||
|
||||
For peripherals which instantiate a concrete module, or which need to be connected to concrete IOs or wires, a matching concrete trait is necessary. We will make our GCD example output a ``gcd_busy`` signal as a top-level port to demonstrate. In the concrete module implementation trait, we instantiate the top level IO (a concrete object) and wire it to the IO of our lazy module.
|
||||
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/GCD.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCD imp trait
|
||||
:end-before: DOC include end: GCD imp trait
|
||||
|
||||
Constructing the Top and Config
|
||||
-------------------------------
|
||||
|
||||
Now we want to mix our traits into the system as a whole.
|
||||
This code is from ``generators/example/src/main/scala/Top.scala``.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/Top.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: Top
|
||||
:end-before: DOC include end: Top
|
||||
|
||||
Just as we need separate traits for ``LazyModule`` and module implementation, we need two classes to build the system.
|
||||
The ``Top`` class contains the set of traits which parameterize and define the ``Top``. Typically these traits will optionally add IOs or peripherals to the ``Top``.
|
||||
The ``Top`` class includes the pre-elaboration code and also a ``lazy val`` to produce the module implementation (hence ``LazyModule``).
|
||||
The ``TopModule`` class is the actual RTL that gets synthesized.
|
||||
|
||||
|
||||
|
||||
And finally, we create a configuration class in ``generators/example/src/main/scala/Configs.scala`` that uses the ``WithGCD`` mixin defined earlier.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCD mixin
|
||||
:end-before: DOC include end: GCD mixin
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GCDTLRocketConfig
|
||||
:end-before: DOC include end: GCDTLRocketConfig
|
||||
|
||||
Testing
|
||||
-------
|
||||
|
||||
Now we can test that the GCD is working. The test program is in ``tests/gcd.c``.
|
||||
|
||||
.. literalinclude:: ../../tests/gcd.c
|
||||
:language: c
|
||||
|
||||
This just writes out to the registers we defined earlier.
|
||||
The base of the module's MMIO region is at 0x2000 by default.
|
||||
This will be printed out in the address map portion when you generate the verilog code.
|
||||
You can also see how this changes the emitted ``.json`` addressmap files in ``generated-src``.
|
||||
|
||||
Compiling this program with ``make`` produces a ``gcd.riscv`` executable.
|
||||
|
||||
Now with all of that done, we can go ahead and run our simulation.
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
cd sims/verilator
|
||||
make CONFIG=GCDTLRocketConfig BINARY=../../tests/gcd.riscv run-binary
|
||||
|
||||
|
||||
70
docs/Customization/RoCC-Accelerators.rst
Normal file
70
docs/Customization/RoCC-Accelerators.rst
Normal file
@@ -0,0 +1,70 @@
|
||||
.. _rocc-accelerators:
|
||||
|
||||
Adding a RoCC Accelerator
|
||||
----------------------------
|
||||
|
||||
RoCC accelerators are lazy modules that extend the ``LazyRoCC`` class.
|
||||
Their implementation should extends the ``LazyRoCCModule`` class.
|
||||
|
||||
.. code-block:: scala
|
||||
|
||||
class CustomAccelerator(opcodes: OpcodeSet)
|
||||
(implicit p: Parameters) extends LazyRoCC(opcodes) {
|
||||
override lazy val module = new CustomAcceleratorModule(this)
|
||||
}
|
||||
|
||||
class CustomAcceleratorModule(outer: CustomAccelerator)
|
||||
extends LazyRoCCModuleImp(outer) {
|
||||
val cmd = Queue(io.cmd)
|
||||
// The parts of the command are as follows
|
||||
// inst - the parts of the instruction itself
|
||||
// opcode
|
||||
// rd - destination register number
|
||||
// rs1 - first source register number
|
||||
// rs2 - second source register number
|
||||
// funct
|
||||
// xd - is the destination register being used?
|
||||
// xs1 - is the first source register being used?
|
||||
// xs2 - is the second source register being used?
|
||||
// rs1 - the value of source register 1
|
||||
// rs2 - the value of source register 2
|
||||
...
|
||||
}
|
||||
|
||||
|
||||
The ``opcodes`` parameter for ``LazyRoCC`` is the set of custom opcodes that will map to this accelerator.
|
||||
More on this in the next subsection.
|
||||
|
||||
The ``LazyRoCC`` class contains two TLOutputNode instances, ``atlNode`` and ``tlNode``.
|
||||
The former connects into a tile-local arbiter along with the backside of the L1 instruction cache.
|
||||
The latter connects directly to the L1-L2 crossbar.
|
||||
The corresponding Tilelink ports in the module implementation's IO bundle are ``atl`` and ``tl``, respectively.
|
||||
|
||||
The other interfaces available to the accelerator are ``mem``, which provides access to the L1 cache;
|
||||
``ptw`` which provides access to the page-table walker;
|
||||
the ``busy`` signal, which indicates when the accelerator is still handling an instruction;
|
||||
and the ``interrupt`` signal, which can be used to interrupt the CPU.
|
||||
|
||||
Look at the examples in ``generators/rocket-chip/src/main/scala/tile/LazyRocc.scala`` for detailed information on the different IOs.
|
||||
|
||||
Adding RoCC accelerator to Config
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
RoCC accelerators can be added to a core by overriding the ``BuildRoCC`` parameter in the configuration.
|
||||
This takes a sequence of functions producing ``LazyRoCC`` objects, one for each accelerator you wish to add.
|
||||
|
||||
For instance, if we wanted to add the previously defined accelerator and route custom0 and custom1 instructions to it, we could do the following.
|
||||
|
||||
.. code-block:: scala
|
||||
|
||||
class WithCustomAccelerator extends Config((site, here, up) => {
|
||||
case BuildRoCC => Seq((p: Parameters) => LazyModule(
|
||||
new CustomAccelerator(OpcodeSet.custom0 | OpcodeSet.custom1)(p)))
|
||||
})
|
||||
|
||||
class CustomAcceleratorConfig extends Config(
|
||||
new WithCustomAccelerator ++
|
||||
new RocketConfig)
|
||||
|
||||
To add RoCC instructions in your program, use the RoCC C macros provided in ``tests/rocc.h``. You can find examples in the files ``tests/accum.c`` and ``charcount.c``.
|
||||
|
||||
27
docs/Customization/RoCC-or-MMIO.rst
Normal file
27
docs/Customization/RoCC-or-MMIO.rst
Normal file
@@ -0,0 +1,27 @@
|
||||
.. _rocc-vs-mmio:
|
||||
|
||||
RoCC vs MMIO
|
||||
------------
|
||||
|
||||
Accelerators or custom IO devices can be added to your SoC in several ways:
|
||||
|
||||
* MMIO Peripheral (a.k.a TileLink-Attached Accelerator)
|
||||
* Tightly-Coupled RoCC Accelerator
|
||||
|
||||
These approaches differ in the method of the communication between the processor and the custom block.
|
||||
|
||||
With the TileLink-Attached approach, the processor communicates with MMIO peripherals through memory-mapped registers.
|
||||
|
||||
In contrast, the processor communicates with a RoCC accelerators through a custom protocol and custom non-standard ISA instructions reserved in the RISC-V ISA encoding space.
|
||||
Each core can have up to four accelerators that are controlled by custom instructions and share resources with the CPU.
|
||||
RoCC coprocessor instructions have the following form.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
customX rd, rs1, rs2, funct
|
||||
|
||||
The X will be a number 0-3, and determines the opcode of the instruction, which controls which accelerator an instruction will be routed to.
|
||||
The ``rd``, ``rs1``, and ``rs2`` fields are the register numbers of the destination register and two source registers.
|
||||
The ``funct`` field is a 7-bit integer that the accelerator can use to distinguish different instructions from each other.
|
||||
|
||||
Note that communication through a RoCC interface requires a custom software toolchain, whereas MMIO peripherals can use that standard toolchain with appropriate driver support.
|
||||
@@ -3,18 +3,41 @@ Customization
|
||||
|
||||
These guides will walk you through customization of your system-on-chip:
|
||||
|
||||
- Contructing heterogenous systems-on-chip using the Chipyard generators and configuration system.
|
||||
- Contructing heterogenous systems-on-chip using the existing Chipyard generators and configuration system.
|
||||
|
||||
- Adding custom accelerators to your system-on-chip.
|
||||
- How to include your custom Chisel sources in the Chipyard build system
|
||||
|
||||
Hit next to get started!
|
||||
- Adding custom RoCC accelerators to an existing Chipyard core (BOOM or Rocket)
|
||||
|
||||
- Adding custom MMIO widgets to the Chipyard memory system by Tilelink or AXI4, with custom Top-level IOs
|
||||
|
||||
- Standard practices for using Keys, Traits, and Configs to parameterize your design
|
||||
|
||||
- Customizing the memory hierarchy
|
||||
|
||||
- Connect widgets which act as TileLink masters
|
||||
|
||||
- Adding custom blackboxed verilog to a Chipyard design
|
||||
|
||||
We also provide information on:
|
||||
|
||||
- The boot process for Chipyard SoCs
|
||||
|
||||
- Examples of FIRRTL transforms used in Chipyard, and where they are specified
|
||||
|
||||
We recommend reading all these pages in order. Hit next to get started!
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: Customization:
|
||||
|
||||
Heterogeneous-SoCs
|
||||
Adding-An-Accelerator
|
||||
Custom-Chisel
|
||||
RoCC-or-MMIO
|
||||
RoCC-Accelerators
|
||||
MMIO-Peripherals
|
||||
Keys-Traits-Configs
|
||||
DMA-Devices
|
||||
Incorporating-Verilog-Blocks
|
||||
Memory-Hierarchy
|
||||
Boot-Process
|
||||
|
||||
Reference in New Issue
Block a user