Merge branch 'dev' into howie-docs
This commit is contained in:
@@ -2,7 +2,14 @@ Hwacha
|
||||
====================================
|
||||
|
||||
The Hwacha project is developing a new vector architecture for future computer systems that are constrained in their power and energy consumption.
|
||||
Inspired by traditional vector machines from the 70s and 80s, and lessons learned from our previous vector-thread architectures Scale and Maven, we are bringing back elegant, performant, and energy-efficient aspects of vector processing to modern data-parallel architectures.
|
||||
We propose a new vector-fetch architectural paradigm, which focuses on the following aspects for higher performance, better energy efficiency, and lower complexity.
|
||||
The Hwacha project is inspired by traditional vector machines from the 70s and 80s, and lessons learned from our previous vector-thread architectures such as Scale and Maven
|
||||
The Hwacha project includes the Hwacha microarchitecture generator, as well as the ``XHwacha`` non-standard RISC-V extension. Hwacha does not implement the RISC-V standard vector extension proposal.
|
||||
|
||||
For more information, please visit the `Hwacha website <http://hwacha.org/>`__.
|
||||
For more information on the Hwacha project, please visit the `Hwacha website <http://hwacha.org/>`__.
|
||||
|
||||
To add the Hwacha vector unit to an SoC, you should add the ``hwacha.DefaultHwachaConfig`` config mixin to the SoC configurations. The Hwacha vector unit uses the RoCC port of a Rocket or BOOM `tile`, and by default connects to the memory system through the `System Bus` (i.e., directly to the L2 cache).
|
||||
|
||||
To change the configuration of the Hwacha vector unit, you can write a custom configuration to replace the ``DefaultHwachaConfig``. You can view the ``DefaultHwachaConfig`` under `generators/hwacha/src/main/scala/configs.scala <https://github.com/ucb-bar/hwacha/blob/master/src/main/scala/configs.scala>`__ to see the possible configuration parameters.
|
||||
|
||||
Since Hwacha implements a non-standard RISC-V extension, it requires a unique software toolchain to be able to compile and asseble its vector instructions.
|
||||
To install the Hwacha toolchain, run the ``./scripts/build-toolchains.sh esp-tools`` command within the root Chipyard directory. This may take a while, and it will install the ``esp-tools-install`` directory within your Chipyard root directory. ``esp-tools`` is a fork of ``riscv-tools`` (formerly a collection of relevant software RISC-V tools) that was enhanced with additional non-standard vector instructions. However, due to the upstreaming of the equivalent RISC-V toolchains, ``esp-tools`` may not be up-to-date with the latest mainline version of the tools included in it.
|
||||
|
||||
@@ -1,15 +1,15 @@
|
||||
RocketChip
|
||||
==========
|
||||
Rocket Chip
|
||||
===========
|
||||
|
||||
RocketChip is an SoC generator developed at Berkeley and now supported by
|
||||
SiFive. Chipyard uses RocketChip as the basis for producing a RISC-V SoC.
|
||||
Rocket Chip generator is an SoC generator developed at Berkeley and now supported by
|
||||
SiFive. Chipyard uses the Rocket Chip generator as the basis for producing a RISC-V SoC.
|
||||
|
||||
RocketChip is distinct from Rocket, the in-order RISC-V CPU generator.
|
||||
RocketChip includes many parts of the SoC besides the CPU. Though RocketChip
|
||||
uses Rocket CPUs by default, it can also be configured to use the BOOM
|
||||
`Rocket Chip` is distinct from `Rocket core`, the in-order RISC-V CPU generator.
|
||||
Rocket Chip includes many parts of the SoC besides the CPU. Though Rocket Chip
|
||||
uses Rocket core CPUs by default, it can also be configured to use the BOOM
|
||||
out-of-order core generator or some other custom CPU generator instead.
|
||||
|
||||
A detailed diagram of a typical RocketChip system is shown below.
|
||||
A detailed diagram of a typical Rocket Chip system is shown below.
|
||||
|
||||
.. image:: ../_static/images/rocketchip-diagram.png
|
||||
|
||||
@@ -1,8 +1,9 @@
|
||||
Rocket
|
||||
Rocket Core
|
||||
====================================
|
||||
|
||||
`Rocket <https://github.com/freechipsproject/rocket-chip>`__ is a 5-stage in-order scalar core generator that is supported by `SiFive <https://www.sifive.com/>`__.
|
||||
It supports the open source RV64GC RISC-V instruction set and is written in the Chisel hardware construction language.
|
||||
`Rocket <https://github.com/freechipsproject/rocket-chip>`__ is a 5-stage in-order scalar processor core generator, originally developed at UC Berkeley and currently supported by `SiFive <https://www.sifive.com/>`__. The `Rocket core` is used as a component within the `Rocket Chip SoC generator`. A Rocket core combined with L1 caches (data and instruction caches) form a `Rocket tile`. The `Rocket tile` is the replicable component of the `Rocket Chip SoC generator`.
|
||||
|
||||
The Rocket core supports the open-source RV64GC RISC-V instruction set and is written in the Chisel hardware construction language.
|
||||
It has an MMU that supports page-based virtual memory, a non-blocking data cache, and a front-end with branch prediction.
|
||||
Branch prediction is configurable and provided by a branch target buffer (BTB), branch history table (BHT), and a return address stack (RAS).
|
||||
For floating-point, Rocket makes use of Berkeley’s Chisel implementations of floating-point units.
|
||||
|
||||
80
docs/Generators/SHA3.rst
Normal file
80
docs/Generators/SHA3.rst
Normal file
@@ -0,0 +1,80 @@
|
||||
SHA3 RoCC Accelerator
|
||||
===================================
|
||||
The SHA3 accelerator is a basic RoCC accelerator for the SHA3 hashing algorithm.
|
||||
We like using SHA3 in Chipyard tutorial content because it is a self-contained, simple
|
||||
example of integrating a custom accelerator into Chipyard.
|
||||
|
||||
|
||||
Introduction
|
||||
-----------------------------------
|
||||
Secure hashing algorithms represent a class of hashing functions that provide four attributes: ease
|
||||
of hash computation, inability to generate the message from the hash (one-way property), inability
|
||||
to change the message and not the hash (weakly collision free property), and inability to find
|
||||
two messages with the same hash (strongly collision free property). The National Institute of
|
||||
Standards and Technology (NIST) recently held a competition for a new algorithm to be added to
|
||||
its set of Secure Hashing Algorithms (SHA). In 2012 the winner was determined to be the Keccak
|
||||
hashing function and a rough specification for SHA3 was established. The algorithm operates on
|
||||
variable length messages with a sponge function, and thus alternates between absorbing chunks of
|
||||
the message into a set of state bits and permuting the state. The absorbing is a simple bitwise
|
||||
XOR while the permutation is a more complex function composed of several operations, χ, θ, ρ,
|
||||
π, ι, that all perform various bitwise operations, including rotations, parity calculations, XORs,
|
||||
etc. The Keccak hashing function is parameterized for different sizes of state and message chunks
|
||||
but for this accelerator we will only support the Keccak-256 variant with 1600 bits of state and
|
||||
1088 bit message chunks. A diagram of the SHA3 accelerator is shown below.
|
||||
|
||||
.. image:: ../_static/images/sha3.png
|
||||
|
||||
Technical Details
|
||||
------------------------------------
|
||||
The accelerator is designed around three sub-systems, an
|
||||
interface with the processor, an interface with memory, and
|
||||
the actual hashing computation system. The interface
|
||||
with the processor is designed using the ROCC interface for
|
||||
coprocessors integrating with the RISC-V Rocket/BOOM
|
||||
processor. It includes the ability to transfer two 64 bit
|
||||
words to the co-processor, the request for a return value,
|
||||
and a small field for the function requested. The accelerator
|
||||
receives these requests using a ready/valid interface. The
|
||||
ROCC instruction is parsed and the needed information is
|
||||
stored into a execution context. The execution context contains
|
||||
the memory address of the message being hashed, the memory address
|
||||
to store the resulting hash in, the length of the message, and
|
||||
several other control fields.
|
||||
|
||||
Once the execution context is valid the memory subsystem
|
||||
then begins to fetch chunks of the message. The memory
|
||||
subsystem is fully decoupled from the other subsystems
|
||||
and maintains a single full round memory buffers.
|
||||
The accelerators memory interface can provide a
|
||||
maximum of one 64 bit word per cycle which corresponds
|
||||
to 17 requests needed to fill a buffer (the size is dictated by
|
||||
the SHA3 algorithm). Memory requests to fill these buffers
|
||||
are sent out as rapidly as the memory interface can handle,
|
||||
with a tag field set to allow the different memory buffers
|
||||
requests to be distinguished, as they may be returned out of
|
||||
order. Once the memory subsystem has filled a buffer the
|
||||
control unit absorbs the buffer into the execution
|
||||
context, at which point the execution context is free to
|
||||
begin permutation, and the memory buffer is free to send
|
||||
more memory requests.
|
||||
|
||||
After the buffer is absorbed, the hashing computation
|
||||
subsystem begins the permutation operations. Once
|
||||
the message is fully hashed, the hash is written to memory
|
||||
with a simple state machine.
|
||||
|
||||
|
||||
Using a SHA3 Accelerator
|
||||
------------------------
|
||||
Since the SHA3 accelerator is designed as a RoCC accelerator,
|
||||
it can be mixed into a Rocket or BOOM core by overriding the
|
||||
BuildRoCC key. The configuration mixin is defined in the SHA3
|
||||
generator. An example configuration highlighting the use of
|
||||
this mixin is shown here:
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: Sha3Rocket
|
||||
:end-before: DOC include end: Sha3Rocket
|
||||
|
||||
|
||||
45
docs/Generators/SiFive-Generators.rst
Normal file
45
docs/Generators/SiFive-Generators.rst
Normal file
@@ -0,0 +1,45 @@
|
||||
SiFive Generators
|
||||
==================
|
||||
|
||||
Chipyard includes several open-source generators developed and maintained by `SiFive <https://www.sifive.com/>`__.
|
||||
These are currently organized within two submodules named ``sifive-blocks`` and ``sifive-cache``.
|
||||
|
||||
Last-Level Cache Generator
|
||||
-----------------------------
|
||||
|
||||
``sifive-cache`` includes last-level cache geneator. The Chipyard framework uses this last-level cache as an L2 cache. To use this L2 cache, you should add the ``freechips.rocketchip.subsystem.WithInclusiveCache`` mixin to your SoC configuration.
|
||||
To learn more about configuring this L2 cache, please refer to the :ref:`memory-hierarchy` section.
|
||||
|
||||
|
||||
Peripheral Devices
|
||||
-------------------
|
||||
``sifive-blocks`` includes multiple peripheral device generators, such as UART, SPI, PWM, JTAG, GPIO and more.
|
||||
|
||||
These peripheral devices usually affect the memory map of the SoC, and its top-level IO as well.
|
||||
To integrate one of these devices in your SoC, you will need to define a custom mixin with the approriate address for the device using the Rocket Chip parameter system. As an example, for a GPIO device you could add the following mixin to set the GPIO address to ``0x10012000``. This address is the start address for the GPIO configuration registers.
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: WithGPIO
|
||||
:end-before: DOC include end: WithGPIO
|
||||
|
||||
Additionally, if the device requires top-level IOs, you will need to define a mixin to change the top-level configuration of your SoC.
|
||||
When adding a top-level IO, you should also be aware of whether it interacts with the test-harness.
|
||||
For example, a GPIO device would require a GPIO pin, and therefore we would write a mixin to augment the top level as follows:
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: WithGPIOTop
|
||||
:end-before: DOC include end: WithGPIOTop
|
||||
|
||||
This example instantiates a top-level module with include GPIO ports (``TopWithGPIO``), and then ties-off the GPIO port inputs to 0 (``false.B``).
|
||||
|
||||
|
||||
Finally, you add the relevant config mixin to the SoC config. For example:
|
||||
|
||||
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
|
||||
:language: scala
|
||||
:start-after: DOC include start: GPIORocketConfig
|
||||
:end-before: DOC include end: GPIORocketConfig
|
||||
|
||||
Some of the devices in ``sifive-blocks`` (such as GPIO) may already have pre-defined mixins within the Chipyard example project. You may be able to use these config mixins directly, but you should be aware of their addresses within the SoC address map.
|
||||
@@ -1,19 +1,28 @@
|
||||
.. _generator-index:
|
||||
|
||||
Generators
|
||||
============================
|
||||
|
||||
Generator can be thought of as a generalized RTL design, written using a mix of meta-programming and standard RTL.
|
||||
A Generator can be thought of as a generalized RTL design, written using a mix of meta-programming and standard RTL.
|
||||
This type of meta-programming is enabled by the Chisel hardware description language (see :ref:`Chisel`).
|
||||
A standard RTL design is essentially just a single instance of a design coming from a generator.
|
||||
However, by using meta-programming and parameter systems, generators can allow for integration of complex hardware designs in automated ways.
|
||||
The following pages introduce the generators integrated with the Chipyard framework.
|
||||
|
||||
Chipyard bundles the source code for the generators, under the ``generators`` directory.
|
||||
It builds them from source each time (although the build system will cache results if they have not changed),
|
||||
so changes to the generators themselves will automatically be used when building with Chipyard and propagate to software simulation, FPGA-accelerated simulation, and VLSI flows.
|
||||
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: Generators:
|
||||
|
||||
Rocket-Chip
|
||||
Rocket
|
||||
BOOM
|
||||
Hwacha
|
||||
RocketChip
|
||||
IceNet
|
||||
TestChipIP
|
||||
SiFive-Generators
|
||||
SHA3
|
||||
|
||||
Reference in New Issue
Block a user