Merge branch 'dev' into docs-generators-change

This commit is contained in:
Brendan Sweeney
2019-09-25 20:03:05 -07:00
committed by GitHub
99 changed files with 3850 additions and 1040 deletions

View File

@@ -2,7 +2,7 @@ Berkeley Out-of-Order Machine (BOOM)
==============================================
The `Berkeley Out-of-Order Machine (BOOM) <https://boom-core.org/>`__ is a synthesizable and parameterizable open source RV64GC RISC-V core written in the Chisel hardware construction language.
It serves as a drop-in replacement to the Rocket core given by Rocket Chip.
It serves as a drop-in replacement to the Rocket core given by Rocket Chip (replaces the RocketTile with a BoomTile).
BOOM is heavily inspired by the MIPS R10k and the Alpha 21264 out-of-order processors.
Like the R10k and the 21264, BOOM is a unified physical register file design (also known as “explicit register renaming”).
Conceptually, BOOM is broken up into 10 stages: Fetch, Decode, Register Rename, Dispatch, Issue, Register Read, Execute, Memory, Writeback and Commit.

View File

@@ -2,7 +2,14 @@ Hwacha
====================================
The Hwacha project is developing a new vector architecture for future computer systems that are constrained in their power and energy consumption.
Inspired by traditional vector machines from the 70s and 80s, and lessons learned from our previous vector-thread architectures Scale and Maven, we are bringing back elegant, performant, and energy-efficient aspects of vector processing to modern data-parallel architectures.
We propose a new vector-fetch architectural paradigm, which focuses on the following aspects for higher performance, better energy efficiency, and lower complexity.
The Hwacha project is inspired by traditional vector machines from the 70s and 80s, and lessons learned from our previous vector-thread architectures such as Scale and Maven
The Hwacha project includes the Hwacha microarchitecture generator, as well as the ``XHwacha`` non-standard RISC-V extension. Hwacha does not implement the RISC-V standard vector extension proposal.
For more information, please visit the `Hwacha website <http://hwacha.org/>`__.
For more information on the Hwacha project, please visit the `Hwacha website <http://hwacha.org/>`__.
To add the Hwacha vector unit to an SoC, you should add the ``hwacha.DefaultHwachaConfig`` config mixin to the SoC configurations. The Hwacha vector unit uses the RoCC port of a Rocket or BOOM `tile`, and by default connects to the memory system through the `System Bus` (i.e., directly to the L2 cache).
To change the configuration of the Hwacha vector unit, you can write a custom configuration to replace the ``DefaultHwachaConfig``. You can view the ``DefaultHwachaConfig`` under `generators/hwacha/src/main/scala/configs.scala <https://github.com/ucb-bar/hwacha/blob/master/src/main/scala/configs.scala>`__ to see the possible configuration parameters.
Since Hwacha implements a non-standard RISC-V extension, it requires a unique software toolchain to be able to compile and asseble its vector instructions.
To install the Hwacha toolchain, run the ``./scripts/build-toolchains.sh esp-tools`` command within the root Chipyard directory. This may take a while, and it will install the ``esp-tools-install`` directory within your Chipyard root directory. ``esp-tools`` is a fork of ``riscv-tools`` (formerly a collection of relevant software RISC-V tools) that was enhanced with additional non-standard vector instructions. However, due to the upstreaming of the equivalent RISC-V toolchains, ``esp-tools`` may not be up-to-date with the latest mainline version of the tools included in it.

View File

@@ -0,0 +1,72 @@
Rocket Chip
===========
Rocket Chip generator is an SoC generator developed at Berkeley and now supported by
SiFive. Chipyard uses the Rocket Chip generator as the basis for producing a RISC-V SoC.
`Rocket Chip` is distinct from `Rocket core`, the in-order RISC-V CPU generator.
Rocket Chip includes many parts of the SoC besides the CPU. Though Rocket Chip
uses Rocket core CPUs by default, it can also be configured to use the BOOM
out-of-order core generator or some other custom CPU generator instead.
A detailed diagram of a typical Rocket Chip system is shown below.
.. image:: ../_static/images/rocketchip-diagram.png
Tiles
-----
The diagram shows a dual-core ``Rocket`` system. Each ``Rocket`` core is
grouped with a page-table walker, L1 instruction cache, and L1 data cache into
a ``RocketTile``.
The ``Rocket`` core can also be swapped for a ``BOOM`` core. Each tile can
also be configured with a RoCC accelerator that connects to the core as a
coprocessor.
Memory System
-------------
The tiles connect to the ``SystemBus``, which connect it to the L2 cache banks.
The L2 cache banks then connect to the ``MemoryBus``, which connects to the
DRAM controller through a TileLink to AXI converter.
To learn more about the memory hierarchy, see :ref:`Memory Hierarchy`.
MMIO
----
For MMIO peripherals, the ``SystemBus`` connects to the ``ControlBus`` and ``PeripheryBus``.
The ``ControlBus`` attaches standard peripherals like the BootROM, the
Platform-Level Interrupt Controller (PLIC), the core-local interrupts (CLINT),
and the Debug Unit.
The BootROM contains the first stage bootloader, the first instructions to run
when the system comes out of reset. It also contains the Device Tree, which is
used by Linux to determine what other peripherals are attached.
The PLIC aggregates and masks device interrupts and external interrupts.
The core-local interrupts include software interrupts and timer interrupts for
each CPU.
The Debug Unit is used to control the chip externally. It can be used to load
data and instructions to memory or pull data from memory. It can be controlled
through a custom DMI or standard JTAG protocol.
The ``PeripheryBus`` attaches additional peripherals like the NIC and Block Device.
It can also optionally expose an external AXI4 port, which can be attached to
vendor-supplied AXI4 IP.
To learn more about adding MMIO peripherals, check out the :ref:`MMIO Peripheral`
section of :ref:`Adding an Accelerator/Device`.
DMA
---
You can also add DMA devices that read and write directly from the memory
system. These are attached to the ``FrontendBus``. The ``FrontendBus`` can also
connect vendor-supplied AXI4 DMA devices through an AXI4 to TileLink converter.
To learn more about adding DMA devices, see the :ref:`Adding a DMA port` section
of :ref:`Adding an Accelerator/Device`.

View File

@@ -1,8 +1,9 @@
Rocket
Rocket Core
====================================
`Rocket <https://github.com/freechipsproject/rocket-chip>`__ is a 5-stage in-order scalar core generator that is supported by `SiFive <https://www.sifive.com/>`__.
It supports the open source RV64GC RISC-V instruction set and is written in the Chisel hardware construction language.
`Rocket <https://github.com/freechipsproject/rocket-chip>`__ is a 5-stage in-order scalar processor core generator, originally developed at UC Berkeley and currently supported by `SiFive <https://www.sifive.com/>`__. The `Rocket core` is used as a component within the `Rocket Chip SoC generator`. A Rocket core combined with L1 caches (data and instruction caches) form a `Rocket tile`. The `Rocket tile` is the replicable component of the `Rocket Chip SoC generator`.
The Rocket core supports the open-source RV64GC RISC-V instruction set and is written in the Chisel hardware construction language.
It has an MMU that supports page-based virtual memory, a non-blocking data cache, and a front-end with branch prediction.
Branch prediction is configurable and provided by a branch target buffer (BTB), branch history table (BHT), and a return address stack (RAS).
For floating-point, Rocket makes use of Berkeleys Chisel implementations of floating-point units.

80
docs/Generators/SHA3.rst Normal file
View File

@@ -0,0 +1,80 @@
SHA3 RoCC Accelerator
===================================
The SHA3 accelerator is a basic RoCC accelerator for the SHA3 hashing algorithm.
We like using SHA3 in Chipyard tutorial content because it is a self-contained, simple
example of integrating a custom accelerator into Chipyard.
Introduction
-----------------------------------
Secure hashing algorithms represent a class of hashing functions that provide four attributes: ease
of hash computation, inability to generate the message from the hash (one-way property), inability
to change the message and not the hash (weakly collision free property), and inability to find
two messages with the same hash (strongly collision free property). The National Institute of
Standards and Technology (NIST) recently held a competition for a new algorithm to be added to
its set of Secure Hashing Algorithms (SHA). In 2012 the winner was determined to be the Keccak
hashing function and a rough specification for SHA3 was established. The algorithm operates on
variable length messages with a sponge function, and thus alternates between absorbing chunks of
the message into a set of state bits and permuting the state. The absorbing is a simple bitwise
XOR while the permutation is a more complex function composed of several operations, χ, θ, ρ,
π, ι, that all perform various bitwise operations, including rotations, parity calculations, XORs,
etc. The Keccak hashing function is parameterized for different sizes of state and message chunks
but for this accelerator we will only support the Keccak-256 variant with 1600 bits of state and
1088 bit message chunks. A diagram of the SHA3 accelerator is shown below.
.. image:: ../_static/images/sha3.png
Technical Details
------------------------------------
The accelerator is designed around three sub-systems, an
interface with the processor, an interface with memory, and
the actual hashing computation system. The interface
with the processor is designed using the ROCC interface for
coprocessors integrating with the RISC-V Rocket/BOOM
processor. It includes the ability to transfer two 64 bit
words to the co-processor, the request for a return value,
and a small field for the function requested. The accelerator
receives these requests using a ready/valid interface. The
ROCC instruction is parsed and the needed information is
stored into a execution context. The execution context contains
the memory address of the message being hashed, the memory address
to store the resulting hash in, the length of the message, and
several other control fields.
Once the execution context is valid the memory subsystem
then begins to fetch chunks of the message. The memory
subsystem is fully decoupled from the other subsystems
and maintains a single full round memory buffers.
The accelerators memory interface can provide a
maximum of one 64 bit word per cycle which corresponds
to 17 requests needed to fill a buffer (the size is dictated by
the SHA3 algorithm). Memory requests to fill these buffers
are sent out as rapidly as the memory interface can handle,
with a tag field set to allow the different memory buffers
requests to be distinguished, as they may be returned out of
order. Once the memory subsystem has filled a buffer the
control unit absorbs the buffer into the execution
context, at which point the execution context is free to
begin permutation, and the memory buffer is free to send
more memory requests.
After the buffer is absorbed, the hashing computation
subsystem begins the permutation operations. Once
the message is fully hashed, the hash is written to memory
with a simple state machine.
Using a SHA3 Accelerator
------------------------
Since the SHA3 accelerator is designed as a RoCC accelerator,
it can be mixed into a Rocket or BOOM core by overriding the
BuildRoCC key. The configuration mixin is defined in the SHA3
generator. An example configuration highlighting the use of
this mixin is shown here:
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
:language: scala
:start-after: DOC include start: Sha3Rocket
:end-before: DOC include end: Sha3Rocket

View File

@@ -0,0 +1,45 @@
SiFive Generators
==================
Chipyard includes several open-source generators developed and maintained by `SiFive <https://www.sifive.com/>`__.
These are currently organized within two submodules named ``sifive-blocks`` and ``sifive-cache``.
Last-Level Cache Generator
-----------------------------
``sifive-cache`` includes last-level cache geneator. The Chipyard framework uses this last-level cache as an L2 cache. To use this L2 cache, you should add the ``freechips.rocketchip.subsystem.WithInclusiveCache`` mixin to your SoC configuration.
To learn more about configuring this L2 cache, please refer to the :ref:`memory-hierarchy` section.
Peripheral Devices
-------------------
``sifive-blocks`` includes multiple peripheral device generators, such as UART, SPI, PWM, JTAG, GPIO and more.
These peripheral devices usually affect the memory map of the SoC, and its top-level IO as well.
To integrate one of these devices in your SoC, you will need to define a custom mixin with the approriate address for the device using the Rocket Chip parameter system. As an example, for a GPIO device you could add the following mixin to set the GPIO address to ``0x10012000``. This address is the start address for the GPIO configuration registers.
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
:language: scala
:start-after: DOC include start: WithGPIO
:end-before: DOC include end: WithGPIO
Additionally, if the device requires top-level IOs, you will need to define a mixin to change the top-level configuration of your SoC.
When adding a top-level IO, you should also be aware of whether it interacts with the test-harness.
For example, a GPIO device would require a GPIO pin, and therefore we would write a mixin to augment the top level as follows:
.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala
:language: scala
:start-after: DOC include start: WithGPIOTop
:end-before: DOC include end: WithGPIOTop
This example instantiates a top-level module with include GPIO ports (``TopWithGPIO``), and then ties-off the GPIO port inputs to 0 (``false.B``).
Finally, you add the relevant config mixin to the SoC config. For example:
.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala
:language: scala
:start-after: DOC include start: GPIORocketConfig
:end-before: DOC include end: GPIORocketConfig
Some of the devices in ``sifive-blocks`` (such as GPIO) may already have pre-defined mixins within the Chipyard example project. You may be able to use these config mixins directly, but you should be aware of their addresses within the SoC address map.

View File

@@ -18,7 +18,9 @@ so changes to the generators themselves will automatically be used when building
:maxdepth: 2
:caption: Generators:
Rocket-Chip
Rocket
BOOM
Hwacha
SiFive-Generators
SHA3