CSAIL Research Abstract

Introduction

Architecture, Systems
& Networks

Language, Learning,
Vision & Graphics

Physical, Biological
& Social Systems

Theory

horizontal line

The UNUM Framework: Microprocessor Components with Guarded Interfaces for Architectural Exploration

Nirav Dave, Michael Pellauer & Arvind

Introduction

We present UNUM: the UNified Universal Microprocessor framework. UNUM's primary benefits are the generality and reusability of components, which facilitate architectural exploration. UNUM is currently being used to model PowerPC processors, with the goal of extending the model to a multiprocessor environment.

Motivation

Given the complexity of modern microprocessors architects often turn to simulations to aid in design analysis. Ideally these simulations should offer fast and accurate processor models, and allow easy extension to a multiprocessor environment. They should provide a general, extensible framework for architectural exploration, and allow for component generalization and reuse.

For these models it is often desirable to use register-transfer level (RTL) descriptions. RTL code increases the designer's faith in the results and has a wide range of tools available for area and timing estimation. RTL designs can be mapped onto FPGAs for rapid simulation. Most importantly, the resulting RTL code can serve as a golden model for equivalence checking as project development progresses towards synthesis.

The disadvantage of RTL is that it can be time consuming to develop and debug. The high level of detail means that it is difficult to swap modules with different characteristics. For example, the simple act of replacing an unpipelined adder with a pipelined ALU capable of two addition operations per clock cycle can result in massive alterations to surrounding control logic and interconnects.

The UNUM Framework

We have developed UNUM, the UNified Universal Microprocessor framework, a system for rapidly designing, modelling and testing processors in Bluespec SystemVerilog [1]. UNUM is a collection of microprocessor components which communicate through guarded interfaces. Here is an overview of UNUM's high-level design:

The front-end consists of IMem, Fetch, and Decode modules. These feed into the Computation Control Unit (CCU). The CCU is an abstraction of processor issue logic, scoreboarding, and writeback logic. UNUM itself places no restrictions on what scheme is used, or on whether instructions are issued in-order or out-of-order.

The CCU issues instructions to three separate back-end units, the Load-Store Unit (LSU), Branch Unit (BRU) and Functional Units (FUs). This last is a catch-all for modules such as arithmatic and logic units. UNUM makes no assumptions about the number and capability of functional units as this area is open to architectural exploration. UNUM also does not restrict the placement of the register file. This allows the designer to organize it as part of the CCU and do operand lookup on issue, or to place it as a datapath element accessed directly by the functional units.

By adhering to UNUM's interfaces the hardware designer can seamlessly swap modules with different capabilities, timing characteristics or instruction widths and the compiler will infer all changes to control logic [2]. The Bluespec compiler can generate either cycle-accurate C or Verilog RTL, allowing the designer to perform rapid architectural exploration that ultimately results in a usable RTL golden model.

Modelling PowerPC Processors

Currently UNUM is being used to model PowerPC processors. We have developed a reusable library of components such as the Decoder or ALU, including an ROB generalizable to any superscalar width. We have combined these components into the Librum microprocessor. Librum does not represent any specific real-world processor, but enables us to perform datapath and library component verification. We demonstrate using UNUM to perform architecural exploration, including showing how the composition of guarded atomic actions can have a direct effect on circuit timing.

There is a high corespondance between UNUM's organization and that of an out-of-order ROB-based processor. For example, the PowerPC 603e is an out-of-order superscalar processor capable of issuing 3 instructions per cycle. Its orgnaization can be modelled in UNUM by using the CCU as the Dispatch and Completion units, as follows:

The framework can also be used to model in-order-pipeline, although there is less corespondence. Initial results show that an embedded in-order pipeline such as a PowerPC 405 can be modelled with 85% code reuse.

In the future we wish to extend UNUM to support multiprocessor environments and cache coherence protocols. We plan to leverage UNUM's RTL generation to use FPGAs for rapid architectural exploration. We are also looking for ways to add support for formal verification.

Research Support

Funding for this work has been provided by the IBM agreement number W0133890 as a part of DARPA's PERCS Projects.

References

[1] Hoe and Arvind. Operation-Centric Hardware Description and Synthesis. In IEEE TRANSACTIONS on Computer-Aided Design of Integrated Circuits and Systems, September 2004.

[2] Arvind, Nikhil, Rosenband and Dave. High-level Synthesis: An Essential Ingredient for Designing Complex ASICs. In Proceedings of ICCAD, 2004.

Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu
(Note: On July 1, 2003, the AI Lab and LCS merged to form CSAIL.)