Abstracts - 2006
Software Restart Markers
Mark Hampton & Krste Asanovic
As designers continue to push for higher performance in modern processors, exception handling mechanisms are increasingly becoming an obstacle to future progress. The precise exception model--which has been overwhelmingly popular due to its conceptual simplicity--has led to the use of various hardware structures that may constrain cycle time, significantly increase energy consumption, and limit processor scalability. The situation is exacerbated in highly parallel machines with large quantities of programmer-visible state, such as VLIW or vector processors. Because of the difficulty of supporting precise exceptions in these types of designs, they have not been widely used in general-purpose processing, as precise exception mechanisms have traditionally been needed to enable virtual memory. In this project, we are developing an exception framework designed to alleviate the hardware requirements typically associated with precise exceptions.
Implementing a precise exception model effectively requires instructions to be committed in program order, which is equivalent to placing a trap barrier on each instruction. Our approach to handling exceptions is to allow software to explicitly mark points in the instruction stream where restart is required rather than having an implicit restart point on every instruction (Figure 1). We encode restart points by marking the last instruction in a restart region as a barrier instruction. This trap barrier will commit and irrevocably update machine state only if it is guaranteed that it will not raise an exception and that any preceding instruction will not raise an exception. Also, the barrier instruction ensures that if an exception does occur before it commits, the effects of following instructions will not be visible. After handling a trap, the operating system resumes execution at the beginning of the restart region for the associated instruction--it is the compiler's responsibility to ensure that livelock does not occur.
One advantage of using software restart markers  is that instructions within a single restart region can be committed to architectural state in any order that preserves program correctness. This means that if regions are sufficiently large, the processor can simply execute instructions from one region at a time without needing to buffer the results produced. This can reduce or eliminate the need for structures such as reorder buffers or store buffers.
A second advantage of our approach is that it introduces a new class of temporary machine state that is only visible within a restart region, and thus does not need to be preserved across an exception. This category of state makes it possible to expose a large amount of state to the compiler without requiring additional hardware support for exception management--no access paths are needed to save and restore temporary state--and without the need to modify the exception handler. For example, internal machine pipeline state can be exposed to the compiler by mapping it to temporary state. This can enable a wide variety of performance and energy optimizations without complicating the hardware, such as making pipeline bypass latches visible to the compiler in order to reduce register file accesses. Alternatively, programmer-visible state that is mapped to temporary state--such as a vector register file--can be added to the processor without compromising the ability to support features such as virtual memory. This is due to the fact that the vector register values do not have to be saved and restored across an exception.
Progress and Future Work
We have implemented compiler analyses that can automatically insert software restart markers at different code granularities, ranging from an entire function down to a basic block. We have also developed schemes to alleviate the performance impact of potentially executing the same instructions in a restart region multiple times.
Additionally, we have implemented virtual memory in a vector processor by using our approach. Although we are still in the process of developing a vectorizing compiler for our target architecture, we currently have vectorized assembly code for a variety of EEMBC benchmarks. We added software restart markers to these programs by manually implementing our compiler algorithms. This resulted in less than a 1% performance reduction on average, as shown in Figure 2.
We are currently extending our exception model to be used within a variety of processing paradigms, such as VLIW processing and multithreaded execution. Additionally, we are developing compiler analyses that will be used to target the various features of the vector-thread architecture , which utilizes software restart markers to reduce the hardware overhead of managing exceptions.
This work was supported by an NTT graduate fellowship, DARPA PAC/C award F30602-00-2-0562, NSF CAREER award CCR-0093354, the Cambridge-MIT Institute, and a donation from Infineon.
 K. Asanovic, M. Hampton, R. Krashinsky, and E. Witchel. Energy-Exposed Instruction Sets. In Power-Aware Computing, R. Graybill and R. Melhem, editors. Kluwer/Plenum Publishing, 2002.
 Ronny Krashinsky, Christopher Batten, Mark Hampton, Steve Gerding, Brian Pharris, Jared Casper, and Krste Asanovic. The Vector-Thread Architecture. In 31st International Symposium on Computer Architecture, Munich, Germany, June 2004.