|
Research
Abstracts - 2006
|
A Reconfigurable Architecture for Load-Balanced RenderingJiawen Chen, Michael I. Gordon, William Thies, Matthias Zwicker, Kari Pulli & Frédo DurandWhat:Our goal is to design a load-balanced programmable architecture for real-time 3D graphics. We would like it to efficiently utilize the available hardware resources to achieve the best performance over different input data. Why:Commodity graphics hardware has become increasingly programmable in recent years. Programmability has amplified the classic problem of dealing with a variable number of pixels per triangle. The programmer is now free to write sophisticated shader programs that can perform a diverse set of operations on the data stream. Hardware designers typically allocate a fixed amount of resources using specialized functional units to each stage of the graphics pipeline based on expectations on the input. Due to the variable nature of the input, fixed allocations will suffer from load-imbalances in many scenarios Graphics hardware is fast because it is exploits parallelism in the computation and uses specialized units. However, more and more parts of the pipeline are becoming programmable. We would like to explore what happens when the entire pipeline is parallel, and how general-purpose computation can solve the load-balancing problem. How:We take advantage of the MIT Raw Processor [1], a scalable tile-based
parallel processor, to design our load-balanced graphics pipeline. Raw
features an 2D array of programmable units which we can use for different
stages of the graphics pipeline. Rendering has been characterized as a
stream operation [2] and we use StreamIt [3] to express our design as
a stream computation.
We have implemented a full graphics pipeline on Raw using StreamIt and studied a number of algorithms and their performance bottlenecks. We propose a static load-balancing scheme, where the programmer profiles the application ahead of time and designs a number of profiles for various stages of the application. The load is statically balanced with respect to each stage. We demonstrated the technique and the increase in performance on several applications. Figure 1: Utilization comparison for per-pixel lighting application. Figure 2: Some
sample scenes rendered using out graphics pipeline. Progress:We have achieved a over 150% increase in throughput and over 100% in utilization in some scenarios. Future Work:Our current system allows for static load-balancing. It would be interesting to consider dynamic load-balancing, where the hardware, given a history of past inputs, is able to dynamically reallocate proessors to adapt to incoming data. AcknowledgementsWe thank Rodric Rabbah, Eric Chan, and Mike Doggett for all their help on the project. Research Support:Raw and StreamIt are supported by DARPA grant PCA F29601-03-2-0065, NSF award EIA-0071841, and the MIT Oxygen Alliance. In addition, StreamIt is supported by DARPA grant HPCA/PERCS W0133890 and NSF award CNS-0305453. References:[1] Michael B. Taylor et al. The Raw Microprocessor: A Computational Fabric for Software Circuits and General Purpose Programs. In IEEE Micro, 2002 [2] John D. Owens et al. Polygon Rendering on a Stream Architecture. In SIGGRAPH/Eurographics Workshop on Graphics Hardware, 2000 [3] William Thies et al. StreamIt: A Language for Streaming Applications. In International Conference on Compiler Construction, 2002 |
||||
|