CSAIL Research Abstracts - 2005 link to http://publications.csail.mit.edu/abstracts/abstracts05/index.html link to http://www.csail.mit.edu
bullet Introduction bullet Architecture, Systems
& Networks
bullet Language, Learning,
Vision & Graphics
bullet Physical, Biological
& Social Systems
bullet Theory bullet

horizontal line

Efficient, Verifiable Binary Sandboxing for a CISC Architecture

Stephen McCamant & Greg Morrisett1

1Department of Engineering and Applied Sciences, Harvard University

Introduction

A key requirement for many kinds of secure systems is to execute code from an untrusted source, while enforcing some policy to constrain the code's actions. The code might come directly from a malicious author, or it might have bugs that allow its execution to be subverted by maliciously chosen inputs. Typically, the system designer chooses some set of legal interfaces for interaction with the code, and the challenge is to ensure that the code's interaction with the rest of the system is limited to those interfaces. The most common technique for isolating untrusted code is the use of hardware virtual memory protection in the form of an operating system process, but for convenience and efficiency, most large applications, servers, and operating system kernels are constructed to run in a single address space. Another technique is to require that the untrusted code be written in a type-safe language such as Java, but this is not possible for the large amount of code written in unsafe languages such as C and C++.

We are investigating a code isolation technique that enforces a security policy similar to an operating system, but with ahead-of-time code verification more like a type system, by rewriting the machine instructions of code after compilation to directly enforce limits on memory access and control flow.

Approach

Our approach belongs to a class of techniques known as "software-based fault isolation" (SFI for short) or "sandboxing". However, previous SFI techniques were applicable only to RISC architectures [2], or gave faulty, incomplete, or undisclosed attention to key security issues. By contrast, our technique applies to CISC architectures like the Intel IA-32 (x86), and can be separately verified so as to minimize the amount of code that must be trusted to assure security.

The key challenge is that the x86 has variable-length instructions that might start at any byte; execution starting at an address that did not correspond to an intended instruction will instead execute other, usually nonsensical, ones. To avoid this, our tool artificially enforces its own alignment constraints: it divides memory into 16-byte chunks, inserts no-op instructions as padding so that no instruction crosses a chunk boundary, and ensures that the low 4 bits of jump instruction targets are always zero (Figure 1). The technique is further optimized by treating certain registers specially, by leaving guard pages around the constrained data region, by choosing virtual locations to minimize the number of enforcement instructions required, and by giving special treatment to procedure returns.

alignment of variable-length instructions

Figure 1. Illustration of the instruction alignment enforced by our technique. Black filled rectangles represent instructions of various lengths present in the original program. Gray outlined rectangles represent added no-op instructions. Call instructions (dark blue) go at the end of chunks, so that the corresponding return addresses are aligned.

Another important aspect of the technique is that its implementation is divided into two tools, one that rewrites instructions to add checks, and a second that verifies that correct checks have been added. Only the separate verifier needs to be trusted to ensure the technique's security: in particular, the rewriting can be performed by the untrusted code producer, and the verification performed by the eventual code user. Furthermore, the verifier can be very simple; it performs no optimization, and can be implemented in a single pass over the modified code with minimal internal state. To further increase our confidence in the security of the verifier, we have constructed a formal model of the key aspects of the technique and the verification algorithm, and proved a safety result for the model using the ACL2 theorem-proving environment.

Experience

We have implemented our technique in a prototype tool known as PittSFIeld (project home page). Though simple, PittSFIeld uses the separate verification architecture we advocate, and includes a few simple but fruitful optimizations. To test the technique's scalability and performance, we have used PittSFIeld to fault-isolate small benchmarks and several widely-used CPU-intensive Linux applications. The results (Figure 2) show PittSFIeld to be competitive in performance with previous x86 tools which were unsound [1], and robust enough to scale to the complexities of realistic systems (for instance, GCC implements its own garbage collector).

Name Description Lines of code Unmodified runtime PittSFIeld time ratio PittSFIeld size ratio Compressed size ratio
bzip2 Lossless compression 6753 21.68 s 1.28 1.96 1.24
bc Calculator 10212 12.41 s 1.45 1.81 1.19
oggenc Audio compression 58372 35.22 s 1.41 1.14 1.08
gcc C compiler 752986 167.75 s 1.30 1.84 1.10

Figure 2. Results of applying PittSFIeld to various open-source applications. Lines of code include comments and blank lines. For ratios, the unmodified program is 1.0. The compressed ratio column gives the a ratio of the compressed size of the PittSFIeld-isolated program to the compressed size of the original program; it is less than the regular size ratio because many of the added instructions are very similar, and hence easily compressible.

Future

The current prototype verifier was not designed for efficiency or trustworthiness; we would like to construct a more realistic one that could be used for performance testing and carefully audited. We would like to extend the machine-checked safety proof so that it describes the remaining important optimizations of the technique, and to use an independent specification of the processor's behavior. More ambitiously, we would be interested in exploring generating a verifier and a machine-checked safety proof from a single specification. Finally, we would like to apply the technique to realistic security-critical system components, such as device drivers or modules in network-accessible servers.

Research Support

This research is supported in part by a National Defense Science and Engineering Graduate Fellowship.

References:

[1] Christopher Small and Margo Seltzer. MiSFIT: A Tool for Constructing Safe Extensible C++ Systems. In Proceedings of the Third USENIX Conference on Object-Oriented Technologies, Portland, Oregon, USA, June 1997.

[2] Robert Wahbe, Steven Lucco, Thomas E. Anderson, and Susan L. Graham. Efficient Software-Based Fault Isolation. In Proceedings of the 14th Symposium on Operating Systems Principles, pp. 203-216, Asheville, North Carolina, USA, December 1993.

horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu
(Note: On July 1, 2003, the AI Lab and LCS merged to form CSAIL.)