CSAIL Research Abstract

Introduction

Architecture, Systems
& Networks

Language, Learning,
Vision & Graphics

Physical, Biological
& Social Systems

Theory

horizontal line

Cryptography and Coding Theory

Silvio Micali, Chris Peikert, Adam Smith, Madhu Sudan & David Wilson

This project examines the connections between the areas of "coding theory", the field that studies methods to protect information from errors, and "cryptography", the field that studies methods to protect protocols from malicious acts. Despite the seeming similarity in objectives, the two fields have developed quite differently and in this project we explore the possibility of applying tools and technique of one field to the other. The results below demonstrate a few sample cases where this exploration has already led to success.

Cryptographic Contributions to Coding Theory

One of the fundamental questions of interest to coding theory is the fraction of error that can be corrected, as a function of the amount of redundancy introduced during encoding. As is well-known, the exact answers to this question depend on the nature of the errors, and the notion of recovery from errors. Typical error-correcting codes are constructed under one of two extreme assumptions: (1) Either they assume the error is probabilistic (and thus oblivious to the encoding/decoding process), or (2) They assume the error is malicious, introduced by an all knowing, computationally unbounded, adversary. The results obtained in the two settings are dramatically different. In the former model one can deal with, for instance, an error rate of 49% when the channel is transmitting bits. In the latter model however, the maximum error rate that can be corrected unambiguously is 25%. (Some of these limitations can be overcome by accepting a weak version of recovery from errors, where the decoder is allowed to produce a small list of codewords, see the related project on List-Decoding. However, if one is unable to use a list, such as in real-time communication, list-decoding is unsatisfactory.)

To bridge the two models, we consider a channel which is adversarial, but is limited to feasible computation when introducing errors. In this model, we construct appealingly simple, efficient, cryptographic encoding and unique decoding schemes whose error-correction capability is much greater than classically possible. In particular:

For binary alphabets, we construct positive-rate coding schemes which are uniquely decodable from a 1/2 - ɣ error rate for any constant ɣ > 0.
For large alphabets, we construct coding schemes which are uniquely decodable from a 1-√R error rate for any information rate R > 0.

Our results are qualitatively stronger than related work: the construction works in the public-key model (requiring no shared secret key or joint local state) and allows the channel to know everything that the receiver knows. In addition, our techniques can potentially be used to construct coding schemes that have information rates approaching the Shannon limit. Finally, our construction is optimal for some choice of parameters: we show that unique decoding under high error rates is impossible in several natural relaxations of our model.

Future work will focus on the question of developing codes that are optimal for any given error-rate on any channel (and not just when the error rate is close to 50% on binary channels).

Coding Theoretic Contributions to Cryptography

Fuzzy Extractors

The choice of reliable keys in cryptography poses a major challenge for the future. We would like the keys to be unique to us, non-trivially long, and yet easy to remember. Unfortunately, the combination of these requirements tends to be contradictory in practice. An alternate approach is to let one's biometric properties (retina-scan, fingerprint, etc.) define one's key. However, this choice is not easy to accept either. Cryptographic keys are expected to be reproduced exactly, while biometrics are often subject to reading errors. This project attempts to use the tools of error-correction (from coding theory) to understand, analyze and resolve some of the basic questions in using biometric (or other fuzzy data) for keys.

We provide formal definitions and efficient secure techniques for:

turning biometric information into keys usable for any cryptographic application, and
reliably and securely authenticating biometric data.

Our techniques apply not just to biometric information, but to any keying material that, unlike traditional cryptographic keys, is (1) not reproducible precisely and (2) not distributed uniformly. We propose two primitives: a fuzzy extractor extracts nearly uniform randomness R from its biometric input; the extraction is error-tolerant in the sense that R will be the same even if the input changes, as long as it remains reasonably close to the original. Thus, R can be used as a key in any cryptographic application. A secure sketch produces public information about its biometric input w that does not reveal w, and yet allows exact recovery of w given another value that is close to w. Thus, it can be used to reliably reproduce error-prone biometric inputs without incurring the security risk inherent in storing them.

In addition to formally introducing our new primitives, we provide nearly optimal constructions of both primitives for various measures of "closeness" of input data, such as Hamming distance, edit distance, and set difference.

Error Correction in the Exponent

Given a corrupted word w = (w_1, ..., w_n) from a Reed-Solomon code of distance d, there are many ways to efficiently find and correct its errors. But what if we are instead given (g^{w_1}, ..., g^{w_n}) for some generator g of a large cyclic group --- can we still correct the errors? This problem is called error correction in the exponent, and though it arises naturally in many areas of cryptography, it has received scant attention.

We first show that unique decoding in the exponent, when the number of errors e < d/2, is no harder than the computational Diffie-Hellman (CDH) problem in the same group. The remainder of our results are negative:

Under mild assumptions on the parameters, we show that bounded-distance decoding in the exponent, under e=d-c errors for any constant c, is as hard as the discrete logarithm problem in the same group.
For generic algorithms that only perform group operations in a black-box manner, we show lower bounds for decoding that exactly match known algorithms.

Our generic lower bounds also extend to decisional variants of the decoding problem, and to groups in which the decisional Diffie-Hellman (DDH) problem is easy. This suggests that hardness of decoding in the exponent is a qualitatively new assumption that lies "between" the DDH and CDH assumptions, and may serve as a basis for new kinds of cryptographic schemes or improved constructions of old ones.

References:

[1] Silvio Micali, Chris Peikert, Madhu Sudan and David Wilson. Optimal Error Correction Against Computationally Bounded Noise. In Proceedings of the Second Theory of Cryptography Conference (TCC 2005), pp. 1--16, Cambridge, MA, USA, February 2005.

[2] Yevgeniy Dodis, Leonid Reyzin and Adam Smith. Fuzzy Extractors: How to Generate Strong Keys from Biometrics and Other Noisy Data. In Proceedings of Eurocrypt 2004, pp. 523-540, Interlaken, Switzerland, May 2004.

[3] Chris Peikert. On Error Correction in the Exponent. In submission.

Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu
(Note: On July 1, 2003, the AI Lab and LCS merged to form CSAIL.)