
Research
Abstracts  2007 
Locally Decodable Codes and Private Information Retrieval SchemesSergey YekhaninProblem OverviewA line of research [BFLS,Sud,CGKS,KT] originating in early 1990s has led to discovery of two related beautiful notions that are of central importance to modern complexity theory and cryptography, namely that of Locally Decodable Codes (LDCs) and Private Information Retrieval schemes (PIRs). LDCs [KT] are a special kind of errorcorrecting codes. Errorcorrecting codes are used to ensure reliable transmission of information over noisy channels as well as to ensure reliable storage information on a medium that may be partially corrupted over time (or whose reading device is subject to errors). In both of these applications the message is typically partitioned into small blocks and then each block is encoded separately. Such encoding strategy allows efficient randomaccess retrieval of the information, since one needs to decode only the portion of data one is interested in. Unfortunately, this strategy yields very poor noise resilience, since in case even a single block (out of possibly tens of thousands) is completely corrupted some information is lost. It would seem preferable to encode the whole message into a single codeword of an errorcorrecting code. Such solution clearly improves the robustness to noise, but is also hardly satisfactory, since one now needs to look at the whole codeword in order to recover any particular bit of the message (at least in the case when classical errorcorrecting codes are used). Such decoding complexity is prohibitive for modern massive datasets. Locally decodable codes are errorcorrecting codes that avoid the problem mentioned above by having extremely efficient sublineartime decoding algorithms. More formally, a qquery locally decodable code C encodes nbit messages x in such a way that one can probabilistically recover any bit x_i of the message by querying only q bits of the (possibly corrupted) codeword C(x), where q can be as small as 2. Apart from the natural application to data transmission and storage [KT] locally decodable codes have found numerous applications within complexity theory in the context of worstcase to averagecase reductions and probabilistically checkable proofs [T_survey]. The major goal of LDC related research is to establish the optimal tradeoff between length and query complexity of such codes. Private Information Retrieval schemes [CGKS] are cryptographic protocols developed in order to protect the privacy of the user accessing a public database. In such schemes a database (modelled by an nbit string x) is replicated between q noncommunicating servers. The user holds an index i and is interested in obtaining the value of the bit x_i. To achieve this goal, the user queries each of the servers and gets replies from which the desired bit x_i can be computed. The query to each server is distributed independently of i and therefore each server gets no information about what the user is after. Private information retrieval schemes are exploited for solving an array of specific tasks (such as secure keyword search, private approximations, private statistics, etc.). The main parameter of interest in a PIR scheme is its communication complexity, namely the number of bits exchanged by the user accessing an nbit database and the servers. Our contributionsIn [Y_nicePir] we present a construction of locally decodable codes whose length is vastly shorter than that of previous constructions. Specifically, we show that every Mersenne prime p = 2^t  1 yields a family of three query locally decodable codes of length Exp(n^{1/t}). The largest Mersenne prime known currently has t>10^7. Moreover a classical numbertheoretic conjecture states that in fact the number of Mersenne primes is infinite. Using this assumption my constructions yield the first family of three query locally decodable codes of subexponential length. Analogous improvements follow for private information retrieval schemes. In particular, under the assumption mentioned above we obtain the first family of three server PIR schemes with subpolynomial communication. The results above were not expected by the community. After a decade of effort, many researchers in the area were pessimistic and believed that locally decodable codes with constant query complexity and subexponential length or private information retrieval schemes with constant number of servers and subpolynomial communication do not exist. In particular, such conjectures were published explicitly in [Go,Ga]. Our results in [Y_nicePir] yield tremendous improvements for private information retrieval schemes involving three or more servers, and provide no insights on the two server case. This raises a natural question regarding whether the two server case is truly intrinsically different. Our results in [RY] suggest that this may well be the case. We employ modular representation theory of finite groups to establish the optimality of the currently best known two server schemes [CGKS,BIKR,WY] in a restricted although fairly broad model. In [WY] we present a lucid algebraic approach to private information retrieval revealing the clean interpolation idea that makes nontrivial PIRs possible. We also simplify some of the earlier results of [BIKR] and obtain improvements for various generalized settings of the problem. The main bottleneck that prevents the currently best known PIR schemes from wide practical deployment is the overwhelming amount of computation the servers need to perform in order to response to users' queries. We plan to address this issue in our future research as well as to continue working on communication complexity of PIRs. References:[BFLS] L. Babai, L. Fortnow, L. Levin, and M. Szegedy, Checking computations in polylogarithmic time. In Proc. of the 23th ACM Sym. on Theory of Computing (STOC), pp. 2131, 1991. [BIKR] A. Beimel, Y. Ishai, E. Kushilevitz, and J. F. Raymond. Breaking the O(n^{1/(2k1)}) Barrier for InformationTheoretic Private Information Retrieval, In Proc. of the 43rd IEEE Symposium on Foundations of Computer Science (FOCS), pp. 261270, 2002. [CGKS] B. Chor, O. Goldreich, E. Kushilevitz, and M. Sudan. Private information retrieval, In Proc. of the 36rd IEEE Symposium on Foundations of Computer Science (FOCS), pp. 4150, 1995. Also, in Journal of the ACM, 45, 1998. [Ga] W. Gasarch, A survey on private information retrieval, The Bulletin of the EATCS, 82:72107, 2004. [Go] O. Goldreich, Short Locally Testable Codes and Proofs, Technical Report TR05014, Electronic Colloquium on Computational Complexity (ECCC), 2005. [KT] J. Katz and L. Trevisan, On the efficiency of local decoding procedures for errorcorrecting codes, In Proc. of the 32th ACM Sym. on Theory of Computing (STOC), pp. 8086, 2000. [RY] A. Razborov and S. Yekhanin An Omega(n^{1/3}) Lower Bound for Bilinear Group Based Private Information Retrieval, In Proc. of the 47rd IEEE Symposium on Foundations of Computer Science (FOCS), 2006. [Sud] M. Sudan, Efficient Checking of Polynomials and Proofs and the Hardness of Approximation Problems. PhD thesis, University of California at Berkeley, 1992. [T_survey] L. Trevisan, Some Applications of Coding Theory in Computational Complexity, Quaderni di Matematica, 13:347424, 2004. [WY] D. Woodruff and S. Yekhanin, A Geometric Approach to Information Theoretic Private Information Retrieval, In Proc. of the 20th IEEE Computational Complexity Conference (CCC), pp. 275284, 2005. [Y_nicePir] S. Yekhanin New Locally Decodable Codes and Private Information Retrieval Schemes, Electronic Colloquium on Computational Complexity (ECCC), TR06127. 

