We introduce a new cryptographic tool: multiset hash functions. Unlike standard
hash functions which take strings as input, multiset hash functions operate on multisets (or
sets). They map multisets of arbitrary finite size to strings (hashes) of fixed length. They are
incremental in that, when new members are added to the multiset, the hash can be updated
in time proportional to the change. The functions may be multiset-collision resistant in that
it is diącult to find two multisets which produce the same hash, or just set-collision resistant
in that it is diącult to find a set and a multiset which produce the same hash.
In particular, we introduce four multiset hash functions, each with its own advantages.
MSet-XOR-Hash uses the XOR operation and is very eącient; however, it uses a secret key
and is only set-collision resistant. MSet-Add-Hash uses addition modulo a large integer and,
thus, is slightly less eącient than MSet-XOR-Hash; MSet-Add-Hash also uses a secret key but it
is multiset-collision resistant. MSet-Mu-Hash uses finite field arithmetic and is not as eącient
as the other two hash functions; however, MSet-Mu-Hash is multiset-collision resistant, and
unlike the other two hash functions, does not require a secret key. MSet-VAdd-Hash is more
eącient than MSet-Mu-Hash; it is also multiset-collision resistant, and does not use a secret
key, but the hashes it produces are significantly longer than the hashes of the other functions.
The proven security of MSet-XOR-Hash and MSet-Add-Hash is quantitative. We reduce the
hardness of finding collisions to the hardness of breaking the underlying pseudorandom
functions. The proven security of MSet-Mu-Hash is in the random oracle model and is based
on the hardness of the discrete logarithm problem. The proven security of MSet-VAdd-Hash
is also in the random oracle model and is based on the hardness of the worst-case shortest
vector problem.
We demonstrate how set-collision resistant multiset hash functions make an existing o˛ine
memory integrity checker secure against active adversaries. We improve on this checker such
that it can use smaller time stamps without increasing the frequency of checks. The improved
checker uses multiset-collision resistant hash functions |