The two most important areas of cryptography are encryption and authentication - making sure that no one except the legitimate receiver reads the message and making sure that nobody except the legitimate sender writes or modifies it. A typical encryption scenario is Alice wanting to send Bob a secret message, but instead of sending the message directly she sends something that Bob can transform to the real message but which means nothing to anyone else. A typical authentication scenario is Alice sending Bob a message which Bob wants to be sure originates from Alice and nobody else. Alice sends the message as-is but also attaches a tag5.1, a few bytes large, which depends on the message. Typically only one tag is valid for each possible message and nobody except Alice and Bob5.2, knows beforehand which one. When Bob has received both the message and the tag he can verify that the tag is correct and conclude that the message really originated from Alice, or at least someone with access to Alice's key, and has not been tampered with on the way to him.
Whenever encryption is explained the one-time pad encryption, commonly referred to as
simply OTP, almost always serves as
an enlightening example. OTP was co-invented in 1917 by Gilbert
Vernam and Major Joseph Mauborgne (see e.g. [11]) and and in the
40's Claude Shannon proved both that it was unbreakable and that
any unbreakable encryption is essentially equivalent to OTP. The
encryption is very simple. For Alice to send Bob a message
they need to
share a secret completely random key
, the one-time pad, which needs
to be at least as long as the message and must never be reused.
Alice simply sends
XOR
and Bob
calculates
XOR
XOR
.5.3 OTP is almost never used in practice.
There exists many other encryption schemes that require smaller
keys, don't require a key known by both Alice and Bob and permits
reusing of keys. They are all theoretically breakable given
enough computation power or maybe good enough algorithms, but are
by most regarded secure enough. OTP serves mainly as an
example.
Even though OTP is universally known for providing unbreakable encryption, few know that something similar exists for authentication. It was invented in the late 70's by J. Lawrence Carter and Mark N. Wegman who published their discoveries in [12] and [13]. It is commonly referred to as Wegman-Carter (type) authentication. One can only speculate why it is almost completely unheard of in the popular science , but contributing factors are surely that it is much newer than OTP, that it is much more complicated to explain and that authentication itself is more complicated and often regarded as less interesting. Furthermore, one example of unbreakability is maybe enough for most purposes. On top of that, if something is encrypted with OTP it is impossible to extract any information at all about the message, but with any authentication scheme it is always possible for Eve to produce a random tag and hope it is the correct one for the message she wants to make Bob believe Alice has sent. The best that can be done for authentication is to make the probability that Eve succeeds arbitrarily small, and that is exactly what Wegman-Carter authentication does.
The main problem with OTP is that the required key needs to be at least as long as the message to be encrypted. Wegman-Carter authentication does not share this problem. The keys can be much shorter, in the order of a few bytes. The fact that the required keys can be much shorter than the message to be authenticated is essential for QKG. Each round of a QKG protocol generates a certain amount of shared secret key and requires far more communication which needs to be authenticated. If the key consumed by the authentication process is larger than the generated key we don't have Quantum Key Growing but Quantum Key Shrinking which would be quite pointless.
Other authentication methods exist where instead of just one tag being sent from Alice to Bob a dialogue is held with several messages going back and forth. They can be more effective in terms of consumed key but are not necessary for QKG and are beyond the scope of this work. This chapter describes unconditionally secure message authentication in the theoretical scenario where Alice and Bob share a completely secret key and wish to transmit an unmodified message through a channel completely controlled by Eve, not necessarily as part of a QKG system. In the next chapter the authentication is made more realistic for a QKG scenario by assuming that the key is not completely secret, and in Chapter 7 everything is put into a QKG context.
A very useful tool in cryptography is the concept of cryptographically secure hash functions. Unfortunately, like most cryptography used in the real world they are only secure against what is believed to be practical attacks and can be broken with enough computation power or, if they exist, good enough algorithms. It is impossible to construct an unbreakable cryptographically secure hash function. See e.g. [14] for definitions and proofs.
Cryptographically secure hash functions can be used for many things, one of them being message authentication. Although hash functions cannot be unbreakable, message authentication can. A word of warning is in place here regarding terminology. The fundamental building block of the unbreakable Wegman-Carter authentication is called universal families5.4 of hash functions, but those hash functions are quite different from the cryptographically secure hash functions mentioned above. They have similarities and both deserve to be called hash functions, but the individual hash functions of Wegman-Carter are not, and need not be, cryptographically secure in the classical sense.
Families of hash functions can be used for many things and
many different requirements can be put on them. A system has
evolved to express the requirements a family fulfills. For
authentication a definition of -almost
strongly-universal
(
-ASU
) is sufficient. Wegman
and Carter began with a stronger requirement in [12] but the keys needed to be far
too big for authentication to be practical. In [13] they showed that by loosening
the requirements somewhat the authentication can still be
acceptably secure but the required length of the keys shrinks
considerably. Although they defined similar requirements and
presented an example of an
-almost strongly-universal
family, they gave it no formal
definition. The first formal definition appeared in [15].
Note that it is not possible to have an
.
The special case
was
the unnecessarily strong requirement in [12] and those families are simply
called strongly universal
(SU
).
In theory
can be as large as 1, but in practice the family will not be of
much use unless
is rather close to
. One example of
a 1-almost strongly-universal
family is the
hash functions simply
defined as
, where
is the
modulo operation from computing, the remainder after division,
rather than the modular arithmetic of algebra. The number of hash
functions is equal to the number of tags, but a message/tag pair
uniquely identifies the hash function which makes the family
unsuitable for use in authentication.
Note also that the number of hash functions in the family must
be at least
, so the
key needed to specify a member of the family must be larger than
the generated tag.
A strongly universal family is in computer science often called
pairwise independent family of hash
functions. When hashing two distinct messages using the
same random hash function, the two resulting tags are
statistically independent. This is the significance of the number
2 in the subscript. Note that even though a set of random
variables are pairwise independent, they are not necessarily
mutually independent or even 3-wise independent. In general, a
strongly universal
family is the same thing a
-wise independent family and means that all sets
of
distinct
messages are mapped to statistically independent tags, which is a
stronger condition for higher
.
Wegman and Carter proposed several strongly
universal families
in [12] and one
-almost
strongly universal
in
[13]. The hash families of
Wegman-Carter are by no means unique or most effective, but since
they are the original ones they are both interesting from a
historical point of view and are often referenced. Furthermore,
they are quite easy to understand and their performance is,
although not optimal, good enough for many examples and
applications.
One of the families they described is a simple strongly
universal family
they called
.
Actually, the family is not really strongly universal
. It is however ``close'' (their
quotation marks) and they treat it as if it was strongly
universal
. We will
do the same.
An implementation of the family is available as function
H1 in hashfunctions.py line
85 . In
words, the family of hash functions mapping a message
to a tag
needs a key consisting of
three parts. The first part is any prime number
and
need not be secret. The other two parts are two secret integers
and
. The hash function defined by
this key simply maps a message
to a hash value
. As Wegman and Carter
write in [13], this can be
generalized using all polynomials of degree5.5 less
than 2 over any Galois field, and if the size of the Galois field
is divisible with the number of possible tags the family is
strongly universal
.
If not, the mapping from the field to a tag,
above, will favor the
lower tags somewhat. In our case the size of the field is always
a prime number and larger than the highest message, so unless the
tags can be larger than the message the mapping can not be
perfect, but will be quite close.
The secret key needed to select a hash function from this family needs to be very big, roughly twice as big as the message to be hashed. A QKG system could never work using this family as authentication. The traffic that needs authentication each round is much larger than the generated key, so the shared secret key would shrink. The next family is not quite as secure but the probability of guessing a tag is at most doubled and the required key size grows much slower than the message size.
![]() |
This
-almost
strongly universal
family works by picking several hash functions from a much
smaller but strongly universal
family and applying them in a hierarchical
manner. Let the smaller family consist of hash functions mapping
bit strings of length
to bit strings of length
, where
is slightly larger than the
length of the tag we want to produce. Divide the message into
substrings of length
, padding the last substring with zeroes if
necessary. Pick a hash function from the small family, apply that
function to each of the substrings and concatenate the results.
Repeat until only one substring of length
is left, using a new hash
function each repetition. Discard the most significant bits that
won't fit into the tag. What is left is the final tag.
One round of hashing halves the length of the message, regardless of its size, but uses only one hash function, and only one small key to pick that hash function. The total key length needed therefore grows with approximately the logarithm of the message length. This means a QKG system can always be designed with large enough rounds to make the key used for authentication acceptably small in comparison to the created shared secret.
For the full details of this family, see either [13] or the Python implementation
function Hprime in hashfunctions.py line 123
. For an implementation using that function together with the
strongly universal
from the previous example, see function Hprime_H1 in
hashfunctions.py line 186
. A functionally equivalent but more compact implementation of
the same function that might be easier to get an overview of, at
the expense of not following the Wegman-Carter papers as closely,
is available at function Hprime_H1_compact in
hashfunctions.py line
214 .
Now suppose Eve has control over the channel between Alice and
Bob and wants Bob to accept a faked message
.
To her the secret key is a random variable
uniform over its whole range
. If the key is a
random variable, so is the correct tag
. The
first condition of definition 1
says that if
is uniform over its whole range, so is
. She can take a guess, but any
guess has probability
to be
correct.
She may also wait until Alice tries to send an authenticated
message to Bob, pick up the message and the tag, and make sure
Bob never see them. With both and
at
her disposal she can, given enough computing power, rule out all
keys that do not match and be left with just
of the keys to
guess from. However, the second condition of
definition 1 says that even
with this knowledge she has, with
uniform over its whole range, at best the
probability
to guess the correct tag
for any
.
is never
smaller than
so
is clearly an upper
limit on the probability that Eve makes the right guess and
manages to fool Bob into accepting a fake message, at least if
Eve knows nothing about the key beforehand.
If the same key is used twice for authentication, the
definition of -almost strongly-universal
families makes no guarantees
about how hard it is to guess the correct tag corresponding to a
third message. The keys must therefore never be reused. For each
authenticated message, Alice and Bob must sacrifice
bits of their shared secret. Wegman and Carter describes in
chapter 4 in [13] a method
of sacrificing only
bits for each message.
Begin by choosing a hash function
randomly from an
-almost strongly-universal
family. This hash function will
be used for all messages, but the tag is calculated as
XOR
. In other words,
the tag is one-time pad encrypted using the one-time pad
the same size as
the tag.
If
is
-almost
strongly-universal
,
so is
XOR
. The
key, secret or not, merely reorders the tags which has no effect
on definition 1. Eve's chance
to guess the tag is therefore still limited by
. The one-time pad
encryption makes sure no information about the hash function
leaks to Eve, so the hash function can be safely reused an
arbitrary number of times as long as new one-time pads are used
each time.
To authenticate the first message both a hash function and a one-time pad needs to be chosen so the required key is larger than in the authentication described above. However, each message after the first needs just a key of the same size as the tag, so the average sacrificed key length per message will approach the size of the tag.