Toeplitz hashing#
Even though the quantum leftover hash lemma has solved our main problem by proving that families of two-universal functions are quantum-proof randomness extractors, we still need to construct these families. Before we mentioned that for any integers \(0\leq m\leq n\) it is possible to find them. In this section we will present two particular families of two-universal functions that are quite relevant for QKD: Toeplitz hashing and modified Toeplitz hashing.
Complexity & seed length
Toeplitz hashing can be computed very efficiently. Its complexity is \(O(n\log n)\). However, it requires seeds that grow linearly with the sum of the input and output lengths. Modified Toeplitz hashing only depends on the input length but the growth is still linear.
Toeplitz matrices#
Toeplitz1They are named after Otto Toeplitz (1881 - 1940), a German mathematician that worked mainly in functional analysis. matrices are matrices fully characterized by their first row and first column. In other words, they are diagonal-constant matrices.
Note
Toeplitz matrices don’t need to be square matrices. The only condition is that their descending diagonals, going left to right, are constant.
In other words, \(T\) is a Toeplitz matrix if it satisfies the following condition
Once we fix the dimensions of a Toeplitz matrix, let’s say \(m\times n\), the set of all possible binary Toeplitz matrices is two-universal. The mapping or hashing from \(\{0,1\}^n\) to \(\{0,1\}^m\) is done with a matrix-vector multiplication. The input \(x\) is shaped as \(n\times 1\) column vector and multiplied on the right with the Toeplitz matrix to obtain the \(m\times 1\) column vector \(z\)
Using the language of randomness extractors, the function \(\text{Ext}(X, Y)\) is constructed by defining a Toeplitz matrix \(T\) using the uniform seed \(Y\), which is used to populate the first row and column. The remaining matrix elements are determined by the relation \(T_{i,j} = T_{i-1, j-1}\). Then, the mapping is defined by the matrix-vector multiplication.
A convenient notation, which will be useful for later embedding this matrix in a circulant one, is to use the following mapping to reshape the vector seed into the matrix
Using this convention, we obtain
Efficient hashing with Fast Fourier Transform#
One of the main advantages of Toeplitz hashing is that it has a complexity \(O(n\log n)\). Note that this is more efficient that the usual matrix-vector multiplication complexity, which is \(O(nm)\). The reason for this is that the hashing can be computed using the Fast Fourier Transform (FFT).
In order to do this, our general Toeplitz matrix has first to be transformed into a particular kind of Toeplitz matrix: a square circulant matrix. This is a matrix in which all columns are composed of the same elements and each column is displaced one element down relative to the preceding column. Mathematically, an \(n\times n\) circulant matrix \(C\) has the form
In practice, we can convert our generic \(m\times n\) Toeplitz matrix \(T\) into an \((m+n-1) \times (m+n-1)\) circulant matrix \(\hat{T}\) by adding \(n-1\) additional rows and \(m-1\) columns to our matrix in the following way
A circulant matrix can be diagonalized using the \(n\times n\) square Fourier matrix \(F_n\), whose matrix elements are determined by
In particular, we can write
where \(q=m+n-1\) and \(y\) is the vector with the seed, or equivalently, the first column of the circulant matrix.
Finally, we can explicitly write the randomness extractor as
where the symbol \(\odot\) is to emphasize that the multiplication is the element-wise multiplication of the two vectors, \(\hat{x}\) is the input vector \(x\) from the weak randomness padded with \(m-1\) zeros, and \(\big|_{0\,\dots\,m-1}\) means that we only consider the first \(m\) bits of the vector after doing the inverse fast fourier transform.
Modified Toeplitz hashing#
We have seen that it is very efficient to compute the Toeplitz hashing. However, it requires a rather long seed. In particular, if we want to extract \(m\) bits from a \(n\)-bit string coming from a weak source, we need a uniform seed of length \(m+n-1\). We can reduce this requirement to only \(n-1\) bits by using a different family of two-universal functions: the modified Toeplitz hashing.
The trick is to define a new matrix \(H\) as the concatenation of a smaller Toeplitz matrix \(T'\) and the identity matrix. The new extractor is defined as
It can be proven that the class of functions determined by such matrices is still two-universal. However, because we are concatenating the identity matrix \(\mathbb{1}_m\), the Toeplitz matrix is of size \(m\times(n-m)\), and therefore we only need a seed of length \(n-1\).
The trick of embedding this matrix into a circulant one and use the fast fourier transform to compute the matrix-vector multiplication still applies to this new matrix \(H\).