API Reference

Contents

API Reference#

Factory class#

class randextract.RandomnessExtractor#

A factory class to create seeded randomness extractors. It also serves as an abstract class with the minimum methods and properties that any implementation class should have.

classmethod register_subclass(extractor_type)#

Decorator to register an implementation class.

Parameters:

extractor_type (str) – unique label of the extractor

Example

@RandomnessExtractor.register_subclass('toeplitz')
class ToeplitzHashing(RandomnessExtractor):
    # ...
classmethod create(extractor_type, *args, **kwargs)#

It creates a randomness extractor object.

Parameters:
  • extractor_type (str) – the label of the desired extractor type. For available types check the usage section.

  • *args – additional parameters that are passed to the implementation class.

  • **kwargs – additional keyword arguments that are passed to the implementation class.

Examples

The following code creates a Toeplitz Hashing extractor and stores it in the variable ext. The output length is computed based on the provided bound on the min-entropy of the weak random source and the allowed error for the extractor.

import randextract
from randextract import RandomnessExtractor

ext = RandomnessExtractor.create(
        extractor_type="toeplitz",
        input_length=1000,
        relative_source_entropy=0.2,
        error_bound=1e-3)

print(f"Output length = {ext.output_length}")
print(f"Required seed length = {ext.seed_length}")

Similarly, we can create a Trevisan’s extractor using a polynomial one-bit extractor and a finite field polynomial weak design. This extractor takes as input a block of 1 MiB and outputs 500 KiB consuming a seed of roughly 26 KiB.

import randextract
from randextract import RandomnessExtractor

ext = RandomnessExtractor.create(
        extractor_type="trevisan",
        weak_design_type="finite_field",
        one_bit_extractor_type="polynomial",
        input_length=2**20,
        relative_source_entropy=0.8,
        output_length=500 * 2**10,
        error_bound=1e-3)

print(f"Output length = {ext.output_length}")
print(f"Required seed length = {ext.seed_length}")

These are examples of actual implementations. For all available parameters of the respective implementation we hereby refer to the actual implementation and its description, see respective docstrings.

abstract property input_length: int#

Returns the expected length of the input by the randomness extractor, i.e., the expected length of a bit string from a weak random source.

Returns:

Length of the input.

Return type:

int

abstract property seed_length: int#

Returns the expected length of the required seed by the randomness extractor.

Returns:

Length of the seed.

Return type:

int

abstract property output_length: int#

Returns the length of the randomness extractor’s output.

Returns:

An integer in the range [0, input_length).

Return type:

int

abstract static calculate_length(extractor_type, input_length, **kwargs) int#

For a given extractor type (i.e., quantum-proof) and a set of parameters (e.g., error_bound or relative_source_entropy), it computes the optimal output length for the extractor. In the case of one-bit extractors, it computes the optimal seed length instead. The exact number of accepted arguments depends on the particular implementation.

Parameters:
  • extractor_type (str) – Type of side information for which the extractor should remain secure. It can take two values: “classical” or “quantum”, for classical-proof and quantum-proof extractors, respectively.

  • input_length (Integral) – Length of the bit string from the weak random source.

Returns:

The optimal output or seed length for given input length and constraints.

Return type:

int

abstract extract(extractor_input, seed) GF2#

For a given input and a seed, it computes the output of the randomness extractor.

Parameters:
  • extractor_input (ndarray | GF2) – a Numpy or Galois binary array from a weak random source.

  • seed (ndarray | GF2) – a Numpy or Galois binary array with a uniform seed.

Returns:

A binary array with the randomness extractor’s output.

Return type:

GF2

Toeplitz Hashing implementation class#

class randextract.ToeplitzHashing(input_length, output_length)#

Implementation class for the Toeplitz hashing randomness extractor.

The constructor expects the input and output lengths. These uniquely define a particular family of Toeplitz hashing functions. In order to compute the optimal output length for a given input length and some assumptions, check the static method calculate_output_length().

The output of this randomness extractor is the result of doing a matrix-vector multiplication, where the matrix is a Toeplitz matrix determined by the uniform seed, and the vector is the bit string from the weak random source.

The extract() method computes the matrix-vector multiplication using the Fast Fourier Transform by embedding the Toeplitz matrix into a square circulant matrix. The Toeplitz matrix for a given seed can be obtained with the to_matrix() method (only for debug purposes).

Parameters:
  • input_length (Integral) – Length of the bit string from the weak random source. It must be a positive number.

  • output_length (Integral) – Length of the randomness extractor’s output. It must be in the range [1, input_length]. To compute the optimal output length given some constraints on the tolerated error and the weak random source use calculate_output_length().

Examples

The following creates a randomness extractor corresponding to a family of Toeplitz hashing functions that takes as input one million bits and extracts hundred thousand bits. The constructor does not select a particular function, this is done when passing a seed to the extract() method.

from randextract import ToeplitzHashing

ext = ToeplitzHashing(input_length=10**6, output_length=10**5)

The extract() method can be used to extract the randomness from the input, providing a uniform seed of the right length. The following example creates a random input and seed and pass them to the extract method.

import numpy as np
from galois import GF2

input_array = GF2.Random(ext.input_length)
seed = GF2.Random(ext.seed_length)
output_ext = ext.extract(input_array, seed)

assert output_ext.size == ext.output_length

The actual Toeplitz matrix can be exported using the to_matrix() method.

ext = ToeplitzHashing(input_length=20, output_length=8)

input_array = GF2.Random(ext.input_length)
seed = GF2.Random(ext.seed_length)

toeplitz_matrix = ext.to_matrix(seed)
print(toeplitz_matrix)

output1 = toeplitz_matrix @ input_array
output2 = ext.extract(input_array, seed)
assert np.array_equal(output1, output2)

The benefits of using the extract() method are very noticeable for large Toeplitz matrices.

ext = ToeplitzHashing(input_length=2**10, output_length=2**8)

input_array = GF2.Random(ext.input_length)
seed = GF2.Random(ext.seed_length)

%timeit (ext.extract(input_array, seed))
%timeit (ext.to_matrix(seed) @ input_array)

By default, the Toeplitz matrix is constructed from the seed vector using the standard mathematical definition, i.e., \(M[i, j] = \text{seed}[i - j]\). However, it is possible to use a different order using the seed_mode argument. It can take values “default”, “col-first”, “row-first”, or “custom”. If “custom” is used, you must provide a permutation array and pass it to seed_order kwarg.

ext = ToeplitzHashing(input_length=5, output_length=4)

seed = GF2.Zeros(ext.seed_length)
seed[0] = 1

# Default order, first element is the first element of the matrix
ext.to_matrix(seed)

# Fill the first column first bottom to top, then first row left to right
ext.to_matrix(seed, seed_mode="col-first")

# Fill the first row first left to right, then the first column top to bottom
ext.to_matrix(seed, seed_mode="row-first")

# Custom order
seed_order = np.array([1, 0, 2, 3, 4, 5, 6, 7])
ext.to_matrix(seed, seed_mode="custom", seed_order=seed_order)

See also

Theory: Toeplitz hashing.

static calculate_length(extractor_type, input_length, **kwargs) int#

For a given extractor type (i.e., quantum-proof) and a set of parameters, it computes the optimal output length for the (modified) Toeplitz hashing. The returned value can be used to choose the optimal family of extractors.

Parameters:
  • extractor_type (str) – This determines the type of side information against which the extractor should remain secure. Two values are accepted: “quantum” and “classical”. In this case, there is no difference between passing “quantum” or “classical” since the version of the leftover hash lemma with quantum side information has exactly the same form as the one with only classical side information.

  • input_length (Integral) – The length of the bit string from the weak random source.

Keyword Arguments:
  • relative_source_entropy – Lower bound on the conditional min-entropy of the weak random source, normalized by the input length. It must be a real number in the range (0, 1].

  • error_bound – Upper bound on the randomness extractor’s error, i.e., a measure of how far the output of the extractor is from the ideal (uniform) output. It must be a real number in the range (0, 1].

Return type:

int

extract(extractor_input, seed, seed_mode='default', seed_order=None) GF2#

Given input_length bits from a weak random source with at least relative_source_entropy \(\times\) input_length bits of entropy, it outputs an almost uniform binary array up to an error error_bound.

Parameters:
  • extractor_input (ndarray | GF2) – Binary array from the weak random source.

  • seed (ndarray | GF2) – Uniform seed used to populate the Toeplitz matrix.

  • seed_mode (str) – Mode to construct the Toeplitz matrix from the seed vector. Default mode uses the standard mathematical definition, i.e., Toeplitz_matrix(i, j) = seed(i - j). Other possible modes are: “col-first”, “row-first”, or “custom”. With “custom” a permutation array can be passed to seed_order.

  • seed_order (ndarray | None) – Permutation vector that determines how to construct the Toeplitz matrix from the seed vector. A valid seed_order array is a permutation of np.arange(seed.size). Using np.arange(seed.size) as the permutation is equivalent to seed_mode=”col-first”. Read the docs for examples of how to use this parameter, e.g. this unit test.

Returns:

An almost uniform (up to an error error_bound) binary array.

Return type:

GF2

to_matrix(seed, seed_mode='default', seed_order=None) GF2#

For a given seed, it outputs the corresponding Toeplitz matrix with dimensions output_length \(\times\) input_length.

Parameters:
  • seed (ndarray | GF2) – Uniform seed used to populate the Toeplitz matrix.

  • seed_mode (str) – Mode to construct the Toeplitz matrix from the seed vector. Default mode uses the standard mathematical definition, i.e., Toeplitz_matrix(i, j) = seed(i - j). Other possible modes are: “col-first”, “row-first”, or “custom”. With “custom” a permutation array can be passed to seed_order.

  • seed_order (ndarray | None) – Permutation vector that determines how to construct the Toeplitz matrix from the seed vector. A valid seed_order array is a permutation of np.arange(seed.size). Using np.arange(seed.size) as the permutation is equivalent to seed_mode=”col-first”. Read the docs for examples of how to use this parameter, e.g. this unit test.

Returns:

The corresponding Toeplitz matrix, a 2-dimensional binary array of shape (output_length, input_length).

Return type:

GF2

property input_length: int#

Returns the expected length of the input by the randomness extractor, i.e., the expected length of a bit string from a weak random source.

Returns:

Length of the input.

Return type:

int

property seed_length: int#

Returns the expected length of the required seed by the randomness extractor.

Returns:

Length of the seed.

Return type:

int

property output_length: int#

Returns the length of the randomness extractor’s output.

Returns:

An integer in the range [0, input_length).

Return type:

int

class randextract.ModifiedToeplitzHashing(input_length, output_length)#

Implementation class for the modified Toeplitz hashing randomness extractor.

The constructor expects the input and output lengths. These uniquely define a particular family of modified Toeplitz hashing functions. In order to compute the optimal output length for a given input length and some assumptions, check the static method calculate_output_length().

With this information, the output of the randomness extractor is the result of doing a matrix-vector multiplication. The matrix is the result of concatenating a Toeplitz matrix, determined by the uniform seed, together with an identity matrix of size output_length. The vector is the bit string from the weak random source.

Because the Toeplitz matrix has dimensions output_length \(\times\) input_length - output_length, the required seed is smaller than with the Toeplitz hashing.

The extract() method computes the matrix-vector multiplication using the Fast Fourier Transform by embedding the modified Toeplitz matrix into a square circulant matrix. The modified Toeplitz matrix for a given seed can be obtained with the to_matrix() method.

Parameters:
  • input_length (Integral) – Length of the bit string from the weak random source. It must be a positive number.

  • output_length (Integral) – Length of the randomness extractor’s output. It must be in the range [1, input_length]. To compute the optimal output length given some constraints on the tolerated error and the weak random source use calculate_output_length().

Examples

The following creates a randomness extractor corresponding to a family of modified Toeplitz hashing functions that takes as input one million bits and extracts hundred thousand bits. The constructor does not select a particular function, this is done when passing a seed to the extract() method.

from randextract import ModifiedToeplitzHashing

ext = ModifiedToeplitzHashing(input_length=10**6, output_length=10**5)

Notice that the required seed is smaller because it does not depend on the output length.

from randextract import ToeplitzHashing

toeplitz = ToeplitzHashing(input_length=10**6, output_length=10**5)

mod_toeplitz = ModifiedToeplitzHashing(input_length=10**6, output_length=10**5)

print(f"Toeplitz hashing requires a seed of length {toeplitz.seed_length:_}.")
print(f"While the modified Toeplitz hashing a seed of length {mod_toeplitz.seed_length:_}.")

You can check that the new matrix is a concatenation of a Toeplitz matrix with an identity matrix using the to_matrix() method.

import numpy as np
from galois import GF2

ext = ModifiedToeplitzHashing(input_length=20, output_length=8)

input_array = GF2.Random(ext.input_length)
seed = GF2.Random(ext.seed_length)

modified_toeplitz_matrix = ext.to_matrix(seed)
print(modified_toeplitz_matrix)

output1 = modified_toeplitz_matrix @ input_array
output2 = ext.extract(input_array, seed)
assert np.array_equal(output1, output2)

This saving in required seed also comes with some performance gain.

ext1  = ToeplitzHashing(input_length=2**20, output_length=2**10)
ext2  = ModifiedToeplitzHashing(input_length=2**20, output_length=2**10)

input_array = GF2.Random(ext1.input_length)
seed1 = GF2.Random(ext1.seed_length)
seed2 = GF2.Random(ext2.seed_length)

%timeit (ext1.extract(input_array, seed1))
%timeit (ext2.extract(input_array, seed2))

See also

Theory: Toeplitz hashing.

static calculate_length(extractor_type, input_length, **kwargs) int#

For a given extractor type (i.e., quantum-proof) and a set of parameters, it computes the optimal output length for the (modified) Toeplitz hashing. The returned value can be used to choose the optimal family of extractors.

Parameters:
  • extractor_type (str) – This determines the type of side information against which the extractor should remain secure. Two values are accepted: “quantum” and “classical”. In this case, there is no difference between passing “quantum” or “classical” since the version of the leftover hash lemma with quantum side information has exactly the same form as the one with only classical side information.

  • input_length (Integral) – The length of the bit string from the weak random source.

Keyword Arguments:
  • relative_source_entropy – Lower bound on the conditional min-entropy of the weak random source, normalized by the input length. It must be a real number in the range (0, 1].

  • error_bound – Upper bound on the randomness extractor’s error, i.e., a measure of how far the output of the extractor is from the ideal (uniform) output. It must be a real number in the range (0, 1].

Return type:

int

to_matrix(seed, seed_mode='default', seed_order=None) GF2#

For a given seed, it outputs the corresponding Toeplitz matrix with dimensions output_length \(\times\) input_length.

Parameters:
  • seed (ndarray | GF2 | None) – Uniform seed used to populate the Toeplitz matrix.

  • seed_mode (str) – Mode to construct the Toeplitz matrix from the seed vector. Default mode uses the standard mathematical definition, i.e., Toeplitz_matrix(i, j) = seed(i - j). Other possible modes are: “col-first”, “row-first”, or “custom”. With “custom” a permutation array can be passed to seed_order.

  • seed_order (ndarray | None) – Permutation vector that determines how to construct the Toeplitz matrix from the seed vector. A valid seed_order array is a permutation of np.arange(seed.size). Using np.arange(seed.size) as the permutation is equivalent to seed_mode=”col-first”. Read the docs for examples of how to use this parameter, e.g. this unit test.

Returns:

The corresponding Toeplitz matrix, a 2-dimensional binary array of shape (output_length, input_length).

Return type:

GF2

extract(extractor_input, seed, seed_mode='default', seed_order=None) GF2#

Given input_length bits from a weak random source with at least relative_source_entropy \(\times\) input_length bits of entropy, it outputs an almost uniform binary array up to an error error_bound.

Parameters:
  • extractor_input (ndarray | GF2) – Binary array from the weak random source.

  • seed (ndarray | GF2 | None) – Uniform seed used to populate the Toeplitz matrix.

  • seed_mode (str) – Mode to construct the Toeplitz matrix from the seed vector. Default mode uses the standard mathematical definition, i.e., Toeplitz_matrix(i, j) = seed(i - j). Other possible modes are: “col-first”, “row-first”, or “custom”. With “custom” a permutation array can be passed to seed_order.

  • seed_order (ndarray | None) – Permutation vector that determines how to construct the Toeplitz matrix from the seed vector. A valid seed_order array is a permutation of np.arange(seed.size). Using np.arange(seed.size) as the permutation is equivalent to seed_mode=”col-first”. Read the docs for examples of how to use this parameter, e.g. this unit test.

Returns:

An almost uniform (up to an error error_bound) binary array.

Return type:

GF2

Trevisan’s construction implementation class#

class randextract.TrevisanExtractor(input_length, output_length, weak_design_type, one_bit_extractor_type, one_bit_extractor_seed_length, basic_weak_design_type=None, precomputed_weak_design=None)#

Implementation class for the Trevisan’s construction randomness extractor.

The main idea is that a one-bit extractor can be called multiple times to extract one bit each time. The desired guarantee for a randomness extractor is given by a weak design, a family of sets that determines what subset of the seed has to be passed for the one-bit extractor in every call. The final output is the concatenation of all these extracted bits.

Different one-bit extractors and weak designs can be used to obtain diverse constructions, which can have distinct properties such as required seed or computational complexity.

For details about the one-bit extractors and weak designs, check their corresponding docstrings. The current available one-bit extractor and weak design implementations are:

One-bit extractors:
Weak designs:
Parameters:
  • input_length (Integral) – Length of the bit string from the weak random source.

  • output_length (Integral) – Length of the randomness extractor’s output.

  • weak_design_type (str) – The type weak design of construction. It can be either “finite_field” for the finite field polynomial design or “block” for the block weak design.

  • one_bit_extractor_type (str) – The type of the one-bit extractor. Use “xor” for the XOR one-bit extractor and “polynomial” for the polynomial hashing one-bit extractor.

  • one_bit_extractor_seed_length (Integral) – Length of the seed passed to the one-bit extractor. This, together with the type of the weak design, determines the length of the seed of the Trevisan’s construction. A value for this parameter can be obtained using the calculate_length() method of the chosen one-bit extractor.

  • basic_weak_design_type (str | None) – (Optional) The basic weak design type when weak_design_type=”block”. By default, it uses “finite_field”.

  • precomputed_weak_design (ndarray | FieldArray | None) – (Optional) A family of sets of indices given as a NumPy or Galois array with the properties of a weak design. Computing a weak design, specially for large fields, is one of the slowest computations when using a Trevisan’s extractor. It is recommended to compute it once, save it, and pass it later using this argument.

Examples

A TrevisanExtractor object can be created directly calling the constructor of this class or using the factory class RandomnessExtractor. Both methods are equivalent. For example, the following code creates a Trevisan’s construction using the polynomial one-bit extractor and the finite field polynomial weak design.

import randextract
from randextract import RandomnessExtractor, TrevisanExtractor

ext1 = RandomnessExtractor.create(
        extractor_type="trevisan",
        weak_design_type="finite_field",
        one_bit_extractor_type="polynomial",
        input_length=2**20,
        relative_source_entropy=0.8,
        output_length=2**10,
        error_bound=1e-3)

ext2 = TrevisanExtractor(
        weak_design_type="finite_field",
        one_bit_extractor_type="polynomial",
        input_length=2**20,
        relative_source_entropy=0.8,
        output_length=2**10,
        error_bound=1e-3)

assert ext1.output_length == ext2.output_length
assert ext1.seed_length == ext2.seed_length

Trevisan’s construction is just an algorithm to create randomness extractor from one-bit extractors and weak designs. The required seed length and computational complexity depends on the choice of these two pieces.

from galois import GF2

ext1 = RandomnessExtractor.create(
        extractor_type="trevisan",
        weak_design_type="finite_field",
        one_bit_extractor_type="polynomial",
        input_length=2**20,
        relative_source_entropy=0.8,
        output_length=2**10,
        error_bound=1e-3)

ext2 = TrevisanExtractor(
        weak_design_type="finite_field",
        one_bit_extractor_type="xor",
        input_length=2**20,
        relative_source_entropy=0.8,
        output_length=2**10,
        error_bound=1e-3)

print(f"Trevisan's construction with polynomial one-bit extractor requires a seed of length {ext1.seed_length}")
print(f"Trevisan's construction with XOR one-bit extractor requires a seed of length {ext2.seed_length}")

input_array = GF2.Random(ext1.input_length)
seed1 = GF2.Random(ext1.seed_length)
seed2 = GF2.Random(ext2.seed_length)

%timeit (ext1.extract(input_array, seed1))
%timeit (ext2.extract(input_array, seed2))

See also

Theory: Trevisan’s construction.

static calculate_length(extractor_type, input_length, **kwargs) list[dict[str, int]]#

For a given extractor type (i.e., quantum-proof) and a set of parameters (e.g., error_bound or relative_source_entropy), it computes the optimal output length for the extractor. In the case of one-bit extractors, it computes the optimal seed length instead. The exact number of accepted arguments depends on the particular implementation.

Parameters:
  • extractor_type (str) – Type of side information for which the extractor should remain secure. It can take two values: “classical” or “quantum”, for classical-proof and quantum-proof extractors, respectively.

  • input_length (Integral) – Length of the bit string from the weak random source.

Returns:

The optimal output or seed length for given input length and constraints.

Return type:

int

extract(extractor_input, seed) GF2#

For a given input and a uniform seed, the chosen one-bit extractor is called output_length times using as seeds the bits determined by the weak design from the given seed.

Parameters:
  • extractor_input (ndarray | GF2) – Binary array from the weak random source.

  • seed (ndarray | GF2) – Uniform seed used to call the one-bit extractor multiple times.

Returns:

An almost uniform (up to an error error_bound) binary array.

Return type:

GF2

property input_length: int#

Returns the expected length of the input by the randomness extractor, i.e., the expected length of a bit string from a weak random source.

Returns:

Length of the input.

Return type:

int

property seed_length: int#

Returns the expected length of the required seed by the randomness extractor.

Returns:

Length of the seed.

Return type:

int

property output_length: int#

Returns the length of the randomness extractor’s output.

Returns:

An integer in the range [0, input_length).

Return type:

int

class randextract.WeakDesign#

Abstract class with the minimum methods and properties that any weak design implementation class should have.

classmethod register_subclass(weak_design_type)#

Decorator to register an implementation class.

Parameters:

weak_design_type (str) – unique label of the weak design

Example

@WeakDesign.register_subclass('finite_field')
class FiniteFieldPolynomialDesign(WeakDesign):
    # ...
classmethod create(weak_design_type, *args, **kwargs)#

It creates a weak design object.

Parameters:
  • weak_design_type (str) – the label of the desired weak design type.

  • *args – additional parameters that are passed to the implementation class.

  • **kwargs – additional keyword arguments that are passed to the implementation class.

Example

The following code creates a small finite field polynomial weak design and stores it in the variable weak.

import randextract
from randextract import WeakDesign

weak = WeakDesign.create(weak_design_type="finite_field",
                         number_of_sets=20, size_of_set=5)
classmethod get_relative_overlap(weak_design_type) float#

Weak designs are characterized by an upper bound on their set’s overlap. This method returns this bound. For the mathematical definition check the corresponding section in the theory.

Parameters:

weak_design_type (str) – the label of the weak design type.

Returns:

Upper bound on the weak design’s overlap normalized by the number of sets.

Return type:

float

static is_valid(weak_design, number_of_sets, size_of_sets, relative_overlap) bool#

It checks if a NumPy array is a valid weak design.

Parameters:
  • weak_design (ndarray) – a NumPy array containing a weak design.

  • number_of_sets (Integral) – the desired number of sets that the weak design should contain.

  • size_of_sets (Integral) – the desired size of the sets.

  • relative_overlap (float) – an upper bound of the relative overlap between the sets.

Returns:

True if the NumPy array is a valid weak design with the desired parameters, False otherwise.

Return type:

bool

static save_to_file(weak_design, filename) None#

A wrapper using numpy.save() to save a weak design to a file so that it can be read and reused later.

Parameters:
  • weak_design (ndarray) – the weak design to be saved.

  • filename (str | Path) – a string with the desired filename or a Path object. If a str is passed, the file will be stored in the current working directory.

Return type:

None

static read_from_file(filename, number_of_sets, size_of_sets, range_design) ndarray#

It reads and verifies a precomputed weak design stored in a file. If the array has the correct shape and contains values in the right range, the weak design is returned as a NumPy array. Note that this method does not check if it’s a valid weak design with a certain overlap parameter. Use is_valid() for that.

Parameters:
  • filename (str | Path) – a string with the filename or a Path object to read. If a str is passed, the file is assumed to be in the current working directory.

  • number_of_sets (Integral) – the desired number of sets that the weak design should contain.

  • size_of_sets (Integral) – the desired size of the sets.

  • range_design (Integral) – the desired range of the weak design

Returns:

A NumPy array of shape number_of_sets x size_of_sets and values in [0, range).

Return type:

np.ndarray

abstract classmethod relative_overlap() float#
Returns:

Upper bound on the weak design’s overlap normalized by the number of sets.

Return type:

float

abstract property number_of_sets: int#

Returns: int: Size of the weak design, i.e., the number of sets.

abstract property size_of_sets: int#

Returns: int: The size (cardinality) of all sets.

abstract property weak_design: ndarray#

Returns: FieldArray: The computed weak design, i.e., a family of number_of_sets sets of size size_of_set.

abstract property is_computed: bool#

Returns: bool: True if the design has been computed already, False otherwise.

abstract property range_design: int#

Returns: int: All the sets that form the weak design have elements from [d] = [0, range_design).

abstract compute_design() None#

This function computes the weak design and saves it to memory.

Return type:

None

abstract get_set(index) ndarray#
Parameters:

index (Integral) – The index of a set from the family of sets, i.e. an integer in [0, m-1].

Returns:

The index-th set of the weak design.

Return type:

FieldArray

Weak design implementation classes#

class randextract.FiniteFieldPolynomialDesign(number_of_sets, size_of_sets, precomputed_weak_design=None, assume_valid=False)#

Implementation class for the finite field polynomial weak design.

The weak design is obtained by evaluating polynomials over a finite field of a size matching the desired set size. Roughly speaking, as many polynomials as the desired number of sets are computed in a degree increasing order. Each polynomial labels one set in the family of sets. Then, for each element in the finite field, the corresponding polynomial is evaluated. Finally, the pair (element, polynomial evaluation) is mapped to a larger finite field of size matching the length of the extractor’s seed. For the mathematical details and a detailed example check the corresponding section in the theory.

This particular weak design has an upper bound on its relative overlap of \(2e\).

Parameters:
  • number_of_sets (Integral) – Size of the weak design, i.e., the number of sets.

  • size_of_sets (Integral) – The size (cardinality) of all sets. It must be a prime number.

  • weak_design – A pre-computed weak design, i.e., a family of number_of_sets sets of size size_of_set. It should be given a FieldArray (or np.ndarray), or a pathlib.Path to a file saved using save_to_file().

  • assume_valid (bool | None) – A boolean that triggers computing the overlap of the provided weak design to check that it’s a valid family with an overlap below the upper bound. Use True to do the check, use False (default) to skip it.

Examples

A FiniteFieldPolynomialDesign object can be created directly calling the constructor of this class. For example, the following code creates a weak design with 1024 sets, each one with 32771 values from the finite field \([32771^2] = \{0,\dots,1073938440\}\).

import randextract
from randextract import FiniteFieldPolynomialDesign

wd = FiniteFieldPolynomialDesign(
        number_of_sets=2**10,
        size_of_set=32_771)

number_of_sets = wd.number_of_sets
size_of_set = wd.size_of_set
finite_field_size = wd.weak_design._order

The following code compute the small design presented as an example in the theory section.

wd = FiniteFieldPolynomialDesign(
        number_of_sets=6,
        size_of_set=2)

print(wd.weak_design)

See also

Theory: Trevisan’s construction.

classmethod relative_overlap() float#
Returns:

Upper bound on the weak design’s overlap normalized by the number of sets.

Return type:

float

compute_design()#

This function computes the weak design and saves it to memory.

get_set(index) ndarray | None#
Parameters:

index (Integral) – The index of a set from the family of sets, i.e. an integer in [0, m-1].

Returns:

The index-th set of the weak design.

Return type:

FieldArray

property number_of_sets: int#

Returns: int: Size of the weak design, i.e., the number of sets.

property size_of_sets: int#

Returns: int: The size (cardinality) of all sets.

property range_design: int#

Returns: int: All the sets that form the weak design have elements from [d] = [0, range_design).

property weak_design: ndarray | None#

Returns: FieldArray: The computed weak design, i.e., a family of number_of_sets sets of size size_of_set.

property is_computed: bool#

Returns: bool: True if the design has been computed already, False otherwise.

One-bit extractor implementation classes#

class randextract.XOROneBitExtractor(input_length, number_indices=None, seed_length=None)#

Implementation class for the XOR one-bit randomness extractor.

This one-bit extractor works by computing the parity of a selection of bits from the input. The selection of bits is done using a uniform seed. Parity of a binary array refers to whether it contains an odd or even number of 1-bits. The array has “odd parity” (1), if it contains odd number of 1s and has “even parity” (0) if it contains even number 1s. For a more rigorous definition and a detailed example check the corresponding section in the theory.

Parameters:
  • input_length (Integral) – Length of the bit string from the weak random source. It must be greater than 1 and smaller than the max integer that can be stored in a np.uint64 (approx. \(10^{19}\))

  • number_indices (Integral | None) – Number of bits from the input that will be XOR together. It must be an integer in [1, input_length].

  • seed_length (Integral | None) – Length of the uniform seed. It must be an integer in [1, input_length * bits_per_index]

Examples

A XOROneBitExtractor object can be created directly calling the constructor of this class or using the factory class RandomnessExtractor. Both methods are equivalent.

from randextract import RandomnessExtractor, XOROneBitExtractor

ext1 = RandomnessExtractor.create(
    extractor_type="xor", input_length=2**20, number_indices=100
)

ext2 = XOROneBitExtractor(input_length=2**20, number_indices=100)

assert ext1.output_length == ext2.output_length == 1
assert ext1.number_indices == ext2.number_indices == 100
# 20 bits are needed to store an index in [0, 2^20 - 1]
assert ext1.seed_length == ext2.seed_length == 20 * 100

Even though the output is a single bit, to be consistent with the RandomnessExtractor this extractor still outputs a 1-dim array. If you want the bit value use the item() function.

import numpy as np
from galois import GF2
from randextract import XOROneBitExtractor

ext = XOROneBitExtractor(input_length=100, number_indices=10)

rng = np.random.default_rng()
extractor_input = GF2.Random(100, seed=rng)
seed = rng.integers(low=0, high=100, size=10)

output = ext.extract(extractor_input, seed).item()

How many indices or, in other words, how long should the seed be in order to extract securely one bit from a weak randomness source can be computed using the class method calculate_length().

from randextract import XOROneBitExtractor

indices = XOROneBitExtractor.calculate_length(
    extractor_type="quantum",
    return_mode="indices",
    input_length=2**20,
    relative_source_entropy=0.7,
    error_bound=1e-6
)

ext = XOROneBitExtractor(2**20, number_indices=indices)
print(f"input length: {ext.input_length}")
print(f"seed length: {ext.seed_length}")
print(f"number indices: {ext.number_indices}")
Raises:
  • TypeError – If the passed arguments have wrong types, or if a compulsory argument is missing (e.g. missing input_length).

  • ValueError – If the passed arguments have wrong values (e.g. a negative seed_length value).

See also

Theory: XOR one-bit extractor

static calculate_length(extractor_type, input_length, **kwargs) int#

For a given extractor type (i.e., quantum-proof) and a set of parameters, it computes the optimal seed length for the XOR one-bit extractor. The returned value is, by default, in order to be consistent with other extractors the number of required bits. Alternatively, the number of indices can be obtained by passing return_mode="binary". Both values can be passed directly to the XOROneBitExtractor constructor to select the optimal family of functions.

Parameters:
  • extractor_type (str) – This determines the type of side information against which the extractor should remain secure. Two values are accepted: “quantum” and “classical”.

  • input_length (Integral) – The length of the bit string from the weak random source.

Keyword Arguments:
  • relative_source_entropy – Lower bound on the conditional min-entropy of the weak random source, normalized by the input length. It must be a real number in the range (0, 1].

  • error_bound – Upper bound on the randomness extractor’s error, i.e., a measure of how far the output of the extractor is from the ideal (uniform) output. It must be a real number in the range (0, 1].

  • return_mode (optional) – Either set with value “indices” to return the optimal number of indices or with “binary” (default) to return the optimal number of bits.

Raises:
  • TypeError – If some of the passed arguments or keyword arguments have wrong types.

  • ValueError – If some of the passed arguments or keyword arguments have wrong values, if it is not possible to find a seed length for given parameters, or if the numerical optimization failed.

Return type:

int

Returns:

Either the number of indices or the seed length (default) depending on return_mode required to extract securely one bit from the weak randomness source.

extract(extractor_input, seed, **kwargs) GF2#

Given a bitstring of length input_length, the XOR one-bit extractor returns the parity of the bits selected by the seed. This seed can take two different forms: an array of indices or another bitstring. The mode can be selected using the keyword argument seed_mode.

Parameters:
  • extractor_input (ndarray | GF2) – Binary array from the weak random source.

  • seed (ndarray | GF2 | None) – An array of indices with values in [0, ìnput_length - 1] or a binary array.

Keyword Arguments:

seed_mode (optional) – It can take two values: "indices" or "binary". The former assumes that the seed is an array with values that can be used to select any of the bits from the extractor_input. The latter assumes that each \(\lceil\log_2(\text{input\_length})\rceil\) bits represent a single index.

Return type:

GF2

Returns:

The extracted bit returned as a A 1-dim GF2 array of size 1.

property input_length: int#

Returns the expected length of the input by the randomness extractor, i.e., the expected length of a bit string from a weak random source.

Returns:

Length of the input.

Return type:

int

property seed_length: int#

Returns the expected length of the required seed by the randomness extractor.

Returns:

Length of the seed.

Return type:

int

property output_length: int#

Returns the length of the randomness extractor’s output.

Returns:

An integer in the range [0, input_length).

Return type:

int

class randextract.PolynomialOneBitExtractor(input_length, seed_length, irreducible_poly=None)#

Implementation class for the polynomial one-bit randomness extractor.

This one-bit extractor is actually a concatenation of two hash functions, each of which uses half of the given seed. The first hash is a polynomial evaluation and the second hash is the parity of a bitwise multiplication. For a more rigorous definition and a detailed example check the corresponding section in the theory.

Parameters:
  • input_length (Integral) – Length of the bit string from the weak random source. It must be an integer greater than 1.

  • seed_length (Integral) – Length of the seed. It must be an even integer in the closed interval [2, 2 * input_length].

  • irreducible_poly (Poly | None) – An irreducible polynomial used for the arithmetic calculation in the Reed Solomon hashing. It must be a galois.Poly object. It must be provided for seed_length larger than 20_001 bits.

Examples

A PolynomialOneBitExtractor object can be created directly calling the constructor of this class or using the factory class RandomnessExtractor. Both methods are equivalent.

from randextract import RandomnessExtractor, PolynomialOneBitExtractor

ext1 = RandomnessExtractor.create(
        extractor_type="polynomial",
        input_length=2**20,
        seed_length=100)

ext2 = XOROneBitExtractor(
        input_length=2**20,
        seed_length=100)

assert ext1.output_length == ext2.output_length == 1
assert ext1.seed_length == ext2.seed_length == 100

You can run the example from the documentation

from galois import GF2, Poly

ext = XOROneBitExtractor(
        input_length=25,
        seed_length=24,
        irreducible_poly=Poly.Str("x^12 + x^3 + 1"))

extractor_input = GF2([1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0])
seed = GF2([0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1])

ext.extract(extractor_input, seed)
static calculate_length(extractor_type, input_length, **kwargs) int#

For a given extractor type (i.e., quantum-proof) and a set of parameters, it computes the optimal seed length for the polynomial one-bit extractor. This extractor is a combination of two almost two-universal extractors.

Parameters:
  • extractor_type (str) – This determines the type of side information against which the extractor should remain secure. Two values are accepted: “quantum” and “classical”.

  • input_length (Integral) – The length of the bit string from the weak random source.

Keyword Arguments:
  • relative_source_entropy – Lower bound on the conditional min-entropy of the weak random source, normalized by the input length. It must be a real number in the range (0, 1].

  • error_bound – Upper bound on the randomness extractor’s error, i.e., a measure of how far the output of the extractor is from the ideal (uniform) output. It must be a real number in the range (0, 1].

Return type:

int

See also

Theory: Trevisan’s construction.

extract(extractor_input, seed) GF2#

For a given input and a seed, it computes the output of the randomness extractor.

Parameters:
  • extractor_input (ndarray | GF2) – a Numpy or Galois binary array from a weak random source.

  • seed (ndarray | GF2) – a Numpy or Galois binary array with a uniform seed.

Returns:

A binary array with the randomness extractor’s output.

Return type:

GF2

property input_length: int#

Returns the expected length of the input by the randomness extractor, i.e., the expected length of a bit string from a weak random source.

Returns:

Length of the input.

Return type:

int

property seed_length: int#

Returns the expected length of the required seed by the randomness extractor.

Returns:

Length of the seed.

Return type:

int

property output_length: int#

Returns the length of the randomness extractor’s output.

Returns:

An integer in the range [0, input_length).

Return type:

int

Validator class#

class randextract.Validator(extractor)#

The Validator class implements a way to test an arbitrary implementation of a randomness extractors against our reference library.

The constructor only expects a valid RandomnessExtractor. Implementations to be tested can be added with the method obj:add_implementation(). To validate the provided implementation(s) against our reference library use the method validate(). Finally, to check whether the added implementation(s) have passed the tests you can check the boolean validator.all_passed or simply print the validator object to get a summary print(validator).

Parameters:

extractor (RandomnessExtractor) – A valid RandomnessExtractor object

Examples

Detailed examples using this class to test other implementations can be found in the package documentation.

Simple examples can also be found on the unit tests, i.e., tests/unit/test_validator.py.

add_implementation(label, input_method, **kwargs) None#
Parameters:
  • label (str) – A name to identify the provided implementation, e.g., “rust-fft”

  • input_method (str) – This determines how the results from the provided implementation are obtained. Three methods are implemented at the moment: “stdio” (standard input/output), “read_files” and “custom”. Use “stdio” when you have a binary that uses stdin to take the parameters, seeds and input, and provides the result via stdout. Use “read_files” if the results are saved in one or more files. Use “custom” if previous methods do not fit your use case. Each method expects additional keyword arguments. Read below for details.

Keyword Arguments:
  • command (str) – (input_method="stdio") Provide the command that gives the output hash to compare with the reference implementation. You can use the following variables in your command: $INPUT_LENGTH$, $OUTPUT_LENGTH$, $SEED$, $INPUT$. By default, $INPUT_LENGTH$ and $OUTPUT_LENGTH$ are passed as strings, and $SEED$ and $INPUT$ as bit strings (without spaces). If you want any other format, check the format_dict kwarg.

  • format_dict (dict) – (input_method="stdio") A dictionary containing functions to convert the variables mentioned in the command kwarg from their usual Python representation to any arbitrary format expected by the implementation to be tested. Use the whole variable name as key for the dict, e.g. {"$SEED$": <function>}.

  • parser (dict) – (input_method="read_files") A dict containing generators to parse the files, i.e., {"input": <generator_input>, "seed": <generator_seed>, "output": <generator_output>}. Check the examples below.

  • custom_class (class) – (input_method="custom") An implementation class for the provided ValidatorCustomClassAbs abstract class. Check the documentation of the abstract class for more details about what you should implement. Check the examples below and the use cases subsection in the documentation for real scenarios using the “custom” input_method.

Return type:

None

Examples

A detailed example using input_method="stdio" can be found here. For input_method="read_files" you can check this other example. Finally, for the input_method="custom" check this complete example.

In addition, you can check the very simple examples from the unit tests in tests/unit/test_validator.py.

remove_implementation(label) None#

It removes the implementation saved with name label. If there is no implementation with passed label a UserWarning is raised.

Parameters:

label (str) – The name associated with a saved implementation included using add_implementation()

Return type:

None

replace_implementation(label, input_method='stdio', **kwargs) None#

If add_implementation() is used with a label that already exists, a UserWarning is raised. This method does exactly the same as add_implementation() but issues no warning.

Return type:

None

validate(**kwargs) None#

It validates the added implementation(s) with add_implementation() against the reference extractor with which the validator was constructed. Implementations with input_method="stdio" can be tested using two different methods: randomly or with a brute force comparison. Check the keyword arguments for more details. This method does not return anything, but it updates the keys “validated” and “valid”

Keyword Arguments:
  • mode (str) – This only affects implementations with input_method="stdio". Two modes are available: “random” and “brute-force”. Random tests take seeds and inputs uniformly at random and compares the output of the reference implementation with the added implementation(s)

  • sample_size (int) – (mode="random") The number of random inputs and seeds that will be used to validate the implementations added with input_method="stdio"

  • max_attempts (int | str) – (mode="brute-force") The max number of testing rounds. Use max_attempts="all" if you want to run an exhaustive brute-force testing trying all possible input and seeds.

  • rng (int | Generator | None) – (mode="random") Seed to initialize the NumPy RNG or, alternatively, an already initialized Generator, e.g. numpy.random.default_rng(1337). This only affects implementations with input_method="stdio"

Return type:

None

static analyze_failed_test(filename=None, func=None) tuple[GF2, GF2, GF2, GF2]#

It loads to memory the arrays from a failed test, saved in a npz file by the validate() method, and prints some useful information to help debug the wrong implementation.

Parameters:
  • filename (str | Path | None) – The name of the npz file saved by validate() on a failed test. If no filename is provided (default), the newest npz file will be loaded.

  • func (Optional[Callable]) – By default, a simple analysis of the two outputs is performed. Here you can provide an additional function to run on the arrays. It should accept for arrays as input and return nothing.

Returns:

GF2 array with the input passed to both the reference and the implementation being tested ext_seed: GF2 array with the seed passed to both the reference and the implementation being tested ref_output: GF2 array with the output of the reference implementation impl_output: GF2 array with the output from the implementation being tested

Return type:

ext_input

class randextract.ValidatorCustomClassAbs#

An abstract base class that serves as the skeleton for the custom class required to validate the implementation of an extractor using the class obj::Validator together with the input_method="custom". All the methods here represent the set of minimum functions that should be implemented to test an extractor in this mode. Feel free to implement any other required methods. If some of these methods are missing in the actual implementation class, a TypeError will be raised when trying to create an instance.

abstract get_extractor_inputs() GF2#

This method should yield the next extractor input used to validate the implementation as a GF2 array.

Yields:

GF2 – a GF2 array of length self.ref.input_length

Return type:

GF2

abstract get_extractor_seeds() GF2#

This method should yield the next extractor seed used to validate the implementation as a GF2 array.

Yields:

GF2 – a GF2 array of length self.ref.seed_length

Return type:

GF2

abstract get_extractor_output(ext_input, ext_seed) GF2#

This method should return the output of the extractor being tested as a GF2 array.

Returns:

a GF2 array of length self.ref.output_length

Return type:

GF2