Rust (CPU) Toeplitz hashing#

See also

The full source code of this example is in examples/validation_rust_toeplitz.py

In recent years, Rust has emerged as a compelling programming language for a variety of applications. Its versatility stems from its ability to produce both high-performance, low-level code (e.g., Asix PHY driver, the first Rust driver incorporated into the Linux kernel, or Asahi DRM, the first Rust GPU kernel driver) as well as sophisticated, memory-safe high-level applications (e.g., sudo-rs or uutils coreutils).

Said Aroua implemented the Toeplitz hashing using Rust. In this example, we validate his implementation in two different ways using the Validator class. First, we use standard input and output to brute-force testing all possible inputs and seeds for a small family of Toeplitz hash functions. Second, we validate a large family by reading files with random cases generated directly with the Rust implementation.

Setting up the Rust implementation#

In order to replicate this example, you first need to download and compile the Rust implementation. This can be done with the following commands:

git clone https://github.com/cryptohslu/toeplitz-rust.git
cd toeplitz-rust
cargo build --release

The relevant binary, toeplitz, can be found in the target/releases directory.

Validating using stdio#

The Rust program contains actually three different implementations for the same family of Toeplitz hash functions: simple computes the hash by performing the matrix-vector multiplication explicitly. This only works for very small input and output lengths, as the required memory grows linearly with the product of these two values. fft and realfft use the fast fourier transform trick to compute the matrix-vector multiplication as described in this theory subsection. We brute-force test the three of them by adding three implementations with different labels and passing the correct command. This is done in the brute_force_stdio_test() function.

def brute_force_stdio_test():
    print("\nFirst we brute force all the possible inputs and seeds for a small family")
    print("of Toeplitz functions with:\n")
    print(f"input_length = {INPUT_LENGTH_STDIO}")
    print(f"output_length = {OUTPUT_LENGTH_STDIO}\n")
    ext = ToeplitzHashing(INPUT_LENGTH_STDIO, OUTPUT_LENGTH_STDIO)
    val = Validator(ext)

    def gf2_to_str(gf2_arr):
        return (np.array(gf2_arr) + ord("0")).tobytes().decode()

    val.add_implementation(
        label="Rust-stdio-simple",
        input_method="stdio",
        command="./toeplitz simple run-args $SEED$ $INPUT$",
        format_dict={"$SEED$": gf2_to_str, "$INPUT$": gf2_to_str},
    )
    val.add_implementation(
        label="Rust-stdio-fft",
        input_method="stdio",
        command="./toeplitz fft run-args $SEED$ $INPUT$",
        format_dict={"$SEED$": gf2_to_str, "$INPUT$": gf2_to_str},
    )
    val.add_implementation(
        label="Rust-stdio-realfft",
        input_method="stdio",
        command="./toeplitz realfft run-args $SEED$ $INPUT$",
        format_dict={"$SEED$": gf2_to_str, "$INPUT$": gf2_to_str},
    )

    print(val)
    print(f"{datetime.datetime.now():%Y-%m-%d %H:%M:%S}")
    print(f"Starting brute-force testing...\n")
    start = time.perf_counter()
    val.validate(mode="brute-force", max_attempts="all")
    stop = time.perf_counter()
    print(val)
    print(f"{datetime.datetime.now():%Y-%m-%d %H:%M:%S}")
    print(f"Brute-force validation finished in {round(stop - start)} seconds\n")

Validating using files#

In the function read_files_test() we try the input_method="read_files". Together with input_method="custom" that was presented in the GPU example we have covered all the (current) modes to pass a implementation to Validator.

def read_files_test():
    try:
        with open("toeplitz.gen", "r") as f:
            pass
    except FileNotFoundError:
        print(
            "Toeplitz.gen file does not exist. You can generate it, for example, by running:"
        )
        print("./toeplitz gen 1000 toeplitz.gen 1048576 524288")
        return

    print("Now we test some cases generated by the Rust binary and saved to a file.")
    print("The Toeplitz hash family has:\n")
    print(f"input_length = {INPUT_LENGTH_FILES}")
    print(f"output_length = {OUTPUT_LENGTH_FILES}\n")
    ext = ToeplitzHashing(INPUT_LENGTH_FILES, OUTPUT_LENGTH_FILES)
    val = Validator(ext)

    def read_file(length):
        with open("toeplitz.gen", "r") as f:
            for i, line in enumerate(f):
                line = line.strip()
                if len(line) == length:
                    yield GF2(np.frombuffer(line.encode(), dtype=np.uint8) - ord("0"))
            return

    val.add_implementation(
        "Rust-file",
        input_method="read_files",
        parser={
            "input": read_file(ext.input_length),
            "seed": read_file(ext.seed_length),
            "output": read_file(ext.output_length),
        },
    )

    print(val)
    print(f"{datetime.datetime.now():%Y-%m-%d %H:%M:%S}")
    print(f"Starting read-files testing...\n")
    start = time.perf_counter()
    val.validate()
    stop = time.perf_counter()
    print(val)
    print(f"{datetime.datetime.now():%Y-%m-%d %H:%M:%S}")
    print(f"Read-files validation finished in {round(stop - start)} seconds\n")
Screenshot validation

Screenshot after running this example#