Skip to content

Comments

Armv8.1-M: Add native Keccak x4 XORBytes and ExtractBytes#972

Draft
mkannwischer wants to merge 4 commits intomainfrom
mve-keccakx4-bitinterleaving
Draft

Armv8.1-M: Add native Keccak x4 XORBytes and ExtractBytes#972
mkannwischer wants to merge 4 commits intomainfrom
mve-keccakx4-bitinterleaving

Conversation

@mkannwischer
Copy link
Contributor

mkannwischer and others added 4 commits February 19, 2026 10:28
Replace test_keccakf1600x4_permute with test_keccakf1600x4_xor_permute_extract
that tests the full x4 Keccak flow (xor_bytes, permute, extract_bytes) against
the x1 C reference implementation.

Testing through the public interface rather than comparing internal state
directly allows verifying backends that use custom state representations
(e.g., bit-interleaved) without requiring state conversion functions.

The test uses random offsets and lengths for both xor_bytes and extract_bytes,
and verifies each of the 4 lanes independently against the x1 reference.

Also reduce functional test iterations for M55 baremetal platform.

Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Extend the FIPS202 native backend API to support implementing XORBytes and
ExtractBytes steps in native code.

This is essential for backends using custom state representations (e.g.,
bit-interleaved state), where these functions handle conversion to/from
the internal format on-the-fly. In such cases, they also account for a
significant amount of processing time.

New flags:
- MLD_USE_FIPS202_X4_XOR_BYTES_NATIVE: Backend provides native XOR bytes
- MLD_USE_FIPS202_X4_EXTRACT_BYTES_NATIVE: Backend provides native extract bytes

When set, backends provide native implementations for:
- mld_keccakf1600_xor_bytes_x4_native: XOR input data into state
- mld_keccakf1600_extract_bytes_x4_native: Extract output from state

Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Add native MVE implementations of XORBytes and ExtractBytes that perform
bit-interleaving/deinterleaving on-the-fly, enabling use of a bit-interleaved
state representation without temporary conversions in the permutation.

This improves performance by:
- Reducing the number of bit-interleaving operations
- Accelerating bit-interleaving using MVE vector instructions

The backend uses bit-interleaved state representation where each 64-bit
lane is split into even and odd 32-bit halves for efficient 32-bit
MVE processing.

Co-Authored-By: Brendan Moran <brendan.moran@arm.com>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Follow the same dispatch pattern used by mld_keccakf1600_permute and
mld_keccakf1600x4_permute: extract the C fallback into a static _c
function, have the public function dispatch via the native return code,
and mark the native wrappers with MLD_MUST_CHECK_RETURN_VALUE.

Add CBMC contracts for the native xor_bytes and extract_bytes functions
and corresponding proofs for the native dispatch paths. The _c functions
do not have separate proofs, in line with the other FIPS-202 functions.

Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
@oqs-bot
Copy link
Contributor

oqs-bot commented Feb 19, 2026

CBMC Results (ML-DSA-65)

Full Results (177 proofs)
Proof Status Current Previous Change
**TOTAL** 2388s 2286s +4.5%
polyvecl_pointwise_acc_montgomery_c 250s 226s +11%
mld_attempt_signature_generation 202s 197s +3%
sign_verify_internal 178s 177s +1%
polyvec_matrix_expand 153s 145s +6%
rej_uniform_native 148s 144s +3%
poly_pointwise_montgomery_c 147s 138s +7%
mld_invntt_layer 120s 117s +3%
mld_ct_memcmp 81s 79s +3%
polyvec_matrix_expand_serial 68s 65s +5%
sign_signature_internal 53s 50s +6%
mld_ntt_layer 49s 44s +11%
keccak_squeezeblocks_x4 43s 42s +2%
mld_compute_t0_t1_tr_from_sk_components 26s 27s -4%
rej_uniform 22s 21s +5%
polymat_permute_bitrev_to_custom 21s 18s +17%
rej_uniform_c 21s 19s +11%
fqmul 19s 18s +6%
poly_chknorm_c 17s 16s +6%
poly_uniform_4x 17s 13s +31%
poly_uniform_eta_4x 17s 17s +0%
polyveck_decompose 17s 16s +6%
keccakf1600x4_permute_native 15s 14s +7%
polyt0_unpack 15s 17s -12%
polyvec_matrix_pointwise_montgomery 15s 14s +7%
mld_polyvecl_permute_bitrev_to_custom_native 14s 14s +0%
polyveck_use_hint 14s 14s +0%
mld_check_pct 12s 9s +33%
mld_ntt_butterfly_block 12s 13s -8%
poly_invntt_tomont_c 11s 8s +38%
polyveck_power2round 11s 11s +0%
keccak_absorb_once_x4 10s 10s +0%
keccakf1600_permute 10s 9s +11%
polyveck_add 10s 9s +11%
polyveck_invntt_tomont 10s 8s +25%
polyveck_reduce 10s 9s +11%
sign 10s 9s +11%
sign_pk_from_sk 9s 8s +12%
poly_challenge 8s 4s +100%
poly_decompose_c 8s 7s +14%
polyveck_caddq 8s 8s +0%
polyveck_ntt 8s 6s +33%
keccakf1600_permute_native 7s 9s -22%
mld_compute_pack_z 7s 7s +0%
mld_sample_s1_s2_serial 7s 6s +17%
poly_use_hint_c 7s 7s +0%
polyeta_unpack 7s 7s +0%
polyveck_shiftl 7s 8s -12%
polyvecl_ntt 7s 8s -12%
sign_keypair_internal 7s 6s +17%
keccak_absorb 6s 5s +20%
keccakf1600_extract_bytes (big endian) 6s 3s +100%
mld_sample_s1_s2 6s 4s +50%
polyt0_pack 6s 4s +50%
polyveck_pointwise_poly_montgomery 6s 6s +0%
rej_eta 6s 4s +50%
shake128_init 6s 1s +500%
shake256 6s 2s +200%
shake256x4_squeezeblocks 6s 3s +100%
sign_open 6s 5s +20%
sign_verify_pre_hash_internal 6s 3s +100%
keccakf1600_xor_bytes 5s 3s +67%
keccakf1600x4_xor_bytes_native 5s - new
poly_add 5s 4s +25%
poly_caddq_native 5s 4s +25%
poly_invntt_tomont_native 5s 7s -29%
poly_uniform 5s 4s +25%
polyveck_sub 5s 7s -29%
polyvecl_uniform_gamma1 5s 5s +0%
polyz_unpack_c 5s 5s +0%
rej_eta_c 5s 3s +67%
sign_keypair 5s 3s +67%
unpack_hints 5s 5s +0%
unpack_sk 5s 5s +0%
keccak_init 4s 3s +33%
keccakf1600x4_extract_bytes 4s 1s +300%
make_hint 4s 3s +33%
mld_ct_cmask_nonzero_u32 4s 2s +100%
mld_h 4s 3s +33%
mld_value_barrier_i64 4s 2s +100%
mld_value_barrier_u8 4s 3s +33%
pack_pk 4s 3s +33%
pack_sig_c_h 4s 2s +100%
poly_decompose 4s 3s +33%
poly_decompose_native 4s 3s +33%
poly_make_hint 4s 3s +33%
poly_power2round 4s 5s -20%
poly_sub 4s 5s -20%
poly_uniform_gamma1_4x 4s 6s -33%
poly_use_hint 4s 3s +33%
poly_use_hint_native 4s 4s +0%
polyeta_pack 4s 2s +100%
polyt1_unpack 4s 6s -33%
polyveck_chknorm 4s 5s -20%
polyveck_make_hint 4s 5s -20%
polyveck_pack_w1 4s 6s -33%
polyveck_unpack_eta 4s 4s +0%
polyvecl_chknorm 4s 5s -20%
polyvecl_pack_eta 4s 4s +0%
polyvecl_unpack_eta 4s 5s -20%
polyvecl_unpack_z 4s 2s +100%
polyz_pack 4s 4s +0%
polyz_unpack 4s 2s +100%
polyz_unpack_native 4s 3s +33%
power2round 4s 2s +100%
rej_eta_native 4s 4s +0%
shake128_absorb 4s 2s +100%
shake128_squeeze 4s 2s +100%
shake256_absorb 4s 5s -20%
sign_signature 4s 4s +0%
sign_verify_pre_hash_shake256 4s 6s -33%
unpack_sig 4s 4s +0%
caddq 3s 3s +0%
fqscale 3s 4s -25%
intt_native_x86_64 3s 3s +0%
keccakf1600x4_permute 3s 1s +200%
mld_ct_abs_i32 3s 4s -25%
mld_ct_cmask_nonzero_u8 3s 2s +50%
mld_ct_get_optblocker_i64 3s 1s +200%
mld_ct_get_optblocker_u32 3s 4s -25%
mld_ct_get_optblocker_u8 3s 2s +50%
mld_prepare_domain_separation_prefix 3s 5s -40%
montgomery_reduce 3s 2s +50%
ntt_native_x86_64 3s 3s +0%
poly_invntt_tomont 3s 6s -50%
poly_ntt_c 3s 2s +50%
poly_ntt_native 3s 3s +0%
poly_shiftl 3s 2s +50%
poly_uniform_eta 3s 3s +0%
poly_uniform_gamma1 3s 5s -40%
polyt1_pack 3s 2s +50%
polyveck_pack_eta 3s 4s -25%
polyveck_pack_t0 3s 3s +0%
polyveck_unpack_t0 3s 3s +0%
polyvecl_permute_bitrev_to_custom 3s 3s +0%
polyvecl_pointwise_acc_montgomery_native 3s 5s -40%
polyvecl_uniform_gamma1_serial 3s 4s -25%
polyw1_pack 3s 1s +200%
shake128_release 3s 3s +0%
shake128x4_absorb_once 3s 4s -25%
shake256_init 3s 2s +50%
shake256_release 3s 1s +200%
sign_signature_extmu 3s 4s -25%
sign_signature_pre_hash_internal 3s 6s -50%
sign_signature_pre_hash_shake256 3s 4s -25%
sign_verify_extmu 3s 4s -25%
sys_check_capability 3s 3s +0%
use_hint 3s 2s +50%
decompose 2s 5s -60%
keccak_finalize 2s 2s +0%
keccak_squeeze 2s 2s +0%
keccakf1600x4_xor_bytes 2s 2s +0%
mld_ct_cmask_neg_i32 2s 2s +0%
pack_sig_z 2s 2s +0%
pack_sk 2s 2s +0%
poly_caddq 2s 3s -33%
poly_caddq_c 2s 3s -33%
poly_caddq_native_aarch64 2s 6s -67%
poly_chknorm_native 2s 4s -50%
poly_ntt 2s 5s -60%
poly_pointwise_montgomery 2s 5s -60%
poly_pointwise_montgomery_native 2s 2s +0%
polyvecl_pointwise_acc_montgomery 2s 5s -60%
reduce32 2s 3s -33%
shake128x4_squeezeblocks 2s 1s +100%
shake256_finalize 2s 5s -60%
shake256_squeeze 2s 2s +0%
shake256x4_absorb_once 2s 4s -50%
sign_verify 2s 4s -50%
unpack_pk 2s 3s -33%
keccakf1600_xor_bytes (big endian) 1s 3s -67%
keccakf1600x4_extract_bytes_native 1s - new
mld_ct_sel_int32 1s 3s -67%
mld_keccakf1600_extract_bytes 1s 2s -50%
mld_value_barrier_u32 1s 3s -67%
poly_chknorm 1s 3s -67%
poly_reduce 1s 5s -80%
shake128_finalize 1s 3s -67%

@oqs-bot
Copy link
Contributor

oqs-bot commented Feb 19, 2026

CBMC Results (ML-DSA-44)

Full Results (177 proofs)
Proof Status Current Previous Change
**TOTAL** 2188s 2055s +6.5%
sign_verify_internal 270s 254s +6%
mld_attempt_signature_generation 244s 221s +10%
polyvecl_pointwise_acc_montgomery_c 232s 208s +12%
rej_uniform_native 150s 144s +4%
poly_pointwise_montgomery_c 147s 143s +3%
mld_ct_memcmp 86s 82s +5%
mld_invntt_layer 56s 50s +12%
mld_ntt_layer 45s 44s +2%
sign_signature_internal 44s 45s -2%
keccak_squeezeblocks_x4 43s 44s -2%
poly_invntt_tomont_c 42s 39s +8%
rej_uniform 22s 20s +10%
rej_uniform_c 22s 20s +10%
fqmul 19s 19s +0%
poly_uniform_eta_4x 19s 19s +0%
polyt0_unpack 18s 15s +20%
poly_uniform_4x 17s 13s +31%
mld_polyvecl_permute_bitrev_to_custom_native 16s 14s +14%
mld_ntt_butterfly_block 15s 13s +15%
poly_chknorm_c 15s 12s +25%
polymat_permute_bitrev_to_custom 15s 16s -6%
polyvec_matrix_expand 15s 16s -6%
keccakf1600x4_permute_native 13s 14s -7%
polyeta_unpack 13s 12s +8%
mld_compute_t0_t1_tr_from_sk_components 12s 15s -20%
polyz_unpack_c 12s 12s +0%
keccak_absorb_once_x4 11s 9s +22%
keccakf1600_permute_native 9s 6s +50%
mld_check_pct 9s 6s +50%
polyveck_reduce 9s 5s +80%
keccak_absorb 8s 8s +0%
polyveck_add 8s 7s +14%
sign 8s 5s +60%
keccakf1600_permute 7s 8s -12%
mld_prepare_domain_separation_prefix 7s 5s +40%
poly_use_hint_c 7s 5s +40%
polyvec_matrix_pointwise_montgomery 7s 6s +17%
polyveck_decompose 7s 7s +0%
polyveck_pointwise_poly_montgomery 7s 5s +40%
polyvecl_ntt 7s 6s +17%
poly_chknorm 6s 4s +50%
poly_uniform_gamma1_4x 6s 4s +50%
polyvec_matrix_expand_serial 6s 7s -14%
polyveck_invntt_tomont 6s 6s +0%
polyveck_make_hint 6s 4s +50%
polyveck_use_hint 6s 5s +20%
rej_eta_c 6s 4s +50%
sign_keypair_internal 6s 4s +50%
sign_pk_from_sk 6s 6s +0%
sign_verify_pre_hash_shake256 6s 3s +100%
keccak_squeeze 5s 5s +0%
mld_compute_pack_z 5s 6s -17%
mld_h 5s 6s -17%
mld_sample_s1_s2_serial 5s 3s +67%
pack_sig_c_h 5s 4s +25%
poly_decompose_c 5s 5s +0%
poly_ntt_native 5s 2s +150%
poly_pointwise_montgomery 5s 2s +150%
poly_pointwise_montgomery_native 5s 3s +67%
polyeta_pack 5s 3s +67%
polyveck_caddq 5s 8s -38%
polyveck_ntt 5s 4s +25%
polyveck_pack_w1 5s 4s +25%
polyveck_shiftl 5s 5s +0%
polyvecl_chknorm 5s 6s -17%
polyvecl_pointwise_acc_montgomery 5s 4s +25%
sign_keypair 5s 2s +150%
sign_signature 5s 6s -17%
sign_signature_pre_hash_internal 5s 4s +25%
unpack_pk 5s 3s +67%
unpack_sk 5s 4s +25%
fqscale 4s 1s +300%
keccakf1600x4_xor_bytes 4s 2s +100%
poly_add 4s 4s +0%
poly_caddq_c 4s 3s +33%
poly_challenge 4s 3s +33%
poly_decompose_native 4s 3s +33%
poly_invntt_tomont_native 4s 4s +0%
poly_ntt 4s 3s +33%
poly_power2round 4s 5s -20%
poly_shiftl 4s 4s +0%
poly_uniform 4s 5s -20%
poly_uniform_eta 4s 5s -20%
poly_use_hint_native 4s 3s +33%
polyt1_unpack 4s 4s +0%
polyveck_pack_eta 4s 3s +33%
polyveck_pack_t0 4s 3s +33%
polyveck_power2round 4s 6s -33%
polyvecl_pointwise_acc_montgomery_native 4s 3s +33%
polyvecl_uniform_gamma1_serial 4s 3s +33%
polyvecl_unpack_z 4s 5s -20%
rej_eta_native 4s 4s +0%
shake128_init 4s 2s +100%
shake128_release 4s 3s +33%
shake256 4s 3s +33%
sign_open 4s 6s -33%
sign_signature_pre_hash_shake256 4s 5s -20%
sign_verify_pre_hash_internal 4s 4s +0%
unpack_hints 4s 5s -20%
intt_native_x86_64 3s 3s +0%
keccakf1600_extract_bytes (big endian) 3s 3s +0%
keccakf1600_xor_bytes 3s 2s +50%
keccakf1600x4_extract_bytes 3s 3s +0%
keccakf1600x4_extract_bytes_native 3s - new
keccakf1600x4_permute 3s 3s +0%
keccakf1600x4_xor_bytes_native 3s - new
make_hint 3s 3s +0%
mld_ct_abs_i32 3s 2s +50%
mld_ct_cmask_neg_i32 3s 2s +50%
mld_ct_cmask_nonzero_u32 3s 2s +50%
mld_ct_get_optblocker_i64 3s 2s +50%
mld_ct_get_optblocker_u32 3s 1s +200%
mld_keccakf1600_extract_bytes 3s 6s -50%
mld_sample_s1_s2 3s 4s -25%
mld_value_barrier_i64 3s 4s -25%
mld_value_barrier_u32 3s 4s -25%
pack_sk 3s 3s +0%
poly_caddq_native_aarch64 3s 3s +0%
poly_chknorm_native 3s 6s -50%
poly_decompose 3s 4s -25%
poly_invntt_tomont 3s 2s +50%
poly_reduce 3s 4s -25%
poly_use_hint 3s 2s +50%
polyt0_pack 3s 4s -25%
polyt1_pack 3s 1s +200%
polyveck_sub 3s 4s -25%
polyveck_unpack_eta 3s 4s -25%
polyvecl_pack_eta 3s 3s +0%
polyvecl_permute_bitrev_to_custom 3s 2s +50%
polyvecl_unpack_eta 3s 2s +50%
polyz_unpack_native 3s 3s +0%
power2round 3s 2s +50%
rej_eta 3s 1s +200%
shake128_finalize 3s 2s +50%
shake128_squeeze 3s 2s +50%
shake128x4_absorb_once 3s 4s -25%
shake256_finalize 3s 4s -25%
shake256_init 3s 3s +0%
shake256_squeeze 3s 5s -40%
sign_signature_extmu 3s 7s -57%
sign_verify_extmu 3s 2s +50%
sys_check_capability 3s 4s -25%
unpack_sig 3s 3s +0%
use_hint 3s 2s +50%
caddq 2s 3s -33%
decompose 2s 3s -33%
keccak_finalize 2s 4s -50%
keccak_init 2s 3s -33%
keccakf1600_xor_bytes (big endian) 2s 2s +0%
mld_ct_cmask_nonzero_u8 2s 4s -50%
mld_ct_get_optblocker_u8 2s 1s +100%
mld_ct_sel_int32 2s 4s -50%
mld_value_barrier_u8 2s 1s +100%
montgomery_reduce 2s 2s +0%
ntt_native_x86_64 2s 3s -33%
pack_pk 2s 3s -33%
pack_sig_z 2s 2s +0%
poly_caddq 2s 3s -33%
poly_make_hint 2s 4s -50%
poly_ntt_c 2s 2s +0%
poly_sub 2s 3s -33%
poly_uniform_gamma1 2s 4s -50%
polyveck_chknorm 2s 3s -33%
polyveck_unpack_t0 2s 4s -50%
polyvecl_uniform_gamma1 2s 2s +0%
polyz_pack 2s 2s +0%
polyz_unpack 2s 5s -60%
reduce32 2s 3s -33%
shake128_absorb 2s 1s +100%
shake128x4_squeezeblocks 2s 2s +0%
shake256_absorb 2s 2s +0%
shake256x4_absorb_once 2s 4s -50%
shake256x4_squeezeblocks 2s 2s +0%
sign_verify 2s 3s -33%
poly_caddq_native 1s 2s -50%
polyw1_pack 1s 2s -50%
shake256_release 1s 3s -67%

@oqs-bot
Copy link
Contributor

oqs-bot commented Feb 19, 2026

CBMC Results (ML-DSA-87)

Full Results (177 proofs)
Proof Status Current Previous Change
**TOTAL** 2553s 2449s +4.2%
sign_verify_internal 367s 353s +4%
mld_attempt_signature_generation 238s 227s +5%
polyvecl_pointwise_acc_montgomery_c 178s 165s +8%
polyvec_matrix_expand 156s 153s +2%
poly_pointwise_montgomery_c 141s 128s +10%
rej_uniform_native 138s 139s -1%
mld_invntt_layer 118s 114s +4%
polyvec_matrix_expand_serial 105s 110s -5%
mld_ct_memcmp 78s 74s +5%
sign_signature_internal 48s 46s +4%
mld_ntt_layer 46s 44s +5%
keccak_squeezeblocks_x4 44s 42s +5%
mld_compute_t0_t1_tr_from_sk_components 28s 25s +12%
polymat_permute_bitrev_to_custom 26s 24s +8%
rej_uniform 22s 21s +5%
fqmul 20s 18s +11%
rej_uniform_c 19s 16s +19%
poly_chknorm_c 18s 17s +6%
poly_uniform_4x 18s 17s +6%
poly_uniform_eta_4x 17s 17s +0%
polyveck_power2round 15s 14s +7%
keccakf1600x4_permute_native 14s 12s +17%
polyt0_unpack 14s 17s -18%
polyeta_unpack 13s 13s +0%
polyvec_matrix_pointwise_montgomery 13s 12s +8%
polyveck_add 13s 13s +0%
mld_ntt_butterfly_block 11s 13s -15%
mld_sample_s1_s2 11s 6s +83%
poly_decompose_c 11s 7s +57%
polyveck_ntt 11s 8s +38%
polyveck_reduce 11s 13s -15%
polyveck_use_hint 11s 9s +22%
keccak_absorb_once_x4 10s 10s +0%
keccakf1600_permute 9s 7s +29%
keccakf1600_permute_native 9s 10s -10%
poly_invntt_tomont_c 9s 9s +0%
polyveck_chknorm 9s 6s +50%
polyveck_pointwise_poly_montgomery 9s 8s +12%
polyveck_shiftl 9s 6s +50%
polyvecl_ntt 9s 11s -18%
mld_compute_pack_z 8s 7s +14%
mld_polyvecl_permute_bitrev_to_custom_native 8s 8s +0%
poly_add 8s 4s +100%
polyveck_decompose 8s 6s +33%
sign_pk_from_sk 8s 9s -11%
mld_check_pct 7s 7s +0%
mld_sample_s1_s2_serial 7s 6s +17%
polyveck_caddq 7s 7s +0%
polyveck_invntt_tomont 7s 10s -30%
sign_verify_extmu 7s 3s +133%
poly_caddq_native 6s 2s +200%
poly_decompose 6s 3s +100%
polyveck_pack_eta 6s 2s +200%
polyveck_sub 6s 7s -14%
sign 6s 7s -14%
sign_signature_extmu 6s 5s +20%
unpack_pk 6s 6s +0%
decompose 5s 4s +25%
keccak_absorb 5s 7s -29%
ntt_native_x86_64 5s 4s +25%
poly_caddq 5s 2s +150%
poly_challenge 5s 5s +0%
poly_ntt 5s 3s +67%
poly_reduce 5s 4s +25%
poly_sub 5s 3s +67%
poly_uniform_gamma1 5s 3s +67%
poly_uniform_gamma1_4x 5s 6s -17%
polyveck_unpack_t0 5s 6s -17%
polyvecl_chknorm 5s 4s +25%
polyz_unpack_c 5s 5s +0%
rej_eta_c 5s 4s +25%
sign_keypair_internal 5s 6s -17%
unpack_hints 5s 4s +25%
intt_native_x86_64 4s 5s -20%
keccakf1600x4_xor_bytes_native 4s - new
mld_ct_sel_int32 4s 2s +100%
mld_h 4s 2s +100%
mld_prepare_domain_separation_prefix 4s 3s +33%
montgomery_reduce 4s 3s +33%
pack_pk 4s 4s +0%
pack_sig_c_h 4s 3s +33%
poly_decompose_native 4s 5s -20%
poly_ntt_c 4s 1s +300%
poly_ntt_native 4s 6s -33%
poly_power2round 4s 3s +33%
poly_shiftl 4s 2s +100%
poly_uniform_eta 4s 4s +0%
poly_use_hint_c 4s 4s +0%
polyt0_pack 4s 5s -20%
polyveck_make_hint 4s 5s -20%
polyveck_pack_w1 4s 4s +0%
polyveck_unpack_eta 4s 3s +33%
polyvecl_permute_bitrev_to_custom 4s 2s +100%
polyvecl_uniform_gamma1 4s 4s +0%
polyvecl_uniform_gamma1_serial 4s 7s -43%
polyvecl_unpack_eta 4s 3s +33%
polyvecl_unpack_z 4s 3s +33%
polyz_unpack 4s 3s +33%
power2round 4s 4s +0%
reduce32 4s 4s +0%
rej_eta 4s 2s +100%
shake128_finalize 4s 3s +33%
shake256_init 4s 2s +100%
shake256x4_absorb_once 4s 2s +100%
sign_keypair 4s 3s +33%
sign_open 4s 6s -33%
sign_signature 4s 5s -20%
sign_signature_pre_hash_shake256 4s 4s +0%
sign_verify 4s 2s +100%
sign_verify_pre_hash_internal 4s 5s -20%
unpack_sk 4s 6s -33%
fqscale 3s 5s -40%
keccak_finalize 3s 2s +50%
keccak_squeeze 3s 4s -25%
keccakf1600_xor_bytes (big endian) 3s 3s +0%
keccakf1600x4_extract_bytes 3s 2s +50%
make_hint 3s 5s -40%
mld_ct_cmask_nonzero_u32 3s 4s -25%
mld_keccakf1600_extract_bytes 3s 1s +200%
pack_sk 3s 3s +0%
poly_caddq_c 3s 3s +0%
poly_chknorm 3s 2s +50%
poly_chknorm_native 3s 2s +50%
poly_invntt_tomont 3s 2s +50%
poly_invntt_tomont_native 3s 2s +50%
poly_make_hint 3s 2s +50%
poly_pointwise_montgomery_native 3s 3s +0%
poly_uniform 3s 6s -50%
poly_use_hint 3s 3s +0%
poly_use_hint_native 3s 4s -25%
polyt1_pack 3s 2s +50%
polyveck_pack_t0 3s 2s +50%
polyvecl_pack_eta 3s 3s +0%
polyvecl_pointwise_acc_montgomery_native 3s 3s +0%
polyw1_pack 3s 2s +50%
polyz_unpack_native 3s 4s -25%
rej_eta_native 3s 5s -40%
shake128x4_squeezeblocks 3s 2s +50%
shake256_finalize 3s 3s +0%
shake256x4_squeezeblocks 3s 1s +200%
sign_signature_pre_hash_internal 3s 5s -40%
sign_verify_pre_hash_shake256 3s 4s -25%
sys_check_capability 3s 3s +0%
use_hint 3s 3s +0%
caddq 2s 4s -50%
keccakf1600_extract_bytes (big endian) 2s 3s -33%
keccakf1600x4_permute 2s 2s +0%
keccakf1600x4_xor_bytes 2s 2s +0%
mld_ct_abs_i32 2s 2s +0%
mld_ct_cmask_nonzero_u8 2s 3s -33%
mld_ct_get_optblocker_i64 2s 3s -33%
mld_ct_get_optblocker_u8 2s 2s +0%
mld_value_barrier_u32 2s 2s +0%
mld_value_barrier_u8 2s 2s +0%
pack_sig_z 2s 3s -33%
poly_caddq_native_aarch64 2s 3s -33%
poly_pointwise_montgomery 2s 4s -50%
polyeta_pack 2s 4s -50%
polyt1_unpack 2s 3s -33%
polyvecl_pointwise_acc_montgomery 2s 4s -50%
shake128_absorb 2s 2s +0%
shake128_init 2s 2s +0%
shake128_release 2s 3s -33%
shake128_squeeze 2s 5s -60%
shake128x4_absorb_once 2s 4s -50%
shake256 2s 4s -50%
shake256_absorb 2s 3s -33%
shake256_release 2s 3s -33%
shake256_squeeze 2s 2s +0%
unpack_sig 2s 5s -60%
keccak_init 1s 3s -67%
keccakf1600_xor_bytes 1s 2s -50%
keccakf1600x4_extract_bytes_native 1s - new
mld_ct_cmask_neg_i32 1s 1s +0%
mld_ct_get_optblocker_u32 1s 2s -50%
mld_value_barrier_i64 1s 3s -67%
polyz_pack 1s 2s -50%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants