diff options
author | Mattias Andrée <maandree@kth.se> | 2022-01-19 20:28:55 +0100 |
---|---|---|
committer | Mattias Andrée <maandree@kth.se> | 2022-01-19 20:28:55 +0100 |
commit | 5d77a0178349ecac6536e0374cf689500efa22bc (patch) | |
tree | f6fcb38cd39e8f4240537233a08fdbb5c0284798 /libblake_blake2b_force_update.c | |
parent | Improve portability (diff) | |
download | libblake-5d77a0178349ecac6536e0374cf689500efa22bc.tar.gz libblake-5d77a0178349ecac6536e0374cf689500efa22bc.tar.bz2 libblake-5d77a0178349ecac6536e0374cf689500efa22bc.tar.xz |
Optimisation for amd64
Increased major number as the ABI was broken
by insertion of padding into the BLAKE2
parameter structures (except for BLAKE2Xs)
Signed-off-by: Mattias Andrée <maandree@kth.se>
Diffstat (limited to 'libblake_blake2b_force_update.c')
-rw-r--r-- | libblake_blake2b_force_update.c | 24 |
1 files changed, 23 insertions, 1 deletions
diff --git a/libblake_blake2b_force_update.c b/libblake_blake2b_force_update.c index 60b8fab..2446e16 100644 --- a/libblake_blake2b_force_update.c +++ b/libblake_blake2b_force_update.c @@ -8,8 +8,30 @@ libblake_blake2b_force_update(struct libblake_blake2b_state *state, const void * size_t off = 0; for (; len - off >= 128; off += 128) { + /* The following optimisations have been tested: + * + * 1) + * `*(__uint128_t *)state->t += 128;` + * result: slower + * + * 2) + * addq, adcq using `__asm__ __volatile__` + * result: slower (as 1) + * + * 3) + * using `__builtin_add_overflow` + * result: no difference + * + * These testes where preformed on amd64 with a compile-time + * assumption that `UINT_LEAST64_C(0xFFFFffffFFFFffff) + 1 == 0`, + * which the compiler accepted and those included the attempted + * optimisations. + * + * UNLIKELY does not seem to make any difference, but it + * does change the output, theoretically of the better. + */ state->t[0] = (state->t[0] + 128) & UINT_LEAST64_C(0xFFFFffffFFFFffff); - if (state->t[0] < 128) + if (UNLIKELY(state->t[0] < 128)) state->t[1] = (state->t[1] + 1) & UINT_LEAST64_C(0xFFFFffffFFFFffff); libblake_internal_blake2b_compress(state, &data[off]); |