STATUS


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94

Optimisation progress for libzahl, compared to other big integer
libraries. These comparisons are for 152-bit integers. Functions
in parenthesis the right column are functions that needs
optimisation to improve the peformance of the function in the
left column. Double-parenthesis means there may be a better way
to do it. Inside square-brackets, there are some comments on
multi-bit comparisons.

zset .................... fastest [always]
zseti ................... tomsfastmath is faster [always]
zsetu ................... tomsfastmath is faster [always]
zneg(a, b) .............. fastest [always]
zneg(a, a) .............. fastest [always] (shared with gmp)
zabs(a, b) .............. fastest [always]
zabs(a, a) .............. tomsfastmath is faster [always]
zadd_unsigned ........... fastest [always]
zsub_unsigned ........... fastest [always]
zadd .................... fastest [after ~100, tomsfastmath before] (shared with gmp)
zsub .................... fastest [always]
zand .................... 77 % of tomsfastmath [until ~900, alternating with gmp]
zor ..................... 65 % of tomsfastmath [until ~1750, alternating with gmp (gcc) and tomsfastmath (clang)]
zxor .................... 87 % of tomsfastmath [until ~700, alternating with gmp (gcc+clangs),]
znot .................... fastest [always]
zeven ................... fastest [always]
zodd .................... fastest [always]
zeven_nonzero ........... fastest [always]
zodd_nonzero ............ fastest [always]
zzero ................... fastest [always] (shared with gmp)
zsignum ................. fastest [always] (shared with gmp)
zbits ................... fastest [always]
zlsb .................... fastest [always]
zswap ................... fastest [always]
zlsh .................... fastest [until ~1000, then gmp]
zrsh .................... fastest [almost never]
ztrunc(a, b, c) ......... fastest [always; alternating with gmp between 1400~3000 (clang)]
ztrunc(a, a, b) ......... fastest [until ~150, then 77 % of tomsfastmath; slightly slower than gmp (clang)]
zsplit .................. fastest [alternating with gmp and slightly slow than gmp]
zcmpmag ................. fastest [always]
zcmp .................... fastest [almost never]
zcmpi(a, +) ............. fastest [always]
zcmpi(a, -) ............. fastest [always]
zcmpu ................... fastest [always]
zbset(a, b, 1) .......... fastest [always]
zbset(a, a, 1) .......... fastest [always]
zbset(a, b, 0) .......... fastest [always]
zbset(a, a, 0) .......... fastest [always]
zbset(a, b, -1) ......... fastest [always]
zbset(a, a, -1) ......... fastest [always]
zbtest .................. fastest [always]
zgcd .................... 21 % of gmp (zcmpmag)
zmul .................... slowest
zsqr .................... slowest (zmul)
zmodmul(big mod) ........ slowest ((zmul, zmod))
zmodsqr(big mod) ........ slowest ((zmul, zmod))
zmodmul(tiny mod) ....... slowest ((zmul))
zmodsqr(tiny mod) ....... slowest ((zmul))
zpow .................... slowest (zmul, zsqr)
zpowu ................... slowest (zmul, zsqr)
zmodpow ................. slowest (zmul, zsqr. zmod)
zmodpowu ................ slowest (zmul, zsqr, zmod)
zsets ................... 13 % of gmp
zstr_length(a, 10) ...... gmp is faster [always] (zdiv, zsqr)
zstr(a, b, n) ........... 8 % of gmp, 59 % of hebimath
zrand(default uniform) .. 51 % of gmp
zptest .................. slowest (zrand, zmodpow, zsqr, zmod)
zsave ................... fastest [until ~250, then tomsfastmath; libtommath is suspicious]
zload ................... fastest [always]
zdiv(big denum) ......... tomsfastmath and naïve hebimath implementation are faster (zdivmod)
zmod(big denum) ......... fastest (zdivmod)
zdivmod(big denum) ...... fastest
zdiv(tiny denum) ........ slowest
zmod(tiny denum) ........ slowest
zdivmod(tiny denum) ..... slowest

Note, some corresponding functions are not implemented in
some other libraries. In such cases, they have been implemented
in the translation layers (found under bench/). Those
implementations are often suboptimal, but probably in style
with what you would write if you need that functionality.
Note further, that if, for example, you want do perform
addition and you know that your operands are non-negative,
you would choose zadd_unsigned in libzahl, but if you are
using a library that does not have the corrsponding function,
you are better of with the regular addition (zadd).

Also note, TomsFastMath does not support arbitrarily large
integers, which gives is a significant performance advantage.
Furthermore, no failure check is done with GMP. Additionally,
hebimath has some functions that are not working correctly;
those have been excluded from the comparison.

Also note, NOT does not mean the same thing in all libraries,
for example in GMP it means (-x - 1), thus, znot does not
use GMP's NOT in the translations layer.