doc/what-is-libzahl.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236

\chapter{What is libzahl?}
\label{chap:What is libzahl?}

In this chapter, it is discussed what libzahl is,
why it is called libzahl, why it exists, why
you should use it, what makes it different, and
what is its limitations.

\vspace{1cm}
\minitoc


\newpage
\section{The name and the what}
\label{sec:The name and the what}

In mathematics, the set of all integers is represented
by a bold uppercase `Z' ({\bf Z}), or sometimes              % proper symbol
double-stroked (blackboard bold) ($\mathbb{Z}$). This symbol % hand-written style, specially on whiteboards and blackboards
is derived from the german word for integers: `Zahlen'
[\textprimstress{}tsa\textlengthmark{}l\textschwa{}n],
whose singular is `Zahl' [tsa\textlengthmark{}l]. libzahl
[l\textsecstress{}\textsci{}b\textprimstress{}tsa\textlengthmark{}l]
is a C library capable of representing very large integers,
limited by the memory address space and available memory.
Whilst this is almost none of the elements in {\bf Z},
it is substantially more than available using the intrinsic
integer types in C. libzahl of course also implements
functions for performing arithmetic operations over
integers represented using libzahl. Libraries such as
libzahl are called bigint libraries, big integer libraries,
multiple precision integer libraries, arbitrary precision
integer libraries,\footnote{`Multiple precision integer'
and `arbitrary precision integer' are misnomers, precision
is only relevant for floating-point numbers.} or bignum
libraries, or any of the previous with `number' substituted
for `integer'. Some libraries that refer to themselves as bignum
libraries or any of using the word `number' support other
number types than integers. libzahl only supports integers.


\newpage
\section{Why does it exist?}
\label{sec:Why does it exist?}

libzahl's main competitors are GNU MP (gmp),\footnote{GNU
Multiple Precision Arithmetic Library} LibTomMath (ltm),
TomsFastMath (tfm) and Hebimath. All of these have problems:

\begin{itemize}
\item
GNU MP is extremely bloated, can only be complied with
GCC, and requires that you use glibc unless another C
standard library was used when GNU MP was compiled.
Additionally, whilst its performance is generally good,
it can still be improved. Furthermore, GNU MP cannot be
used for robust applications.

\item
LibTomMath is very slow, in fact performance is not its
priority, rather its simplicity is the priority. Despite
this, it is not really that simple.

\item
TomsFastMath is slow, complicated, and is not a true
big integer library and is specifically targeted at
cryptography.
\end{itemize}

libzahl is developed under the suckless.org umbrella.
As such, it attempts to follow the suckless
philosophy.\footnote{\href{http://suckless.org/philosophy}
{http://suckless.org/philosophy}} libzahl is simple,
very fast, simple to use, and can be used in robust
applications. Currently however, it does not support
multithreading, but it has better support for multiprocessing
and distributed computing than its competitors.

Lesser ``competitors'' (less known) to libzahl include
Hebimath and bsdnt.

\begin{itemize}
\item
Hebimath is far from stable, some fundamental functions
are not implemented and some functions are broken. The
author of libzahl thinks Hebimath is promising, but that
it could be better designed. Like libzahl, Hebimath aims
to follow the suckless philosophy.

\item
bsdnt has not been thoroughly investigated, but it
does not look promising.
\end{itemize}


\newpage
\section{How is it different?}
\label{sec:How is it different?}

All big number libraries have in common that both input
and output integers are parameters for the functions.
There are however two variants of this: input parameters
followed by output parameters, and output parameters
followed by input parameters. The former variant is the
conventional for C functions. The latter is more in style
with primitive operations, pseudo-code, mathematics, and
how it would look if the output was return. In libzahl, the
latter convention is used. That is, we write

\begin{alltt}
   zadd(sum, augend, addend);
\end{alltt}

\noindent
rather than

\begin{alltt}
   zadd(augend, addend, sum);
\end{alltt}

\noindent
This can be compared to

\vspace{1em}
$sum \gets augend + addend$
\vspace{1em}

\noindent
versus

\vspace{1em}
$augend + addend \rightarrow sum$.
\vspace{1em}

libzahl, GNU MP, and Hebimath use the output-first
convention.\footnote{GNU MP-style.} LibTomMath and
TomsFastMath use the input-first convention.\footnote{BSD
MP-style.}

Unlike other bignum libraries, errors in libzahl are
caught using {\tt setjmp}. This ensure that it can be
used in robust applications, catching errors does not
become a mess, and it minimises the overhead of
catching errors. Typically, errors can be checked when
they can occur and after each function return; however,
here they can be checked only when they can occur.

Additionally, libzahl tries to keep the functions'
names simple and natural rather than technical or
mathematical. The names resemble those of the standard
integer operators. For example, the left-shift, right-shift
and truncation bit-operations in libzahl are called
{\tt zlsh}, {\tt zrsh} and {\tt ztrunc}, respectively.
In GNU MP, they are called {\tt mpz\_mul\_2exp},
{\tt mpz\_tdiv\_q\_2exp} and {\tt mpz\_tdiv\_r\_2exp}.
The need of complicated names are diminished by resisting
to implement all possible variants of each operations.
Variants of a function simply append a short description
of the difference in plain text. For example, a variant of
{\tt zadd} that makes the assumption that both operands
are non-negative (or if not so, calculates the sum of
their absolute values) is called {\tt zadd\_unsigned}.
If libzahl would have had floored and ceiled variants of
{\tt zdiv} (truncated division), they would have been
called {\tt zdiv\_floor} and {\tt zdiv\_ceiling}.
{\tt zdiv} and {\tt zmod} (modulus) are variants of
{\tt zdivmod} that throw away one of the outputs. These
names can be compared to GNU MP's variants of truncated
division: {\tt mpz\_tdiv\_q}, {\tt mpz\_tdiv\_r} and
{\tt mpz\_tdiv\_qr}.


\newpage
\section{Limitations}
\label{sec:Limitations}

libzahl is not recommended for cryptographic
applications, it is not mature enough, and its
author does not have the necessary expertise.
And in particular, it does not implement constant
time operations, and it does not clear pooled
memory. Using libzahl in cryptographic application
is insecure; your application may become susceptible
attacks such as timing attacks, power-monitoring
attacks, electromagnetic attacks, acoustic
cryptanalysis, and data remanence attacks. libzahl
is known to be susceptible to timing attacks
(due to lack of constant time operations) and
data remanence attacks (due to pooling memory
for reuse without clearing the content of the
memory allocations.) Additionally, libzahl is not
thread-safe.

libzahl is also only designed for POSIX systems.
It will probably run just fine on any modern
system. But it makes some assumption that POSIX
stipulates or are unpractical to leave out from
machines that should support POSIX (or even
support modern software):

\begin{itemize}
\item
Bytes are octets.

\item
There is an integer type that is 64-bits wide.
(The compiler needs to support it, but it is not
strictly necessary for it to be an CPU-intrinsic,
but that would be favourable for performance.)

\item
Two's complement is used.
(The compiler needs to support it, but it is not
strictly necessary for it to be an CPU-intrinsic,
but that would be favourable for performance.)
\end{itemize}

Because of the prevalence of these properties
in contemporary machines, and the utilisation of
these properties in software, especially software
for POSIX and popular platforms with similar
properties, any new general-purpose machine must
have these properties, lest it be useless with
today's software. Therefore, libzahl can make
the assumption that the machine has these
properties. If the machine does not have these
properties, the compiler must compensate for
these machines deficiencies, making it generally
slower.

These limitations may be removed later. And there
is some code that does not make these assumptions
but acknowledge that it may be a case. On the other
hand, these limitations could be fixed, and agnostic
code could be rewritten to assume that these
restrictions are met.