aboutsummaryrefslogtreecommitdiffstats
path: root/doc/info/chap/memory-allocation.texinfo
blob: 3a07824f5055f0e1cc15945a549cfa7bd15bf3b9 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
@node Memory allocation
@chapter Memory allocation

@cpindex Memory allocation
@cpindex Allocate memory
The ability to allocate memory of on at compile-time
unknown amount is an important feature for most
programs.

@cpindex Virtual address space
@cpindex Virtual memory-address
@cpindex Memory, virtual address space
@cpindex @sc{RAM}
@cpindex Swap space
On modern operating systems, processes do not have
direct access to the memory. Only the operating system
kernel does. Instead, the process have virtual
memory-addresses, that the kernel maps to either,
real @sc{RAM}, swap space (disc backed), file segments,
or zeroes.

@cpindex Forks
@cpindex Exec:s
@cpindex Process image
@cpindex Memory sharing, private memory
@cpindex Private memory sharing
@cpindex Memory deduplication
@cpindex Deduplication memory
Memory for a process is either allocated programmatically,
or when the process forks (is created) or exec:s
(changes process image.) The operating system kernel is
typically configured to share a much memory between processes
as possible. For example, and a process forks, they will
share their memory rather than duplicate the memory, and
the kernel will only remap the memory when the processs'
memory content diverges. It is also possible to allocate
memory in such a way that processes can share it. so that
updates from one process influences the other processes.

@cpindex Virtual address space segments
@cpindex Memory segments
@cpindex Segments, virtual address space
A process' virtual address space is divided into segments.
There are three important segments.

@table @i
@item text segment
@cpindex Segment, text
@cpindex Text segment
@cpindex @code{.text}
@cpindex Instructions
@cpindex Static constants
@cpindex Constants, static
@cpindex Literals
@cpindex Strings, literals
When a process exec:s this segment is allocated.
It contains instructions, static constants, and
literals.
@item BSS segment
@cpindex Segment, @sc{BSS}
@cpindex @sc{BSS} segment
@cpindex @code{.bss}
@cpindex Block Started by Symbol
@cpindex Uninitialised variables
@cpindex Zero variables
@cpindex Global variables
@cpindex Static variables
When a process exec:s this segment is allocated.
It contains all global and static variables that are
initialised to zero or lacks explicit initialisation.
On some systems this segment is merged into the data
segment. @sc{BSS} is an acronym: `Block Started by Symbol'.
@item data segment
@cpindex Segment, data
@cpindex Data segment
@cpindex @code{.data}
@cpindex Heap
@cpindex Memory, heap
@cpindex Global variables
@cpindex Static variables
When a process exec:s this segment is allocated.
It is filled all global and static variables that
are not covered by the @sc{BSS} segment.
This segment's lower end is fixed, and its upper end
is unfixed; it can be resized. Any part of the segment
that is a result of it being resized, is referred to
as the heap.
@item stack segment
@cpindex Segment, stack
@cpindex Stack segment
@cpindex Memory, stack
@cpindex Call-stack
@cpindex Automatic variables
This segment contains a@footnote{Program stack's can
created programmtically, hence `a' rather than `the'.}
program stack. A program stack contains information
required to return after a function call, and automatic
variables. Depending on the platform and the function
it may also contain arguments passed in function calls
and return values. It grows as the stack grows, but
it does not shrink as the stack shrinks.
@end table

The layout of the segments is machine dependent and
operating system. However, a common@footnote{Keeping
in mind this cannot be assumed practice, and that
it is in fact different between systems.} layout is
have the text segment (as other fixed segments) start
at 0, and end at some positive address. The text
segment is then followed by the @sc{BSS} segment, and
the data segment. The stack segment however, grows
from the highest address (@math{-1} on two's
complement-machines) downwards. The process cannot
allocate any more memory when the allocation would
require the data segment to be grown to such an
extent that it would overlap with the stack segment.

In C there are multiple ways to allocate memory.
@itemize @bullet{}
@item
@cpindex Global variables
@cpindex Static allocations
Variables that are declared outside functions are
called global variables@footnote{Even if that are
declared with @code{static}. In this context,
@code{static} is only use to hide object from
other translation units.}. These are stored either
in the @sc{BSS} segment or in the data segment, depending
on their initialisation. Pointer are stored as
numerical values, and the content is stored in the
text segment. Arrays however are not stored, but
their content is. In the data segment (or in the
@sc{BSS} segment if the elements are zeroes.) These
allocations are known as @i{static allocations}.
@item
@cpindex @code{static}
@cpindex Static variables
@cpindex Static allocations
Variables that are declared inside functions, with
@code{static}, are stored just like global variables.
These are called static variables, and remain unchanged
between function calls, and only change in statements
other than the declaration statement. These allocations
are known as @i{static allocations}.
@item
@cpindex @code{auto}
@cpindex Local variables
@cpindex Automatic variables
@cpindex Automatic allocations
@cpindex Stack-allocations
Variables that are declared inside functions (these
are known as local variables), without @code{static},
but with @code{auto}, are called automatic variables.
These are stored in the stack segment, and are
deallocated when the function returns. Local variables
that are declared with neither @code{static},
@code{auto}, or @code{register} are often though of
as automatic; however, the compiler may choose to
add @code{register}. These allocations are known as
@i{automatic allocations}@footnote{Known in some other
programming languages as stack-allocations.}.
@item
@cpindex @code{register}
@cpindex Register variables
Variables that are declared with @code{register}
are not stored in any segment. They lack addresses
and stored as CPU registers.
@item
@cpindex Dynamic allocation
@cpindex Heap-allocation
Using system calls, a heap can be created, where
new allocations are stored. These are often allocated
with the function @code{malloc}, and are known as
@i{dynamic allocations}@footnote{Known in some other
programming languages as heap-allocations.}.
@end itemize

@cpindex Dynamic allocation
@cpindex Automatic allocations
@cpindex Stack-allocations
Some compilers, including @sc{GCC}, provide two
additional ways to allocate memory.
@table @asis
@item @i{Variable-length arrays}
@cpindex Variable-length arrays
@cpindex Arrays, variable-length
A simple example of variable-length arrays, available
with some compilers, is
@example
void function(size_t n)
@{
  int array[n];
  /* ... */
@}
@end example
Variable-length arrays have a special property:
they may be deallocated before the function returns.
In fact, they are returned once the variable becomes
invisible, this causes the follow example to work
without every exhausting the memory.
@example
void function(size_t n)
@{
  for (;;)
    @{
      int array[n];
      /* ... */
    @}
@}
@end example

@item @code{alloca}
@fnindex alloca
@code{alloca} is a special function that is implement
by the compiler. It increases the stack and returns
a pointer to the beginning of the new part of the stack.
It is similar to variable-length arrays, however the
allocation is not deallocated before the function returns.
This causes the follow example to eventually exhaust
the memory.
@example
void function(size_t n)
@{
  for (;;)
    @{
      int pointer* = alloca(n);
      /* ... */
    @}
@}
@end example
@end table

Both of these allocation-methods are both automatic
and dynamic.

@cpindex Memory exhaustion
Memory allocation functions return @code{NULL} if
the process cannot allocate more memory. However,
under typical configurations of the operating system
kernel, memory allocation functions can return
succesfully, even if the system's memory is exhausted.
The process's virtual address space is not exhausted,
so it thinks it can allocate the memory, but the machines
memory is exhausted, so once the process tries to write
to the memory, the kernel cannot map the virtual address
to a real address. When this happens the process is
killed by a @code{SIGSEGV}@footnote{Segmentation
violation, also known as segmentation fault.} signal.


@menu
* The alloca function::                       Dynamically allocate automatically freed memory.
* Basic memory allocation::                   Basic functions for dynamic memory allocation.
* Aligned memory allocation::                 Dynamic memory allocation with alignment.
* Resizing memory allocations::               How to resize memory allocations.
* Efficient stack-based allocations::         Improving the performance using constrained allocation methods.
* Resizing the data segment::                 How to change the size of the heap.
* Memory locking::                            How to prevent pages from being swapped out.
@end menu



@node The alloca function
@section The @code{alloca} function

@cpindex Dynamic allocation
@cpindex Automatic allocations
@cpindex Stack-allocations
@fnindex alloca
@hfindex alloca.h
The function @code{void* alloca(size_t n)} appears
on multiple systems: 32V, @sc{PWB}, @sc{PWB}.2,
3@sc{BSD}, 4@sc{BSD}, and @sc{GNU}. It has been
added to @command{slibc}, without require on
feature-test macros, despite not being standardised
(for instance, it does not appear in @sc{POSIX}).
This function is however not portable, and will not
be made available if @code{_PORTABLE_SOURCE} or
@code{_LIBRARY_HEADER} is defined. @code{alloca}
is defined in the header file @file{<alloca.h>}.

@code{void* alloca(size_t n)} is similar to the
function @code{malloc}. It allocates a contiguous
space of usable memory of @code{n} bytes, and
returns a pointer, which points to the beginning
of the allocated memory. However, the allocate
appears on the stack rather than the heap, and
it automatically deallocated when the function
whence the call to @code{alloca} was made. Be
cause if this, @code{alloca} is implemented as
a macro --- using an intrinsic function provided
by the compiler --- rather than as a function.

You must not try to free the memory explicitly,
with a function like @code{free}, or resize it
with a function like @code{realloc}. Just like
arrays and pointers to automatic variables,
memory management functions cannot operate
memory allocated with @code{alloca}.

@cpindex Memory exhaustion
Unlike @code{malloc}, @code{alloca} does not detect
memory exhaustion, thus it will never return
@code{NULL}. However, it is likely that the process
will receive @code{SIGSEGV}@footnote{Segmentation
violation, also known as segmentation fault.} if it
tries to access memory that could not be allocated,
or, depending on the kernel's configuration, before
it returns.

On typical kernels and kernel configurations,
@code{alloca} and @code{malloc} will handle memory
exhaustion identically.

Undefined behaviour may be invoked if @code{alloca}
is called within a function call. The behaviour
depends on the machine, the compiler, and
optimisations. You should avoid code similar to
@example
#define strdupa(string)  \
  strcpy(alloca((strlen(string) + 1) * sizeof(char)), string)
@end example

@code{alloca} has its restrictions --- limited lifetime,
cannot be explicitly deallocated, not growable, and
not shrinkable ---  but it can also be advantageous
because:
@itemize
@item
Results in cleaner code because it is deallocated
automatically.
@item
It does not waste any space on metainformation
required for bookkeeping.
@item
Uses a faster memory allocation mechanism.
@end itemize



@node Basic memory allocation
@section Basic memory allocation

@cpindex Dynamic allocation
@fnindex malloc
@hfindex malloc.h
@hfindex stdlib.h
@code{malloc} is the most general, and most commonly
used memory allocation facility. It is also the most
portable, and is available on all environments. There
are three important function that were introduced in
@sc{ANSI}@tie{}C. These are defined in @file{<malloc.h>},
however, they should be included via @file{<stdlib.h>}
whihc includes @file{<malloc.h>}.

@table @code
@item void* malloc(size_t size)
@fnindex malloc
@cpindex Memory allocation without initialisation
@cpindex Initialised memory allocation
Allocates a contiguous memory block of @code{size}
bytes of usuable memory, and returns a pointer
to the beginning of the usable part of the allocation.
The function allocates some extra memory for internal
bookkeeping, this part of the allocation is located
before the address the returned pointer points.
Allocated memory is uninitialised.

If the memory cannot be allocated (due to memory
exhaustion,) @code{NULL} is returned and @code{errno}
is set to @code{ENOMEM}.

It is implementation defined whether this function
returns @code{NULL} or a unique pointer if
@code{size} is zero. In @code{slibc}, @code{NULL}
is returned.

The allocation be grown or shrinked at any time.
See @ref{Resizing memory allocations}.

In @sc{ISO}@tie{}C, @code{void*} can be implicitly
casted to any other pointer type. Therefore it
is sufficient (and preferable) to write
@example
int* ten_integers = malloc(10 * sizeof(int));
@end example
You do not need to write
@example
int* ten_integers = (int*)malloc(10 * sizeof(int));
@end example

@item void* calloc(size_t count, size_t size)
@fnindex calloc
@cpindex Memory allocation without uninitialisation
@cpindex Uninitialised memory allocation
This function is similar to @code{malloc}, but
it initialises the memory to zeroes. The only
other difference is that it as two parameters
rather than one.

If @code{count * size} would overflow, @code{NULL}
is returned and @code{errno} is set to @code{ENOMEM}.
Otherwise, @code{p = calloc(a, b)} is identical to
@example
p = malloc(a * b);
memset(p, 0, a * b);
@end example

@item void free(void* ptr)
@fnindex free
@cpindex Deallocate memory
@cpindex Memory, deallocation
Memory that is dynamically allocated (without
automatic deallocation) can be deallocated with
this function. It is required that @code{ptr}
is the pointer returned by the function that
allocated it. It must not have been incremented.
Undefined behaviour is invoked otherwise. One
exception is @code{NULL}, if @code{ptr} is
@code{NULL} nothing happens.

Do not try to free a pointer twice, undefined
behaviour. Neither should you try to pass
pointers to variables to this function, nor
pointers to functions, arrays, @code{main}'s
parameters, memory-mapped I/O, or memory allocate
with @code{alloca}. Function-like macros that allocate
memory with @code{alloca}@footnote{There functions
that will return pointers allocate with @code{alloca},
because the memory would be deallocated at the
return. Thus, such facilities are implemented as
function-like macros.} specifies so, and their
name typically end with an `a', which is not too
common for other functions.

Be also cautious of whether functions return
statically allocated arrays, which must not
be deallocated, or dynamically allocated
memory, which should be deallocated to avoid
memory leaks.
@end table

@file{<malloc.h>} also includes three unportable
functions.
@table @code
@item void* zalloc(size_t size)
@fnindex zalloc
This function is similar to @code{calloc}, but it
only has one parameter. Assuming @code{a} and
@code{b} are two @code{size_t}:s, and @code{a * b}
does not overflow, @code{calloc(a, b)} is identical
to @code{zalloc(a * b)}. @code{zalloc} is a
@command{klibc} extension.

@item void cfree(void* ptr, ...)
@fnindex cfree
This function is an obsolete function provided
by a number of C standard library implementations.
It has been replaced by @code{free}. @code{cfree}
is identical to @code{free}.

Different implementions @command{libc}, defined
@code{cfree} with different numbers of parameters,
therefore @code{slibc} declares @code{cfree} as
a variadic function, and ignores all but the first
argument.

@item size_t malloc_usable_size(void* ptr)
@fnindex malloc_usable_sizew
@cpindex Retrieve allocation size
@cpindex Allocation size, retrieve
This function returns the number of usable bytes
for a pointer. If @code{ptr} is @code{NULL}, zero
is returned. It has the same restrictions as the
function @code{free}.

@code{malloc_usable_size(malloc(n))} returns
@code{n} (and allocates memory in the process.)

This function is a @sc{GNU} extension and requires
@code{_GNU_SOURCE}.
@end table



@node Aligned memory allocation
@section Aligned memory allocation

TODO



@node Resizing memory allocations
@section Resizing memory allocations

TODO



@node Efficient stack-based allocations
@section Efficient stack-based allocations

TODO obstack.h



@node Resizing the data segment
@section Resizing the data segment

TODO brk, sbrk



@node Memory locking
@section Memory locking

TODO mlock, munlock, mlockall, munlockall