Malloc
|
- The title of this article is incorrect because of technical limitations. The correct title is malloc.
malloc is a function for performing dynamic memory allocation in the C programming language.
Contents |
Rationale
The C programming language normally manages memory statically, that is, on the stack. If space for a variable is needed, it is created when the function is entered and is automatically reclaimed when the function returns. However, stack-based allocation is somewhat limited: the size of the allocation must be a compile-time constant, and the lifetime of the allocation is limited to the current function call. This can make it awkward to persist stack-allocated data over multiple function calls. Consider getting input from a user and storing it in a string - the size of the string must be known at compilation-time as the size of the stack frame must be known beforehand. Thus if the programmer allocates space for eleven characters and the user types fifteen, five characters will be lost (one byte is reserved for the terminating null byte).
This problem is alleviated by allocating memory elsewhere, such as on the heap. The heap is a memory area reserved for allocating memory dynamically. Generally, one uses malloc to allocate a block of memory on the heap, and a pointer to this block of memory is maintained.
Dynamic memory allocation in C
The malloc
function is the basic function used to allocate memory on the heap in C. Its prototype is
void *malloc(size_t size)
which allocates size
bytes of memory. If the allocation succeeds, a pointer to the block of memory is returned. malloc
returns a void pointer (void *
), which indicates that it is a pointer to a region of unknown data type. This pointer is typically cast to a more specific pointer type by the programmer before being used.
Memory allocated via malloc
is persistent: it will continue to exist until the program terminates or the memory is explicitly deallocated by the programmer (that is, the block is said to be "freed"). This is achieved by use of the free
function. Its prototype is
void free(void *pointer)
which releases the block of memory pointed to by pointer
.
Usage example
The standard method of creating an array of ten integers on the stack:
int array[10];
But if we want to allocate the array dynamically we should write:
#include <stdlib.h> int *ptr = malloc(sizeof(int) * 10); /* Allocates space for an array with 10 elements of type int */ if (ptr == NULL) { exit(1); /* We couldn't allocate any memory, so exit */ } /* allocation succeeded */
Related functions
malloc
returns a block of memory that is allocated for the programmer to use, but is uninitialised. The memory is usually initialized by hand if necessary -- either via the memset
function, or by one or more assignment statements that dereference the pointer. An alternative is to use the calloc
function, which allocates memory and then initializes it. Its prototype is
void *calloc(size_t nelements, size_t bytes<var>)
which allocates a region of memory large enough to hold <var>nelements</var>
of size <var>bytes</var>
each. The allocated region is initialized to zero.
In the motivating example at first, it may be useful to grow or shrink a block of memory. One can allocate a new block, then copy the blocks and then free the old block, however, the realloc
function is often used (which can conceivably optimise this operation). Its prototype is
void *realloc(void *<var>pointer, size_t bytes)
If the new size is to be greater than the old size, the block is grown, otherwise it is shrunk.
Common errors
Some programmers find that the improper use of malloc and related functions in C can be a frequent source of bugs.
Allocation failure
malloc is not guaranteed to succeed — if there is no virtual memory available, or the program has exceeded the amount of virtual memory it is allowed to reference, malloc will return a NULL
pointer. Depending on the design of the operating system and the C standard library, this may happen with some frequency in production environments. Therefore, a well-written program should handle this situation. Unfortunately, many programs do not, and will crash if malloc fails. One reason why malloc failures are often ignored is that recovering from the error can be difficult.
Memory leaks
The return value of malloc, calloc, and realloc must be passed to the free
function so that it can be released. If this is not done, the allocated memory is not released until the process exits — in other words, a memory leak will occur.
Double free
When a pointer has been passed to free
, the pointer can still be used, but it now references a region of memory with undefined content, which may not be available for use. For example:
int *ptr = malloc(sizeof(int)); free(ptr); *ptr = 0; /* undefined behavior */
Problems of this kind can result in unpredictable program behavior — after the memory has been freed, the system may reuse that memory region for storage of unrelated data. So writing through a pointer to a deallocated region of memory may result in overwriting another piece of data somewhere else in the program — which may cause data corruption, or crash the program at some future point in time.
A particularly bad example of this problem is if the same pointer is passed to free twice.
Implementations
The implementation of memory management depends greatly upon operating system and architecture. Some operating systems supply an allocator for malloc, while others supply functions to control certain regions of data.
The same dynamic memory allocator is often used to implement both malloc and operator new
in C++. Hence, we will call this the allocator rather than malloc.
Heap-based
Implementation of the allocator on IA-32 architectures is commonly done using the heap, or data segment. The allocator will usually expand and contract the heap to fulfill allocation requests.
The heap method suffers from a few inherent flaws, stemming entirely from fragmentation. Like any method of memory allocation, the heap will become fragmented; that is, there will be sections of used and unused memory in the allocated space on the heap. A good allocator will attempt to find an unused area of already allocated memory to use before resorting to expanding the heap. However, due to performance it can be impossible to use an allocator in a real time system and a memory pool must be deployed instead.
The major problem with this method is that the heap has only two significant attributes: base, or the beginning of the heap in virtual memory space; and length, or its size. The heap requires enough system memory to fill its entire length, and its base can never change. Thus, any large areas of unused memory are wasted. The heap can get "stuck" in this position if a small used segment exists at the end of the heap, which could waste any magnitude of system RAM, from a few megabytes to a few hundred.
The glibc allocator
The GNU C library, glibc, uses both brk
and mmap
on the Linux operating system. The brk system call will change the size of the heap to be larger or smaller as needed; while the mmap system call will be used when extremely large segments are allocated. The heap method suffers the same flaws as any other, while the mmap method may avert problems with huge buffers trapping a small allocation at the end after their expiration.
The mmap method has its own flaws. It always allocates a segment by mapping pages. Only one set of mapped pages exists for each allocated segment. Mapping a single byte will use an entire page, usually 4096 bytes, on IA-32; however, huge pages are 4MiB, 1024 times larger, and so this method could be particularly devastating if userspace uses all huge pages. The advantage to the mmap method is that when the segment is freed, the memory is returned to the system immediately.
See also
External links
- Definition of malloc in IEEE Std 1003.1 standard (http://www.opengroup.org/onlinepubs/009695399/functions/malloc.html)
- The design of the basis of the glibc allocator (http://gee.cs.oswego.edu/dl/html/malloc.html) by Doug Lea (http://gee.cs.oswego.edu/dl)
- The Hoard memory allocator, by Emery Berger (http://www.cs.umass.edu/~emery)
- "Scalable Lock-Free Dynamic Memory Allocation (http://www.research.ibm.com/people/m/michael/pldi-2004.pdf)" by Maged M. Michael
- "Inside memory management - The choices, tradeoffs, and implementations of dynamic allocation (http://www-106.ibm.com/developerworks/linux/library/l-memory/)" by Jonathan Bartlett