Providing dynamic memory allocation with good performance is an important aspect for most modern software designs. The implementation of the ANSI-compliant malloc, free and related allocation functions in the system's C library is therefore an essential ingredient for overall system efficiency, considering that most applications ultimately make use of these basic functions.
This paper describes the evolution of the malloc implementation in recent releases of the Linux C library and the new GNU C library. Most of the features discussed are those of the public domain code by Doug Lea, which has been incorporated into Linux libc since version 5.3.x. An extension of this code was developed to handle concurrent malloc/free use from multiple threads, and this is now available as part of the new GNU C library release, which is expected to be in wide use at the time of this conference. Comparisons are made against other popular free malloc implementations, among them the BSD-, `CSRI'- and earlier GNU versions.
For evaluating and comparing malloc performance, as large a variety of allocation patterns as possible should be considered, because real programs vary greatly in their kind of memory usage. It is always possible to generate long pseudo-random allocation sequences and measure the time and space used by the allocator for such a sequence. However, in order to accurately assess the behaviour of real applications, especially with respect to memory efficiency, a small tracing facility was developed. This consists of a `dummy' malloc implementation that records all malloc, free, etc. calls in a file while the program runs, and a simulator, which can read the trace file and re-perform the original allocation pattern, while measuring memory usage and CPU time. The simulator can be linked against different malloc implementations for comparisons.
Several performance results from a typical i586-Linux system are presented, for memory-intensive applications such as compiling large C++ sources with gcc/gas/ld, the X server, and a WWW browser. Doug Lea's malloc is consistently among the fastest and most space-efficient implementations; in many cases it particularily excels at saving memory, where other allocators suffer from fragmentation.
Finally, recently added malloc features in Linux libc and GNU libc are discussed. `Hooks' were introduced which make it possible to change the malloc implementation at runtime. For example, truly fault-tolerant wrappers are part of the library code and can be enabled by setting an environment variable, at a noticeable but moderate performance cost. The implementation has been made thread-safe, but without using the single global lock approach found on many other systems. On multiprocessor machines, this promises large performance gains of heavily threaded programs.