Home >
Access1.sun.com >
Technical Articles
Identifying Memory Management Bugs Within Applications
Using the libumem Library
by Robert Benson
(June, 2003)
We want to hear from you! Please send us your
FEEDBACK.
The following article may contain actual software programs in source code form.
This source code is made available for developers to use as needed, pursuant to the
terms and conditions of this license.
Table of Contents
Overview
Introduction
Debugging Infrastructure
Debugging Methodology
Examples
References
Overview
This article will introduce the new user space slab allocator,
libumem, shipped in the Solaris 9 Operating System (Solaris 9 OS),
Update 3 . Of particular interest is the
debugging infrastructure provided by the libumem library.
This paper will focus on the application developer's use of the
debugging features provided by the libumem library to find and fix
memory management bugs efficiently within existing code.
First, we will introduce the libumem library and briefly
describe some of the advantages of using the slab allocator for
application memory management. Next, we'll describe
the details of the debugging infrastructure provided by the
libumem library, and the tools to take advantage of the infrastructure.
Finally, we will walk through a few examples using the libumem
library and the Solaris OS Modular Debugger (MDB) to illustrate
the ease of finding memory management bugs (that is, memory
corruption and leaks).
Back to Top
Introduction
The creation of the user space slab allocator was inspired
by the kernel space slab allocator introduced in SunOS 5.4
[1].
The kernel slab allocator was created by engineers investigating
system memory management in an effort to find new ways to make the
virtual memory (VM) subsystem faster, more efficient, and
scalable. The slab allocator provides faster and more efficient
memory allocation by using an object caching strategy. Object
caching is a strategy by which memory that is frequently allocated and
freed will be cached, so that the overhead of creating the same data
structure is decreased.This strategy has proven to be very efficient
due to the large amount of variable reuse within most code. The
scalability of the slab allocator has been improved upon over the last few
years by using a per-CPU set of caches. This addition allowed for
a far less contentious locking scheme when requesting memory from the
system and thus, has created a more scalable memory allocator.
Once the slab allocator was found to work well within kernel space, the
next step was to port those wins to user space. Hence, the user space
slab allocation library, libumem, was created. Beginning with the
Solaris 9 OS, Update 3, the libumem library will be a standard
part of the Solaris OS.
The user space slab allocator is based upon a set of umem caches whose
size is determined before the first allocation. The umem caches
are built using slabs of memory from the system. The slab nomenclature
denotes one or more contiguous virtual memory (VM) pages which are
split into equal size chunks called buffers. The buffer contains
the user's data and in addition can, depending on the environment
settings, contain the debug information that will help the application
developer find and repair memory management bugs.
For a detailed discussion about the structure and theory
behind the slab allocator, please refer to
The Slab Allocator: An Object-Caching Kernel Memory Allocator,
by Jeff Bonwick, and
Magazines and
Vmem: Extending the Slab Allocator to Many CPUs and
Arbitrary Resources, by Jeff Bonwick and Jonathan Adams.
Back to Top
Debugging Infrastructure
Here we will describe the sections of the buffer created when
an application requests memory resources. In addition, the
meanings of the boundary values seen within this buffer will be
explained. This is intended to provide the developer with an understanding of
the way in which the libumem library sets up the infrastructure by
which the application's memory transactions can be scrutinized for
validity.
Anatomy of the Buffer
The buffer is divided into four sections, as seen in Figure 1.
Metadata
section
|
User
data section
|
Redzone
section
|
Debug
metadata section
|
Figure 1. The structure of a buffer created
by the libumem library
The first section is devoted to storing 8 bytes of metadata with which
we will not concern ourselves in this article. The second
section contains the memory that the application will use to
store its data. The third section is called the
redzone, and purposely separates the user data and debug
metadata sections. In addition, the redzone section contains a
value that can be used to determine the size of the application's
memory request. The fourth, and final, section is used to store the
debug metadata which the developer can use to determine the
state and history of the buffer.
User Data Section
The user data section is the portion of the buffer which is
reserved for the application's data. To understand the
functionality of this section of the buffer we must understand the
basic building blocks of the slab allocator, the umem caches. The
slab allocator is based upon umem caches that consist of buffers with
predetermined sizes. Thus, when an application requests memory
from the system, the system will allocate memory from the umem cache
that has a user data section of equal or greater size than the request.
The size of the user data section will typically be larger than
the amount of memory requested by the application, as seen in
Figure 2.
| Memory
avaliable to the application |
0xbb |
Memory
not avaliable to the application |
Figure 2. The structure of the user data
section
Each umem cache consists of a set of buffers of one predetermined size
in order to facilitate object reuse and to minimize memory
fragmentation within the system. Therefore, most of the memory allocation
requests by an application do not require the full amount of space
provided by the buffer's user data section.
The memory requested by the application begins at the start of the
user data section and ends at the boundary value of 0xbb. The
0xbb boundary value is placed just after the last byte of memory
requested by the application. The memory between the 0xbb value and the
start of the redzone section, the next section within the buffer,
is not to be used by the application. In the following output
from MDB, the 0xbb value is written just after the tenth byte in the
user data section, as is appropriate for a 10-byte application
allocation request.
> 0x49fc0/10X
0x49fc0: 12 3a10bfee baddcafe baddcafe
baddbbfe baddcafe feedface 11a7
50000 a115c8ed
Note: The previous hexadecimal dump
is the output of a MDB command. This particular command displays
ten 4 byte hexadecimal values starting at address 0x49fc0. This
output represents an entire libumem buffer starting from the address
0x49fc0. Please refer to the documentation at http://docs.sun.com for more details
about MDB.
If the application requests an amount of memory which happens to be
exactly the same size as the user data section predetermined by the size of
the umem cache, the 0xbb value will occupy the first byte of the
redzone section.
Please note that the value 0xbaddcafe is written to all of
the uninitialized memory segments within the buffer's user data
section. This is a feature of the debugging infrastructure
provided by the libumem library in order to determine when an
application is accessing data that has not been previously initialized.
Redzone Section
This section of the buffer is 8 bytes in size and is used to
differentiate between the user data section and the debug
metadata section within the buffer. The boundary value 0xfeedface
indicates the beginning of the redzone section, as can be seen
below.
> 0x49fc0/10X
0x49fc0: 12 3a10bfee baddcafe baddcafe
baddbbfe baddcafe feedface 11a7
50000 a115c8ed
As was noted previously, if the application requests an amount of
memory which happens to be exactly the same size as the entire user data
section predetermined by the size of the umem cache, the 0xbb
value will
occupy the first byte of the redzone section. Thus, the redzone
will not start with 0xfeedface but with 0xbbedface.
The redzone boundary value can be verified to determine
whether a buffer overflow has taken place. In addition,
the last 4 bytes of the redzone section, 0x11a7 in the previous dump,
can be used to to verify the amount of memory requested by the
application. As can be seen in the /usr/include/umem_impl.h
header file, this value has been encoded by the following macro:
#define UMEM_SIZE_ENCODING(x) ( 251 * (x) + 1 )
where the value x is the size of the application's memory request, plus
8 bytes. Thus, we'll use the previous dump to verify this
behavior.
> 0x11a7=D
4519
By dividing the decimal value of 4519 by 251, and then subtracting 8, we
find that the application requested 10 bytes of memory from the system.
Debug Metadata
This section of the buffer contains 8 bytes that consist of a 4
byte pointer to a umem_bufctl_audit structure and a 4 byte checksum.
The umem_bufctl_audit structure, as seen within the
/usr/include/umem_impl.h header file, contains the following:
typedef struct umem_bufctl_audit {
struct umem_bufctl *bc_next; /* next bufctl struct */
void *bc_addr; /* address of buffer */
struct umem_slab *bc_slab; /* controlling slab */
umem_cache_t *bc_cache; /* controlling cache */
hrtime_t bc_timestamp; /* transaction time */
thread_t bc_thread; /* thread doing transaction */
struct umem_bufctl *bc_lastlog /* last log entry */
void *bc_contents; /* contents at last free */
int bc_depth; /* stack depth */
uintptr_t bc_stack[1]; /* pc stack */
} umem_bufctl_audit_t;
Of particular interest is the pointer to the stack trace for the
last thread that allocated or freed the buffer. The second
4 byte value within the debug metadata section, called the bxstat
value, is a checksum that can be used to verify that the buffer is
in a known state. The value of the pointer to the umem_bufctl_audit
structure XOR'ed to the value of the bxstat checksum should result in
0xa110c8ed for an allocated buffer
(as seen below) or 0xf4eef4ee for a freed buffer. If this is not the
case, the buffer has become corrupt.
> 0x49fc0/10X
0x49fc0: 12 3a10bfee baddcafe baddcafe
baddbbfe baddcafe feedface 11a7
50000 a115c8ed
> 50000^a115c8ed=K
a110c8ed
Back to Top
Debugging Methodology
The malloc() and free() memory management methods are used
by many application developers. An application can be
written without a dependence on any particular memory management
programming interface by using the standard memory management methods
malloc() and free(). This section will outline the steps
needed to take advantage of the libumem library to debug an
application's memory transactions.
Library Interposition and libumem Flags
If the libumem library is interposed (by setting the
LD_PRELOAD environment variable) when executing an application, the
malloc() and free() methods defined within the
libumem library
will be used whenever the application calls malloc() or
free(). In order to take advantage of the debugging
infrastructure of the libumem library, one needs to set the
UMEM_DEBUG and the UMEM_LOGGING flags in the environment where the
application is being executed. The most common values for
these flags are as follows: UMEM_DEBUG=default and
UMEM_LOGGING=transaction. With these settings, a thread ID,
high-resolution time stamp, and stack trace are recorded
for each memory transaction initiated by the application.
In addition, the libumem library will:
- Fill all the allocated and freed memory segments
within the buffer with special patterns to help detect the use
of uninitialized data (
0xbaddcafe) and previously
freed buffers (0xdeadbeef).
- Create a redzone section after the user data section which is
checked for integrity when the buffer is allocated and freed.
- Create a debug metadata section after the redzone section which
consists of a pointer to a
umem_bufctl_audit structure and a bxstat
checksum.
The following are examples of the commands used to set the
appropriate debug flags and interpose the libumem library when
executing an application.
(csh)
%(setenv UMEM_DEBUG default; setenv UMEM_LOGGING transaction;
setenv LD_PRELOAD libumem.so.1; ./a.out)
or
(bash)
bash-2.04$UMEM_DEBUG=default UMEM_LOGGING=transaction
LD_PRELOAD=libumem.so.1 ./a.out
More details about the debug flags (UMEM_DEBUG and
UMEM_LOGGING) can be found in the umem_debug(3MALLOC) man page.
MDB Commands
The developer can view the debug information pertaining to an
application's memory management transactions by using MDB. The
following commands within MDB can be used to provide a great deal of
information about the memory transactions that took place during
the execution of the application.
::umem_status
- Prints the status of the
umem indicating if the
logging features have been turned on or off
> ::umem_status
Status: ready and active
Concurrency: 1
Logs: transaction=64k
Message buffer:
::findleaks
- Prints a summary of the memory leaks found within
the application
> ::findleaks
CACHE LEAKED BUFCTL CALLER
0003d888 1 00050000 main+0xc
----------------------------------
Total 1 buffer, 24 bytes
::umalog
- Prints the memory transactions initiated by the application and
the correlated stack traces
> ::umalog
T-0.000000000 addr=55fb8 umem_alloc_32
libumem.so.1`umem_cache_alloc+0x13c
libumem.so.1`umem_alloc+0x44
libumem.so.1`malloc+0x2c
main+0x18
_start+0x108
T-0.000457800 addr=49fc0 umem_alloc_24
libumem.so.1`umem_cache_alloc+0x13c
libumem.so.1`umem_alloc+0x44
libumem.so.1`malloc+0x2c
main+0xc
_start+0x108
::umem_cache
- Prints the details about each of the
umem caches
> ::umem_cache
ADDR NAME FLAG CFLAG BUFSIZE BUFTOTL
0003c008 umem_magazine_1 000e 80080000 8 0
0003c1c8 umem_magazine_3 000e 80080000 16 0
0003c388 umem_magazine_7 000e 80080000 32 0
0003c548 umem_magazine_15 000e 80080000 64 0
0003c708 umem_magazine_31 000e 80080000 128 0
0003c8c8 umem_magazine_47 000e 80080000 192 0
0003ca88 umem_magazine_63 000e 80080000 256 0
0003cc48 umem_magazine_95 000e 80080000 384 0
0003ce08 umem_magazine_143 000e 80080000 576 0
0003cfc8 umem_slab_cache 000e 80080000 28 170
0003d188 umem_bufctl_cache 000e 80080000 12 0
0003d348 umem_bufctl_audit_cache 000e 80080000 100 408
0003d508 umem_alloc_8 020f 80000000 8 0
0003d6c8 umem_alloc_16 020f 80000000 16 0
0003d888 umem_alloc_24 020f 80000000 24 204
0003da48 umem_alloc_32 020f 80000000 32 170
...snip...
[address]::umem_log
- Prints the
umem transaction log for the application
> ::umem_log
CPU ADDR BUFADDR TIMESTAMP THREAD
0 0002e064 00055fb8 10475e3dd1c98 00000001
0 0002e000 00049fc0 10475e3d62050 00000001
0003483c 00000000 0 00000000
000348a0 00000000 0 00000000
00034904 00000000 0 00000000
... snip ...
[address]::umem_verfiy
address$<bufctl_audit
- Prints the contents of the
umem_bufctl_audit structure as
defined in the /usr/include/umem_impl.h header file
> 50000$<bufctl_audit
0x50000: next addr slab
0 49fc0 4bfb0
0x5000c: cache timestamp thread
3d888 23764722653000 1
0x5001c: lastlog contents stackdepth
2e000 0 5
libumem.so.1`umem_cache_alloc+0x13c
libumem.so.1`umem_alloc+0x44
libumem.so.1`malloc+0x2c
main+4
_start+0x108
Back to Top
Examples
The following basic examples will show how to use MDB in
conjunction with the libumem library to examine the history of an
application's memory transactions.
Traditional Memory Leak
In order to examine if an application has a memory leak, one can
execute the following steps to narrow down the section of the code
which is causing the leak.
- The
libumem library is only available on systems which are
running the Solaris 9 OS, Update 3 and above.
%uname -a
SunOS fountainhead 5.9 Generic_112233-05
- Execute the application with the
libumem library interposed
and the appropriate debug flags set.
%(setenv UMEM_DEBUG default; setenv UMEM_LOGGING transaction; \
setenv LD_PRELOAD libumem.so.1; ./a.out)
- Use the
gcore (1) command to get an application core to
analyze the application's memory transactions.
%ps -ef | grep a.out
user1 970 714 0 10:42:42 pts/4 0:00 ./a.out
%gcore 970
gcore: core.970 dumped
- Use MDB to analyze the core for memory leaks using the
commands described in the previous section.
%mdb core.970
Loading modules: [ libumem.so.1 libc.so.1 ld.so.1 ]
> ::umem_log
CPU ADDR BUFADDR TIMESTAMP THREAD
0 0002e0c8 00055fb8 159d27e121a0 00000001
0 0002e064 00055fb8 159d27e0fce8 00000001
0 0002e000 00049fc0 159d27da1748 00000001
00034904 00000000 0 00000000
00034968 00000000 0 00000000
... snip ...
Here we can see that there have been three transactions by thread #1 on
cpu #0.
> ::umalog
T-0.000000000 addr=55fb8 umem_alloc_32
libumem.so.1`umem_cache_free+0x4c
libumem.so.1`process_free+0x68
libumem.so.1`free+0x38
main+0x18
_start+0x108
T-0.000009400 addr=55fb8 umem_alloc_32
libumem.so.1`umem_cache_alloc+0x13c
libumem.so.1`umem_alloc+0x44
libumem.so.1`malloc+0x2c
main+0x10
_start+0x108
T-0.000461400 addr=49fc0 umem_alloc_24
libumem.so.1`umem_cache_alloc+0x13c
libumem.so.1`umem_alloc+0x44
libumem.so.1`malloc+0x2c
main+4
_start+0x108
The three transactions consist of one allocation to the 24 byte umem
cache, and one memory
allocation and release from the 32 byte umem cache. Note that the high
resolution timestamp
output in the upper left hand corner is relative to the last memory
transaction initiated by the application.
> ::findleaks
CACHE LEAKED BUFCTL CALLER
0003d888 1 00050000 libumem.so.1`malloc+0x0
----------------------------------------------------------------------
Total 1 buffer, 24 bytes
This shows that there is one 24 byte buffer which has been leaked.
> 00050000$<bufctl_audit
0x50000: next addr slab
0 49fc0 4bfb0
0x5000c: cache timestamp thread
3d888 23764722653000 1
0x5001c: lastlog contents stackdepth
2e000 0 5
libumem.so.1`umem_cache_alloc+0x13c
libumem.so.1`umem_alloc+0x44
libumem.so.1`malloc+0x2c
main+4
_start+0x108
We can find the stack trace for the allocation which resulted in the
memory leak
by dumping the bufctl structure. The address of this structure can be
gathered from
the previous ::findleaks output.
> 49fc0/10X
0x49fc0: 12 3a10bfee baddcafe baddcafe
baddbbfe baddcafe feedface 11a7
50000 a115c8ed
Looking at the values within the buffer we see that the size of the
allocation was 10
bytes. This can be calculated by dividing the redzone value of 0x11a7
by 251, and then
subtracting 8 bytes.
%cat test.c
#include
#include
#include
void test_sleep(int);
void main(){
int *test;
test = malloc(10); // This is the memory allocation that is never freed!!
test = malloc(20);
free(test);
test_sleep(24);
}
void test_sleep( int interval ){
printf("Starting to sleep for %d seconds...\n", interval);
sleep(interval);
printf("Stopped sleeping...\n\n");
}
Once we look at the code of the executable, we can use the function in
the stack trace
and the size of the
allocation to determine the piece of memory which has leaked.
Traditional Memory Corruption
The following example will list the steps used to examine an
application core for a memory corruption bug.
- Follow the first two steps listed above in the memory leak
example.
- Either analyze the core dump created by the application if it
aborted, or use
gcore as seen above.
- Use MDB to analyze the application core for the memory
corruption using the MDB commands listed in a previous section.
%mdb core.1095
Loading modules: [ libumem.so.1 libc.so.1 ld.so.1 ]
> ::umem_verify
Cache Name Addr Cache Integrity
umem_magazine_1 3c008 clean
umem_magazine_3 3c1c8 clean
umem_magazine_7 3c388 clean
umem_magazine_15 3c548 clean
umem_magazine_31 3c708 clean
umem_magazine_47 3c8c8 clean
umem_magazine_63 3ca88 clean
umem_magazine_95 3cc48 clean
umem_magazine_143 3ce08 clean
umem_slab_cache 3cfc8 clean
umem_bufctl_cache 3d188 clean
umem_bufctl_audit_cache 3d348 clean
umem_alloc_8 3d508 clean
umem_alloc_16 3d6c8 clean
umem_alloc_24 3d888 1 corrupt buffer
... snip ...
Using the umem_verify command we can see that one of the umem caches
has a corrupted buffer.
> 3d888::umem_verify
Summary for cache 'umem_alloc_24'
buffer 49fc0 (allocated) has a corrupt redzone size encoding
This provides more detail about the type of corruption that has taken
place within the 24 byte umem cache.
> 49fc0/10X
0x49fc0: 18 3a10bfe8 0 1
2 3 4 1789
50000 a115c8ed
When we dump out the buffer we can see that the size of the original
allocation was 16
bytes. This can be calculated by decoding the redzone value of 1789 by
dividing it by 251
and then subtracting the result by 8 bytes. Once we know the size of
the allocation we can
look for the 0xfeedface boundary 16 bytes from where the user data
section starts.
Scrutinizing the buffer above reveals that the user section is filled
with 0 through 4, and
there is no redzone boundary tag (that is, 0xfeedface). We find the value 4
where the redzone
value should be!
>50000$<bufctl_audit
0x50000: next addr slab
0 49fc0 4bfb0
0x5000c: cache timestamp thread
3d888 31080154190400 1
0x5001c: lastlog contents stackdepth
2e000 0 5
libumem.so.1`umem_cache_alloc+0x13c
libumem.so.1`umem_alloc+0x44
libumem.so.1`malloc+0x2c
main+4
_start+0x108
Getting the stack trace for the last memory transaction will allow the
developer to narrow
down where in the code the memory corruption is taking place.
%cat test.c
#include
#include
#include
void test_sleep(int);
void main(){
int *test;
test = malloc(16);
for(int i = 0; i <= 4; i++){ // CORRUPTION! The for loop should only
test[i] = i; // be traversed three times instead
} // of four due to the buffer size
test_sleep(24);
}
void test_sleep( int interval ){
printf("Starting to sleep for %d seconds...\n", interval);
sleep(interval);
printf("Stopped sleeping...\n\n");
}
After looking at the code
it is obvious that the memory corruption is due to invalid
conditions in the for loop.
Back to Top
References
1.
The Slab
Allocator: An Object-Caching Kernel Memory Allocator,
by Jeff Bonwick
2.
Magazines and
Vmem: Extending the Slab Allocator to Many CPUs and
Arbitrary Resources,
by Jeff Bonwick and Jonathan Adams
DOC ID# 1833