Android中native进程内存泄露的调试技巧

Android中native进程内存泄露的调试技巧
红狼博客

代码基于Android2.3.x版本

Android为Java程序提供了方便的内存泄露信息和工具（如MAT），便于查找。但是，对于纯粹C/C++ 编写的natvie进程，却不那么容易查找内存泄露。传统的C/C＋＋程序可以使用valgrind工具，也可以使用某些代码检查工具。幸运的是，Google的bionic库为我们查找内存泄露提供了一个非常棒的API－－get_malloc_leak_info。利用它，我们很容易通过得到backtrace的方式找到涉嫌内存泄露的地方。

代码原理分析

我们可以使用adb shell setprop libc.debug.malloc 1来设置内存的调试等级（debug_level），更详细的等级解释见文件bionic/libc/bionic/malloc_debug_common.c中的注释：

/* Handle to shared library where actual memory allocation is implemented.
* This library is loaded and memory allocation calls are redirected there
* when libc.debug.malloc environment variable contains value other than
* zero:
* 1 – For memory leak detections.
* 5 – For filling allocated / freed memory with patterns defined by
* CHK_SENTINEL_VALUE, and CHK_FILL_FREE macros.
* 10 – For adding pre-, and post- allocation stubs in order to detect
* buffer overruns.
* Note that emulator’s memory allocation instrumentation is not controlled by
* libc.debug.malloc value, but rather by emulator, started with -memcheck
* option. Note also, that if emulator has started with -memcheck option,
* emulator’s instrumented memory allocation will take over value saved in
* libc.debug.malloc. In other words, if emulator has started with -memcheck
* option, libc.debug.malloc value is ignored.
* Actual functionality for debug levels 1-10 is implemented in
* libc_malloc_debug_leak.so, while functionality for emultor’s instrumented
* allocations is implemented in libc_malloc_debug_qemu.so and can be run inside
* the emulator only.
*/

对于不同的调试等级，内存分配管理函数操作句柄将指向不同的内存分配管理函数。这样，内存的分配和释放，在不同的的调试等级下，将使用不同的函数版本。
详细过程如下：

如下面代码注释所说，在__libc_init例程中会调用malloc_debug_init进行初始化，进而调用malloc_init_impl（在一个进程中，使用pthread_once保证其只被执行一次）

在malloc_init_impl中，会打开对应的C库，解析出函数符号：malloc_debug_initialize（见行366），并执行之（行373）

当debug_level被设置为1、5、10时，打开库”/system/lib/libc_malloc_debug_leak.so”。在文件bionic/libc/bionic/malloc_debug_leak.c中，实现了malloc_debug_initialize，但只为返回0的空函数。若为20，则打开的是：”/system/lib/libc_malloc_debug_qemu.so”

接着，针对不同的debug_level，解析出不同的内存操作函数malloc/free/calloc/realloc/memalign实现：

对于debug_level等级1、5、10的情况，malloc/free/calloc/realloc/memalign各种版本的实现位于文件bionic/libc/bionic/malloc_debug_leak.c中。如debug_level为5时的情况，malloc/free/则是在分配内存时将分配的内存填充为0xeb，释放时填充为0xef：

当debug_level为1调试memory leak时，其实现是打出backtrace：

void* leak_malloc(size_t bytes)
{
// allocate enough space infront of the allocation to store the pointer for
// the alloc structure. This will making free’ing the structer really fast!

// 1. allocate enough memory and include our header
// 2. set the base pointer to be right after our header

void* base = dlmalloc(bytes + sizeof(AllocationEntry));
if (base != NULL) {
pthread_mutex_lock(&gAllocationsMutex);

intptr_t backtrace[BACKTRACE_SIZE];
size_t numEntries = get_backtrace(backtrace, BACKTRACE_SIZE);

AllocationEntry* header = (AllocationEntry*)base;
header->entry = record_backtrace(backtrace, numEntries, bytes);
header->guard = GUARD;

// now increment base to point to after our header.
// this should just work since our header is 8 bytes.
base = (AllocationEntry*)base + 1;

pthread_mutex_unlock(&gAllocationsMutex);
}

return base;
}

该malloc函数在实际分配的bytes字节前额外分配了一块数据用作AllocationEntry。在分配内存成功后，分配了一个拥有32个元素的指针数组，用于存放调用堆栈指针，调用函数get_backtrace将调用堆栈保存起来，也就是将各函数指针保存到数组backtrace中；然后使用record_backtrace记录下该调用堆栈，然后让AllocationEntry的entry成员指向它。函数record_backtrace会通过hash值在全局调用堆栈表gHashTable里查找。若没找到，则创建一项调用堆栈信息，将其加入到全局表中。最后，将base所指向的地方往后移一下，然后它，就是分配的内存地址。
可见，该版本的malloc函数额外记录了调用堆栈的信息。通过在分配的内存块前加一个头的方式，保存了如何查询hash表调用堆栈信息的entry。

再来看一下record_backtrace函数，在分析其代码之前，看一下结构体（文件malloc_debug_common.h）：
struct HashEntry {
size_t slot;// HashTable中的slots数组索引
HashEntry* prev;//前一项
HashEntry* next;//后一项，新添加时添加到后面
size_t numEntries;//调用堆栈中的函数指针数量
// fields above “size” are NOT sent to the host
size_t size;//表示该次malloc操作所分配的内存数
size_t allocations;//调用的次数，即此处的malloc被调用了多少次
intptr_t backtrace[0];//调用堆栈
};

typedef struct HashTable HashTable;
struct HashTable {
size_t count;
HashEntry* slots[HASHTABLE_SIZE];//HASHTABLE_SIZE=1543
};
和在一个进程中，有一个全局的变量gHashTable，用于记录谁最终调用了malloc分配内存的调用堆栈列表。gHashTable的类型是HashTable，其有一个指针，这个指针指向一个slots数组，该数组的最大容量是1543；数组中有多少有效的值由另一个成员count记录。可以通过backtrace和 numEntries得到hash值，再与HASHTABLE_SIZE整除得到HashEntry在该数组中的索引，这样就可以根据自身信息根据hash，快速得到在数组中的索引。
另一个结构体是HashEntry，因其成员存在指向前后的指针，所以它也是个链表，hash值相同将添加到链表的后面。HashEntry第一个成员slot就是自身在数组中的索引，亦即由hash运算而来；最后一项即调用堆栈backtrace[0]，里面是函数指针，这个数组具体有多少项则由另一个成员numEntries记录；size表示该次分配的内存的大小；allocations是分配次数，即有多少次同一调用路径。
这两个数据结构关系可由下图表示：

在leak_malloc中调用record_backtrace记录堆栈信息时，先由backtrace和numEntries得到hash值，再整除运算后得到在gHashTable中的数组索引；接着检查是否已经存在该项，即有没有分配了相同内存大小、同一调用路径、记录了相当数量的函数指针的HashEntry。若有，则直接在原有项上的allocations加1，没有则创建新项：为HashEntry结构体分配内存（见行151，注意最后一个成员backtrace需要根据numEntries值来确定其有多少项），然后调用堆栈信息复制给HashEntry最后的一个成员backtrace。最后，还要为整个表格增加计数。
这样record_backtrace函数完成了向全局表中添加backtrace信息的任务：要么新增加一项HashEntry，要么增加索引。

static HashEntry* record_backtrace(intptr_t* backtrace, size_t numEntries, size_t size)
{
size_t hash = get_hash(backtrace, numEntries);//得到backtrace和numEntries的hash值
size_t slot = hash % HASHTABLE_SIZE;//整除,得到的是HashTable中的HashEntry数组索引

if (size & SIZE_FLAG_MASK) {
debug_log(“malloc_debug: allocation %zx exceeds bit widthn”, size);
abort();
}

if (gMallocLeakZygoteChild)
size |= SIZE_FLAG_ZYGOTE_CHILD;

HashEntry* entry = find_entry(&gHashTable, slot, backtrace, numEntries, size);
//上面一行: 在全局表中搜索该项是否已经存在，即是否该调用路径是否已经被调用过
if (entry != NULL) {
entry->allocations++;//若调用过，则增加计数
} else {//若没有调用，则创建一新项
// create a new entry
entry = (HashEntry*)dlmalloc(sizeof(HashEntry) + numEntries*sizeof(intptr_t));//为该项分配内存，
if (!entry)//接上一行:因HashEntry最后一项是intptr_t backtrace[0];故它是一动态长度，所有numEntries*sizeof(intptr_t)
return NULL;
entry->allocations = 1;
entry->slot = slot;
entry->prev = NULL;
entry->next = gHashTable.slots[slot];
entry->numEntries = numEntries;
entry->size = size;

memcpy(entry->backtrace, backtrace, numEntries * sizeof(intptr_t));//将backtrace拷贝到entry结构体的后面的内存中

gHashTable.slots[slot] = entry;//将新分配的并经过赋值的一项HashEntry添加到HashTable中的数组中去

if (entry->next != NULL) {
entry->next->prev = entry;
}

// we just added an entry, increase the size of the hashtable
gHashTable.count++;//增加计数
}

return entry;
}

在leak_free函数中会释放上述全局hash表中的堆栈项（见行550）：

void leak_free(void* mem)
{
if (mem != NULL) {
pthread_mutex_lock(&gAllocationsMutex);

// check the guard to make sure it is valid
AllocationEntry* header = (AllocationEntry*)mem – 1;

if (header->guard != GUARD) {
// could be a memaligned block
if (((void**)mem)[-1] == MEMALIGN_GUARD) {
mem = ((void**)mem)[-2];
header = (AllocationEntry*)mem – 1;
}
}

if (header->guard == GUARD || is_valid_entry(header->entry)) {
// decrement the allocations
HashEntry* entry = header->entry;
entry->allocations–;
if (entry->allocations <= 0) {
remove_entry(entry);
dlfree(entry);
}

// now free the memory!
dlfree(header);
} else {
debug_log(“WARNING bad header guard: ’0x%x’! and invalid entry: %pn”,
header->guard, header->entry);
}

pthread_mutex_unlock(&gAllocationsMutex);
}
}

因此，在全局表中剩下的未被释放的项，就是分配了内存但未被释放的调用了malloc的调用堆栈。

get_malloc_leak_info

函数get_malloc_leak_info用于获取内存泄露信息。在分配内存时，记录下调用堆栈，在释放时清除它们。这样，剩下的就很有可能是产生内存泄露的根源。那么如何获取该内存调用堆栈全局hash表呢？在文件malloc_debug_common.c中提供了函数get_malloc_leak_info，可以获取该堆栈信息。
函数get_malloc_leak_info接收5个参数，用于各种存放各种变量的地址，调用结束后，这些变量将得到修改。如其代码注释所说：
*info将指向在该函数中分配的整块内存，这些内存空间大小为overallSize；
整个空间若干小项组成，每项的大小为infoSize，这个小项的数据结构等同于HashEntry中自size成员开始的结构，即第一个成员是malloc分配的内存大小，第二个成员是allocations，即多次有着相同调用堆栈的计数，最后一项是backtrace，共32（BACKTRACE_SIZE）个指针值的空间。因此，*info指向的大内存块包含了共有overallSize/infoSize个小项。注意HashEntry中backtrace数组是按实际数量分配的，而此处则统一按32个分配空间，若不到32个，则后面的值置0；
totalMemory是malloc分配的所有内存的大小；
最后一个参数是backtraceSize，即32（BACKTRACE_SIZE）

函数get_malloc_leak_info首先检查传递进来的变量是否合法，以及全局堆栈中是否有堆栈项：
void get_malloc_leak_info(uint8_t** info, size_t* overallSize,
size_t* infoSize, size_t* totalMemory, size_t* backtraceSize)
{
// don’t do anything if we have invalid arguments
if (info == NULL || overallSize == NULL || infoSize == NULL ||
totalMemory == NULL || backtraceSize == NULL) {
return;
}
*totalMemory = 0;

pthread_mutex_lock(&gAllocationsMutex);

if (gHashTable.count == 0) {
*info = NULL;
*overallSize = 0;
*infoSize = 0;
*backtraceSize = 0;
goto done;
}

接着查看全局堆栈表中有多少项，然后分配一块内存，用于保存指针，这些指针用于指向gHashTable中的所有HashEntry项，并顺便计数出已分配但未释放的内存总数量totalMemory用于返回给调用者。最后一个参数是调用堆栈中的函数指针个数，实际值为BACKTRACE_SIZE，即32。.
void** list = (void**)dlmalloc(sizeof(void*) * gHashTable.count);

// get the entries into an array to be sorted
int index = 0;
int i;
for (i = 0 ; i < HASHTABLE_SIZE ; i++) {//遍历gHashTable全部项
HashEntry* entry = gHashTable.slots[i];
while (entry != NULL) {//有效项放到list中去
list[index] = entry;
*totalMemory = *totalMemory +//计算总分配的内存
((entry->size & ~SIZE_FLAG_MASK) * entry->allocations);
index++;
entry = entry->next;//让entry指向下一个，即相同的slot值
}
}//经过此for循环，将全局表中所有的堆栈项指针存放到list指向的表中

// XXX: the protocol doesn’t allow variable size for the stack trace (yet)
*infoSize = (sizeof(size_t) * 2) + (sizeof(intptr_t) * BACKTRACE_SIZE);//32个指针值项，
//注意: info前面是两个size_t变量，它们是HashEntry中的size和allocations两个成员,后面是backtrace
*overallSize = *infoSize * gHashTable.count;//计算所有调用堆栈项所需内存
*backtraceSize = BACKTRACE_SIZE;

最后，为所有调用堆栈项信息分配内存，即info指向的地方；并将gHashTable中的调用堆栈信息（即list表中的HashEntry自其结构体成员size后面的值）拷贝到info所指向的内存中。

// now get A byte array big enough for this
*info = (uint8_t*)dlmalloc(*overallSize);//为所有堆栈项分配内存，包括各项的2个size_t变量

if (*info == NULL) {//分配不成功，没内存了
*overallSize = 0;
goto out_nomem_info;
}

qsort((void*)list, gHashTable.count, sizeof(void*), hash_entry_compare);//为列表中的项排序

uint8_t* head = *info;
const int count = gHashTable.count;
for (i = 0 ; i < count ; i++) {
HashEntry* entry = list[i];
size_t entrySize = (sizeof(size_t) * 2) + (sizeof(intptr_t) * entry->numEntries);
if (entrySize < *infoSize) {
/* we’re writing less than a full entry, clear out the rest */
memset(head + entrySize, 0, *infoSize – entrySize);//调用堆栈32项中未填满的部分
} else {
/* make sure the amount we’re copying doesn’t exceed the limit */
entrySize = *infoSize;
}//下面的一行将32个指针占用空间加上前面两个size_t变量的值复制到info项中
memcpy(head, &(entry->size), entrySize);//size_t变量分别为size和allocations
head += *infoSize;//让head指向下一个info所在内存
}

out_nomem_info:
dlfree(list);

done:
pthread_mutex_unlock(&gAllocationsMutex);
}

当程序运行结束时，一般来说，内存都应该释放，这时我们可以调用get_malloc_leak_info获取未被释放的调用堆栈项。原理上，这些就是内存泄露的地方。但实际情况可能是，在我们运行get_malloc_leak_info时，某些内存应该保留还不应该释放。
另外，我们有时要检查的进程是守护进程，不会退出。所以有些内存应该一直保持下去，不被释放。这时，我们可以选择某个状态的一个时刻来查看未释放的内存，比如在刚进入时的idle状态时的一个时刻，使用get_malloc_leak_info获取未释放的内存信息，然后在程序执行某些操作结束后返回Idle状态时，再次使用get_malloc_leak_info获取未释放的内存信息。两种信息对比，新多出来的调用堆栈项，就存在涉嫌内存泄露。
使用get_malloc_leak_info函数的样例代码如下：

typedef struct {
size_t size;//分配的内存
size_t dups;//重复数
intptr_t * backtrace;//调用堆栈指针
} AllocEntry;

uint8_t *info = NULL;
size_t overallSize = 0;
size_t infoSize = 0;
size_t totalMemory = 0;
size_t backtraceSize = 0;

get_malloc_leak_info(&info, &overallSize, &infoSize, &totalMemory, &backtraceSize);
LOGI(“returned from get_malloc_leak_info, info=0x%x, overallSize=%d, infoSize=%d, totalMemory=%d, backtraceSize=%d”, (int)info, overallSize, infoSize, totalMemory, backtraceSize);
if (info) {
uint8_t *ptr = info;
size_t count = overallSize / infoSize;

snprintf(buffer, SIZE, ” Allocation count %in”, count);
result.append(buffer);
snprintf(buffer, SIZE, ” Total meory %in”, totalMemory);
result.append(buffer);

AllocEntry * entries = new AllocEntry[count];//数组

for (size_t i = 0; i < count; i++) {让获取的堆栈信息填充到 AllocEntry数组中
// Each entry should be size_t, size_t, intptr_t[backtraceSize]
AllocEntry *e = &entries[i];

e->size = *reinterpret_cast<size_t *>(ptr);
ptr += sizeof(size_t);

e->dups = *reinterpret_cast<size_t *>(ptr);
ptr += sizeof(size_t);

e->backtrace = reinterpret_cast<intptr_t *>(ptr);
ptr += sizeof(intptr_t) * backtraceSize;
}

具体调试步骤：
参考http://freepine.blogspot.com/2010/02/analyze-memory-leak-of-android-native.html
下载其补丁包和python工具包
将代码补丁达到android源码中的frameworks/base下，重新编译生成image，烧进手机板里，这时会在/system/bin/下有个二进制程序memorydumper。该代码补丁包向mediaserver进程中添加一个服务，二进制程序通过Binder IPC使用该服务。该服务使用get_malloc_leak_info获取未释放内存信息。

step1．设置调试等级并重启mediaserver进程
adb shell setprop libc.debug.malloc 1
adb shell ps mediaserver
adb shell kill <mediaserver_pid>

它的目的是让mediaserver进程使用leak_malloc的版本。当设置调试等级后，杀死mediaserver进程，android系统将自动重启它。这时，它重新加载libc库，内存分配函数通过handle将使用leak_malloc、leak_free版本。
Step2:在某初始状态下，如在使用“照相机”程序之前，执行memorydumper，记录下此时未释放的内存：
$ adb shell /system/bin/memorydumper
$ adb pull /data/memstatus_<mediaserver_pid>.0 .

Step3：执行某些操作，如拍照、录制视频或播放几首歌曲，然后退出这些应用程序；

Step4：再次执行memorydumper，记录下此时未释放的内存；通过比较工具，比较此次和step2中的差异；这些差异就是有内存泄露嫌疑的地方。因为第一得到的未释放的可能就是那个时刻不该释放的，比较就是将它们排除掉。
$ adb pull /data/memstatus_<mediaserver_pid>.1 .
$ diff memstatus_<mediaserver_pid>.0 memstatus_<mediaserver_pid>.1 >diff_0_1

Step5：获取maps文件。根据该文件，可以得到.so库文件所在地址范围空间，用于将调用堆栈函数符号地址解析出来。
$ adb pull /proc/<mediaserver_pid>/maps your_path

Step5.执行参考链接中的python脚本：
./addr2func.py –root-dir=~/u8500-android-2.3_v4.30 –maps-file=maps –product=u8500 diff._0_1>memleak.backtrace
该脚本将通过分析maps文件得到地址段对应的库文件所占用的地址空间，得到每个调用堆栈的地址对应的库，通过下面的命令，得到对应的经过编译器mangled后的函数名称、源文件及其行号：
[root-dir]/prebuilt/linux-x86/toolchain/arm-eabi-4.4.0/bin/arm-eabi-addr2line -f -e [root-dir]/ /out/target/product/[product]/symbols/[libname] callstack_address

然后使用[root-dir]/prebuilt/linux-x86/toolchain/arm-eabi-4.4.0/bin/arm-eabi-c++filt进行函数的demangle，得到与源码一致的函数名称，使我们更易辨认。

一个例子的snapshot：
下面的截图是第一次使用memorydumper得到的调用堆栈地址：

下面的截图是第二次使用memorydumper得到的调用堆栈地址：

两者进行diff比较后得到的差异：

使用addr2func后得到的调用堆栈：

本文链接地址: http://www.redwolf-blog.com/?p=1233

更多相关文章

随机推荐