快捷搜索:  汽车  科技

聊聊linux的内存管理机制,Linux内存管理内存检测技术

聊聊linux的内存管理机制,Linux内存管理内存检测技术CONFIG_SLUB_STATS=yCONFIG_SLUB_DEBUG_ON=y make bzImage -j4 ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-CONFIG_SLUB=y CONFIG_SLUB_DEBUG=y

Linux常见的内存访问错误有:
  1. 越界访问(out of bounds)
  2. 访问已经释放的内存(use after free)
  3. 重复释放
  4. 内存泄露(memory leak)
  5. 栈溢出(stack overflow)
  • 不同的工具有不同的侧重点,本章主要从slub_debug、kmemleak、kasan三个工具介绍。
  • kmemleak侧重于内存泄露问题发现。
  • slub_debug和kasan有一定的重复,部分slub_debug问题需要借助slabinfo去发现;kasan更快,所有问题独立上报,缺点是需要高版本GCC支持(gcc 4.9.2 or gcc 5.0)。
测试环境准备
  • 更新内核版本到Kernel v4.4,然后编译:

git clone https://github.com/arnoldlu/linux.git -b running_kernel_4.4

export ARCH=arm64

export CROSS_COMPILE=aarch64-linux-gnu-

make defconfig

make bzImage -j4 ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-

slub_debug
  • 关键词:Red Zone、Padding、Object Layout。
  • Linux内核中,小块内存大量使用slab/slub分配器,slub_debug提供了内存检测小功能。
  • 内存中比较容易出错的地方有:
  1. 访问已经释放的内存
  2. 越界访问
  3. 重复释放内存
编译支持slub_debug内核
  • 首先需要打开General setup -> Enable SLUB debugging support,然后再选择Kernel hacking -> Memory Debugging -> SLUB debugging on by default。

CONFIG_SLUB=y

CONFIG_SLUB_DEBUG=y

CONFIG_SLUB_DEBUG_ON=y

CONFIG_SLUB_STATS=y

测试环境:slabinfo、slub.ko
  • 通过slub.ko模拟内存异常访问,有些可以直接显示,有些需要通过slabinfo -v来查看。
  • 在tools/vm目录下,执行如下命令,生成可执行文件slabinfo。放入_install目录,打包到zImage中。

make slabinfo CFLAGS=-static ARCH=arm CROSS_COMPILE=arm-linux-gnueabi-

  • 将编译好的slabinfo放入sbin。
  • 下面三个测试代码:https://github.com/arnoldlu/linux/tree/running_kernel_4.4/test_code/slub_debug
  • 在test_code/slub_debug目录下执行make.sh,将slub.ko/slub2.ko/slub3.ko放入data。
进行测试
  • 启动QEMU:

qemu-system-aarch64 -machine virt -cpu cortex-a57 -machine type=virt -smp 2 -m 2048 -kernel arch/arm64/boot/Image --append "rdinit=/linuxrc console=ttyAMA0 loglevel=8 slub_debug=UFPZ" -nographic

F:在free的时候会执行检查。

Z:表示Red Zone的意思。

P:是Poison的意思。

U:会记录slab的使用者信息,如果打开,会会显示分配释放对象的栈回溯。

  • 在slub_debug打开SLAB_STORE_USER选项后,可以清晰地看到问题点的backtrace。
测试结果
  • 内存越界访问包括Redzone overwritten和Object padding overwritten。
  • 重复释放对应Object already free。访问已释放内存为Posion overwritten。

Redzone overwritten

  • 执行insmod data/slub.ko,使用slabinfo -v查看结果。

static void create_slub_error(void) { buf = kmalloc(32 GFP_KERNEL); if(buf) { memset(buf 0x55 80);-----------------------------------虽然分配32字节,但是对应分配了64字节。所以设置为80字节访问触发异常。从buf开始的80个字节仍然被初始化成功。 } }

  • 虽然kmalloc申请了32字节的slab缓冲区,但是内核分配的是kmalloc-64。所以memset 36字节不会报错,将36改成大于64即可。
  • 一个slub Debug输出包括四大部分:

============================================================================= BUG kmalloc-64 (Tainted: G O ): Redzone overwritten-------------------------------------------------------------1. 问题描述:slab名称-kmalloc-64,什么错误-Redzone overwritten。 ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: 0xeddb3640-0xeddb3643. First byte 0x55 instead of 0xcc------------------------------------------------1.1 问题起始和结束地址,这里一共4字节。 INFO: Allocated in 0x55555555 age=1766 cpu=0 pid=771---------------------------------------------------------1.2 slab的分配栈回溯 0x55555555 0xbf002014 do_one_initcall 0x90/0x1d8 do_init_module 0x60/0x38c load_module 0x1bac/0x1e94 SyS_init_module 0x14c/0x15c ret_fast_syscall 0x0/0x3c INFO: Freed in do_one_initcall 0x78/0x1d8 age=1766 cpu=0 pid=771-----------------------------------------1.3 slab的释放栈回溯 do_one_initcall 0x78/0x1d8 do_init_module 0x60/0x38c load_module 0x1bac/0x1e94 SyS_init_module 0x14c/0x15c ret_fast_syscall 0x0/0x3c INFO: Slab 0xefdb5660 objects=16 used=14 fp=0xeddb3700 flags=0x0081-----------------------------------1.4 slab的地址,以及其它信息。 INFO: Object 0xeddb3600 @offset=1536 fp=0x55555555-----------------------------------------------------------1.5 当前Object起始,及相关信息 Bytes b4 eddb35f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ------------2. 问题slab对象内容。2.1 打印问题slab对象内容之前一些字节。 Object eddb3600: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU---------2.2 slab对象内容,全部为0x55。 Object eddb3610: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU Object eddb3620: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU Object eddb3630: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU Redzone eddb3640: 55 55 55 55 UUUU----------------------------------------------------------------------------------2.3 Redzone内容,问题出在这里。 Padding eddb36e8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ------------2.4 Padding内容,为了对象对齐而补充。 Padding eddb36f8: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ CPU: 2 PID: 773 Comm: slabinfo Tainted: G B O 4.4.0 #93--------------------------------------------------------3. 检查问题点的栈打印,这里是由于slabinfo找出来的。 Hardware name: ARM-Versatile Express [<c0016588>] (unwind_backtrace) from [<c0013070>] (show_stack 0x10/0x14) [<c0013070>] (show_stack) from [<c0244130>] (dump_stack 0x78/0x88) [<c0244130>] (dump_stack) from [<c00e1874>] (check_bytes_and_report 0xd0/0x10c) [<c00e1874>] (check_bytes_and_report) from [<c00e1a14>] (check_object 0x164/0x234) [<c00e1a14>] (check_object) from [<c00e29bc>] (validate_slab_slab 0x198/0x1bc) [<c00e29bc>] (validate_slab_slab) from [<c00e578c>] (validate_store 0xac/0x190) [<c00e578c>] (validate_store) from [<c0146780>] (kernfs_fop_write 0xb8/0x1b4) [<c0146780>] (kernfs_fop_write) from [<c00ebfc4>] (__vfs_write 0x1c/0xd8) [<c00ebfc4>] (__vfs_write) from [<c00ec808>] (vfs_write 0x90/0x170) [<c00ec808>] (vfs_write) from [<c00ed008>] (SyS_write 0x3c/0x90) [<c00ed008>] (SyS_write) from [<c000f3c0>] (ret_fast_syscall 0x0/0x3c) FIX kmalloc-64: Restoring 0xeddb3640-0xeddb3643=0xcc----------------------------------------------------------4. 问题点是如何被解决的,此处恢复4个字节为0xcc。

Object padding overwritten

void create_slub_error(void) { int i; buf = kmalloc(32 GFP_KERNEL); if(buf) { buf[-1] = 0x55;------------------------------------------------------------------------向左越界访问 kfree(buf); } }

  • 执行insmod data/slub4.ko,结果如下。
  • 这里的越界访问和之前有点不一样的是,这里向左越界。覆盖到了Padding区域。

al: slub error test init ============================================================================= BUG kmalloc-128 (Tainted: G O ): Object padding overwritten------------------------------------------------------覆盖到Padding区域 ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: 0xffff80007767e9ff-0xffff80007767e9ff. First byte 0x55 instead of 0x5a INFO: Allocated in call_usermodehelper_setup 0x44/0xb8 age=1 cpu=1 pid=789 alloc_debug_processing 0x17c/0x188 ___slab_alloc.constprop.30 0x3f8/0x440 __slab_alloc.isra.27.constprop.29 0x24/0x38 kmem_cache_alloc 0x1ec/0x260 call_usermodehelper_setup 0x44/0xb8 / # kobject_uevent_env 0x494/0x500 kobject_uevent 0x10/0x18 load_module 0x18cc/0x1d78 SyS_init_module 0x150/0x178 el0_svc_naked 0x24/0x28 INFO: Slab 0xffff7bffc2dd9f80 objects=16 used=9 fp=0xffff80007767ea00 flags=0x4081 INFO: Object 0xffff80007767e800 @offset=2048 fp=0xffff80007767ea00 Bytes b4 ffff80007767e7f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Object ffff80007767e800: 00 01 00 00 00 00 00 00 08 e8 67 77 00 80 ff ff ..........gw.... Object ffff80007767e810: 08 e8 67 77 00 80 ff ff f8 83 0c 00 00 80 ff ff ..gw............ Object ffff80007767e820: 00 00 00 00 00 00 00 00 00 6e aa 00 00 80 ff ff .........n...... Object ffff80007767e830: 00 23 67 78 00 80 ff ff 18 23 67 78 00 80 ff ff .#gx.....#gx.... Object ffff80007767e840: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff80007767e850: b8 8e 32 00 00 80 ff ff 00 23 67 78 00 80 ff ff ..2......#gx.... Object ffff80007767e860: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff80007767e870: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Redzone ffff80007767e880: cc cc cc cc cc cc cc cc ........ Padding ffff80007767e9c0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Padding ffff80007767e9d0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Padding ffff80007767e9e0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Padding ffff80007767e9f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 55 ZZZZZZZZZZZZZZZU CPU: 0 PID: 790 Comm: mdev Tainted: G B O 4.4.0 #116 Hardware name: linux dummy-virt (DT) Call trace: [<ffff800000089738>] dump_backtrace 0x0/0x108 [<ffff800000089854>] show_stack 0x14/0x20 [<ffff8000003253c4>] dump_stack 0x94/0xd0 [<ffff800000196460>] print_trailer 0x128/0x1b8 [<ffff800000196848>] check_bytes_and_report 0xd8/0x118 [<ffff800000196928>] check_object 0xa0/0x240 [<ffff8000001987e0>] free_debug_processing 0x128/0x380 [<ffff80000019a1cc>] __slab_free 0x344/0x4a0 [<ffff80000019ab94>] kfree 0x1ec/0x220 [<ffff8000000c8278>] umh_complete 0x58/0x68 [<ffff8000000c83d8>] call_usermodehelper_exec_async 0x150/0x170 [<ffff800000085c50>] ret_from_fork 0x10/0x40 FIX kmalloc-128: Restoring 0xffff80007767e9ff-0xffff80007767e9ff=0x5a---------------------------------------------------------问题处理是将对应字节恢复为0x5a。

Object already free

void create_slub_error(void) { buf = kmalloc(32 GFP_KERNEL); if(buf) { memset(buf 0x55 32); kfree(buf); printk("al: Object already freed"); kfree(buf); } }

  • 内核中free执行流程如下:

kfree ->slab_free ->__slab_free ->kmem_cache_debug ->free_debug_processing ->on_freelist

  • 执行insmod data/slub2.ko,结果如下。

al: slub error test init al: Object already freed ============================================================================= BUG kmalloc-128 (Tainted: G B O ): Object already free------------------------------------------------------------------在64位系统,32字节的kmalloc变成了kmalloc-128,问题类型是:Object already free,也即重复释放。 ----------------------------------------------------------------------------- INFO: Allocated in create_slub_error 0x20/0x80 [slub2] age=0 cpu=1 pid=791------------------------------------内存分配点栈回溯 alloc_debug_processing 0x17c/0x188 ___slab_alloc.constprop.30 0x3f8/0x440 __slab_alloc.isra.27.constprop.29 0x24/0x38 kmem_cache_alloc 0x1ec/0x260 create_slub_error 0x20/0x80 [slub2] my_test_init 0x14/0x28 [slub2] do_one_initcall 0x90/0x1a0 do_init_module 0x60/0x1cc load_module 0x18dc/0x1d78 SyS_init_module 0x150/0x178 el0_svc_naked 0x24/0x28 INFO: Freed in create_slub_error 0x50/0x80 [slub2] age=0 cpu=1 pid=791------------------------------------------内存释放点栈回溯 free_debug_processing 0x17c/0x380 __slab_free 0x344/0x4a0 kfree 0x1ec/0x220 create_slub_error 0x50/0x80 [slub2] my_test_init 0x14/0x28 [slub2] do_one_initcall 0x90/0x1a0 do_init_module 0x60/0x1cc load_module 0x18dc/0x1d78 SyS_init_module 0x150/0x178 el0_svc_naked 0x24/0x28 INFO: Slab 0xffff7bffc2dda800 objects=16 used=7 fp=0xffff8000776a0800 flags=0x4081 INFO: Object 0xffff8000776a0800 @offset=2048 fp=0xffff8000776a0a00 Bytes b4 ffff8000776a07f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Object ffff8000776a0800: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk-----------------内存内容打印,供128字节。 Object ffff8000776a0810: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff8000776a0820: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff8000776a0830: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff8000776a0840: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff8000776a0850: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff8000776a0860: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff8000776a0870: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk. Redzone ffff8000776a0880: bb bb bb bb bb bb bb bb ........ Padding ffff8000776a09c0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Padding ffff8000776a09d0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Padding ffff8000776a09e0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Padding ffff8000776a09f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ CPU: 1 PID: 791 Comm: insmod Tainted: G B O 4.4.0 #116--------------------------------------------------------------此处问题在insmod就发现了,所以检查出问题的进程就是insmod。 Hardware name: linux dummy-virt (DT) Call trace: [<ffff800000089738>] dump_backtrace 0x0/0x108 [<ffff800000089854>] show_stack 0x14/0x20 [<ffff8000003253c4>] dump_stack 0x94/0xd0 [<ffff800000196460>] print_trailer 0x128/0x1b8 [<ffff800000198954>] free_debug_processing 0x29c/0x380 [<ffff80000019a1cc>] __slab_free 0x344/0x4a0 [<ffff80000019ab94>] kfree 0x1ec/0x220 [<ffff7ffffc008060>] create_slub_error 0x60/0x80 [slub2] [<ffff7ffffc00a014>] my_test_init 0x14/0x28 [slub2] [<ffff800000082930>] do_one_initcall 0x90/0x1a0 [<ffff80000014647c>] do_init_module 0x60/0x1cc [<ffff800000120704>] load_module 0x18dc/0x1d78 [<ffff800000120cf0>] SyS_init_module 0x150/0x178 [<ffff800000085cb0>] el0_svc_naked 0x24/0x28 FIX kmalloc-128: Object at 0xffff8000776a0800 not freed------------------------------------------------------------------处理的结果是,此处slab 对象是没有被释放。

Poison overwritten

  • 访问已释放内存的测试代码如下:

static void create_slub_error(void) { buf = kmalloc(32 GFP_KERNEL);-----------------------此时的buf内容都是0x6B if(buf) { kfree(buf); printk("al: Access after free"); memset(buf 0x55 32);-----------------------------虽然被释放,但是memset仍然生效了变成了0x55。 } }

  • 执行insmod data/slub3.ko ,使用slabinfo -v查看结果。

============================================================================= BUG kmalloc-128 (Tainted: G B O ): Poison overwritten----------------------------------------------slab名称为kmalloc-64,问题类型是:Poison overwritten,即访问已释放内存。 ----------------------------------------------------------------------------- INFO: 0xffff800077692800-0xffff80007769281f. First byte 0x55 instead of 0x6b INFO: Allocated in create_slub_error 0x28/0xf0 [slub3] age=1089 cpu=1 pid=793----------分配点的栈回溯 alloc_debug_processing 0x17c/0x188 ___slab_alloc.constprop.30 0x3f8/0x440 __slab_alloc.isra.27.constprop.29 0x24/0x38 kmem_cache_alloc 0x1ec/0x260 create_slub_error 0x28/0xf0 [slub3] 0xffff7ffffc00e014 do_one_initcall 0x90/0x1a0 do_init_module 0x60/0x1cc load_module 0x18dc/0x1d78 SyS_init_module 0x150/0x178 el0_svc_naked 0x24/0x28 INFO: Freed in create_slub_error 0x80/0xf0 [slub3] age=1089 cpu=1 pid=793--------------释放点的栈回溯 free_debug_processing 0x17c/0x380 __slab_free 0x344/0x4a0 kfree 0x1ec/0x220 create_slub_error 0x80/0xf0 [slub3] 0xffff7ffffc00e014 do_one_initcall 0x90/0x1a0 do_init_module 0x60/0x1cc load_module 0x18dc/0x1d78 SyS_init_module 0x150/0x178 el0_svc_naked 0x24/0x28 INFO: Slab 0xffff7bffc2dda480 objects=16 used=16 fp=0x (null) flags=0x4080 INFO: Object 0xffff800077692800 @offset=2048 fp=0xffff800077692400 Bytes b4 ffff8000776927f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Object ffff800077692800: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU--------前32字节仍然被修改成功。 Object ffff800077692810: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU Object ffff800077692820: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff800077692830: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff800077692840: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff800077692850: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff800077692860: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff800077692870: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk. Redzone ffff800077692880: bb bb bb bb bb bb bb bb ........ Padding ffff8000776929c0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Padding ffff8000776929d0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Padding ffff8000776929e0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Padding ffff8000776929f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ CPU: 0 PID: 795 Comm: slabinfo Tainted: G B O 4.4.0 #116 Hardware name: linux dummy-virt (DT) Call trace: [<ffff800000089738>] dump_backtrace 0x0/0x108 [<ffff800000089854>] show_stack 0x14/0x20 [<ffff8000003253c4>] dump_stack 0x94/0xd0 [<ffff800000196460>] print_trailer 0x128/0x1b8 [<ffff800000196848>] check_bytes_and_report 0xd8/0x118 [<ffff800000196a54>] check_object 0x1cc/0x240 [<ffff800000197920>] alloc_debug_processing 0x108/0x188 [<ffff800000199670>] ___slab_alloc.constprop.30 0x3f8/0x440 [<ffff8000001996dc>] __slab_alloc.isra.27.constprop.29 0x24/0x38 [<ffff8000001998dc>] kmem_cache_alloc 0x1ec/0x260 [<ffff8000001d42fc>] seq_open 0x34/0x90 [<ffff80000022059c>] kernfs_fop_open 0x194/0x370 [<ffff8000001afb04>] do_dentry_open 0x214/0x318 [<ffff8000001b0dc8>] vfs_open 0x58/0x68 [<ffff8000001bf338>] path_openat 0x460/0xdf0 [<ffff8000001c0ff0>] do_filp_open 0x60/0xe0 [<ffff8000001b117c>] do_sys_open 0x12c/0x218 [<ffff8000001fd53c>] compat_SyS_open 0x1c/0x28 [<ffff800000085cb0>] el0_svc_naked 0x24/0x28 FIX kmalloc-128: Restoring 0xffff800077692800-0xffff80007769281f=0x6b FIX kmalloc-128: Marking all objects used SLUB: kmalloc-128 210 slabs counted but counter=211 slabinfo (795) used greatest stack depth: 12976 bytes leftkmemleak

  • kmemleak是内核提供的一种检测内存泄露工具,启动一个内核线程扫描内存,并打印发现新的未引用对象数量。
支持kmemleak内核选项
  • 要使用kmemlieak,需要打开如下内核选项。
  • Kernel hacking->Memory Debugging->Kernel memory leak detector:

CONFIG_HAVE_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=400 # CONFIG_DEBUG_KMEMLEAK_TEST is not set CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y---------或者关闭此选项,则不需要在命令行添加kmemleak=on。

构造测试环境
  • 同时还需要在内核启动命令行中添加kmemleak=on。

qemu-system-aarch64 -machine virt -cpu cortex-a57 -machine type=virt -smp 2 -m 2048 -kernel arch/arm64/boot/Image --append "rdinit=/linuxrc console=ttyAMA0 loglevel=8 kmemleak=on" -nographic

static char *buf; void create_kmemleak(void) { buf = kmalloc(120 GFP_KERNEL); buf = vmalloc(4096); }进行测试

  • 进行kmemleak测试之前,需要写入scan触发扫描操作。
  • 然后通过读kmemlean节点读取相关信息。
  1. 打开kmemlean扫描功能:echo scan > sys/kernel/debug/kmemleak
  2. 加载问题module:insmod data/kmemleak.ko
  3. 等待问题发现:kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
  4. 查看kmemleak结果:cat /sys/kernel/debug/kmemleak
分析测试结果
  • 每处泄露,都标出泄露地址和大小;相关进程信息;内存内容dump;栈回溯。
  • kmemleak会提示内存泄露可疑对象的具体栈调用信息、可疑对象的大小、使用哪个函数分配、二进制打印。

unreferenced object 0xede22dc0 (size 128):-------------------------------------第一处可疑泄露128字节 comm "insmod" pid 765 jiffies 4294941257 (age 104.920s)--------------------相关进程信息 hex dump (first 32 bytes):---------------------------------------------------二进制打印 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk backtrace:-------------------------------------------------------------------栈回溯 [<bf002014>] 0xbf002014 [<c000973c>] do_one_initcall 0x90/0x1d8 [<c00a71f4>] do_init_module 0x60/0x38c [<c0086898>] load_module 0x1bac/0x1e94 [<c0086ccc>] SyS_init_module 0x14c/0x15c [<c000f3c0>] ret_fast_syscall 0x0/0x3c [<ffffffff>] 0xffffffff unreferenced object 0xf12ba000 (size 4096): comm "insmod" pid 765 jiffies 4294941257 (age 104.920s) hex dump (first 32 bytes): d8 21 00 00 02 18 00 00 e4 21 00 00 02 18 00 00 .!.......!...... 46 22 00 00 02 18 00 00 52 22 00 00 02 18 00 00 F"......R"...... backtrace: [<c00d77c8>] vmalloc 0x2c/0x34 [<bf002014>] 0xbf002014 [<c000973c>] do_one_initcall 0x90/0x1d8 [<c00a71f4>] do_init_module 0x60/0x38c [<c0086898>] load_module 0x1bac/0x1e94 [<c0086ccc>] SyS_init_module 0x14c/0x15c [<c000f3c0>] ret_fast_syscall 0x0/0x3c [<ffffffff>] 0xffffffffkasan

  • kasan暂不支持32位ARM,支持ARM64和X86。
  • kasan是一个动态检查内存错误的工具,可以检查内存越界访问、使用已释放内存、重复释放以及栈溢出。
使能kasan
  • 使用kasan,必须打开CONFIG_KASAN。
  • Kernel hacking->Memory debugging->KASan: runtime memory debugger

CONFIG_HAVE_ARCH_KASAN=y CONFIG_KASAN=y # CONFIG_KASAN_OUTLINE is not set CONFIG_KASAN_INLINE=y CONFIG_TEST_KASAN=m

代码分析

kasan_report ->kasan_report_error ->print_error_description ->print_address_description ->print_shadow_for_address

测试用及分析
  • kasan提供了一个测试程序test_kacan.c,将其编译成模块,加载到内核。可以模拟很多内存错误场景。
  • kasan可以检测到越界访问、访问已释放内存、重复释放等类型错误,其中重复释放借助于slub_debug。

insmod data/kasan.ko

  • 越界访问包括slab越界、栈越界、全局变量越界;访问已释放内存use-after-free;重复释放可以被slub_debug识别。

slab-out-of-bounds

static noinline void __init kmalloc_oob_right(void) { char *ptr; size_t size = 123; pr_info("out-of-bounds to right\n"); ptr = kmalloc(size GFP_KERNEL); if (!ptr) { pr_err("Allocation failed\n"); return; } ptr[size] = 'x'; kfree(ptr); }

  • 此种错误类型是对slab的越界访问,包括左侧、右侧、扩大、缩小后越界访问。除了数组赋值,还包括memset、指针访问等等。

al: kasan error test init kasan test: kmalloc_oob_right out-of-bounds to right ================================================================== BUG: KASAN: slab-out-of-bounds in kmalloc_oob_right 0xa4/0xe0 [kasan] at addr ffff800066539c7b----------------错误类型是slab-out-of-bounds,在kmalloc_oob_right中产生。 Write of size 1 by task insmod/788 ============================================================================= BUG kmalloc-128 (Tainted: G O ): kasan: bad access detected-------------------------------------------------------------------slab非法非法访问 ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: Allocated in kmalloc_oob_right 0x54/0xe0 [kasan] age=0 cpu=1 pid=788--------------------------------------------问题点kmalloc_oob_right的栈回溯 alloc_debug_processing 0x17c/0x188 ___slab_alloc.constprop.30 0x3f8/0x440 __slab_alloc.isra.27.constprop.29 0x24/0x38 kmem_cache_alloc 0x220/0x280 kmalloc_oob_right 0x54/0xe0 [kasan] kmalloc_tests_init 0x18/0x70 [kasan] do_one_initcall 0x11c/0x310 do_init_module 0x1cc/0x588 load_module 0x48cc/0x5dc0 SyS_init_module 0x1a8/0x1e0 el0_svc_naked 0x24/0x28 INFO: Freed in do_one_initcall 0x10c/0x310 age=0 cpu=1 pid=788 free_debug_processing 0x17c/0x368 __slab_free 0x344/0x4a0 kfree 0x21c/0x250 do_one_initcall 0x10c/0x310 do_init_module 0x1cc/0x588 load_module 0x48cc/0x5dc0 SyS_init_module 0x1a8/0x1e0 el0_svc_naked 0x24/0x28 INFO: Slab 0xffff7bffc2994e00 objects=16 used=2 fp=0xffff800066539e00 flags=0x4080 INFO: Object 0xffff800066539c00 @offset=7168 fp=0xffff800066538200 Bytes b4 ffff800066539bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................------------------------------内存dump Object ffff800066539c00: 00 82 53 66 00 80 ff ff 74 65 73 74 73 5f 69 6e ..Sf....tests_in Object ffff800066539c10: 69 74 20 5b 6b 61 73 61 6e 5d 00 00 00 00 00 00 it [kasan]...... Object ffff800066539c20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff800066539c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff800066539c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff800066539c50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff800066539c60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff800066539c70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Padding ffff800066539db0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Padding ffff800066539dc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Padding ffff800066539dd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Padding ffff800066539de0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Padding ffff800066539df0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ CPU: 1 PID: 788 Comm: insmod Tainted: G B O 4.4.0 #108------------------------------------------------------------------打印此log消息的栈回溯 Hardware name: linux dummy-virt (DT) Call trace: [<ffff80000008e938>] dump_backtrace 0x0/0x270 [<ffff80000008ebbc>] show_stack 0x14/0x20 [<ffff800000735bb0>] dump_stack 0x100/0x188 [<ffff800000318f60>] print_trailer 0xf8/0x160 [<ffff80000031ea8c>] object_err 0x3c/0x50 [<ffff8000003209a0>] kasan_report_error 0x240/0x558 [<ffff800000320e90>] __asan_report_store1_noabort 0x48/0x50 [<ffff7ffffc008324>] kmalloc_oob_right 0xa4/0xe0 [kasan] [<ffff7ffffc009070>] kmalloc_tests_init 0x18/0x70 [kasan] [<ffff80000008309c>] do_one_initcall 0x11c/0x310 [<ffff8000002648c4>] do_init_module 0x1cc/0x588 [<ffff800000206724>] load_module 0x48cc/0x5dc0 [<ffff800000207dc0>] SyS_init_module 0x1a8/0x1e0 [<ffff800000086cb0>] el0_svc_naked 0x24/0x28 Memory state around the buggy address: ffff800066539b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff800066539b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff800066539c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03 ^ ffff800066539c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff800066539d00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ==================================================================

user-after-free

  • user-after-free是释放后使用的意思。

static noinline void __init kmalloc_uaf(void) { char *ptr; size_t size = 10; pr_info("use-after-free\n"); ptr = kmalloc(size GFP_KERNEL); if (!ptr) { pr_err("Allocation failed\n"); return; } kfree(ptr); *(ptr 8) = 'x'; }

测试结果如下:

kasan test: kmalloc_uaf use-after-free ================================================================== BUG: KASAN: use-after-free in kmalloc_uaf 0xac/0xe0 [kasan] at addr ffff800066539e08 Write of size 1 by task insmod/788 ============================================================================= BUG kmalloc-128 (Tainted: G B O ): kasan: bad access detected ----------------------------------------------------------------------------- INFO: Allocated in kmalloc_uaf 0x54/0xe0 [kasan] age=0 cpu=1 pid=788 alloc_debug_processing 0x17c/0x188 ___slab_alloc.constprop.30 0x3f8/0x440 __slab_alloc.isra.27.constprop.29 0x24/0x38 kmem_cache_alloc 0x220/0x280 kmalloc_uaf 0x54/0xe0 [kasan] kmalloc_tests_init 0x48/0x70 [kasan] do_one_initcall 0x11c/0x310 do_init_module 0x1cc/0x588 load_module 0x48cc/0x5dc0 SyS_init_module 0x1a8/0x1e0 el0_svc_naked 0x24/0x28 INFO: Freed in kmalloc_uaf 0x84/0xe0 [kasan] age=0 cpu=1 pid=788 free_debug_processing 0x17c/0x368 __slab_free 0x344/0x4a0 kfree 0x21c/0x250 kmalloc_uaf 0x84/0xe0 [kasan] kmalloc_tests_init 0x48/0x70 [kasan] do_one_initcall 0x11c/0x310 do_init_module 0x1cc/0x588 load_module 0x48cc/0x5dc0 SyS_init_module 0x1a8/0x1e0 el0_svc_naked 0x24/0x28 INFO: Slab 0xffff7bffc2994e00 objects=16 used=1 fp=0xffff800066539e00 flags=0x4080 INFO: Object 0xffff800066539e00 @offset=7680 fp=0xffff800066539800 Bytes b4 ffff800066539df0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff800066539e00: 00 98 53 66 00 80 ff ff 00 00 00 00 00 00 00 00 ..Sf............ Object ffff800066539e10: 00 9e 53 66 00 80 ff ff d0 51 12 00 00 80 ff ff ..Sf.....Q...... Object ffff800066539e20: 00 00 00 00 00 00 00 00 e0 14 6d 01 00 80 ff ff ..........m..... Object ffff800066539e30: 00 69 a3 66 00 80 ff ff 18 69 a3 66 00 80 ff ff .i.f.....i.f.... Object ffff800066539e40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff800066539e50: 30 da 73 00 00 80 ff ff 00 69 a3 66 00 80 ff ff 0.s......i.f.... Object ffff800066539e60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff800066539e70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Padding ffff800066539fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Padding ffff800066539fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Padding ffff800066539fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Padding ffff800066539fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Padding ffff800066539ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ CPU: 1 PID: 788 Comm: insmod Tainted: G B O 4.4.0 #108 Hardware name: linux dummy-virt (DT) Call trace: [<ffff80000008e938>] dump_backtrace 0x0/0x270 [<ffff80000008ebbc>] show_stack 0x14/0x20 [<ffff800000735bb0>] dump_stack 0x100/0x188 [<ffff800000318f60>] print_trailer 0xf8/0x160 [<ffff80000031ea8c>] object_err 0x3c/0x50 [<ffff8000003209a0>] kasan_report_error 0x240/0x558 [<ffff800000320e90>] __asan_report_store1_noabort 0x48/0x50 [<ffff7ffffc00874c>] kmalloc_uaf 0xac/0xe0 [kasan] [<ffff7ffffc0090a0>] kmalloc_tests_init 0x48/0x70 [kasan] [<ffff80000008309c>] do_one_initcall 0x11c/0x310 [<ffff8000002648c4>] do_init_module 0x1cc/0x588 [<ffff800000206724>] load_module 0x48cc/0x5dc0 [<ffff800000207dc0>] SyS_init_module 0x1a8/0x1e0 [<ffff800000086cb0>] el0_svc_naked 0x24/0x28 Memory state around the buggy address: ffff800066539d00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff800066539d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff800066539e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff800066539e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff800066539f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ==================================================================

stack-out-of-bounds

  • 栈越界访问是函数中数组越界,在实际工程中经常出现,问题难以发现。

static noinline void __init kasan_stack_oob(void) { char stack_array[10]; volatile int i = 0; char *p = &stack_array[ARRAY_SIZE(stack_array) i]; pr_info("out-of-bounds on stack\n"); *(volatile char *)p; }

kasan test: kasan_stack_oob out-of-bounds on stack ================================================================== BUG: KASAN: stack-out-of-bounds in kasan_stack_oob 0xa8/0xf0 [kasan] at addr ffff800066acb95a Read of size 1 by task insmod/788 page:ffff7bffc29ab2c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x0() page dumped because: kasan: bad access detected CPU: 1 PID: 788 Comm: insmod Tainted: G B O 4.4.0 #108 Hardware name: linux dummy-virt (DT) Call trace: [<ffff80000008e938>] dump_backtrace 0x0/0x270 [<ffff80000008ebbc>] show_stack 0x14/0x20 [<ffff800000735bb0>] dump_stack 0x100/0x188 [<ffff800000320c90>] kasan_report_error 0x530/0x558 [<ffff800000320d00>] __asan_report_load1_noabort 0x48/0x50 [<ffff7ffffc0080a8>] kasan_stack_oob 0xa8/0xf0 [kasan] [<ffff7ffffc0090b0>] kmalloc_tests_init 0x58/0x70 [kasan] [<ffff80000008309c>] do_one_initcall 0x11c/0x310 [<ffff8000002648c4>] do_init_module 0x1cc/0x588 [<ffff800000206724>] load_module 0x48cc/0x5dc0 [<ffff800000207dc0>] SyS_init_module 0x1a8/0x1e0 [<ffff800000086cb0>] el0_svc_naked 0x24/0x28 Memory state around the buggy address: ffff800066acb800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff800066acb880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 >ffff800066acb900: f1 f1 04 f4 f4 f4 f2 f2 f2 f2 00 02 f4 f4 f3 f3 ^ ffff800066acb980: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 ffff800066acba00: f1 f1 00 00 00 00 00 00 00 00 f3 f3 f3 f3 00 00 ==================================================================

global-out-of-bounds

static char global_array[10]; static noinline void __init kasan_global_oob(void) { volatile int i = 3; char *p = &global_array[ARRAY_SIZE(global_array) i]; pr_info("out-of-bounds global variable\n"); *(volatile char *)p; }

测试结果如下:

kasan test: kasan_global_oob out-of-bounds global variable ================================================================== BUG: KASAN: global-out-of-bounds in kasan_global_oob 0x9c/0xe8 [kasan] at addr ffff7ffffc001c8d Read of size 1 by task insmod/788 Address belongs to variable global_array 0xd/0xffffffffffffe3f8 [kasan] CPU: 1 PID: 788 Comm: insmod Tainted: G B O 4.4.0 #108 Hardware name: linux dummy-virt (DT) Call trace: [<ffff80000008e938>] dump_backtrace 0x0/0x270 [<ffff80000008ebbc>] show_stack 0x14/0x20 [<ffff800000735bb0>] dump_stack 0x100/0x188 [<ffff800000320c90>] kasan_report_error 0x530/0x558 [<ffff800000320d00>] __asan_report_load1_noabort 0x48/0x50 [<ffff7ffffc00818c>] kasan_global_oob 0x9c/0xe8 [kasan] [<ffff7ffffc0090b4>] kmalloc_tests_init 0x5c/0x70 [kasan] [<ffff80000008309c>] do_one_initcall 0x11c/0x310 [<ffff8000002648c4>] do_init_module 0x1cc/0x588 [<ffff800000206724>] load_module 0x48cc/0x5dc0 [<ffff800000207dc0>] SyS_init_module 0x1a8/0x1e0 [<ffff800000086cb0>] el0_svc_naked 0x24/0x28 Memory state around the buggy address: ffff7ffffc001b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff7ffffc001c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff7ffffc001c80: 00 02 fa fa fa fa fa fa 00 00 00 00 00 00 00 00 ^ ffff7ffffc001d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff7ffffc001d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ==================================================================小结

  • kmemleak检查内存泄露的独门绝技,让其有一定市场空间。但功能比较单一,专注于内存泄露问题。
  • 对于非ARM64/x86平台,只能使用slub_debug进行内存问题分析;kasan更高效,但也需要更高的内核和GCC版本支持。

聊聊linux的内存管理机制,Linux内存管理内存检测技术(1)

猜您喜欢: