Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: double free socket buffer #127

Merged
merged 1 commit into from
Dec 12, 2024
Merged

fix: double free socket buffer #127

merged 1 commit into from
Dec 12, 2024

Conversation

ianchen0119
Copy link
Collaborator

@andy89923 found a reproducible kernel panic issue.
Follow the actions below can produce the issue

  • create online charging PDU Session
  • ping the specific Data Network

The kernel panic will only happens if the version of gtp5g greater than v0.8.x.

panic log

[  +0.000002] kernel BUG at mm/slub.c:307!
[  +0.000109] invalid opcode: 0000 [#1] SMP PTI
[  +0.000056] CPU: 3 PID: 191301 Comm: nrf Tainted: G           OE     5.4.0-131-generic #147-Ubuntu
[  +0.000068] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[   +0.000047] RIP: 0010:kfree (/usr/src/linux-source-5.4.0/mm/slub.c:307 /usr/src/linux-source-5.4.0/mm/slub.c:302 /usr/src/linux-source-5.4.0/mm/slub.c:3035 /usr/src/linux-source-5.4.0/mm/slub.c:3060 /usr/src/linux-source-5.4.0/mm/slub.c:4027) 
[ +0.000048] Code: e7 e8 9e 71 fd ff e9 ef fe ff ff 4d 89 f1 41 b8 01 00 00 00 48 89 d9 48 89 da 4c 89 e6 4c 89 ef e8 6f fa ff ff e9 d0 fe ff ff <0f> 0b 48 8b 05 d1 51 77 01 e9 ff fd ff ff 66 66 2e 0f 1f 84 00 00
All code
========
   0:	e7 e8                	out    %eax,$0xe8
   2:	9e                   	sahf   
   3:	71 fd                	jno    0x2
   5:	ff                   	(bad)  
   6:	e9 ef fe ff ff       	jmpq   0xfffffffffffffefa
   b:	4d 89 f1             	mov    %r14,%r9
   e:	41 b8 01 00 00 00    	mov    $0x1,%r8d
  14:	48 89 d9             	mov    %rbx,%rcx
  17:	48 89 da             	mov    %rbx,%rdx
  1a:	4c 89 e6             	mov    %r12,%rsi
  1d:	4c 89 ef             	mov    %r13,%rdi
  20:	e8 6f fa ff ff       	callq  0xfffffffffffffa94
  25:	e9 d0 fe ff ff       	jmpq   0xfffffffffffffefa
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	48 8b 05 d1 51 77 01 	mov    0x17751d1(%rip),%rax        # 0x1775204
  33:	e9 ff fd ff ff       	jmpq   0xfffffffffffffe37
  38:	66                   	data16
  39:	66                   	data16
  3a:	2e                   	cs
  3b:	0f                   	.byte 0xf
  3c:	1f                   	(bad)  
  3d:	84 00                	test   %al,(%rax)
	...

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	48 8b 05 d1 51 77 01 	mov    0x17751d1(%rip),%rax        # 0x17751da
   9:	e9 ff fd ff ff       	jmpq   0xfffffffffffffe0d
   e:	66                   	data16
   f:	66                   	data16
  10:	2e                   	cs
  11:	0f                   	.byte 0xf
  12:	1f                   	(bad)  
  13:	84 00                	test   %al,(%rax)
	...
[  +0.000108] RSP: 0000:ffffa104c015c7f0 EFLAGS: 00010246
[  +0.000018] RAX: ffff93e58bc98000 RBX: ffff93e58bc98000 RCX: ffff93e58bc98000
[  +0.000017] RDX: 0000000000039962 RSI: bdd6aff4c23d967a RDI: ffff93e58bc98000
[  +0.000017] RBP: ffffa104c015c810 R08: ffff93e58bc98000 R09: ffffa104c015c8d8
[  +0.000018] R10: ffff93e5d302c680 R11: 0000000000000001 R12: fffffc7d8c2f2600
[  +0.000018] R13: ffff93e6adc06bc0 R14: ffffffff99edcf25 R15: ffff93e565a70600
[  +0.000017] FS:  000000c000580090(0000) GS:ffff93e6afac0000(0000) knlGS:0000000000000000
[  +0.000020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.000014] CR2: 00007fac6fecf160 CR3: 000000034857c001 CR4: 0000000000760ee0
[  +0.000026] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  +0.000018] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  +0.000021] PKRU: 55555554
[  +0.000007] Call Trace:
[  +0.000014]  <IRQ>
[   +0.000031] skb_free_head (/usr/src/linux-source-5.4.0/net/core/skbuff.c:602) 
[   +0.000018] skb_release_data (/usr/src/linux-source-5.4.0/net/core/skbuff.c:622) 
[   +0.000015] skb_release_all (/usr/src/linux-source-5.4.0/net/core/skbuff.c:676) 
[   +0.000015] consume_skb (/usr/src/linux-source-5.4.0/net/core/skbuff.c:690 /usr/src/linux-source-5.4.0/net/core/skbuff.c:848) 
[   +0.000046] gtp5g_dev_xmit (/home/ianchen0119/gtp5g/src/gtpu/dev.c:136) gtp5g
[   +0.000016] ? update_load_avg (/usr/src/linux-source-5.4.0/kernel/sched/fair.c:3388 /usr/src/linux-source-5.4.0/kernel/sched/fair.c:3602) 
[   +0.000017] dev_hard_start_xmit (/usr/src/linux-source-5.4.0/./include/linux/prandom.h:58 /usr/src/linux-source-5.4.0/net/core/dev.c:3216 /usr/src/linux-source-5.4.0/net/core/dev.c:3234) 
[   +0.000019] __dev_queue_xmit (/usr/src/linux-source-5.4.0/./include/net/sch_generic.h:179 /usr/src/linux-source-5.4.0/net/core/dev.c:3453 /usr/src/linux-source-5.4.0/net/core/dev.c:3765) 
[   +0.000016] ? nfnetlink_has_listeners+0x15/0x20 nfnetlink
[   +0.000016] dev_queue_xmit (/usr/src/linux-source-5.4.0/net/core/dev.c:3834) 
[   +0.000014] neigh_direct_output (/usr/src/linux-source-5.4.0/net/core/neighbour.c:1548) 
[   +0.000019] ip_finish_output2 (/usr/src/linux-source-5.4.0/./include/net/neighbour.h:510 /usr/src/linux-source-5.4.0/net/ipv4/ip_output.c:236) 
[   +0.000016] __ip_finish_output (/usr/src/linux-source-5.4.0/net/ipv4/ip_output.c:317) 
[   +0.000017] ip_finish_output (/usr/src/linux-source-5.4.0/net/ipv4/ip_output.c:326) 
[   +0.000018] ip_output (/usr/src/linux-source-5.4.0/net/ipv4/ip_output.c:444) 
[   +0.000010] ? __ip_finish_output (/usr/src/linux-source-5.4.0/net/ipv4/ip_output.c:320) 
[   +0.000013] ip_forward_finish (/usr/src/linux-source-5.4.0/net/ipv4/ip_forward.c:84) 
[   +0.000012] ip_forward (/usr/src/linux-source-5.4.0/./include/linux/netfilter.h:300 /usr/src/linux-source-5.4.0/net/ipv4/ip_forward.c:157) 
[   +0.000010] ? ip4_key_hashfn (/usr/src/linux-source-5.4.0/net/ipv4/ip_forward.c:66) 
[   +0.000012] ip_sublist_rcv_finish (/usr/src/linux-source-5.4.0/net/ipv4/ip_input.c:539) 
[   +0.000021] ip_sublist_rcv (/usr/src/linux-source-5.4.0/net/ipv4/ip_input.c:588) 
[   +0.000956] ? ip_rcv_finish_core.isra.0 (/usr/src/linux-source-5.4.0/net/ipv4/ip_input.c:407) 
[   +0.000637] ip_list_rcv (/usr/src/linux-source-5.4.0/net/ipv4/ip_input.c:622) 
[   +0.000678] __netif_receive_skb_list_core (/usr/src/linux-source-5.4.0/net/core/dev.c:5014 /usr/src/linux-source-5.4.0/net/core/dev.c:5062) 
[   +0.000576] netif_receive_skb_list_internal (/usr/src/linux-source-5.4.0/net/core/dev.c:5116 /usr/src/linux-source-5.4.0/net/core/dev.c:5209) 
[   +0.000572] gro_normal_list.part.0 (/usr/src/linux-source-5.4.0/./include/linux/compiler.h:295 /usr/src/linux-source-5.4.0/./include/linux/list.h:28 /usr/src/linux-source-5.4.0/net/core/dev.c:5321) 
[   +0.000524] napi_complete_done (/usr/src/linux-source-5.4.0/net/core/dev.c:6063 (discriminator 1) /usr/src/linux-source-5.4.0/net/core/dev.c:6051 (discriminator 1)) 
[   +0.000557] virtnet_poll+0x30d/0x450 virtio_net
[   +0.000558] net_rx_action (/usr/src/linux-source-5.4.0/net/core/dev.c:6366 /usr/src/linux-source-5.4.0/net/core/dev.c:6436) 
[   +0.000598] __do_softirq (/usr/src/linux-source-5.4.0/./arch/x86/include/asm/jump_label.h:25 /usr/src/linux-source-5.4.0/./include/linux/jump_label.h:200 /usr/src/linux-source-5.4.0/./include/trace/events/irq.h:142 /usr/src/linux-source-5.4.0/kernel/softirq.c:293) 
[   +0.000559] irq_exit (/usr/src/linux-source-5.4.0/kernel/softirq.c:373 /usr/src/linux-source-5.4.0/kernel/softirq.c:413) 
[   +0.000500] do_IRQ (/usr/src/linux-source-5.4.0/arch/x86/kernel/irq.c:267 (discriminator 42)) 
[   +0.000504] common_interrupt (/usr/src/linux-source-5.4.0/arch/x86/entry/entry_64.S:613) 
[  +0.000488]  </IRQ>

root cause

If the PDU session is online charging session, the FAR action will be changed to PKT_DROP after the first uplink packet be sent to data network til the UPF get the quota from SMF. So the downlink packet for responding the first uplink packet will be freed twice before UPF het the new quota.
I believe that the issue started to be visible is effected by #101, because of it gives the more accurate packet counting.

@tim-ywliu tim-ywliu merged commit 97ef91a into master Dec 12, 2024
@tim-ywliu tim-ywliu deleted the fix/double-free-skb branch December 12, 2024 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants