Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debian (9.1) machine freeze after running "sudo ./a.out" #23

Closed
CiraciNicolo opened this issue Oct 5, 2017 · 20 comments
Closed

Debian (9.1) machine freeze after running "sudo ./a.out" #23

CiraciNicolo opened this issue Oct 5, 2017 · 20 comments

Comments

@CiraciNicolo
Copy link

CiraciNicolo commented Oct 5, 2017

Type of this issue (please specify)

This is a support matter (i.e. your own modified tree). I've removed CPU_DYING because it is not supported anymore.

System information

  1. CPU: Intel (Codename: i5-4278U)
  2. Kernel: Linux
  3. Kernel version: 4.9.0-3-amd64

Issue description

After running sudo ./a.out the machine freeze, I was able to understand that "ioctl" causes the freeze.
Since the machine just become unresponsive there is no logs,

@asamy
Copy link
Owner

asamy commented Oct 5, 2017 via email

@CiraciNicolo
Copy link
Author

I'm still experiencing the problem, also when compiling I get this warning:

ksm/exit.c: In function ‘vcpu_sync_idt’:
ksm/exit.c:2099:1: warning: the frame size of 4112 bytes is larger than 2048 bytes [-Wframe-larger-than=]
 }

@asamy
Copy link
Owner

asamy commented Oct 6, 2017

This warning is a false positive. The host stack is 2 4-KByte pages in size.
I am not entirely sure what could be the issue, this might be related to #22.

Are you testing on VM or baremetal? What is your RAM capacity?

@CiraciNicolo
Copy link
Author

I'm testing on baremetal, 8 GB of RAM. Looking into #22, esoterix's comment can lead somewhere?

@asamy
Copy link
Owner

asamy commented Oct 6, 2017

No, commit 85a228d fixes what he pointed out.

@asamy
Copy link
Owner

asamy commented Oct 6, 2017

Someone reported that disabling some VMCS controls fixes the freeze, but he hasn't pointed out which one is faulty.

Since I can't reproduce this freeze at all myself, can you disable one by one and let me know here which is faulty?

He pointed out removing all bits in req_cpuctl fixes it, but it could be a control in secondary control that is faulty since cpu control is what enables secondary controls.

@CiraciNicolo
Copy link
Author

I'm been able to get the stack frame of the crash via tty, we get a GPF. I've attached an image because I can't get the txt log. (sorry for the quality)
img_0473

@asamy
Copy link
Owner

asamy commented Oct 6, 2017

Upload ksmlinux.ko

@CiraciNicolo
Copy link
Author

Here we go!
ksmlinux.ko.zip

@CiraciNicolo
Copy link
Author

I found out that after a while, the machine unfreeze. I don't understand what is happening.

@asamy
Copy link
Owner

asamy commented Oct 10, 2017

I had this issue before with ept_memory_type, I just decided to comment it out without actually looking at the issue. From the binary you provided, it's not commented out like I suggested you do before, so please do so and let me know here just to confirm.

Have it like this:

#if 0
	int i;
	struct mtrr_range *range;
	u8 type = 0xff;

	for (i = 0; i < k->mtrr_count; ++i) {
		range = &k->mtrr_ranges[i];
		if (!in_bounds(gpa, range->start, range->end))
			continue;

		if (range->fixed || range->type == EPT_MT_UNCACHABLE)
			return range->type;

		if (range->type == EPT_MT_WRITETHROUGH && type == EPT_MT_WRITEBACK)
			type = EPT_MT_WRITETHROUGH;
		else
			type = range->type;
	}

	if (type == 0xff)
		type = k->mtrr_def;

	return type;
#else
	return EPT_MT_WRITEBACK;
#endif

@hmkawakami
Copy link

I was having the same issue, and I just found the problem. The MAX_RANGES in mm.h was too small. I had 10 physical memory regions, and MAX_RANGES default value is 8.

@CiraciNicolo
Copy link
Author

CiraciNicolo commented Oct 11, 2017

Actually I commented it but since it didn't fix the issue I decommended it. I tested it right now, and the um.c don't freeze anymore but I get

subvert: Invalid argument
ret: 0xFFFFFFFF

EDIT: I rerun a.out and another freeze

@CiraciNicolo
Copy link
Author

I rune again but this time the machine did not froze, and I was able to get this from dmegs:

[  106.265407] ksm: CPU 0: ksm_init: EPT/VPID caps: 0x00000F0106134141
[  106.265418] ksm: CPU 0: ksm_init: 2 physical memory ranges
[  106.265419] ksm: CPU 0: ksm_init: Range: 0x0000000000001000 -> 0x000000000009FBFF
[  106.265419] ksm: CPU 0: ksm_init: Range: 0x0000000000100000 -> 0x000000003FFEBFFF
[  106.265434] ksm: CPU 0: ksm_init: 41 MTRR ranges (0 default type)
[  106.265434] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000000000 -> 0x000000000000FFFF fixed: 1 type: 6
[  106.265435] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000010000 -> 0x000000000001FFFF fixed: 1 type: 6
[  106.265436] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000030000 -> 0x000000000003FFFF fixed: 1 type: 6
[  106.265436] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000060000 -> 0x000000000006FFFF fixed: 1 type: 6
[  106.265437] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000A0000 -> 0x00000000000AFFFF fixed: 1 type: 6
[  106.265437] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000F0000 -> 0x00000000000FFFFF fixed: 1 type: 6
[  106.265438] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000150000 -> 0x000000000015FFFF fixed: 1 type: 6
[  106.265438] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000001C0000 -> 0x00000000001CFFFF fixed: 1 type: 6
[  106.265439] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000080000 -> 0x0000000000083FFF fixed: 1 type: 6
[  106.265439] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000084000 -> 0x0000000000087FFF fixed: 1 type: 6
[  106.265440] ksm: CPU 0: ksm_init: MTRR Range: 0x000000000008C000 -> 0x000000000008FFFF fixed: 1 type: 6
[  106.265440] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000098000 -> 0x000000000009BFFF fixed: 1 type: 6
[  106.265442] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000A8000 -> 0x00000000000ABFFF fixed: 1 type: 6
[  106.265443] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000BC000 -> 0x00000000000BFFFF fixed: 1 type: 6
[  106.265443] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000D4000 -> 0x00000000000D7FFF fixed: 1 type: 6
[  106.265444] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000F0000 -> 0x00000000000F3FFF fixed: 1 type: 6
[  106.265444] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000C0000 -> 0x00000000000C0FFF fixed: 1 type: 5
[  106.265445] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000C1000 -> 0x00000000000C1FFF fixed: 1 type: 5
[  106.265445] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000C3000 -> 0x00000000000C3FFF fixed: 1 type: 5
[  106.265446] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000C6000 -> 0x00000000000C6FFF fixed: 1 type: 5
[  106.265446] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000CA000 -> 0x00000000000CAFFF fixed: 1 type: 5
[  106.265447] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000CF000 -> 0x00000000000CFFFF fixed: 1 type: 5
[  106.265447] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000D5000 -> 0x00000000000D5FFF fixed: 1 type: 5
[  106.265448] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000DC000 -> 0x00000000000DCFFF fixed: 1 type: 5
[  106.265448] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000C0000 -> 0x00000000000C0FFF fixed: 1 type: 5
[  106.265449] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000C9000 -> 0x00000000000C9FFF fixed: 1 type: 5
[  106.265449] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000D3000 -> 0x00000000000D3FFF fixed: 1 type: 5
[  106.265450] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000DE000 -> 0x00000000000DEFFF fixed: 1 type: 5
[  106.265450] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000EA000 -> 0x00000000000EAFFF fixed: 1 type: 5
[  106.265451] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000F7000 -> 0x00000000000F7FFF fixed: 1 type: 5
[  106.265451] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000105000 -> 0x0000000000105FFF fixed: 1 type: 5
[  106.265452] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000114000 -> 0x0000000000114FFF fixed: 1 type: 5
[  106.265452] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000C0000 -> 0x00000000000C0FFF fixed: 1 type: 5
[  106.265453] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000D1000 -> 0x00000000000D1FFF fixed: 1 type: 5
[  106.265453] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000E3000 -> 0x00000000000E3FFF fixed: 1 type: 5
[  106.265454] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000F6000 -> 0x00000000000F6FFF fixed: 1 type: 5
[  106.265454] ksm: CPU 0: ksm_init: MTRR Range: 0x000000000010A000 -> 0x000000000010AFFF fixed: 1 type: 5
[  106.265455] ksm: CPU 0: ksm_init: MTRR Range: 0x000000000011F000 -> 0x000000000011FFFF fixed: 1 type: 5
[  106.265455] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000135000 -> 0x0000000000135FFF fixed: 1 type: 5
[  106.265456] ksm: CPU 0: ksm_init: MTRR Range: 0x000000000014C000 -> 0x000000000014CFFF fixed: 1 type: 5
[  106.265456] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000000000 -> 0x000000003FFFFFFF fixed: 0 type: 0
[  106.265470] ksm: CPU 0: ksm_start: Major: 248
[  106.266087] ksm: CPU 0: ksm_start: ready
[  113.046346] ksm: CPU 0: ksm_open: open() from a.out
[  113.046349] ksm: CPU 0: ksm_ioctl: ioctl from a.out: cmd(0x00004B02)
[  113.053502] ksm: CPU 0: vcpu_run: 1: something went wrong: 12
[  113.053524] BUG: unable to handle kernel paging request at fffffffffffffff3
[  113.053565] IP: [<ffffffffc04d571c>] __vmx_vminit+0x4a/0x52 [ksmlinux]
[  113.053592] PGD 27a0a067 
[  113.053599] PUD 27a0c067 
[  113.053615] PMD 0 

[  113.053630] Oops: 0000 [#1] SMP
[  113.053649] Modules linked in: ksmlinux(O) fuse usblp prl_fs_freeze(PO) prl_fs(PO) prl_eth(PO) x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_rapl_perf uvcvideo videobuf2_vmalloc evdev videobuf2_memops snd_intel8x0 serio_raw videobuf2_v4l2 snd_ac97_codec pcspkr ac97_bus videobuf2_core snd_pcm videodev snd_timer snd soundcore media lpc_ich sg mfd_core shpchp pvpanic prl_tg(PO) virtio_balloon sbs sbshc binfmt_misc battery ac button parport_pc ppdev lp parport ip_tables x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache sd_mod sr_mod cdrom ata_generic crc32c_intel virtio_net aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd ata_piix ahci libahci psmouse i2c_i801 i2c_smbus libata scsi_mod xhci_pci xhci_hcd
[  113.053934]  uhci_hcd ehci_pci ehci_hcd usbcore usb_common virtio_pci virtio_ring virtio
[  113.053965] CPU: 0 PID: 2689 Comm: a.out Tainted: P           O    4.9.0-4-amd64 #1 Debian 4.9.51-1
[  113.054001] Hardware name: Parallels Software International Inc. Parallels Virtual Platform/Parallels Virtual Platform, BIOS 12.2.0 (41591) 04/03/2017
[  113.054044] task: ffff9c037ae7f040 task.stack: ffffac4400b70000
[  113.054064] RIP: 0010:[<ffffffffc04d571c>]  [<ffffffffc04d571c>] __vmx_vminit+0x4a/0x52 [ksmlinux]
[  113.054101] RSP: 0018:ffffac4400b73d90  EFLAGS: 00010046
[  113.054121] RAX: 0000000000000000 RBX: 000000000000003a RCX: 0000000000000000
[  113.054144] RDX: 0000000000000000 RSI: 00000000fee00037 RDI: ffff9c034aa01000
[  113.054166] RBP: 0000000000000000 R08: 00000000fee00030 R09: 0000fffffffff000
[  113.054189] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9c034aa00000
[  113.054211] R13: ffff9c034aa01000 R14: 0000000000000000 R15: 0000000000000000
[  113.054234] FS:  00007f23714c5700(0000) GS:ffff9c037de00000(0000) knlGS:0000000000000000
[  113.054258] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  113.054278] CR2: fffffffffffffff3 CR3: 00000000399ac000 CR4: 00000000001426f0
[  113.054301] Stack:
[  113.054314]  ffffffffc04d1a4b 0000000000000202 ffffac4400b73e08 ffff9c034aa00000
[  113.054341]  0000000000000000 0000000000000000 ffffffffc04d1b2a ffffffffbd6f96e8
[  113.054368]  ffffac4400b73e58 ffff9c037ae7f680 0000000000000000 ffffffffc04d1b20
[  113.054395] Call Trace:
[  113.054413]  [<ffffffffc04d1a4b>] ? __ksm_init_cpu+0xab/0x180 [ksmlinux]
[  113.054439]  [<ffffffffc04d1b2a>] ? __percpu___call_init+0xa/0x20 [ksmlinux]
[  113.054466]  [<ffffffffbd6f96e8>] ? generic_exec_single+0x98/0x100
[  113.054488]  [<ffffffffc04d1b20>] ? __ksm_init_cpu+0x180/0x180 [ksmlinux]
[  113.054511]  [<ffffffffbd6f9818>] ? smp_call_function_single+0xc8/0x130
[  113.054534]  [<ffffffffbd77af2e>] ? printk+0x57/0x73
[  113.054554]  [<ffffffffc04d1b7a>] ? ksm_subvert+0x3a/0x70 [ksmlinux]
[  113.054577]  [<ffffffffc04d54b3>] ? ksm_ioctl+0x2c3/0x40e [ksmlinux]
[  113.054601]  [<ffffffffbd816f1f>] ? do_vfs_ioctl+0x9f/0x600
[  113.054621]  [<ffffffffbd8174f4>] ? SyS_ioctl+0x74/0x80
[  113.055119]  [<ffffffffbdc085bb>] ? system_call_fast_compare_end+0xc/0x9b
[  113.055592] Code: 48 ba 24 57 4d c0 ff ff ff ff e8 00 e2 ff ff 58 59 5a 5b 48 83 c4 08 5d 5e 5f 41 58 41 59 41 5a 41 5b 41 5c 41 5d 41 5e 41 5f 9d <8b> 04 25 f3 ff ff ff c3 58 59 5a 5b 48 83 c4 08 5d 5e 5f 41 58 
[  113.057140] RIP  [<ffffffffc04d571c>] __vmx_vminit+0x4a/0x52 [ksmlinux]
[  113.058172]  RSP <ffffac4400b73d90>
[  113.059476] CR2: fffffffffffffff3
[  113.060388] ---[ end trace a4e8d77f6429cfff ]---
[  113.063077] ksm: CPU 0: ksm_release: release() from a.out

@asamy
Copy link
Owner

asamy commented Oct 11, 2017

Set EPTP_INIT_USED (ksm.h) to 1, see if that fixes it.

@CiraciNicolo
Copy link
Author

Still freezes

@asamy
Copy link
Owner

asamy commented Oct 11, 2017

That physical memory range output is weird when you said you have 8 GB of RAM. Are you using VM now or something?

Maybe it's not pre-allocating physical RAM like it should, so it's getting a lot of EPT violations to allocate them and that causes the freeze. Maybe the code that gets the physical memory ranges is faulty...

Regardless, assuming those are the physical memory ranges you have (i.e. the output matches the actual ranges), then those are not enough.

@CiraciNicolo
Copy link
Author

Yeah, now I'm using a VM so I don't have to reboot every time the machine freeze.

@asamy
Copy link
Owner

asamy commented Oct 22, 2017

So, I tested today and I haven't been able to reproduce. Both on VM and baremetal (Both Windows 10 & Linux 4.13.8-1), the only difference is my CPU is an i7-5550U (Broadwell).

Have you been able to find some other clue other than the double crash? Can you disable features until you find something out of ordinary?

@CrazyHarb
Copy link

Good morning sir, I've gotten the same 'freeze' issue on 'Ubuntu 16.04.1' (kernel version is '4.15.0-29-generic'), when I'm trying to run 'sudo ./a.out', the VM will be froze.

BUT, I've found something interesting out of the blue:

  1. Another code also freeze.
  2. code can entry VMX-host, but not always.

so, I guess that maybe the code has been swapped to disk. My VM memory range is 2GB.

@CiraciNicolo CiraciNicolo closed this as not planned Won't fix, can't repro, duplicate, stale Jul 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants