Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion `t->regs().syscall_result_signed() == -syscall_state.expect_errno' failed to hold. #1577

Closed
ghost opened this issue Nov 4, 2015 · 17 comments

Comments

@ghost
Copy link

ghost commented Nov 4, 2015

Seeing this assertion when attempting to debug VMware Workstation:

ALSA lib conf.c:3782:(snd_config_update_r) cannot access file /usr/share/alsa/alsa.conf
[FATAL /home/roc/rr/rr/src/record_syscall.cc:3001:rec_process_syscall_arch() errno: 0 'Success'](task 28326 %28rec:28326%29 at time 136608)
-> Assertion `t->regs().syscall_result_signed() == -syscall_state.expect_errno' failed to hold. Expected EINVAL for 'ioctl' but got result 0; Unknown ioctl(0x81785501): type:0x55 nr:0x1 dir:0x2 size:376 addr:0x7fffdfe99ac0

The Googles have told me this is the SNDRV_CTL_IOCTL_CARD_INFO ioctl from ALSA.

@rocallahan
Copy link
Collaborator

Ugh.

Feel like figuring out the ALSA ioctls and adding them to prepare_ioctl in record_syscall.cc?

@rocallahan
Copy link
Collaborator

Actually, finding any documentation at all for these ioctls would be nice...

@ghost
Copy link
Author

ghost commented Nov 4, 2015

I took a stab at this here: https://github.com/awalton/rr/commit/559804fe324d78183e193f6503eb9082050cea4e

I admit I have no idea if it's right, but it runs Workstation!

@rocallahan
Copy link
Collaborator

I admit I have no idea if it's right, but it runs Workstation!

Woah really? Cool! ... Hey, any chance you could help with http://robert.ocallahan.org/2014/09/vmware-cpuid-conditional-branch.html ? :-)

I left a couple of minor comments in your commit. The only other thing this needs before landing is a test.

Thanks!!!

@ghost
Copy link
Author

ghost commented Nov 4, 2015

No problem. I filed an internal bug on VMware's bugzilla and forwarded the post on to the monitor team so hopefully either I or someone from that team will have something to tell you at some point.

Until then, I hit another issue around the iopl syscall being unimplemented and I'm poking around to see what it will take to get around that. I'll respin and post an updated patch later today.

@rocallahan
Copy link
Collaborator

FWIW if you're calling iopl to enable use of x86 in instructions, rr is not going to work because we have no mechanism to record and replay the results of those instructions. (That could probably be fixed by making rr drop the iopl and then trap, emulate and record every in instruction, with significant overhead of course.)

@rocallahan
Copy link
Collaborator

There may be other exotic things that VMWare does that rr can't handle, e.g. ioctls that you use to communicate with your host kernel drivers. If you share memory between user-space and the kernel drivers that could create additional issues. I'm happy to help you work through them, just setting expectations :-).

@ghost
Copy link
Author

ghost commented Nov 4, 2015

Sure, and I expect there are probably a bunch of gotchas. Right now I'm trying to hunt down a bug in the UI though, so a lot of those are likely avoidable. (In fact, I may not even need to implement iopl at all - I think I may be able to get around this for the time being.)

One of our internal team members has already commented on the monitor bug:

"Could you ask him to try this in the VM's .vmx file:

monitor_control.disable_hvsim_clusters = true
"

@rocallahan
Copy link
Collaborator

That fixes it! Thanks, at very least we now have a workaround!

@rocallahan
Copy link
Collaborator

@awalton is it OK for us to include that advice in rr's message when it detects the VMWare bug? Just want to make sure adding that setting won't cause any harm.

@ghost
Copy link
Author

ghost commented Nov 5, 2015

I roundtripped it through our monitor team again and they said it's okay to use that setting and that it's likely the best workaround you'll get for the time being: the reason, as you correctly deduced, is to try to reduce hardware virtualization exits in order to improve performance (quite significantly), so the worst this setting does for you is slow things down a bit, which might make certain kinds of bugs harder to debug, but otherwise the impact should be pretty low. I think.

The advice they gave me to give to you is to make sure to read and cite their paper on this specific topic (http://dl.acm.org/citation.cfm?id=2342856) and to use the given workaround above. It was certainly educational for me - I work with VMs every day and even I didn't know some of the stuff they're doing!

I'll look into bumping the patch later this week - I got caught up actually debugging my problem in Workstation and forgot to rev the patch today.

@rocallahan
Copy link
Collaborator

Thanks a ton!

@ghost
Copy link
Author

ghost commented Nov 7, 2015

I updated my tree here:
https://github.com/awalton/rr/tree/alsa-fixes

However, when running the tests I ran into a whole slew of failed tests, including the one I just added, so it probably still needs more review:

10 - alsa_ioctl-no-syscallbuf (Failed)
418 - blocked_bad_ip-no-syscallbuf (Failed)
    ....(skip a few hundred lines)
    1360 - when-32-no-syscallbuf (Failed)

Probably the case of me doing something wrong, but I don't quite know the tool well enough to know what I've done...

@rocallahan
Copy link
Collaborator

Interesting. I assume those failures occur without your patch? Would be worth investigating if you're interested.

I squashed your patches, reworked the test somewhat and merged: 89374ab

@rocallahan
Copy link
Collaborator

And thanks!

@folkertvanheusden
Copy link

Please note that I get that error also without vmware.

@rocallahan
Copy link
Collaborator

Please file a new issue with the details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants