Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System freezes after loading kvm module #1

Closed
jasonbking opened this issue Aug 27, 2011 · 6 comments
Closed

System freezes after loading kvm module #1

jasonbking opened this issue Aug 27, 2011 · 6 comments

Comments

@jasonbking
Copy link

CPU is corei5 2400 (sandy bridge)

One time prior to a freeze, I did see 'kvm: NOTICE: unhanded wrmsr: 0x0 data 3000000018' on the console. However have not seen that since. Tried setting a bp in kvm_set_msr_common, and it appears to not be reached in subsequent lockups.

Disabling kvm leaves the system stable, doing an rem_drv kvm; add_drv kvm causes it to lockup shortly thereafter.

This is on a stock illumos debug build (source as of 8/26).

Also experienced similar issues w/ smartos live (though was never able to narrow it down).

@bcantrill
Copy link
Contributor

Interesting. What guest? (Or does it hang without any guest at all?) Do you have a dump? And can you do this on the running system:

echo "vmcs_config::print" | mdb -k

@jasonbking
Copy link
Author

No guests running -- just a regular boot, doesn't generate a dump, cannot drop to kmdb, tried "dtrace -wn 'tick-1m { panic(); }".

If I boot with -B disable-kvm=true, things are stable.. however when I 'rem_drv kvm; add_drv kvm' it freezes shortly thereafter (just like when I boot the BE normally) and I cannot drop to kmdb (this is also a DEBUG kernel)

So due to all of that, I set a breakpoint in setup_vmcs_config, and the output is immediately before it returns (hopefully this is sufficient, if not, let me know another point that would be more useful to return the value):

{
size = 0x400
order = 0
revision_id = 0x10
pin_based_exec_ctrl = 0x3f
cpu_based_exec_ctrl = 0xb6a065fa
cpu_based_2nd_exec_ctrl = 0xeb
vmexit_ctrl = 0xf6fff
vmentry_ctrl = 0x51ff
}

@jasonbking
Copy link
Author

Additional data points: set breakpoints on kvmkvm_{open,close,ioctl,devmap,segmap}. None are being hit prior to the system locking up. Also set a bp on kvmkvm_attach, that succeeds without any issue.

@jasonbking
Copy link
Author

.. and it appears during the boot to be trying to unload the kvm module. setting a bp on kvm_detach gets triggered.

I stepped over each instruction, and after kvm_arch_hardware_unsetup is called, (or perhaps during), kmdb reports 'single-step stop on miscellaneous trap' and pc is within xc_serv. ::stack shows it's called as xc_serv(0, 0). Doing :c drops it back into xc_serv with the same message, after doing this several times, it drops back into the OS.

At this point, the system no longer locks up. (Uneducated guess) is the lockup perhaps a nasty interrupt deadlock triggered by kvm_arch_hardware_unsetup?

richlowe referenced this issue in richlowe/illumos-kvm Sep 11, 2011
@rmustacc
Copy link
Contributor

rmustacc commented Nov 3, 2011

We finally have a box on hand to test this against. Our investigation shows that while the kvm driver is inducing it, there is a problem much deeper in the system. Basically the act of taking a spin lock in cross call context can lead to the behavior you're seeing. As a work around, on a sandy bridge system, consider setting apix_enable=0 in /etc/system or via mdb -kd. The issue is likely in the apix module which was taken in a not quite refined state when the source closed. We're going to be doing further work to determine what's going on there, but it'll be some time before we get there.

@rmustacc
Copy link
Contributor

This has been resolved in illumos-joyent. See TritonDataCenter/illumos-joyent@4d86fb7 for the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants