Re: [Rdo-list] Nested RDO Icehouse nova-compute KVM / QEMU issues due to -cpu host

Wednesday, 2 July 2014

On Tue, Jul 01, 2014 at 04:57:02PM +1200, Steven Ellis wrote:
...
 So I'm  having issues nesting RDO on my T440s laptop (Intel(R)
Core(TM)
 i7-4600U CPU @ 2.10GHz), and I'm hoping someone on the list can help

 My Physical Host (L0) is Fedora 19 running 3.14.4-100.fc19.x86_64 with
 nesting turned on 
If you can, I'd strongly suggest to use latest F20 Kernels (for L0 & L1)
as nested KVM issues are freuqently upstream which are available in
Fedora Rawhide.

The thing with nested virtualization is the explosion of test matrix
(different Kernels + distributions on L0, L1, L2) :-(

I'm running F20 (L0) -> F20 (L1) -> F20 (L2), with current Fedora
Rawhide Kernels (and cpu -host on for L1 & L2) and I don't see this
issue.

...
 My OpenStack Host is RHEL 6.5 or RHEL 7 (L1)
 My Guest is Cirros (L2) 
[. . .]

...
 The issue appears to be running with "-cpu host"  with this
nesting
 combination.

 Now if I run the qemu command directly on RHEL7 (L1) I get this error
    KVM: entry failed, hardware error 0x7

 Under RHEL 6.5 (L1) it is similar but not identical
     kvm: unhandled exit 7

 In both cases on my Fedora physical host (L0) I see
    nested_vmx_run: VMCS MSR_{LOAD,STORE} unsupported 
IIRC, that's because your CPU just doesn't support VMCS shadowing
(unless you're using Intel Haswell or above). I think the below command
returns 'N' on your CPU:

    $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs 

...
 There does appear to be a Red Hat bugzilla for RHEL7 relating to
this
 but not for RHEL6
  - https://bugzilla.redhat.com/show_bug.cgi?id=1038427 
I recall that bug. Marcelo's suggestion to not use host-passthrough
(-cpu host) for L2 is reasonable for now I guess. From my testing I
haven't seen any significant performance benefits for hostpassthrough at
both levels, I instead try to expose just 'vmx' extension (more on it
below).

...
 I can reproduce this issue using both RHEL 6.5 and RHEL 7 as my
 OpenStack Host (L1). Has anyone else hit this issue?

 Next I tried a work around of editing the /etc/nova/nova.conf file and
 forcing the CPU type for my guests under OpenStack

 #cpu_mode=none
 cpu_mode=custom

 # Set to a named libvirt CPU model (see names listed in
 # /usr/share/libvirt/cpu_map.xml). Only has effect if
 # cpu_mode="custom" and virt_type="kvm|qemu" (string value)
 # Deprecated group;name - DEFAULT;libvirt_cpu_model
 #cpu_model=<None>
 cpu_model=Conroe 
To see if it's working (only for testing), you can enforce the CPU model
in your CirrOS guest XML and see the guest starts w/ `virsh start
instance-foo`

...
 Problem is qemu is still run with "-cpu host,+kvmclock"

...

 So am I hitting a secondary bug with nova-compute or is there another
 way to force OpenStack to select a particular CPU subset for Nova? 
Can you try to edit your L1 guest XML, and ensure you just expose the
'vmx' extension which is necessary for exposing KVM (/dev/kvm character
device) inside your L1:

    <cpu match='exact'>
    <model>SandyBridge</model>
    <feature policy='require' name='vmx'/>
    </cpu>

Alternatively, you can also try exposing the CPU element values from the
below command on your L0 & L1 and see if you can reproduce the errors:

    $ virsh  capabilities | virsh cpu-baseline /dev/stdin

-- 
/kashyap

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [Rdo-list] Nested RDO Icehouse nova-compute KVM / QEMU issues due to -cpu host