[rdo-users] [rdo][ussuri][TripleO][nova][kvm] libvirt.libvirtError: internal error: process exited while connecting to monitor

Wed Jul 1 12:19:51 UTC 2020

Hi all!

Here we go, we are in the second part of this interesting troubleshooting!

1) I have LogTool setup.Thank you Arkady.

2) I have user OSP to create instance, and I have used virsh to create
instance.
2.1) OSP  way is failing in either way, if it is volume-based or
image-based, it is failing either way.. [1] and [2]
2.2) when I create it using CLI: [0] [3]

any ideas what can be wrong? What options I should choose?
I have one network/vlan for whole cloud. I am doing proof of concept of
remote booting, so I do not have br-ex setup. and I do not have br-provider.

There is my compute[5] and controller[6] yaml files, Please help, how it
should look like so it would have br-ex and br-int connected? as br-int now
is in UNKNOWN state. And br-ex do not exist.
As I understand, in roles data yaml, when we have tag external it should
create br-ex? or am I wrong?

[0] http://paste.openstack.org/show/Rdou7nvEWMxpGECfQHVm/  VM is running.
[1] http://paste.openstack.org/show/tp8P0NUYNFcl4E0QR9IM/ < compute logs
[2] http://paste.openstack.org/show/795431/ < controller logs
[3] http://paste.openstack.org/show/HExQgBo4MDxItAEPNaRR/
[4] http://paste.openstack.org/show/795433/ < xml file for
[5]
https://github.com/qw3r3wq/homelab/blob/master/overcloud/net-config/computeS01.yaml
[6]
https://github.com/qw3r3wq/homelab/blob/master/overcloud/net-config/controller.yaml

On Tue, 30 Jun 2020 at 16:02, Arkady Shtempler <ashtempl at redhat.com> wrote:

> Hi all!
>
> I was able to analyze the attached log files and I hope that the results
> may help you understand what's going wrong with instance creation.
> You can find *Log_Tool's unique exported Error blocks* here:
> http://paste.openstack.org/show/795356/
>
> *Some statistics and problematical messages:*
> ##### Statistics - Number of Errors/Warnings per Standard OSP log since:
> 2020-06-30 12:30:00 #####
> Total_Number_Of_Errors --> 9
> /home/ashtempl/Ruslanas/controller/neutron/server.log --> 1
> /home/ashtempl/Ruslanas/compute/stdouts/ovn_controller.log --> 1
> /home/ashtempl/Ruslanas/compute/nova/nova-compute.log --> 7
>
> *nova-compute.log*
> *default default] Error launching a defined domain with XML: <domain
> type='kvm'>*
> 368-2020-06-30 12:30:10.815 7 *ERROR* nova.compute.manager
> [req-87bef18f-ad3d-4147-a1b3-196b5b64b688 7bdb8c3bf8004f98aae1b16d938ac09b
> 69134106b56941698e58c61...
> 70dc50f] Instance *failed* to spawn: *libvirt.libvirtError*: internal
> *error*: qemu unexpectedly closed the monitor:
> 2020-06-30T10:30:10.182675Z qemu-kvm: *error*: failed to set MSR 0...
> he monitor: 2020-06-30T10:30:10.182675Z *qemu-kvm: error: failed to set
> MSR 0x48e to 0xfff9fffe04006172*
> _msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' *failed*.
>  [instance: 128f372c-cb2e-47d9-b1bf-ce17270dc50f] *Traceback* (most
> recent call last):
> 375-2020-06-30 12:30:10.815 7* ERROR* nova.compute.manager [instance:
> 128f372c-cb2e-47d9-b1bf-ce17270dc50f]   File
> "/usr/lib/python3.6/site-packages/nova/vir...
>
> *server.log *
> 5821c815-d213-498d-9394-fe25c6849918', 'status': 'failed', *'code': 422}
> returned with failed status*
>
> *ovn_controller.log*
> 272-2020-06-30T12:30:10.126079625+02:00 stderr F
> 2020-06-30T10:30:10Z|00247|patch|WARN|*Bridge 'br-ex' not found for
> network 'datacentre'*
>
> Thanks!
>
> Compute nodes are baremetal or virtualized?, I've seen similar bug reports
>>>>>>> when using nested virtualization in other OSes.
>>>>>>>
>>>>>> baremetal. Dell R630 if to be VERY precise.
>>>>>
>>>>> Thank you, I will try. I also modified a file, and it looked like it
>>>>> relaunched podman container once config was changed. Either way, if I
>>>>> understand Linux config correctly, the default value for user and group is
>>>>> root, if commented out:
>>>>> #user = "root"
>>>>> #group = "root"
>>>>>
>>>>> also in some logs, I saw, that it detected, that it is not AMD CPU :)
>>>>> and it is really not AMD CPU.
>>>>>
>>>>>
>>>>> Just for fun, it might be important, here is how my node info looks.
>>>>>   ComputeS01Parameters:
>>>>>     NovaReservedHostMemory: 16384
>>>>>     KernelArgs: "crashkernel=no rhgb"
>>>>>   ComputeS01ExtraConfig:
>>>>>     nova::cpu_allocation_ratio: 4.0
>>>>>     nova::compute::libvirt::rx_queue_size: 1024
>>>>>     nova::compute::libvirt::tx_queue_size: 1024
>>>>>     nova::compute::resume_guests_state_on_host_boot: true
>>>>> _______________________________________________
>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rdoproject.org/pipermail/users/attachments/20200701/5ab3359a/attachment.html>