[rdo-users] [rdo][ussuri][TripleO][nova][kvm] libvirt.libvirtError: internal error: process exited while connecting to monitor

Alfredo Moralejo Alonso amoralej at redhat.com
Thu Jul 2 15:35:07 UTC 2020


On Thu, Jul 2, 2020 at 4:38 PM Ruslanas Gžibovskis <ruslanas at lpic.lt> wrote:

> it is, i have image build failing. i can modify yaml used to create image.
> can you remind me which files it would be?
>
>
Right, I see that the patch must not be working fine for centos and the
package is being installed from delorean repos in the log.  I guess it
needs an entry to cover the centos 8 case (i'm checking with opstools
maintainer).

As workaround I'd propose you to use the package from:

https://trunk.rdoproject.org/centos8-ussuri/component/cloudops/current-tripleo/

or alternatively applying some local patch to tripleo-puppet-elements.


> and your question, "how it can impact kvm":
>
> in image most of the packages get deployed from deloren repos. I believe
> part is from centos repos and part of whole packages in
> overcloud-full.qcow2 are from deloren. so it might have bit different minor
> version, that might be incompactible... at least it have happend for me
> previously with train release so i used tested ci fully from the
> beginning...
> I might be for sure wrong.
>

Delorean repos contain only OpenStack packages, things like nova, etc...
not kvm or things included in CentOS repos. KVM will always installed which
should be installed from "Advanced Virtualization" repository. May you
check what versions of qemu-kvm and libvirt you got installed into the
overcloud-full image?, it should match with the versions in:

http://mirror.centos.org/centos/8/virt/x86_64/advanced-virtualization/Packages/q/

like qemu-kvm-4.2.0-19.el8.x86_64.rpm and libvirt-6.0.0-17.el8.x86_64.rpm


>
> On Thu, 2 Jul 2020, 17:18 Alfredo Moralejo Alonso, <amoralej at redhat.com>
> wrote:
>
>>
>>
>> On Thu, Jul 2, 2020 at 3:59 PM Ruslanas Gžibovskis <ruslanas at lpic.lt>
>> wrote:
>>
>>> by the way in CentOS8, here is an error message I receive when searching
>>> around
>>>
>>> [stack at rdo-u ~]$ dnf list --enablerepo="*" --disablerepo
>>> "c8-media-BaseOS,c8-media-AppStream" | grep osops-tools-monitoring-oschecks
>>> Errors during downloading metadata for repository
>>> 'rdo-trunk-ussuri-tested':
>>>   - Status code: 403 for
>>> https://trunk.rdoproject.org/centos8-ussuri/current-passed-ci/repodata/repomd.xml
>>> (IP: 3.87.151.16)
>>> Error: Failed to download metadata for repo 'rdo-trunk-ussuri-tested':
>>> Cannot download repomd.xml: Cannot download repodata/repomd.xml: All
>>> mirrors were tried
>>> [stack at rdo-u ~]$
>>>
>>>
>> Yep, rdo-trunk-ussuri-tested repo included in the release rpm is disabled
>> by default and not longer usable (i'll send a patch to retire it), don't
>> enable it.
>>
>> Sorry, I'm not sure how adding osops-tools-monitoring-oschecks may lead
>> to install CentOS8 maintained kvm. BTW, i think that package should not be
>> required in CentOS8:
>>
>>
>> https://opendev.org/openstack/tripleo-puppet-elements/commit/2d2bc4d8b20304d0939ac0cebedac7bda3398def
>>
>>
>>
>>
>>> On Thu, 2 Jul 2020 at 15:56, Ruslanas Gžibovskis <ruslanas at lpic.lt>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I have one idea, why it might be the issue.
>>>>
>>>> during image creation step, I have hadd missing packets:
>>>> pacemaker-remote osops-tools-monitoring-oschecks pacemaker pcs
>>>> PCS thing can be found in HA repo, so I enabled it, but
>>>> "osops-tools-monitoring-oschecks" ONLY in delorene for CentOS8...
>>>>
>>>> I believe that is a case...
>>>> so it installed non CentOS8 maintained kvm or some dependent
>>>> packages....
>>>>
>>>> How can I get  osops-tools-monitoring-oschecks from centos repos? it is
>>>> last seen in CentOS7 repos....
>>>>
>>>> $ yum list --enablerepo=* --disablerepo "c7-media" | grep
>>>> osops-tools-monitoring-oschecks -A2
>>>> osops-tools-monitoring-oschecks.noarch
>>>> 0.0.1-0.20191202171903.bafe3f0.el7
>>>>
>>>> rdo-trunk-train-tested
>>>> ostree-debuginfo.x86_64                    2019.1-2.el7
>>>> base-debuginfo
>>>> (undercloud) [stack at ironic-poc ~]$
>>>>
>>>> can I somehow not include that package in image creation? OR if it is
>>>> essential, can I create a different repo for that one?
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, 1 Jul 2020 at 14:19, Ruslanas Gžibovskis <ruslanas at lpic.lt>
>>>> wrote:
>>>>
>>>>> Hi all!
>>>>>
>>>>> Here we go, we are in the second part of this interesting
>>>>> troubleshooting!
>>>>>
>>>>> 1) I have LogTool setup.Thank you Arkady.
>>>>>
>>>>> 2) I have user OSP to create instance, and I have used virsh to create
>>>>> instance.
>>>>> 2.1) OSP  way is failing in either way, if it is volume-based or
>>>>> image-based, it is failing either way.. [1] and [2]
>>>>> 2.2) when I create it using CLI: [0] [3]
>>>>>
>>>>> any ideas what can be wrong? What options I should choose?
>>>>> I have one network/vlan for whole cloud. I am doing proof of concept
>>>>> of remote booting, so I do not have br-ex setup. and I do not have
>>>>> br-provider.
>>>>>
>>>>> There is my compute[5] and controller[6] yaml files, Please help, how
>>>>> it should look like so it would have br-ex and br-int connected? as
>>>>> br-int now is in UNKNOWN state. And br-ex do not exist.
>>>>> As I understand, in roles data yaml, when we have tag external it
>>>>> should create br-ex? or am I wrong?
>>>>>
>>>>> [0] http://paste.openstack.org/show/Rdou7nvEWMxpGECfQHVm/  VM is
>>>>> running.
>>>>> [1] http://paste.openstack.org/show/tp8P0NUYNFcl4E0QR9IM/ < compute
>>>>> logs
>>>>> [2] http://paste.openstack.org/show/795431/ < controller logs
>>>>> [3] http://paste.openstack.org/show/HExQgBo4MDxItAEPNaRR/
>>>>> [4] http://paste.openstack.org/show/795433/ < xml file for
>>>>> [5]
>>>>> https://github.com/qw3r3wq/homelab/blob/master/overcloud/net-config/computeS01.yaml
>>>>> [6]
>>>>> https://github.com/qw3r3wq/homelab/blob/master/overcloud/net-config/controller.yaml
>>>>>
>>>>>
>>>>> On Tue, 30 Jun 2020 at 16:02, Arkady Shtempler <ashtempl at redhat.com>
>>>>> wrote:
>>>>>
>>>>>> Hi all!
>>>>>>
>>>>>> I was able to analyze the attached log files and I hope that the
>>>>>> results may help you understand what's going wrong with instance creation.
>>>>>> You can find *Log_Tool's unique exported Error blocks* here:
>>>>>> http://paste.openstack.org/show/795356/
>>>>>>
>>>>>> *Some statistics and problematical messages:*
>>>>>> ##### Statistics - Number of Errors/Warnings per Standard OSP log
>>>>>> since: 2020-06-30 12:30:00 #####
>>>>>> Total_Number_Of_Errors --> 9
>>>>>> /home/ashtempl/Ruslanas/controller/neutron/server.log --> 1
>>>>>> /home/ashtempl/Ruslanas/compute/stdouts/ovn_controller.log --> 1
>>>>>> /home/ashtempl/Ruslanas/compute/nova/nova-compute.log --> 7
>>>>>>
>>>>>> *nova-compute.log*
>>>>>> *default default] Error launching a defined domain with XML: <domain
>>>>>> type='kvm'>*
>>>>>> 368-2020-06-30 12:30:10.815 7 *ERROR* nova.compute.manager
>>>>>> [req-87bef18f-ad3d-4147-a1b3-196b5b64b688 7bdb8c3bf8004f98aae1b16d938ac09b
>>>>>> 69134106b56941698e58c61...
>>>>>> 70dc50f] Instance *failed* to spawn: *libvirt.libvirtError*:
>>>>>> internal *error*: qemu unexpectedly closed the monitor:
>>>>>> 2020-06-30T10:30:10.182675Z qemu-kvm: *error*: failed to set MSR 0...
>>>>>> he monitor: 2020-06-30T10:30:10.182675Z *qemu-kvm: error: failed to
>>>>>> set MSR 0x48e to 0xfff9fffe04006172*
>>>>>> _msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' *failed*.
>>>>>>  [instance: 128f372c-cb2e-47d9-b1bf-ce17270dc50f] *Traceback* (most
>>>>>> recent call last):
>>>>>> 375-2020-06-30 12:30:10.815 7* ERROR* nova.compute.manager
>>>>>> [instance: 128f372c-cb2e-47d9-b1bf-ce17270dc50f]   File
>>>>>> "/usr/lib/python3.6/site-packages/nova/vir...
>>>>>>
>>>>>> *server.log *
>>>>>> 5821c815-d213-498d-9394-fe25c6849918', 'status': 'failed', *'code':
>>>>>> 422} returned with failed status*
>>>>>>
>>>>>> *ovn_controller.log*
>>>>>> 272-2020-06-30T12:30:10.126079625+02:00 stderr F
>>>>>> 2020-06-30T10:30:10Z|00247|patch|WARN|*Bridge 'br-ex' not found for
>>>>>> network 'datacentre'*
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Compute nodes are baremetal or virtualized?, I've seen similar bug
>>>>>>>>>>>> reports when using nested virtualization in other OSes.
>>>>>>>>>>>>
>>>>>>>>>>> baremetal. Dell R630 if to be VERY precise.
>>>>>>>>>>
>>>>>>>>>> Thank you, I will try. I also modified a file, and it looked like
>>>>>>>>>> it relaunched podman container once config was changed. Either way, if I
>>>>>>>>>> understand Linux config correctly, the default value for user and group is
>>>>>>>>>> root, if commented out:
>>>>>>>>>> #user = "root"
>>>>>>>>>> #group = "root"
>>>>>>>>>>
>>>>>>>>>> also in some logs, I saw, that it detected, that it is not AMD
>>>>>>>>>> CPU :) and it is really not AMD CPU.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Just for fun, it might be important, here is how my node info
>>>>>>>>>> looks.
>>>>>>>>>>   ComputeS01Parameters:
>>>>>>>>>>     NovaReservedHostMemory: 16384
>>>>>>>>>>     KernelArgs: "crashkernel=no rhgb"
>>>>>>>>>>   ComputeS01ExtraConfig:
>>>>>>>>>>     nova::cpu_allocation_ratio: 4.0
>>>>>>>>>>     nova::compute::libvirt::rx_queue_size: 1024
>>>>>>>>>>     nova::compute::libvirt::tx_queue_size: 1024
>>>>>>>>>>     nova::compute::resume_guests_state_on_host_boot: true
>>>>>>>>>> _______________________________________________
>>>>>>>>>>
>>>>>>>>>>
>>>>>
>>>>
>>>> --
>>>> Ruslanas Gžibovskis
>>>> +370 6030 7030
>>>>
>>>
>>>
>>> --
>>> Ruslanas Gžibovskis
>>> +370 6030 7030
>>> _______________________________________________
>>> users mailing list
>>> users at lists.rdoproject.org
>>> http://lists.rdoproject.org/mailman/listinfo/users
>>>
>>> To unsubscribe: users-unsubscribe at lists.rdoproject.org
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rdoproject.org/pipermail/users/attachments/20200702/ffdf9271/attachment-0001.html>


More information about the users mailing list