Re: [rdo-list] Newton HA galera-ready dependency error

Wednesday, 12 October 2016


Ok, I updated puppet-tripleo as  suggested with virt-customize and 
verified that the patch was indeed updated (using guestmount )
I am now redeploying

Charles

On 12/10/2016 10:56, Marius Cornea wrote:
...
 I see that in that repo puppet-tripleo got updated on Friday,
 2016-10-07. If the repo file is present inside the image you could
 update puppet-tripleo with virt-customize, then upload the image the
 the undercloud Glance with 'openstack overcloud image upload
 --update-existing' and redeploy.

 On Wed, Oct 12, 2016 at 11:45 AM, Charles Short <cems(a)ebi.ac.uk&gt; wrote:
> Just checked what I did to build the image previously.
>
> I used
>   export
> DELOREAN_TRUNK_REPO="http://trunk.rdoproject.org/centos7-newton/curr...
>
> Which is the same repo I used to install the Undercloud.
> Maybe I need to do a yum update in the image with  libguestfs-tools prior to
> building the image?
>
> I will rebuild again anyway in case I made an error
>
> C
>
>
>
> On 12/10/2016 10:28, Charles Short wrote:
>> Hi,
>>
>> Ok,  I will rebuild using the undercloud repo and report back.
>>
>> Thanks for your help
>>
>> Charles
>>
>> On 12/10/2016 09:44, Marius Cornea wrote:
>>> Oh, that explains it. It looks that the overcloud image doesn't
>>> contain the patch.
>>>
>>> I'm not familiar with the image build process but according to the
>>> docs[1] I think the packages get installed from the repo specified by
>>> export DELOREAN_TRUNK_REPO so maybe you should try to rebuild the
>>> image and use the same repo as the one set on the undercloud.
>>>
>>> [1]
>>>
http://docs.openstack.org/developer/tripleo-docs/basic_deployment/basic_d...
>>>
>>> On Wed, Oct 12, 2016 at 10:30 AM, Charles Short <cems(a)ebi.ac.uk&gt;
wrote:
>>>> Hi,
>>>>
>>>> We noticed something that may be the reason for this patch not working
>>>> which
>>>> may be related to the way the Undercloud built the image? -
>>>>
>>>> This is difference between the puppet files in the undercloud and
>>>> puppet in the image:
>>>>
>>>> [stack@hh-extcl05-undercloud ~]$ grep mysql_short_node_names
>>>>
>>>>
/usr/share/openstack-puppet/modules/tripleo/manifests/profile/pacemaker/database/mysql.pp
>>>>
>>>> $galera_node_names_lookup = hiera('mysql_short_node_names',
>>>> hiera('mysql_node_names', $::hostname))
>>>>
>>>> [root@overcloud-controller-1 puppet]# grep mysql_short_node_names
>>>>
>>>>
/etc/puppet/modules/tripleo/manifests/profile/pacemaker/database/mysql.pp
>>>>
>>>> (nothing found)
>>>>
>>>>
>>>> On 12/10/2016 09:09, Marius Cornea wrote:
>>>>> That's odd. I encountered the same issue and it was caused by
missing
>>>>> this patch.  What do you get if you do sudo hiera
>>>>> mysql_short_node_names on the controller node?
>>>>>
>>>>> On Wed, Oct 12, 2016 at 12:48 AM, Charles Short
<cems(a)ebi.ac.uk&gt; wrote:
>>>>>> Hi,
>>>>>>
>>>>>> puppet-tripleo-5.2.0-0.20161007035759.f32e484.el7.centos.noarch
>>>>>>
>>>>>> The patch seems to be already present -
>>>>>>
>>>>>>      grep short
>>>>>>
>>>>>>
>>>>>>
/usr/share/openstack-puppet/modules/tripleo/manifests/profile/pacemaker/database/mysql.pp
>>>>>>
>>>>>>      # short name which is already registered in pacemaker until
we get
>>>>>> around
>>>>>>      $galera_node_names_lookup =
hiera('mysql_short_node_names',
>>>>>> hiera('mysql_node_names', $::hostname))
>>>>>>
>>>>>> Charles
>>>>>>
>>>>>> On 11/10/2016 19:39, Marius Cornea wrote:
>>>>>>> I think the issue is caused by the addresses in
wsrep_cluster_address
>>>>>>> not matching the pacemaker node names:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
wsrep_cluster_address=gcomm://overcloud-controller-0.internalapi.localdomain,overcloud-controller-1.internalapi.localdomain,overcloud-controller-2.internalapi.localdomain
>>>>>>>
>>>>>>> Could you please confirm what version of puppet-tripleo
you've got
>>>>>>> installed on the overcloud nodes and if it contains the
following
>>>>>>> patch:
>>>>>>>
>>>>>>>
>>>>>>>
https://review.openstack.org/#/c/382883/1/manifests/profile/pacemaker/dat...
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Marius
>>>>>>>
>>>>>>> On Tue, Oct 11, 2016 at 7:42 PM, Charles Short
<cems(a)ebi.ac.uk&gt;
>>>>>>> wrote:
>>>>>>>> Ok install finished with same error
>>>>>>>> The latest pcs status etc
>>>>>>>>
>>>>>>>> http://pastebin.com/ZK683gZe
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/10/2016 17:35, Charles Short wrote:
>>>>>>>>> Deployment almost finished...so
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://pastebin.com/zE9B19XB
>>>>>>>>>
>>>>>>>>> This shows the pcs status as the deployment nears the
end, and pcs
>>>>>>>>> resource show galera
>>>>>>>>>
>>>>>>>>> Charles
>>>>>>>>>
>>>>>>>>> On 11/10/2016 16:59, Marius Cornea wrote:
>>>>>>>>>> Great, thanks for checking this.
>>>>>>>>>>
>>>>>>>>>> On Tue, Oct 11, 2016 at 5:58 PM, Charles Short
<cems(a)ebi.ac.uk&gt;
>>>>>>>>>> wrote:
>>>>>>>>>>> Currently having more generic deployment
issues (no valid host
>>>>>>>>>>> found
>>>>>>>>>>> etc).
>>>>>>>>>>> I can work around/solve these.
>>>>>>>>>>> I don't yet have another stack to
analyse, but will do soon.
>>>>>>>>>>>
>>>>>>>>>>> Charles
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 11/10/2016 16:35, Marius Cornea wrote:
>>>>>>>>>>>> Did it succeed in bringing the Galera
nodes to Master? You can
>>>>>>>>>>>> ssh
>>>>>>>>>>>> to
>>>>>>>>>>>> the nodes and run 'pcs resource show
galera' even though the
>>>>>>>>>>>> deployment hasn't finished. I'm
interested to see how the
>>>>>>>>>>>> wsrep_cluster_address is set to see if
it's affected by the
>>>>>>>>>>>> resource
>>>>>>>>>>>> agent issue described in
>>>>>>>>>>>>
https://bugs.launchpad.net/tripleo/+bug/1628521
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Oct 11, 2016 at 5:18 PM, Charles
Short <cems(a)ebi.ac.uk&gt;
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> Looks similar to this bug (still
waiting on deployment to
>>>>>>>>>>>>> finish)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
https://bugzilla.redhat.com/show_bug.cgi?id=1368214
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/10/2016 15:25, Charles Short
wrote:
>>>>>>>>>>>>>> Sorry for the delay.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Just redeploying to make sure I
can repeat the same error.
>>>>>>>>>>>>>> Should
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>> long.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 11/10/2016 14:24, Marius
Cornea wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Could you also please paste
the output for 'pcs resource show
>>>>>>>>>>>>>>> galera',
>>>>>>>>>>>>>>> it looks that all the galera
nodes show up as slaves?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>         Master/Slave Set:
galera-master [galera]
>>>>>>>>>>>>>>>             Slaves: [
overcloud-controller-0
>>>>>>>>>>>>>>> overcloud-controller-1
>>>>>>>>>>>>>>> overcloud-controller-2 ]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Oct 11, 2016 at 2:16
PM, Charles Short
>>>>>>>>>>>>>>> <cems(a)ebi.ac.uk&gt;
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Here you are -
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         - Heat stack
error  - http://pastebin.com/E8KZa2vE
>>>>>>>>>>>>>>>>         - PCS status  -
http://pastebin.com/z34gSLq6
>>>>>>>>>>>>>>>>         - mariadb.log -
http://pastebin.com/APFXPBLc
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 11/10/2016 12:07,
Marius Cornea wrote:
>>>>>>>>>>>>>>>>> Hi Charles,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Could you please
paste the output of 'pcs status' ? The log
>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>
/var/log/mariadb/mariadb.log might also be a good
>>>>>>>>>>>>>>>>> indicator.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Oct 11, 2016
at 11:16 AM, Charles Short
>>>>>>>>>>>>>>>>>
<cems(a)ebi.ac.uk&gt;
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> To add I built my
own image from
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
http://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud.qcow2
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> as the images in
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
http://buildlogs.centos.org/centos/7/cloud/x86_64/tripleo_images/newton/d...
>>>>>>>>>>>>>>>>>> caused sporadic
ramdisk loading errors  (hung at x% loaded
>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>> boot)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Does my image now
need to be customised in any way for HA
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> work?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 11/10/2016
09:55, Charles Short wrote:
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I am
installing Newton with TripleO on baremetal HP
>>>>>>>>>>>>>>>>>>> blades.
>>>>>>>>>>>>>>>>>>> I can deploy
a single controller stack overcloud no
>>>>>>>>>>>>>>>>>>> problem,
>>>>>>>>>>>>>>>>>>> however
>>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>> I choose
three controllers the deployment fails
>>>>>>>>>>>>>>>>>>> (including
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
/usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml)
>>>>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The heat
stack error first complains "Dependency
>>>>>>>>>>>>>>>>>>>
Exec[galera-ready]
>>>>>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>>>
failures" which in turn causes lots of other errors.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have
deployed Liberty and Mitaka successfully in the
>>>>>>>>>>>>>>>>>>> past
>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>> baremetal
>>>>>>>>>>>>>>>>>>> with three
controllers, and this is the first time I have
>>>>>>>>>>>>>>>>>>> seen
>>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>> error.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Charles Short
>>>>>>>>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>>>>>>>>> Virtualization
and Cloud Team
>>>>>>>>>>>>>>>>>> European
Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>>>>>>>>> Tel: +44 (0)1223
494205
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>>>>>>> rdo-list mailing
list
>>>>>>>>>>>>>>>>>>
rdo-list(a)redhat.com
>>>>>>>>>>>>>>>>>>
https://www.redhat.com/mailman/listinfo/rdo-list
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> To unsubscribe:
rdo-list-unsubscribe(a)redhat.com
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Charles Short
>>>>>>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>>>>>>> Virtualization and Cloud
Team
>>>>>>>>>>>>>>>> European Bioinformatics
Institute (EMBL-EBI)
>>>>>>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Charles Short
>>>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>>>>> European Bioinformatics Institute
(EMBL-EBI)
>>>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Charles Short
>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>
>>>>>>>> --
>>>>>>>> Charles Short
>>>>>>>> Cloud Engineer
>>>>>>>> Virtualization and Cloud Team
>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>
>>>>>> --
>>>>>> Charles Short
>>>>>> Cloud Engineer
>>>>>> Virtualization and Cloud Team
>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>> Tel: +44 (0)1223 494205
>>>>>>
>>>> --
>>>> Charles Short
>>>> Cloud Engineer
>>>> Virtualization and Cloud Team
>>>> European Bioinformatics Institute (EMBL-EBI)
>>>> Tel: +44 (0)1223 494205
>>>>
> --
> Charles Short
> Cloud Engineer
> Virtualization and Cloud Team
> European Bioinformatics Institute (EMBL-EBI)
> Tel: +44 (0)1223 494205
> 
-- 
Charles Short
Cloud Engineer
Virtualization and Cloud Team
European Bioinformatics Institute (EMBL-EBI)
Tel: +44 (0)1223 494205

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [rdo-list] Newton HA galera-ready dependency error