[rdo-list] Newton HA galera-ready dependency error

Wed Oct 12 09:56:13 UTC 2016

I see that in that repo puppet-tripleo got updated on Friday,
2016-10-07. If the repo file is present inside the image you could
update puppet-tripleo with virt-customize, then upload the image the
the undercloud Glance with 'openstack overcloud image upload
--update-existing' and redeploy.

On Wed, Oct 12, 2016 at 11:45 AM, Charles Short <cems at ebi.ac.uk> wrote:
> Just checked what I did to build the image previously.
>
> I used
>  export
> DELOREAN_TRUNK_REPO="http://trunk.rdoproject.org/centos7-newton/current/"
>
> Which is the same repo I used to install the Undercloud.
> Maybe I need to do a yum update in the image with  libguestfs-tools prior to
> building the image?
>
> I will rebuild again anyway in case I made an error
>
> C
>
>
>
> On 12/10/2016 10:28, Charles Short wrote:
>>
>> Hi,
>>
>> Ok,  I will rebuild using the undercloud repo and report back.
>>
>> Thanks for your help
>>
>> Charles
>>
>> On 12/10/2016 09:44, Marius Cornea wrote:
>>>
>>> Oh, that explains it. It looks that the overcloud image doesn't
>>> contain the patch.
>>>
>>> I'm not familiar with the image build process but according to the
>>> docs[1] I think the packages get installed from the repo specified by
>>> export DELOREAN_TRUNK_REPO so maybe you should try to rebuild the
>>> image and use the same repo as the one set on the undercloud.
>>>
>>> [1]
>>> http://docs.openstack.org/developer/tripleo-docs/basic_deployment/basic_deployment_cli.html
>>>
>>> On Wed, Oct 12, 2016 at 10:30 AM, Charles Short <cems at ebi.ac.uk> wrote:
>>>>
>>>> Hi,
>>>>
>>>> We noticed something that may be the reason for this patch not working
>>>> which
>>>> may be related to the way the Undercloud built the image? -
>>>>
>>>> This is difference between the puppet files in the undercloud and
>>>> puppet in the image:
>>>>
>>>> [stack at hh-extcl05-undercloud ~]$ grep mysql_short_node_names
>>>>
>>>> /usr/share/openstack-puppet/modules/tripleo/manifests/profile/pacemaker/database/mysql.pp
>>>>
>>>> $galera_node_names_lookup = hiera('mysql_short_node_names',
>>>> hiera('mysql_node_names', $::hostname))
>>>>
>>>> [root at overcloud-controller-1 puppet]# grep mysql_short_node_names
>>>>
>>>> /etc/puppet/modules/tripleo/manifests/profile/pacemaker/database/mysql.pp
>>>>
>>>> (nothing found)
>>>>
>>>>
>>>> On 12/10/2016 09:09, Marius Cornea wrote:
>>>>>
>>>>> That's odd. I encountered the same issue and it was caused by missing
>>>>> this patch.  What do you get if you do sudo hiera
>>>>> mysql_short_node_names on the controller node?
>>>>>
>>>>> On Wed, Oct 12, 2016 at 12:48 AM, Charles Short <cems at ebi.ac.uk> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> puppet-tripleo-5.2.0-0.20161007035759.f32e484.el7.centos.noarch
>>>>>>
>>>>>> The patch seems to be already present -
>>>>>>
>>>>>>     grep short
>>>>>>
>>>>>>
>>>>>> /usr/share/openstack-puppet/modules/tripleo/manifests/profile/pacemaker/database/mysql.pp
>>>>>>
>>>>>>     # short name which is already registered in pacemaker until we get
>>>>>> around
>>>>>>     $galera_node_names_lookup = hiera('mysql_short_node_names',
>>>>>> hiera('mysql_node_names', $::hostname))
>>>>>>
>>>>>> Charles
>>>>>>
>>>>>> On 11/10/2016 19:39, Marius Cornea wrote:
>>>>>>>
>>>>>>> I think the issue is caused by the addresses in wsrep_cluster_address
>>>>>>> not matching the pacemaker node names:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> wsrep_cluster_address=gcomm://overcloud-controller-0.internalapi.localdomain,overcloud-controller-1.internalapi.localdomain,overcloud-controller-2.internalapi.localdomain
>>>>>>>
>>>>>>> Could you please confirm what version of puppet-tripleo you've got
>>>>>>> installed on the overcloud nodes and if it contains the following
>>>>>>> patch:
>>>>>>>
>>>>>>>
>>>>>>> https://review.openstack.org/#/c/382883/1/manifests/profile/pacemaker/database/mysql.pp
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Marius
>>>>>>>
>>>>>>> On Tue, Oct 11, 2016 at 7:42 PM, Charles Short <cems at ebi.ac.uk>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Ok install finished with same error
>>>>>>>> The latest pcs status etc
>>>>>>>>
>>>>>>>> http://pastebin.com/ZK683gZe
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/10/2016 17:35, Charles Short wrote:
>>>>>>>>>
>>>>>>>>> Deployment almost finished...so
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://pastebin.com/zE9B19XB
>>>>>>>>>
>>>>>>>>> This shows the pcs status as the deployment nears the end, and pcs
>>>>>>>>> resource show galera
>>>>>>>>>
>>>>>>>>> Charles
>>>>>>>>>
>>>>>>>>> On 11/10/2016 16:59, Marius Cornea wrote:
>>>>>>>>>>
>>>>>>>>>> Great, thanks for checking this.
>>>>>>>>>>
>>>>>>>>>> On Tue, Oct 11, 2016 at 5:58 PM, Charles Short <cems at ebi.ac.uk>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Currently having more generic deployment issues (no valid host
>>>>>>>>>>> found
>>>>>>>>>>> etc).
>>>>>>>>>>> I can work around/solve these.
>>>>>>>>>>> I don't yet have another stack to analyse, but will do soon.
>>>>>>>>>>>
>>>>>>>>>>> Charles
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 11/10/2016 16:35, Marius Cornea wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Did it succeed in bringing the Galera nodes to Master? You can
>>>>>>>>>>>> ssh
>>>>>>>>>>>> to
>>>>>>>>>>>> the nodes and run 'pcs resource show galera' even though the
>>>>>>>>>>>> deployment hasn't finished. I'm interested to see how the
>>>>>>>>>>>> wsrep_cluster_address is set to see if it's affected by the
>>>>>>>>>>>> resource
>>>>>>>>>>>> agent issue described in
>>>>>>>>>>>> https://bugs.launchpad.net/tripleo/+bug/1628521
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Oct 11, 2016 at 5:18 PM, Charles Short <cems at ebi.ac.uk>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Looks similar to this bug (still waiting on deployment to
>>>>>>>>>>>>> finish)
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1368214
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/10/2016 15:25, Charles Short wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sorry for the delay.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Just redeploying to make sure I can repeat the same error.
>>>>>>>>>>>>>> Should
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>> long.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 11/10/2016 14:24, Marius Cornea wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Could you also please paste the output for 'pcs resource show
>>>>>>>>>>>>>>> galera',
>>>>>>>>>>>>>>> it looks that all the galera nodes show up as slaves?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>        Master/Slave Set: galera-master [galera]
>>>>>>>>>>>>>>>            Slaves: [ overcloud-controller-0
>>>>>>>>>>>>>>> overcloud-controller-1
>>>>>>>>>>>>>>> overcloud-controller-2 ]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Oct 11, 2016 at 2:16 PM, Charles Short
>>>>>>>>>>>>>>> <cems at ebi.ac.uk>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Here you are -
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>        - Heat stack error  - http://pastebin.com/E8KZa2vE
>>>>>>>>>>>>>>>>        - PCS status  - http://pastebin.com/z34gSLq6
>>>>>>>>>>>>>>>>        - mariadb.log - http://pastebin.com/APFXPBLc
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 11/10/2016 12:07, Marius Cornea wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Charles,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Could you please paste the output of 'pcs status' ? The log
>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>> /var/log/mariadb/mariadb.log might also be a good
>>>>>>>>>>>>>>>>> indicator.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Oct 11, 2016 at 11:16 AM, Charles Short
>>>>>>>>>>>>>>>>> <cems at ebi.ac.uk>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> To add I built my own image from
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> http://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud.qcow2
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> as the images in
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> http://buildlogs.centos.org/centos/7/cloud/x86_64/tripleo_images/newton/delorean/
>>>>>>>>>>>>>>>>>> caused sporadic ramdisk loading errors  (hung at x% loaded
>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>> boot)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Does my image now need to be customised in any way for HA
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> work?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 11/10/2016 09:55, Charles Short wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I am installing Newton with TripleO on baremetal HP
>>>>>>>>>>>>>>>>>>> blades.
>>>>>>>>>>>>>>>>>>> I can deploy a single controller stack overcloud no
>>>>>>>>>>>>>>>>>>> problem,
>>>>>>>>>>>>>>>>>>> however
>>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>> I choose three controllers the deployment fails
>>>>>>>>>>>>>>>>>>> (including
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml)
>>>>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The heat stack error first complains "Dependency
>>>>>>>>>>>>>>>>>>> Exec[galera-ready]
>>>>>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>>> failures" which in turn causes lots of other errors.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have deployed Liberty and Mitaka successfully in the
>>>>>>>>>>>>>>>>>>> past
>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>> baremetal
>>>>>>>>>>>>>>>>>>> with three controllers, and this is the first time I have
>>>>>>>>>>>>>>>>>>> seen
>>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>> error.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Charles Short
>>>>>>>>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>> rdo-list mailing list
>>>>>>>>>>>>>>>>>> rdo-list at redhat.com
>>>>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/rdo-list
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Charles Short
>>>>>>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Charles Short
>>>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Charles Short
>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>
>>>>>>>> --
>>>>>>>> Charles Short
>>>>>>>> Cloud Engineer
>>>>>>>> Virtualization and Cloud Team
>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>
>>>>>> --
>>>>>> Charles Short
>>>>>> Cloud Engineer
>>>>>> Virtualization and Cloud Team
>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>> Tel: +44 (0)1223 494205
>>>>>>
>>>> --
>>>> Charles Short
>>>> Cloud Engineer
>>>> Virtualization and Cloud Team
>>>> European Bioinformatics Institute (EMBL-EBI)
>>>> Tel: +44 (0)1223 494205
>>>>
>>
>
> --
> Charles Short
> Cloud Engineer
> Virtualization and Cloud Team
> European Bioinformatics Institute (EMBL-EBI)
> Tel: +44 (0)1223 494205
>