[rdo-list] Newton HA galera-ready dependency error
Marius Cornea
marius at remote-lab.net
Wed Oct 12 08:44:50 UTC 2016
Oh, that explains it. It looks that the overcloud image doesn't
contain the patch.
I'm not familiar with the image build process but according to the
docs[1] I think the packages get installed from the repo specified by
export DELOREAN_TRUNK_REPO so maybe you should try to rebuild the
image and use the same repo as the one set on the undercloud.
[1] http://docs.openstack.org/developer/tripleo-docs/basic_deployment/basic_deployment_cli.html
On Wed, Oct 12, 2016 at 10:30 AM, Charles Short <cems at ebi.ac.uk> wrote:
> Hi,
>
> We noticed something that may be the reason for this patch not working which
> may be related to the way the Undercloud built the image? -
>
> This is difference between the puppet files in the undercloud and
> puppet in the image:
>
> [stack at hh-extcl05-undercloud ~]$ grep mysql_short_node_names
> /usr/share/openstack-puppet/modules/tripleo/manifests/profile/pacemaker/database/mysql.pp
>
> $galera_node_names_lookup = hiera('mysql_short_node_names',
> hiera('mysql_node_names', $::hostname))
>
> [root at overcloud-controller-1 puppet]# grep mysql_short_node_names
> /etc/puppet/modules/tripleo/manifests/profile/pacemaker/database/mysql.pp
>
> (nothing found)
>
>
> On 12/10/2016 09:09, Marius Cornea wrote:
>>
>> That's odd. I encountered the same issue and it was caused by missing
>> this patch. What do you get if you do sudo hiera
>> mysql_short_node_names on the controller node?
>>
>> On Wed, Oct 12, 2016 at 12:48 AM, Charles Short <cems at ebi.ac.uk> wrote:
>>>
>>> Hi,
>>>
>>> puppet-tripleo-5.2.0-0.20161007035759.f32e484.el7.centos.noarch
>>>
>>> The patch seems to be already present -
>>>
>>> grep short
>>>
>>> /usr/share/openstack-puppet/modules/tripleo/manifests/profile/pacemaker/database/mysql.pp
>>>
>>> # short name which is already registered in pacemaker until we get
>>> around
>>> $galera_node_names_lookup = hiera('mysql_short_node_names',
>>> hiera('mysql_node_names', $::hostname))
>>>
>>> Charles
>>>
>>> On 11/10/2016 19:39, Marius Cornea wrote:
>>>>
>>>> I think the issue is caused by the addresses in wsrep_cluster_address
>>>> not matching the pacemaker node names:
>>>>
>>>>
>>>> wsrep_cluster_address=gcomm://overcloud-controller-0.internalapi.localdomain,overcloud-controller-1.internalapi.localdomain,overcloud-controller-2.internalapi.localdomain
>>>>
>>>> Could you please confirm what version of puppet-tripleo you've got
>>>> installed on the overcloud nodes and if it contains the following
>>>> patch:
>>>>
>>>> https://review.openstack.org/#/c/382883/1/manifests/profile/pacemaker/database/mysql.pp
>>>>
>>>> Thanks,
>>>> Marius
>>>>
>>>> On Tue, Oct 11, 2016 at 7:42 PM, Charles Short <cems at ebi.ac.uk> wrote:
>>>>>
>>>>> Ok install finished with same error
>>>>> The latest pcs status etc
>>>>>
>>>>> http://pastebin.com/ZK683gZe
>>>>>
>>>>>
>>>>> On 11/10/2016 17:35, Charles Short wrote:
>>>>>>
>>>>>> Deployment almost finished...so
>>>>>>
>>>>>>
>>>>>> http://pastebin.com/zE9B19XB
>>>>>>
>>>>>> This shows the pcs status as the deployment nears the end, and pcs
>>>>>> resource show galera
>>>>>>
>>>>>> Charles
>>>>>>
>>>>>> On 11/10/2016 16:59, Marius Cornea wrote:
>>>>>>>
>>>>>>> Great, thanks for checking this.
>>>>>>>
>>>>>>> On Tue, Oct 11, 2016 at 5:58 PM, Charles Short <cems at ebi.ac.uk>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Currently having more generic deployment issues (no valid host found
>>>>>>>> etc).
>>>>>>>> I can work around/solve these.
>>>>>>>> I don't yet have another stack to analyse, but will do soon.
>>>>>>>>
>>>>>>>> Charles
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/10/2016 16:35, Marius Cornea wrote:
>>>>>>>>>
>>>>>>>>> Did it succeed in bringing the Galera nodes to Master? You can ssh
>>>>>>>>> to
>>>>>>>>> the nodes and run 'pcs resource show galera' even though the
>>>>>>>>> deployment hasn't finished. I'm interested to see how the
>>>>>>>>> wsrep_cluster_address is set to see if it's affected by the
>>>>>>>>> resource
>>>>>>>>> agent issue described in
>>>>>>>>> https://bugs.launchpad.net/tripleo/+bug/1628521
>>>>>>>>>
>>>>>>>>> On Tue, Oct 11, 2016 at 5:18 PM, Charles Short <cems at ebi.ac.uk>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Looks similar to this bug (still waiting on deployment to finish)
>>>>>>>>>>
>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1368214
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 11/10/2016 15:25, Charles Short wrote:
>>>>>>>>>>>
>>>>>>>>>>> Sorry for the delay.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Just redeploying to make sure I can repeat the same error. Should
>>>>>>>>>>> not
>>>>>>>>>>> be
>>>>>>>>>>> long.
>>>>>>>>>>>
>>>>>>>>>>> Charles
>>>>>>>>>>>
>>>>>>>>>>> On 11/10/2016 14:24, Marius Cornea wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Could you also please paste the output for 'pcs resource show
>>>>>>>>>>>> galera',
>>>>>>>>>>>> it looks that all the galera nodes show up as slaves?
>>>>>>>>>>>>
>>>>>>>>>>>> Master/Slave Set: galera-master [galera]
>>>>>>>>>>>> Slaves: [ overcloud-controller-0
>>>>>>>>>>>> overcloud-controller-1
>>>>>>>>>>>> overcloud-controller-2 ]
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Oct 11, 2016 at 2:16 PM, Charles Short <cems at ebi.ac.uk>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here you are -
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Heat stack error - http://pastebin.com/E8KZa2vE
>>>>>>>>>>>>> - PCS status - http://pastebin.com/z34gSLq6
>>>>>>>>>>>>> - mariadb.log - http://pastebin.com/APFXPBLc
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>
>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/10/2016 12:07, Marius Cornea wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Charles,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Could you please paste the output of 'pcs status' ? The log in
>>>>>>>>>>>>>> /var/log/mariadb/mariadb.log might also be a good indicator.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Oct 11, 2016 at 11:16 AM, Charles Short
>>>>>>>>>>>>>> <cems at ebi.ac.uk>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To add I built my own image from
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud.qcow2
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> as the images in
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://buildlogs.centos.org/centos/7/cloud/x86_64/tripleo_images/newton/delorean/
>>>>>>>>>>>>>>> caused sporadic ramdisk loading errors (hung at x% loaded on
>>>>>>>>>>>>>>> boot)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Does my image now need to be customised in any way for HA to
>>>>>>>>>>>>>>> work?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 11/10/2016 09:55, Charles Short wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am installing Newton with TripleO on baremetal HP blades.
>>>>>>>>>>>>>>>> I can deploy a single controller stack overcloud no problem,
>>>>>>>>>>>>>>>> however
>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>> I choose three controllers the deployment fails (including
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml)
>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The heat stack error first complains "Dependency
>>>>>>>>>>>>>>>> Exec[galera-ready]
>>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>> failures" which in turn causes lots of other errors.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have deployed Liberty and Mitaka successfully in the past
>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>> baremetal
>>>>>>>>>>>>>>>> with three controllers, and this is the first time I have
>>>>>>>>>>>>>>>> seen
>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>> error.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Charles Short
>>>>>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> rdo-list mailing list
>>>>>>>>>>>>>>> rdo-list at redhat.com
>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/rdo-list
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Charles Short
>>>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Charles Short
>>>>>>>>>> Cloud Engineer
>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>
>>>>>>>> --
>>>>>>>> Charles Short
>>>>>>>> Cloud Engineer
>>>>>>>> Virtualization and Cloud Team
>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>
>>>>> --
>>>>> Charles Short
>>>>> Cloud Engineer
>>>>> Virtualization and Cloud Team
>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>> Tel: +44 (0)1223 494205
>>>>>
>>> --
>>> Charles Short
>>> Cloud Engineer
>>> Virtualization and Cloud Team
>>> European Bioinformatics Institute (EMBL-EBI)
>>> Tel: +44 (0)1223 494205
>>>
>
> --
> Charles Short
> Cloud Engineer
> Virtualization and Cloud Team
> European Bioinformatics Institute (EMBL-EBI)
> Tel: +44 (0)1223 494205
>
More information about the dev
mailing list