[rdo-list] Newton HA galera-ready dependency error

Marius Cornea marius at remote-lab.net
Wed Oct 12 08:09:26 UTC 2016


That's odd. I encountered the same issue and it was caused by missing
this patch.  What do you get if you do sudo hiera
mysql_short_node_names on the controller node?

On Wed, Oct 12, 2016 at 12:48 AM, Charles Short <cems at ebi.ac.uk> wrote:
> Hi,
>
> puppet-tripleo-5.2.0-0.20161007035759.f32e484.el7.centos.noarch
>
> The patch seems to be already present -
>
>   grep short
> /usr/share/openstack-puppet/modules/tripleo/manifests/profile/pacemaker/database/mysql.pp
>
>   # short name which is already registered in pacemaker until we get around
>   $galera_node_names_lookup = hiera('mysql_short_node_names',
> hiera('mysql_node_names', $::hostname))
>
> Charles
>
> On 11/10/2016 19:39, Marius Cornea wrote:
>>
>> I think the issue is caused by the addresses in wsrep_cluster_address
>> not matching the pacemaker node names:
>>
>> wsrep_cluster_address=gcomm://overcloud-controller-0.internalapi.localdomain,overcloud-controller-1.internalapi.localdomain,overcloud-controller-2.internalapi.localdomain
>>
>> Could you please confirm what version of puppet-tripleo you've got
>> installed on the overcloud nodes and if it contains the following
>> patch:
>> https://review.openstack.org/#/c/382883/1/manifests/profile/pacemaker/database/mysql.pp
>>
>> Thanks,
>> Marius
>>
>> On Tue, Oct 11, 2016 at 7:42 PM, Charles Short <cems at ebi.ac.uk> wrote:
>>>
>>> Ok install finished with same error
>>> The latest pcs status etc
>>>
>>> http://pastebin.com/ZK683gZe
>>>
>>>
>>> On 11/10/2016 17:35, Charles Short wrote:
>>>>
>>>> Deployment almost finished...so
>>>>
>>>>
>>>> http://pastebin.com/zE9B19XB
>>>>
>>>> This shows the pcs status as the deployment nears the end, and pcs
>>>> resource show galera
>>>>
>>>> Charles
>>>>
>>>> On 11/10/2016 16:59, Marius Cornea wrote:
>>>>>
>>>>> Great, thanks for checking this.
>>>>>
>>>>> On Tue, Oct 11, 2016 at 5:58 PM, Charles Short <cems at ebi.ac.uk> wrote:
>>>>>>
>>>>>> Currently having more generic deployment issues (no valid host found
>>>>>> etc).
>>>>>> I can work around/solve these.
>>>>>> I don't yet have another stack to analyse, but will do soon.
>>>>>>
>>>>>> Charles
>>>>>>
>>>>>>
>>>>>> On 11/10/2016 16:35, Marius Cornea wrote:
>>>>>>>
>>>>>>> Did it succeed in bringing the Galera nodes to Master? You can ssh to
>>>>>>> the nodes and run 'pcs resource show galera' even though the
>>>>>>> deployment hasn't finished. I'm interested to see how the
>>>>>>> wsrep_cluster_address is set to see if it's affected by the resource
>>>>>>> agent issue described in
>>>>>>> https://bugs.launchpad.net/tripleo/+bug/1628521
>>>>>>>
>>>>>>> On Tue, Oct 11, 2016 at 5:18 PM, Charles Short <cems at ebi.ac.uk>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Looks similar to this bug (still waiting on deployment to finish)
>>>>>>>>
>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1368214
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/10/2016 15:25, Charles Short wrote:
>>>>>>>>>
>>>>>>>>> Sorry for the delay.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Just redeploying to make sure I can repeat the same error. Should
>>>>>>>>> not
>>>>>>>>> be
>>>>>>>>> long.
>>>>>>>>>
>>>>>>>>> Charles
>>>>>>>>>
>>>>>>>>> On 11/10/2016 14:24, Marius Cornea wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Could you also please paste the output for 'pcs resource show
>>>>>>>>>> galera',
>>>>>>>>>> it looks that all the galera nodes show up as slaves?
>>>>>>>>>>
>>>>>>>>>>      Master/Slave Set: galera-master [galera]
>>>>>>>>>>          Slaves: [ overcloud-controller-0 overcloud-controller-1
>>>>>>>>>> overcloud-controller-2 ]
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>>
>>>>>>>>>> On Tue, Oct 11, 2016 at 2:16 PM, Charles Short <cems at ebi.ac.uk>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> Here you are -
>>>>>>>>>>>
>>>>>>>>>>>      - Heat stack error  - http://pastebin.com/E8KZa2vE
>>>>>>>>>>>      - PCS status  - http://pastebin.com/z34gSLq6
>>>>>>>>>>>      - mariadb.log - http://pastebin.com/APFXPBLc
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>>
>>>>>>>>>>> Charles
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 11/10/2016 12:07, Marius Cornea wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Charles,
>>>>>>>>>>>>
>>>>>>>>>>>> Could you please paste the output of 'pcs status' ? The log in
>>>>>>>>>>>> /var/log/mariadb/mariadb.log might also be a good indicator.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Oct 11, 2016 at 11:16 AM, Charles Short <cems at ebi.ac.uk>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> To add I built my own image from
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud.qcow2
>>>>>>>>>>>>>
>>>>>>>>>>>>> as the images in
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://buildlogs.centos.org/centos/7/cloud/x86_64/tripleo_images/newton/delorean/
>>>>>>>>>>>>> caused sporadic ramdisk loading errors  (hung at x% loaded on
>>>>>>>>>>>>> boot)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Does my image now need to be customised in any way for HA to
>>>>>>>>>>>>> work?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/10/2016 09:55, Charles Short wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am installing Newton with TripleO on baremetal HP blades.
>>>>>>>>>>>>>> I can deploy a single controller stack overcloud no problem,
>>>>>>>>>>>>>> however
>>>>>>>>>>>>>> when
>>>>>>>>>>>>>> I choose three controllers the deployment fails (including
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml)
>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The heat stack error first complains "Dependency
>>>>>>>>>>>>>> Exec[galera-ready]
>>>>>>>>>>>>>> has
>>>>>>>>>>>>>> failures" which in turn causes lots of other errors.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have deployed Liberty and Mitaka successfully in the past on
>>>>>>>>>>>>>> baremetal
>>>>>>>>>>>>>> with three controllers, and this is the first time I have seen
>>>>>>>>>>>>>> this
>>>>>>>>>>>>>> error.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Charles
>>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Charles Short
>>>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> rdo-list mailing list
>>>>>>>>>>>>> rdo-list at redhat.com
>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/rdo-list
>>>>>>>>>>>>>
>>>>>>>>>>>>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Charles Short
>>>>>>>>>>> Cloud Engineer
>>>>>>>>>>> Virtualization and Cloud Team
>>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>>>>
>>>>>>>> --
>>>>>>>> Charles Short
>>>>>>>> Cloud Engineer
>>>>>>>> Virtualization and Cloud Team
>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>> Tel: +44 (0)1223 494205
>>>>>>>>
>>>>>> --
>>>>>> Charles Short
>>>>>> Cloud Engineer
>>>>>> Virtualization and Cloud Team
>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>> Tel: +44 (0)1223 494205
>>>>>>
>>> --
>>> Charles Short
>>> Cloud Engineer
>>> Virtualization and Cloud Team
>>> European Bioinformatics Institute (EMBL-EBI)
>>> Tel: +44 (0)1223 494205
>>>
>
> --
> Charles Short
> Cloud Engineer
> Virtualization and Cloud Team
> European Bioinformatics Institute (EMBL-EBI)
> Tel: +44 (0)1223 494205
>




More information about the dev mailing list