[rdo-list] [tripleo] Troubles deploying Libery with HA setup

Michele Baldessari michele at acksyn.org
Wed Aug 3 04:36:25 UTC 2016


Hi Luca,

On Tue, Aug 02, 2016 at 05:59:00PM +0200, Luca 'remix_tj' Lorenzetto wrote:
> If i go in depth with heat deployment-show i see that all resources
> report this deploy_stderr:
> 
> Error: Could not prefetch mysql_user provider 'mysql': Execution of
> '/usr/bin/mysql -NBe SELECT CONCAT(User, '@',Host) AS User FROM
> mysql.user' returned 1: ERROR 2002 (HY000): Can't connect to local
> MySQL server through socket '/var/lib/mysql/mysql.sock' (2)
> Error: Could not prefetch mysql_database provider 'mysql': Execution
> of '/usr/bin/mysql -NBe show databases' returned 1: ERROR 2002
> (HY000): Can't connect to local MySQL server through socket
> '/var/lib/mysql/mysql.sock' (2)\
> Error: Command exceeded timeout
> Error: /Stage[main]/Pacemaker::Corosync/Exec[auth-successful-across-all-nodes]/returns:
> change from notrun to 0 failed: Command exceeded timeout
> Warning: /Stage[main]/Pacemaker::Corosync/Exec[Create Cluster
> tripleo_cluster]: Skipping because of failed dependencies
> Warning: /Stage[main]/Pacemaker::Corosync/Exec[Start Cluster
> tripleo_cluster]: Skipping because of failed dependencies
> Warning: /Stage[main]/Pacemaker::Corosync/Exec[wait-for-settle]:
> Skipping because of failed dependencies
> Warning: /Stage[main]/Pacemaker::Corosync/Notify[pacemaker settled]:
> Skipping because of failed dependencies
> Warning: /Stage[main]/Pacemaker::Stonith/Exec[Disable STONITH]:
> Skipping because of failed dependencies",
> 
> I see nodes are stuck on puppet step
> "auth-successful-across-all-nodes" (defined in
> /etc/puppet/modules/pacemaker/manifests/corosync.pp)
> 
> /usr/bin/python2 /usr/sbin/pcs cluster auth opsctrl0 opsctrl1 opsctrl2
> -u hacluster -p PASSWORD --force
> 
> I suppose that the problem is due to corosync service not yet started.
> But as far as i can see corosync will never start because
> /etc/corosync/corosync.conf file is missing.

What happens at this step is that "pcs cluster auth opsctrl0 opsctrl1
opsctrl2..." will set up a secret key between the three nodes and then
configure corosync (/etc/corosync/corosync.conf) and pacemaker and then
start both services on all three nodes. What you need to verify is why
this command is stuck. It is likely either due to networking issues, dns
issues or firewalling issues. You can quickly try and strace the pcs
process and see on which network connections it waits for replies that
never arrive.

cheers,
Michele
-- 
Michele Baldessari            <michele at acksyn.org>
C2A5 9DA3 9961 4FFB E01B  D0BC DDD4 DCCB 7515 5C6D




More information about the dev mailing list