Hi Luca,
On Tue, Aug 02, 2016 at 05:59:00PM +0200, Luca 'remix_tj' Lorenzetto wrote:
If i go in depth with heat deployment-show i see that all resources
report this deploy_stderr:
Error: Could not prefetch mysql_user provider 'mysql': Execution of
'/usr/bin/mysql -NBe SELECT CONCAT(User, '@',Host) AS User FROM
mysql.user' returned 1: ERROR 2002 (HY000): Can't connect to local
MySQL server through socket '/var/lib/mysql/mysql.sock' (2)
Error: Could not prefetch mysql_database provider 'mysql': Execution
of '/usr/bin/mysql -NBe show databases' returned 1: ERROR 2002
(HY000): Can't connect to local MySQL server through socket
'/var/lib/mysql/mysql.sock' (2)\
Error: Command exceeded timeout
Error: /Stage[main]/Pacemaker::Corosync/Exec[auth-successful-across-all-nodes]/returns:
change from notrun to 0 failed: Command exceeded timeout
Warning: /Stage[main]/Pacemaker::Corosync/Exec[Create Cluster
tripleo_cluster]: Skipping because of failed dependencies
Warning: /Stage[main]/Pacemaker::Corosync/Exec[Start Cluster
tripleo_cluster]: Skipping because of failed dependencies
Warning: /Stage[main]/Pacemaker::Corosync/Exec[wait-for-settle]:
Skipping because of failed dependencies
Warning: /Stage[main]/Pacemaker::Corosync/Notify[pacemaker settled]:
Skipping because of failed dependencies
Warning: /Stage[main]/Pacemaker::Stonith/Exec[Disable STONITH]:
Skipping because of failed dependencies",
I see nodes are stuck on puppet step
"auth-successful-across-all-nodes" (defined in
/etc/puppet/modules/pacemaker/manifests/corosync.pp)
/usr/bin/python2 /usr/sbin/pcs cluster auth opsctrl0 opsctrl1 opsctrl2
-u hacluster -p PASSWORD --force
I suppose that the problem is due to corosync service not yet started.
But as far as i can see corosync will never start because
/etc/corosync/corosync.conf file is missing.
What happens at this step is that "pcs cluster auth opsctrl0 opsctrl1
opsctrl2..." will set up a secret key between the three nodes and then
configure corosync (/etc/corosync/corosync.conf) and pacemaker and then
start both services on all three nodes. What you need to verify is why
this command is stuck. It is likely either due to networking issues, dns
issues or firewalling issues. You can quickly try and strace the pcs
process and see on which network connections it waits for replies that
never arrive.
cheers,
Michele
--
Michele Baldessari <michele(a)acksyn.org>
C2A5 9DA3 9961 4FFB E01B D0BC DDD4 DCCB 7515 5C6D