[rdo-list] [tripleo] Troubles deploying Libery with HA setup
Luca 'remix_tj' Lorenzetto
lorenzetto.luca at gmail.com
Wed Aug 3 08:17:47 UTC 2016
On Wed, Aug 3, 2016 at 6:36 AM, Michele Baldessari <michele at acksyn.org> wrote:
> Hi Luca,
[cut]
Hi Michele,
> What happens at this step is that "pcs cluster auth opsctrl0 opsctrl1
> opsctrl2..." will set up a secret key between the three nodes and then
> configure corosync (/etc/corosync/corosync.conf) and pacemaker and then
> start both services on all three nodes.
Thank you for the explanation.
> What you need to verify is why
> this command is stuck. It is likely either due to networking issues, dns
> issues or firewalling issues. You can quickly try and strace the pcs
> process and see on which network connections it waits for replies that
> never arrive.
I see this on netstat:
[heat-admin at opsctrl0 ~]$ sudo netstat -alptn | grep 2224
tcp 0 0 172.25.122.13:44378 172.25.122.12:2224
ESTABLISHED 26473/ruby
tcp 0 0 172.25.122.13:55286 172.25.122.13:2224
ESTABLISHED 26473/ruby
tcp 0 0 172.25.122.13:43808 172.25.122.14:2224
ESTABLISHED 26473/ruby
tcp6 2 0 :::2224 :::*
LISTEN 26276/ruby
tcp6 200 0 172.25.122.13:2224 172.25.122.13:55286
ESTABLISHED -
tcp6 200 0 172.25.122.13:2224 172.25.122.12:59155
ESTABLISHED -
tcp6 0 1457 172.25.122.13:2224 172.25.122.14:39918
ESTABLISHED 26276/ruby
(similar on the other 3 nodes)
PID 26276 is /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb
PID 26473 is /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb
auth which is a child process of /usr/bin/python2 /usr/sbin/pcs
cluster auth opsctrl0 opsctrl1 opsctrl2 -u hacluster -p password
--force
Stracing 26276 i continue to get:
select(11, [10], NULL, NULL, {2, 0}) = 0 (Timeout)
With lsof i see that there are these fds open:
ruby 26276 root 0r CHR 1,3 0t0 1028 /dev/null
ruby 26276 root 1u REG 8,2 32181 20889837 /var/log/pcsd/pcsd.log
ruby 26276 root 2u REG 8,2 32181 20889837 /var/log/pcsd/pcsd.log
ruby 26276 root 3r FIFO 0,8 0t0 72146 pipe
ruby 26276 root 4w FIFO 0,8 0t0 72146 pipe
ruby 26276 root 5r FIFO 0,8 0t0 72147 pipe
ruby 26276 root 6w FIFO 0,8 0t0 72147 pipe
ruby 26276 root 7u sock 0,6 0t0 64861 protocol: TCPv6
ruby 26276 root 9w REG 8,2 32181 20889837 /var/log/pcsd/pcsd.log
ruby 26276 root 10u IPv6 23096 0t0 TCP *:efi-mg (LISTEN)
i suppose that the one involved is the last one.
Additionally, since i've seen that a log file is open, i can report
that every minute the log get this lines:
I, [2016-08-03T04:14:05.213788 #26276] INFO -- : Running:
/usr/sbin/corosync-cmapctl totem.cluster_name
I, [2016-08-03T04:14:05.214140 #26276] INFO -- : CIB USER: hacluster, groups:
I, [2016-08-03T04:14:05.219656 #26276] INFO -- : Return Value: 1
W, [2016-08-03T04:14:05.219834 #26276] WARN -- : Cannot read config
'corosync.conf' from '/etc/corosync/corosync.conf': No such file or
directory - /etc/corosync/corosync.conf
I'm not able to understand what's happening.
--
"E' assurdo impiegare gli uomini di intelligenza eccellente per fare
calcoli che potrebbero essere affidati a chiunque se si usassero delle
macchine"
Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)
"Internet è la più grande biblioteca del mondo.
Ma il problema è che i libri sono tutti sparsi sul pavimento"
John Allen Paulos, Matematico (1945-vivente)
Luca 'remix_tj' Lorenzetto, http://www.remixtj.net , <lorenzetto.luca at gmail.com>
More information about the dev
mailing list