[Rdo-list] Openstack Icehouse and Qpid behind HAproxy causes services down
Chris
contact at progbau.de
Mon Oct 26 07:18:03 UTC 2015
Hello,
We have an Openstack Icehouse cluster setup with two management nodes (Nova,
Neutron, Horizon etc. ) as well as qpidd (version 0.18) for the message
queue. Everything sits behind an HAproxy setup which round robins the
request to the both nodes.
It works fine until a certain amount of time (couple of days), all the
agents from the compute nodes (Nova, Neutron) shows as down in the Horizon
web interface. A "openstack-services restart" on both management nodes fixes
it normally and the agents are shown as up.
In the Nova logs on the compute nodes I see a lot of messages like the ones
below, its seems like the connection to the message queue is lost.:
ERROR nova.openstack.common.periodic_task [-] Error during
ComputeManager.update_available_resource: Timed out waiting for a reply to
message ID b28ae4098c31453c83d963c2a9d6c1ee
[.]
TRACE nova.openstack.common.periodic_task reply, ending =
self._poll_connection(msg_id, timeout)
TRACE nova.openstack.common.periodic_task File
"/usr/lib/python2.6/site-packages/oslo/messaging/_drivers/amqpdriver.py",
line 217, in _poll_connection
TRACE nova.openstack.common.periodic_task % msg_id)
TRACE nova.openstack.common.periodic_task MessagingTimeout: Timed out
waiting for a reply to message ID b28ae4098c31453c83d963c2a9d6c1ee
Here the Haproxy port for qpidd:
listen qpid_message_broker
bind 10.xxx.xxx.xxx:5672
timeout server 1h
timeout client 1h
timeout connect 240s
server xx-xxxxx-x001 10.xxx.xxx.xx1:5672 check inter 10s rise 9999999 fall
5
server xx-xxxxx-x002 10.xxx.xxx.xx2:5672 check backup
Any ideas or experiences you had with setting up HAproxy for qpidd? Any help
appreciated!
Cheers,
Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rdoproject.org/pipermail/dev/attachments/20151026/3aa94f80/attachment.html>
More information about the dev
mailing list