From: John Trowbridge <trown(a)redhat.com
Sent:
Friday, June 3, 2016 5:43 PM
To: Boris Derzhavets; Lars Kellogg-Stedman
Cc: rdo-list
Subject: Re: [rdo-list] Tripleo QuickStart HA deployment attempts constantly crash
On 06/03/2016 04:53 PM, John Trowbridge wrote:
I just did an HA deploy locally on master, and I see the same thing
wrt
telemetry services being down due to failed redis import. That could be
a packaging bug (something should depend on python-redis, maybe
python-tooz?). That said, it does not appear fatal in my case. Is there
some issue other than telemetry services being down that you are seeing?
That is certainly something we should fix, but I wouldn't characterize
it as the deployment is constantly crashing.
That was told by me in regards of comment #3 in
https://bugzilla.redhat.com/show_bug.cgi?id=1340865
Of course , issue with telemetry services is not "constantly crashing"
Confirmed that installing python-redis fixes the telemetry issue by
doing the following from the undercloud:
sudo LIBGUESTFS_BACKEND=direct virt-customize -a overcloud-full.qcow2
--install python-redis
openstack overcloud image upload --update-existing
Then deleting the failed overcloud stack, and re-running
overcloud-deploy.sh.
Doesn't work for me. Re-running fails to recreate overcloud stack.
> On 06/03/2016 11:30 AM, Boris Derzhavets wrote:
>> 1. Attempting to address your concern ( if I understood you correct )
>
>> First log :-
>
>> [root@overcloud-controller-0 ceilometer]# cat
central.log | grep ERROR
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
[req-4db5f172-0bf0-4200-9cf4-174859cdc00b admin - - - -] Error starting thread.
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service Traceback (most recent
call last):
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/oslo_service/service.py", line 680, in
run_service
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service service.start()
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/ceilometer/agent/manager.py", line 384, in
start
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
self.partition_coordinator.start()
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/ceilometer/coordination.py", line 84, in
start
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service backend_url,
self._my_id)
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/tooz/coordination.py", line 539, in
get_coordinator
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
invoke_args=(member_id, parsed_url, options)).driver
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/driver.py", line 46, in __init__
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
verify_requirements=verify_requirements,
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/named.py", line 55, in __init__
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
verify_requirements)
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/extension.py", line 171, in
_load_plugins
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
self._on_load_failure_callback(self, ep, err)
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/extension.py", line 163, in
_load_plugins
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
verify_requirements,
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/named.py", line 123, in
_load_one_plugin
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
verify_requirements,
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/extension.py", line 186, in
_load_one_plugin
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service plugin =
ep.load(require=verify_requirements)
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/pkg_resources.py", line 2260, in load
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service entry =
__import__(self.module_name, globals(),globals(), ['__name__'])
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 27, in
<module
>> 2016-06-03 08:50:04.405
17503 ERROR oslo_service.service import redis
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service ImportError: No module
named redis
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
>> [root@overcloud-controller-0 ceilometer]# clear
>> [3;J
>> [root@overcloud-controller-0 ceilometer]# cat central.log | grep ERROR
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
[req-4db5f172-0bf0-4200-9cf4-174859cdc00b admin - - - -] Error starting thread.
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service Traceback (most recent
call last):
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/oslo_service/service.py", line 680, in
run_service
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service service.start()
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/ceilometer/agent/manager.py", line 384, in
start
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
self.partition_coordinator.start()
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/ceilometer/coordination.py", line 84, in
start
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service backend_url,
self._my_id)
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/tooz/coordination.py", line 539, in
get_coordinator
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
invoke_args=(member_id, parsed_url, options)).driver
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/driver.py", line 46, in __init__
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
verify_requirements=verify_requirements,
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/named.py", line 55, in __init__
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
verify_requirements)
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/extension.py", line 171, in
_load_plugins
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
self._on_load_failure_callback(self, ep, err)
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/extension.py", line 163, in
_load_plugins
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
verify_requirements,
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/named.py", line 123, in
_load_one_plugin
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
verify_requirements,
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/extension.py", line 186, in
_load_one_plugin
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service plugin =
ep.load(require=verify_requirements)
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/pkg_resources.py", line 2260, in load
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service entry =
__import__(self.module_name, globals(),globals(), ['__name__'])
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 27, in
<module
>> 2016-06-03 08:50:04.405
17503 ERROR oslo_service.service import redis
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service ImportError: No module
named redis
>> 2016-06-03 08:50:04.405 17503 ERROR oslo_service.service
>
>> Second log :-
>
>> [root@overcloud-controller-0 ceilometer]# cd -
>> /var/log/aodh
>> [root@overcloud-controller-0 aodh]# cat evaluator.log | grep ERROR
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service [-] Error starting
thread.
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service Traceback (most recent
call last):
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/oslo_service/service.py", line 680, in
run_service
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service service.start()
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/aodh/evaluator/__init__.py", line 229, in
start
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service
self.partition_coordinator.start()
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/aodh/coordination.py", line 133, in start
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service self.backend_url,
self._my_id)
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/tooz/coordination.py", line 539, in
get_coordinator
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service
invoke_args=(member_id, parsed_url, options)).driver
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/driver.py", line 46, in __init__
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service
verify_requirements=verify_requirements,
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/named.py", line 55, in __init__
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service
verify_requirements)
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/extension.py", line 171, in
_load_plugins
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service
self._on_load_failure_callback(self, ep, err)
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/extension.py", line 163, in
_load_plugins
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service
verify_requirements,
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/named.py", line 123, in
_load_one_plugin
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service
verify_requirements,
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/stevedore/extension.py", line 186, in
_load_one_plugin
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service plugin =
ep.load(require=verify_requirements)
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/pkg_resources.py", line 2260, in load
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service entry =
__import__(self.module_name, globals(),globals(), ['__name__'])
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service File
"/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 27, in
<module
>> 2016-06-03 08:46:20.552
32101 ERROR oslo_service.service import redis
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service ImportError: No module
named redis
>> 2016-06-03 08:46:20.552 32101 ERROR oslo_service.service
>
>> 2 . Memory DIMMs DDR3 ( Kingston HyperX 1600 MHZ ) is
not a problem
>> My board ASUS Z97-P cannot support more 32 GB. So ....
>
>> 3. i7 4790 surprised me on doing deployment on
TripleO Quickstart , in particular, Controller+2xComputes ( --compute-scale 2 )
>
>> Thank you
>> Boris.
>> ________________________________________
>> From: John Trowbridge <trown(a)redhat.com
>>
Sent: Friday, June 3, 2016 8:43 AM
>> To: Boris Derzhavets; John Trowbridge; Lars Kellogg-Stedman
>> Cc: rdo-list
>> Subject: Re: [rdo-list] Tripleo QuickStart HA deployment attempts constantly
crash
>
>> So this last one looks like telemetry services went
down. You could
>> check the logs on the controllers to see if it was OOM killed. My bet
>> would be this is what is happening.
>
>> The reason that HA is not the default for
tripleo-quickstart is exactly
>> this type of issue. It is pretty difficult to fit a full HA deployment
>> of TripleO on a 32G virthost. I think there is near 100% chance that the
>> default HA config will crash when trying to do anything on the
>> deployed overcloud, due to running out of memory.
>
>> I have had some success in my local test setup using
KSM [1] on the
>> virthost, and then changing the HA config to give the controllers more
>> memory. This results in overcommiting, but KSM can handle overcommiting
>> without going into swap. It might even be possible to try to setup KSM
>> in the environment setup part of quickstart. I would certainly accept an
>> RFE/patch for this [2,3].
>
>> If you have a larger virthost than 32G, you could
similarly bump the
>> memory for the controllers, which should lead to a much higher success rate.
>
>> There is also a feature coming in TripleO [4] that
will allow choosing
>> what services get deployed in each role, which will allow us to tweak
>> the tripleo-quickstart HA config to deploy a minimal service layout in
>> order to reduce memory requirements.
>
>> Thanks a ton for giving tripleo-quickstart a go!
>
>> [1]
https://en.wikipedia.org/wiki/Kernel_same-page_merging
Kernel same-page merging - Wikipedia, the free
encyclopedia<https://en.wikipedia.org/wiki/Kernel_same-page_merging
en.wikipedia.org
In computing, kernel same-page merging (abbreviated as KSM, and also known as kernel
shared memory and memory merging) is a kernel feature that makes it possible for ...
>> [2]
https://bugs.launchpad.net/tripleo-quickstart
>> [3]
https://review.openstack.org/#/q/project:openstack/tripleo-quickstart
>> [4]
>>
https://blueprints.launchpad.net/tripleo/+spec/composable-services-within...
>
>> On 06/03/2016 06:20 AM, Boris Derzhavets wrote:
>>> =====================================
>>
>>> Fresh HA deployment
attempt
>>
>>>
=====================================
>>
>>> [stack@undercloud ~]$
date
>>> Fri Jun 3 10:05:35 UTC 2016
>>> [stack@undercloud ~]$ heat stack-list
>>>
+--------------------------------------+------------+-----------------+---------------------+--------------+
>>> | id | stack_name | stack_status |
creation_time | updated_time |
>>>
+--------------------------------------+------------+-----------------+---------------------+--------------+
>>> | 0c6b8205-be86-4a24-be36-fd4ece956c6d | overcloud | CREATE_COMPLETE |
2016-06-03T08:14:19 | None |
>>>
+--------------------------------------+------------+-----------------+---------------------+--------------+
>>> [stack@undercloud ~]$ nova list
>>>
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+
>>> | ID | Name | Status |
Task State | Power State | Networks |
>>>
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+
>>> | 6a38b7be-3743-4339-970b-6121e687741d | overcloud-controller-0 | ACTIVE | -
| Running | ctlplane=192.0.2.10 |
>>> | 9222dc1b-5974-495b-8b98-b8176ac742f4 | overcloud-controller-1 | ACTIVE | -
| Running | ctlplane=192.0.2.9 |
>>> | 76adbb27-220f-42ef-9691-94729ee28749 | overcloud-controller-2 | ACTIVE | -
| Running | ctlplane=192.0.2.11 |
>>> | 8f57f7b6-a2d8-4b7b-b435-1c675e63ea84 | overcloud-novacompute-0 | ACTIVE | -
| Running | ctlplane=192.0.2.8 |
>>>
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+
>>> [stack@undercloud ~]$ ssh heat-admin(a)192.0.2.10
>>> Last login: Fri Jun 3 10:01:44 2016 from gateway
>>> [heat-admin@overcloud-controller-0 ~]$ sudo su -
>>> Last login: Fri Jun 3 10:01:49 UTC 2016 on pts/0
>>> [root@overcloud-controller-0 ~]# . keystonerc_admin
>>
>>>
[root@overcloud-controller-0 ~]# pcs status
>>> Cluster name: tripleo_cluster
>>> Last updated: Fri Jun 3 10:07:22 2016 Last change: Fri Jun 3
08:50:59 2016 by root via cibadmin on overcloud-controller-0
>>> Stack: corosync
>>> Current DC: overcloud-controller-0 (version 1.1.13-10.el7_2.2-44eb2dd) -
partition with quorum
>>> 3 nodes and 123 resources configured
>>
>>> Online: [
overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
>>
>>> Full list of
resources:
>>
>>> ip-192.0.2.6
(ocf::heartbeat:IPaddr2): Started overcloud-controller-0
>>> Clone Set: haproxy-clone [haproxy]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> ip-192.0.2.7 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1
>>> Master/Slave Set: galera-master [galera]
>>> Masters: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: memcached-clone [memcached]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: rabbitmq-clone [rabbitmq]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-core-clone [openstack-core]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Master/Slave Set: redis-master [redis]
>>> Masters: [ overcloud-controller-1 ]
>>> Slaves: [ overcloud-controller-0 overcloud-controller-2 ]
>>> Clone Set: mongod-clone [mongod]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-aodh-evaluator-clone [openstack-aodh-evaluator]
>>> Stopped: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: neutron-l3-agent-clone [neutron-l3-agent]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> openstack-cinder-volume (systemd:openstack-cinder-volume): Started
overcloud-controller-2
>>> Clone Set: openstack-heat-engine-clone [openstack-heat-engine]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-ceilometer-api-clone [openstack-ceilometer-api]
>>> Stopped: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-aodh-listener-clone [openstack-aodh-listener]
>>> Stopped: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-gnocchi-metricd-clone [openstack-gnocchi-metricd]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-aodh-notifier-clone [openstack-aodh-notifier]
>>> Stopped: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-heat-api-clone [openstack-heat-api]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-ceilometer-collector-clone
[openstack-ceilometer-collector]
>>> Stopped: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-glance-api-clone [openstack-glance-api]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-nova-api-clone [openstack-nova-api]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-sahara-api-clone [openstack-sahara-api]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-heat-api-cloudwatch-clone
[openstack-heat-api-cloudwatch]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-sahara-engine-clone [openstack-sahara-engine]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-glance-registry-clone [openstack-glance-registry]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-gnocchi-statsd-clone [openstack-gnocchi-statsd]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-ceilometer-notification-clone
[openstack-ceilometer-notification]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-cinder-api-clone [openstack-cinder-api]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: delay-clone [delay]
>>> Stopped: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: neutron-server-clone [neutron-server]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-ceilometer-central-clone
[openstack-ceilometer-central]
>>> Stopped: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: httpd-clone [httpd]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>> Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor]
>>> Started: [ overcloud-controller-0 overcloud-controller-1
overcloud-controller-2 ]
>>
>>> Failed Actions:
>>> * openstack-aodh-evaluator_monitor_60000 on overcloud-controller-1 'not
running' (7): call=76, status=complete, exitreason='none',
>>> last-rc-change='Fri Jun 3 08:47:22 2016', queued=0ms, exec=0ms
>>> * openstack-ceilometer-central_start_0 on overcloud-controller-1 'not
running' (7): call=290, status=complete, exitreason='none',
>>> last-rc-change='Fri Jun 3 08:51:18 2016', queued=0ms,
exec=2132ms
>>> * openstack-aodh-evaluator_monitor_60000 on overcloud-controller-2 'not
running' (7): call=76, status=complete, exitreason='none',
>>> last-rc-change='Fri Jun 3 08:47:16 2016', queued=0ms, exec=0ms
>>> * openstack-ceilometer-central_start_0 on overcloud-controller-2 'not
running' (7): call=292, status=complete, exitreason='none',
>>> last-rc-change='Fri Jun 3 08:51:31 2016', queued=0ms,
exec=2102ms
>>> * openstack-aodh-evaluator_monitor_60000 on overcloud-controller-0 'not
running' (7): call=77, status=complete, exitreason='none',
>>> last-rc-change='Fri Jun 3 08:47:19 2016', queued=0ms, exec=0ms
>>> * openstack-ceilometer-central_start_0 on overcloud-controller-0 'not
running' (7): call=270, status=complete, exitreason='none',
>>> last-rc-change='Fri Jun 3 08:50:02 2016', queued=0ms,
exec=2199ms
>>
>>
>>> PCSD Status:
>>> overcloud-controller-0: Online
>>> overcloud-controller-1: Online
>>> overcloud-controller-2: Online
>>
>>> Daemon Status:
>>> corosync: active/enabled
>>> pacemaker: active/enabled
>>> pcsd: active/enabled
>>
>>
>>> ________________________________
>>> From: rdo-list-bounces(a)redhat.com <rdo-list-bounces(a)redhat.com> on
behalf of Boris Derzhavets <bderzhavets(a)hotmail.com
>>> Sent: Monday, May 30, 2016 4:56 AM
>>> To: John Trowbridge; Lars Kellogg-Stedman
>>> Cc: rdo-list
>>> Subject: Re: [rdo-list] Tripleo QuickStart HA deployment attempts constantly
crash
>>
>>
>>> Done one more time :-
>>
>>
>>> [stack@undercloud ~]$ heat deployment-show
9cc8087a-6d82-4261-8a13-ee8c46e3a02d
>>
>>> Uploaded here :-
>>
>>>
http://textuploader.com/5bm5v
>>> ________________________________
>>> From: rdo-list-bounces(a)redhat.com <rdo-list-bounces(a)redhat.com> on
behalf of Boris Derzhavets <bderzhavets(a)hotmail.com
>>> Sent: Sunday, May 29, 2016 3:39 AM
>>> To: John Trowbridge; Lars Kellogg-Stedman
>>> Cc: rdo-list
>>> Subject: [rdo-list] Tripleo QuickStart HA deploymemt attempts constantly
crash
>>
>>
>>> Error every time is the same :-
>>
>>
>>> 2016-05-29 07:20:17 [0]: CREATE_FAILED Error:
resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with
non-zero status code: 6
>>> 2016-05-29 07:20:18 [0]: SIGNAL_COMPLETE Unknown
>>> 2016-05-29 07:20:18
[overcloud-ControllerNodesPostDeployment-dzawjmjyaidt-ControllerServicesBaseDeployment_Step2-ufz2ccs5egd7]:
CREATE_FAILED Resource CREATE failed: Error: resources[0]: Deployment to server failed:
deploy_status_code : Deployment exited with non-zero status code: 6
>>> 2016-05-29 07:20:18 [0]: SIGNAL_COMPLETE Unknown
>>> 2016-05-29 07:20:19 [ControllerServicesBaseDeployment_Step2]: CREATE_FAILED
Error: resources.ControllerServicesBaseDeployment_Step2.resources[0]: Deployment to server
failed: deploy_status_code: Deployment exited with non-zero status code: 6
>>> 2016-05-29 07:20:19 [0]: SIGNAL_COMPLETE Unknown
>>> 2016-05-29 07:20:19 [0]: SIGNAL_COMPLETE Unknown
>>> 2016-05-29 07:20:20 [ControllerDeployment]: SIGNAL_COMPLETE Unknown
>>> 2016-05-29 07:20:20 [overcloud-ControllerNodesPostDeployment-dzawjmjyaidt]:
CREATE_FAILED Resource CREATE failed: Error:
resources.ControllerServicesBaseDeployment_Step2.resources[0]: Deployment to server
failed: deploy_status_code: Deployment exited with non-zero status code: 6
>>> 2016-05-29 07:20:21 [ControllerNodesPostDeployment]: CREATE_FAILED Error:
resources.ControllerNodesPostDeployment.resources.ControllerServicesBaseDeployment_Step2.resources[0]:
Deployment to server failed: deploy_status_code: Deployment exited with non-zero status
code: 6
>>> 2016-05-29 07:20:21 [0]: SIGNAL_COMPLETE Unknown
>>> 2016-05-29 07:20:22 [NetworkDeployment]: SIGNAL_COMPLETE Unknown
>>> 2016-05-29 07:20:22 [0]: SIGNAL_COMPLETE Unknown
>>> 2016-05-29 07:24:22 [ComputeNodesPostDeployment]: CREATE_FAILED CREATE
aborted
>>> 2016-05-29 07:24:22 [overcloud]: CREATE_FAILED Resource CREATE failed: Error:
resources.ControllerNodesPostDeployment.resources.ControllerServicesBaseDeployment_Step2.resources[0]:
Deployment to server failed: deploy_status_code: Deployment exited with non-zero status
code: 6
>>> Stack overcloud CREATE_FAILED
>>> Deployment failed: Heat Stack create failed.
>>> + heat stack-list
>>> + grep -q CREATE_FAILED
>>> + deploy_status=1
>>> ++ heat resource-list --nested-depth 5 overcloud
>>> ++ grep FAILED
>>> ++ grep 'StructuredDeployment '
>>> ++ cut -d '|' -f3
>>> + for failed in '$(heat resource-list --nested-depth 5 overcloud
| grep FAILED |
>>> grep '\''StructuredDeployment '\'' | cut -d
'\''|'\'' -f3)'
>>> + heat deployment-show 66bd3fbe-296b-4f88-87a7-5ceafd05c1d3
>>> + exit 1
>>
>>
>>> Minimal configuration deployments run with no
errors and build completely functional environment.
>>
>>
>>> However, template :-
>>
>>
>>> #################################
>>> # Test Controller + 2*Compute nodes
>>> #################################
>>> control_memory: 6144
>>> compute_memory: 6144
>>
>>> undercloud_memory:
8192
>>
>>> # Giving the undercloud
additional CPUs can greatly improve heat's
>>> # performance (and result in a shorter deploy time).
>>> undercloud_vcpu: 4
>>
>>> # We set introspection
to true and use only the minimal amount of nodes
>>> # for this job, but test all defaults otherwise.
>>> step_introspect: true
>>
>>> # Define a single
controller node and a single compute node.
>>> overcloud_nodes:
>>> - name: control_0
>>> flavor: control
>>
>>> - name: compute_0
>>> flavor: compute
>>
>>> - name: compute_1
>>> flavor: compute
>>
>>> # Tell tripleo how we
want things done.
>>> extra_args: >-
>>> --neutron-network-type vxlan
>>> --neutron-tunnel-types vxlan
>>> --ntp-server
pool.ntp.org
>>
>>> network_isolation:
true
>>
>>
>>> Picks up new memory setting but doesn't create
second Compute Node.
>>
>>> Every time just
Controller && (1)* Compute.
>>
>>
>>> HW - i74790 , 32 GB RAM
>>
>>
>>> Thanks.
>>
>>> Boris
>>
>>>
________________________________
>>
>>
>>
>>
>>>
_______________________________________________
>>> rdo-list mailing list
>>> rdo-list(a)redhat.com
>>>
https://www.redhat.com/mailman/listinfo/rdo-list
>>
>>> To unsubscribe:
rdo-list-unsubscribe(a)redhat.com
>>
>
_______________________________________________
> rdo-list mailing list
> rdo-list(a)redhat.com
>
https://www.redhat.com/mailman/listinfo/rdo-list
> To unsubscribe: rdo-list-unsubscribe(a)redhat.com