So this last one looks like telemetry services went down. You could
check the logs on the controllers to see if it was OOM killed. My bet
would be this is what is happening.
The reason that HA is not the default for tripleo-quickstart is exactly
this type of issue. It is pretty difficult to fit a full HA deployment
of TripleO on a 32G virthost. I think there is near 100% chance that the
default HA config will crash when trying to do anything on the
deployed overcloud, due to running out of memory.
I have had some success in my local test setup using KSM [1] on the
virthost, and then changing the HA config to give the controllers more
memory. This results in overcommiting, but KSM can handle overcommiting
without going into swap. It might even be possible to try to setup KSM
in the environment setup part of quickstart. I would certainly accept an
RFE/patch for this [2,3].
If you have a larger virthost than 32G, you could similarly bump the
memory for the controllers, which should lead to a much higher success rate.
There is also a feature coming in TripleO [4] that will allow choosing
what services get deployed in each role, which will allow us to tweak
the tripleo-quickstart HA config to deploy a minimal service layout in
order to reduce memory requirements.
Thanks a ton for giving tripleo-quickstart a go!
[1] 
 =====================================
 
 Fresh HA deployment attempt
 
 =====================================
 
 [stack@undercloud ~]$ date
 Fri Jun  3 10:05:35 UTC 2016
 [stack@undercloud ~]$ heat stack-list
+--------------------------------------+------------+-----------------+---------------------+--------------+
 | id                                   | stack_name | stack_status    | creation_time    
  | updated_time |
+--------------------------------------+------------+-----------------+---------------------+--------------+
 | 0c6b8205-be86-4a24-be36-fd4ece956c6d | overcloud  | CREATE_COMPLETE |
2016-06-03T08:14:19 | None         |
+--------------------------------------+------------+-----------------+---------------------+--------------+
 [stack@undercloud ~]$ nova list
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+
 | ID                                   | Name                    | Status | Task State |
Power State | Networks            |
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+
 | 6a38b7be-3743-4339-970b-6121e687741d | overcloud-controller-0  | ACTIVE | -          |
Running     | ctlplane=192.0.2.10 |
 | 9222dc1b-5974-495b-8b98-b8176ac742f4 | overcloud-controller-1  | ACTIVE | -          |
Running     | ctlplane=192.0.2.9  |
 | 76adbb27-220f-42ef-9691-94729ee28749 | overcloud-controller-2  | ACTIVE | -          |
Running     | ctlplane=192.0.2.11 |
 | 8f57f7b6-a2d8-4b7b-b435-1c675e63ea84 | overcloud-novacompute-0 | ACTIVE | -          |
Running     | ctlplane=192.0.2.8  |
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+
 [stack@undercloud ~]$ ssh heat-admin(a)192.0.2.10
 Last login: Fri Jun  3 10:01:44 2016 from gateway
 [heat-admin@overcloud-controller-0 ~]$ sudo su -
 Last login: Fri Jun  3 10:01:49 UTC 2016 on pts/0
 [root@overcloud-controller-0 ~]# .  keystonerc_admin
 
 [root@overcloud-controller-0 ~]# pcs status
 Cluster name: tripleo_cluster
 Last updated: Fri Jun  3 10:07:22 2016        Last change: Fri Jun  3 08:50:59 2016 by
root via cibadmin on overcloud-controller-0
 Stack: corosync
 Current DC: overcloud-controller-0 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with
quorum
 3 nodes and 123 resources configured
 
 Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 
 Full list of resources:
 
  ip-192.0.2.6    (ocf::heartbeat:IPaddr2):    Started overcloud-controller-0
  Clone Set: haproxy-clone [haproxy]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  ip-192.0.2.7    (ocf::heartbeat:IPaddr2):    Started overcloud-controller-1
  Master/Slave Set: galera-master [galera]
      Masters: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: memcached-clone [memcached]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: rabbitmq-clone [rabbitmq]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-core-clone [openstack-core]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Master/Slave Set: redis-master [redis]
      Masters: [ overcloud-controller-1 ]
      Slaves: [ overcloud-controller-0 overcloud-controller-2 ]
  Clone Set: mongod-clone [mongod]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-aodh-evaluator-clone [openstack-aodh-evaluator]
      Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: neutron-l3-agent-clone [neutron-l3-agent]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  openstack-cinder-volume    (systemd:openstack-cinder-volume):    Started
overcloud-controller-2
  Clone Set: openstack-heat-engine-clone [openstack-heat-engine]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-ceilometer-api-clone [openstack-ceilometer-api]
      Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-aodh-listener-clone [openstack-aodh-listener]
      Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-gnocchi-metricd-clone [openstack-gnocchi-metricd]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-aodh-notifier-clone [openstack-aodh-notifier]
      Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-heat-api-clone [openstack-heat-api]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-ceilometer-collector-clone [openstack-ceilometer-collector]
      Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-glance-api-clone [openstack-glance-api]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-nova-api-clone [openstack-nova-api]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-sahara-api-clone [openstack-sahara-api]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-heat-api-cloudwatch-clone [openstack-heat-api-cloudwatch]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-sahara-engine-clone [openstack-sahara-engine]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-glance-registry-clone [openstack-glance-registry]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-gnocchi-statsd-clone [openstack-gnocchi-statsd]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-ceilometer-notification-clone [openstack-ceilometer-notification]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-cinder-api-clone [openstack-cinder-api]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: delay-clone [delay]
      Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: neutron-server-clone [neutron-server]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-ceilometer-central-clone [openstack-ceilometer-central]
      Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: httpd-clone [httpd]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
  Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor]
      Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 
 Failed Actions:
 * openstack-aodh-evaluator_monitor_60000 on overcloud-controller-1 'not running'
(7): call=76, status=complete, exitreason='none',
     last-rc-change='Fri Jun  3 08:47:22 2016', queued=0ms, exec=0ms
 * openstack-ceilometer-central_start_0 on overcloud-controller-1 'not running'
(7): call=290, status=complete, exitreason='none',
     last-rc-change='Fri Jun  3 08:51:18 2016', queued=0ms, exec=2132ms
 * openstack-aodh-evaluator_monitor_60000 on overcloud-controller-2 'not running'
(7): call=76, status=complete, exitreason='none',
     last-rc-change='Fri Jun  3 08:47:16 2016', queued=0ms, exec=0ms
 * openstack-ceilometer-central_start_0 on overcloud-controller-2 'not running'
(7): call=292, status=complete, exitreason='none',
     last-rc-change='Fri Jun  3 08:51:31 2016', queued=0ms, exec=2102ms
 * openstack-aodh-evaluator_monitor_60000 on overcloud-controller-0 'not running'
(7): call=77, status=complete, exitreason='none',
     last-rc-change='Fri Jun  3 08:47:19 2016', queued=0ms, exec=0ms
 * openstack-ceilometer-central_start_0 on overcloud-controller-0 'not running'
(7): call=270, status=complete, exitreason='none',
     last-rc-change='Fri Jun  3 08:50:02 2016', queued=0ms, exec=2199ms
 
 
 PCSD Status:
   overcloud-controller-0: Online
   overcloud-controller-1: Online
   overcloud-controller-2: Online
 
 Daemon Status:
   corosync: active/enabled
   pacemaker: active/enabled
   pcsd: active/enabled
 
 
 ________________________________
 From: rdo-list-bounces(a)redhat.com <rdo-list-bounces(a)redhat.com> on behalf of Boris
Derzhavets <bderzhavets(a)hotmail.com>
 Sent: Monday, May 30, 2016 4:56 AM
 To: John Trowbridge; Lars Kellogg-Stedman
 Cc: rdo-list
 Subject: Re: [rdo-list] Tripleo QuickStart HA deployment attempts constantly crash
 
 
 Done one more time :-
 
 
 [stack@undercloud ~]$ heat deployment-show 9cc8087a-6d82-4261-8a13-ee8c46e3a02d
 
 Uploaded here :-
 
 
http://textuploader.com/5bm5v
 ________________________________
 From: rdo-list-bounces(a)redhat.com <rdo-list-bounces(a)redhat.com> on behalf of Boris
Derzhavets <bderzhavets(a)hotmail.com>
 Sent: Sunday, May 29, 2016 3:39 AM
 To: John Trowbridge; Lars Kellogg-Stedman
 Cc: rdo-list
 Subject: [rdo-list] Tripleo QuickStart HA deploymemt attempts constantly crash
 
 
 Error every time is the same :-
 
 
 2016-05-29 07:20:17 [0]: CREATE_FAILED Error: resources[0]: Deployment to server failed:
deploy_status_code : Deployment exited with non-zero status code: 6
 2016-05-29 07:20:18 [0]: SIGNAL_COMPLETE Unknown
 2016-05-29 07:20:18
[overcloud-ControllerNodesPostDeployment-dzawjmjyaidt-ControllerServicesBaseDeployment_Step2-ufz2ccs5egd7]:
CREATE_FAILED Resource CREATE failed: Error: resources[0]: Deployment to server failed:
deploy_status_code : Deployment exited with non-zero status code: 6
 2016-05-29 07:20:18 [0]: SIGNAL_COMPLETE Unknown
 2016-05-29 07:20:19 [ControllerServicesBaseDeployment_Step2]: CREATE_FAILED Error:
resources.ControllerServicesBaseDeployment_Step2.resources[0]: Deployment to server
failed: deploy_status_code: Deployment exited with non-zero status code: 6
 2016-05-29 07:20:19 [0]: SIGNAL_COMPLETE Unknown
 2016-05-29 07:20:19 [0]: SIGNAL_COMPLETE Unknown
 2016-05-29 07:20:20 [ControllerDeployment]: SIGNAL_COMPLETE Unknown
 2016-05-29 07:20:20 [overcloud-ControllerNodesPostDeployment-dzawjmjyaidt]: CREATE_FAILED
Resource CREATE failed: Error:
resources.ControllerServicesBaseDeployment_Step2.resources[0]: Deployment to server
failed: deploy_status_code: Deployment exited with non-zero status code: 6
 2016-05-29 07:20:21 [ControllerNodesPostDeployment]: CREATE_FAILED Error:
resources.ControllerNodesPostDeployment.resources.ControllerServicesBaseDeployment_Step2.resources[0]:
Deployment to server failed: deploy_status_code: Deployment exited with non-zero status
code: 6
 2016-05-29 07:20:21 [0]: SIGNAL_COMPLETE Unknown
 2016-05-29 07:20:22 [NetworkDeployment]: SIGNAL_COMPLETE Unknown
 2016-05-29 07:20:22 [0]: SIGNAL_COMPLETE Unknown
 2016-05-29 07:24:22 [ComputeNodesPostDeployment]: CREATE_FAILED CREATE aborted
 2016-05-29 07:24:22 [overcloud]: CREATE_FAILED Resource CREATE failed: Error:
resources.ControllerNodesPostDeployment.resources.ControllerServicesBaseDeployment_Step2.resources[0]:
Deployment to server failed: deploy_status_code: Deployment exited with non-zero status
code: 6
 Stack overcloud CREATE_FAILED
 Deployment failed:  Heat Stack create failed.
 + heat stack-list
 + grep -q CREATE_FAILED
 + deploy_status=1
 ++ heat resource-list --nested-depth 5 overcloud
 ++ grep FAILED
 ++ grep 'StructuredDeployment '
 ++ cut -d '|' -f3
 + for failed in '$(heat resource-list         --nested-depth 5 overcloud | grep
FAILED |
         grep '\''StructuredDeployment '\'' | cut -d
'\''|'\'' -f3)'
 + heat deployment-show 66bd3fbe-296b-4f88-87a7-5ceafd05c1d3
 + exit 1
 
 
 Minimal configuration deployments run with no errors and build completely functional
environment.
 
 
 However,   template :-
 
 
 #################################
 # Test Controller + 2*Compute nodes
 #################################
 control_memory: 6144
 compute_memory: 6144
 
 undercloud_memory: 8192
 
 # Giving the undercloud additional CPUs can greatly improve heat's
 # performance (and result in a shorter deploy time).
 undercloud_vcpu: 4
 
 # We set introspection to true and use only the minimal amount of nodes
 # for this job, but test all defaults otherwise.
 step_introspect: true
 
 # Define a single controller node and a single compute node.
 overcloud_nodes:
   - name: control_0
     flavor: control
 
   - name: compute_0
     flavor: compute
 
   - name: compute_1
     flavor: compute
 
 # Tell tripleo how we want things done.
 extra_args: >-
   --neutron-network-type vxlan
   --neutron-tunnel-types vxlan
   --ntp-server 
pool.ntp.org
 
 network_isolation: true
 
 
 Picks up new memory setting but doesn't create second Compute Node.
 
 Every time just Controller && (1)* Compute.
 
 
 HW - i74790 , 32 GB RAM
 
 
 Thanks.
 
 Boris
 
 ________________________________
 
 
 
 
 _______________________________________________
 rdo-list mailing list
 rdo-list(a)redhat.com
 
https://www.redhat.com/mailman/listinfo/rdo-list
 
 To unsubscribe: rdo-list-unsubscribe(a)redhat.com