________________________________
From: rdo-list-bounces(a)redhat.com <rdo-list-bounces(a)redhat.com> on behalf of Dan
Sneddon <dsneddon(a)redhat.com>
Sent: Wednesday, June 29, 2016 1:46 PM
To: rdo-list(a)redhat.com
Subject: Re: [rdo-list] HA overcloud-deploy.sh crashes again (
ControllerOvercloudServicesDeployment_Step4 )
On 06/29/2016 10:42 AM, Dan Sneddon wrote:
On 06/29/2016 07:03 AM, Boris Derzhavets wrote:
> Boris Derzhavets has shared a OneDrive file with you. To view it, click
> the link below.
>
> <
https://1drv.ms/u/s!AqjiDzRpwaKogSHAekH8ZluOaclk>
[
https://p.sfx.ms/icons/v2/Large/Default.png]<https://1drv.ms/u/s!AqjiD...
HeatCrash2.txt 1.gz<https://1drv.ms/u/s!AqjiDzRpwaKogSHAekH8ZluOaclk>
1drv.ms
GZ File
>
> HeatCrash2.txt 1.gz <
https://1drv.ms/u/s!AqjiDzRpwaKogSHAekH8ZluOaclk>
> [HeatCrash2.txt 1.gz]
>
> Reattach gzip archive via One Drive
>
>
>
> -----------------------------------------------------------------------
> *From:* rdo-list-bounces(a)redhat.com <rdo-list-bounces(a)redhat.com> on
> behalf of Boris Derzhavets <bderzhavets(a)hotmail.com>
> *Sent:* Wednesday, June 29, 2016 9:36 AM
> *To:* John Trowbridge; shardy(a)redhat.com
> *Cc:* rdo-list(a)redhat.com
> *Subject:* [rdo-list] HA overcloud-deploy.sh crashes again (
> ControllerOvercloudServicesDeployment_Step4 )
>
>
> Attempt to follow steps suggested
> in
http://hardysteven.blogspot.ru/2016/06/tripleo-partial-stack-updates.html
>
>
> ./deploy-overstack crashes
>
>
> 2016-06-29 12:42:41
>
[overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk-ControllerOvercloudServicesDeployment_Step4-nzdoizlgrmx2]:
> CREATE_FAILED Resource CREATE failed: Error: resources[0]: Deployment
> to server failed: deploy_status_code : Deployment exited with non-zero
> status code: 6
> 2016-06-29 12:42:42 [ControllerOvercloudServicesDeployment_Step4]:
> CREATE_FAILED Error:
> resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
> Deployment to server failed: deploy_status_code: Deployment exited with
> non-zero status code: 6
> 2016-06-29 12:42:43
> [overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk]: CREATE_FAILED
> Resource CREATE failed: Error:
> resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
> Deployment to server failed: deploy_status_code: Deployment exited with
> non-zero status code: 6
> 2016-06-29 12:42:44 [ControllerNodesPostDeployment]: CREATE_FAILED
> Error:
>
resources.ControllerNodesPostDeployment.resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
> Deployment to server failed: deploy_status_code: Deployment exited with
> non-zero status code: 6
> 2016-06-29 12:42:44 [2]: SIGNAL_COMPLETE Unknown
> 2016-06-29 12:42:45 [2]: SIGNAL_COMPLETE Unknown
> 2016-06-29 12:42:45 [2]: SIGNAL_COMPLETE Unknown
> 2016-06-29 12:42:46 [overcloud]: CREATE_FAILED Resource CREATE failed:
> Error:
>
resources.ControllerNodesPostDeployment.resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
> Deployment to server failed: deploy_status_code: Deployment exited with
> non-zero status code: 6
> 2016-06-29 12:42:46 [2]: SIGNAL_COMPLETE Unknown
> 2016-06-29 12:42:47 [2]: SIGNAL_COMPLETE Unknown
> 2016-06-29 12:42:47 [ControllerDeployment]: SIGNAL_COMPLETE Unknown
> 2016-06-29 12:42:48 [NetworkDeployment]: SIGNAL_COMPLETE Unknown
> 2016-06-29 12:42:48 [2]: SIGNAL_COMPLETE Unknown
> Stack overcloud CREATE_FAILED
> Deployment failed: Heat Stack create failed.
> + heat stack-list
> + grep -q CREATE_FAILED
> + deploy_status=1
> ++ heat resource-list --nested-depth 5 overcloud
> ++ grep FAILED
> ++ grep 'StructuredDeployment '
> ++ cut -d '|' -f3
> + for failed in '$(heat resource-list --nested-depth 5
> overcloud | grep FAILED |
> grep '\''StructuredDeployment '\'' | cut -d
'\''|'\'' -f3)'
> + heat deployment-show 655c77fc-6a78-4cca-b4b7-a153a3f4ad52
> + for failed in '$(heat resource-list --nested-depth 5
> overcloud | grep FAILED |
> grep '\''StructuredDeployment '\'' | cut -d
'\''|'\'' -f3)'
> + heat deployment-show 1fe5153c-e017-4ee5-823a-3d1524430c1d
> + for failed in '$(heat resource-list --nested-depth 5
> overcloud | grep FAILED |
> grep '\''StructuredDeployment '\'' | cut -d
'\''|'\'' -f3)'
> + heat deployment-show bf6f25f4-d812-41e9-a7a8-122de619a624
> + exit 1
>
> *****************************
> Troubleshooting steps :-
> *****************************
>
> [stack@undercloud ~]$ . stackrc
> [stack@undercloud ~]$ heat resource-list overcloud | grep
> ControllerNodesPost
> | ControllerNodesPostDeployment |
> f1d6a474-c946-46bf-ab0c-2fdaeb55d0b3 |
> OS::TripleO::ControllerPostDeployment | CREATE_FAILED |
> 2016-06-29T12:11:21 |
>
>
> [stack@undercloud ~]$ heat stack-list -n | grep "^|
> f1d6a474-c946-46bf-ab0c-2fdaeb55d0b3"
> | f1d6a474-c946-46bf-ab0c-2fdaeb55d0b3 |
> overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk
> | CREATE_FAILED | 2016-06-29T12:31:11 | None |
> 17f82f6e-e0ca-44c6-9058-de82c00d4f79 |
>
>
>
> [stack@undercloud ~]$ heat event-list -m
> f1d6a474-c946-46bf-ab0c-2fdaeb55d0b3
> overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk
>
>
+------------------------------------------------------+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+---------------------+
> | resource_name |
> id |
> resource_status_reason
> | resource_status | event_time |
>
+------------------------------------------------------+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+---------------------+
> | overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk |
> 10ec0cf9-b3c9-4191-9966-3f4d47f27e2a | Stack CREATE started
> . . . . . . . . . . . . . . . . .
> Step1,2,3 succeeded
> . . . . . . . . . . . . . . . . .
>
> | CREATE_IN_PROGRESS | 2016-06-29T12:31:14 |
> | ControllerPuppetConfig |
> a2a1df33-5106-425c-b16d-8d2df709b19f | state
> changed
> | CREATE_COMPLETE | 2016-06-29T12:35:02 |
> | ControllerOvercloudServicesDeployment_Step4 |
> 1e151333-4de5-4e7b-907c-ea0f42d31a47 | state
> changed
> | CREATE_IN_PROGRESS | 2016-06-29T12:35:03 |
> | ControllerOvercloudServicesDeployment_Step4 |
> 7bf36334-3d92-4554-b6c0-41294a072ab6 | Error:
> resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
> Deployment to server failed: deploy_status_code: Deployment exited with
> non-zero status code: 6 | CREATE_FAILED |
> 2016-06-29T12:42:42 |
> | overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk
> | e72fb6f4-c2aa-4fe8-9bd1-5f5ad152685c | Resource CREATE failed:
> Error:
> resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
> Deployment to server failed: deploy_status_code: Deployment exited with
> non-zero status code: 6 | CREATE_FAILED | 2016-06-29T12:42:43 |
>
+------------------------------------------------------+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+---------------------+
>
> [stack@undercloud ~]$ heat stack-show
> overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk | grep
> NodeConfigIdentifiers
> | | "NodeConfigIdentifiers":
> "{u'deployment_identifier': 1467202276, u'controller_config':
{u'1':
> u'os-apply-config deployment 796df02a-7550-414b-a084-8b591a13e6db
> completed,Root CA cert injection not enabled.,TLS not enabled.,None,',
> u'0': u'os-apply-config deployment 613ec889-d852-470a-8e4c-6e243e1d2033
> completed,Root CA cert injection not enabled.,TLS not enabled.,None,',
> u'2': u'os-apply-config deployment c8b099d0-3af4-4ba0-a056-a0ce60f40e2d
> completed,Root CA cert injection not enabled.,TLS not enabled.,None,'},
> u'allnodes_extra': u'none'}" |
>
> However, when stack creating crashed update wouldn't help.
>
> [stack@undercloud ~]$ heat stack-update -x
> overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk -e update_env.yaml
> ERROR: PATCH update to non-COMPLETE stack is not supported.
>
> DUE TO :-
>
> [stack@undercloud ~]$ heat stack-list
>
+--------------------------------------+------------+---------------+---------------------+--------------+
> | id | stack_name | stack_status |
> creation_time | updated_time |
>
+--------------------------------------+------------+---------------+---------------------+--------------+
> | 17f82f6e-e0ca-44c6-9058-de82c00d4f79 | overcloud | CREATE_FAILED |
> 2016-06-29T12:11:20 | None |
>
+--------------------------------------+------------+---------------+---------------------+------
>
>
> Complete error file `heat deployment-show
> 655c77fc-6a78-4cca-b4b7-a153a3f4ad52` is attached a gzip archive.
>
>
> Thanks.
>
> Boris.
>
>
>
> _______________________________________________
> rdo-list mailing list
> rdo-list(a)redhat.com
>
https://www.redhat.com/mailman/listinfo/rdo-list
>
> To unsubscribe: rdo-list-unsubscribe(a)redhat.com
>
The failure occurred during the post-deployment, which means that the
initial deployment succeeded, but then the steps that are done to the
completed overcloud failed.
This is most commonly attributable to network problems between the
Undercloud and the Overcloud Public API. The Undercloud needs to reach
the Public API in order to do some of the post-configuration steps. If
this API isn't reachable, you end up with the error you saw above.
You can test this connectivity by pinging the Public API VIP from the
Undercloud. Starting with the failed deployment, run "neutron
port-list" against the Underlcloud and look for the IP on the port
named "public_virtual_ip". You should be able to ping this address from
the Undercloud. If you can't reach that IP, then you need to check the
connectivity/routing between the Undercloud and the External network on
the Overcloud.
I should also mention common causes of this problem:
* Incorrect value for ExternalInterfaceDefaultRoute in the network
environment file.
* Controllers do not have the default route on the External network in
the NIC config templates (required for reachability from remote subnets).
* Incorrect subnet mask on the ExternalNetCidr in the network environment.
* Incorrect ExternalAllocationPools values in the network environment.
* Incorrect Ethernet switch config for the Controllers.
Issue has been reproduced with exactly same error 4 times
starting since 06/25/16 on daily basis with exactly same error at Step4
of overcloud-ControllerNodesPostDeployment.
In meantime I cannot reproduce the error.
Config 3xNode HA Controller + 1xCompute works .
There was one more issue 3xNode HA Controller + 2xCompute
failed immediately when overcloud-deploy.sh started due to
only 4 nodes could be introspected. I will test it tomorrow morning.
Thanks a lot.
Boris.
--
Dan Sneddon | Principal OpenStack Engineer
dsneddon(a)redhat.com |
redhat.com/openstack
650.254.4025 | dsneddon:irc @dxs:twitter
_______________________________________________
rdo-list mailing list
rdo-list(a)redhat.com
https://www.redhat.com/mailman/listinfo/rdo-list
To unsubscribe: rdo-list-unsubscribe(a)redhat.com