On 06/29/2016 10:42 AM, Dan Sneddon wrote:
> On 06/29/2016 07:03 AM, Boris Derzhavets wrote:
>> Boris Derzhavets has shared a OneDrive file with you. To view it, click
>> the link below.
>>
>> <
https://1drv.ms/u/s!AqjiDzRpwaKogSHAekH8ZluOaclk>
>>
>> HeatCrash2.txt 1.gz <
https://1drv.ms/u/s!AqjiDzRpwaKogSHAekH8ZluOaclk>
>> [HeatCrash2.txt 1.gz]
>>
>> Reattach gzip archive via One Drive
>>
>>
>>
>> -----------------------------------------------------------------------
>> *From:* rdo-list-bounces@redhat.com <rdo-list-bounces@redhat.com> on
>> behalf of Boris Derzhavets <bderzhavets@hotmail.com>
>> *Sent:* Wednesday, June 29, 2016 9:36 AM
>> *To:* John Trowbridge; shardy@redhat.com
>> *Cc:* rdo-list@redhat.com
>> *Subject:* [rdo-list] HA overcloud-deploy.sh crashes again (
>> ControllerOvercloudServicesDeployment_Step4 )
>>
>>
>> Attempt to follow steps suggested
>> in
http://hardysteven.blogspot.ru/2016/06/tripleo-partial-stack-updates.html
>>
>>
>> ./deploy-overstack crashes
>>
>>
>> 2016-06-29 12:42:41
>> [overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk-ControllerOvercloudServicesDeployment_Step4-nzdoizlgrmx2]:
>> CREATE_FAILED Resource CREATE failed: Error: resources[0]: Deployment
>> to server failed: deploy_status_code : Deployment exited with non-zero
>> status code: 6
>> 2016-06-29 12:42:42 [ControllerOvercloudServicesDeployment_Step4]:
>> CREATE_FAILED Error:
>> resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
>> Deployment to server failed: deploy_status_code: Deployment exited with
>> non-zero status code: 6
>> 2016-06-29 12:42:43
>> [overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk]: CREATE_FAILED
>> Resource CREATE failed: Error:
>> resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
>> Deployment to server failed: deploy_status_code: Deployment exited with
>> non-zero status code: 6
>> 2016-06-29 12:42:44 [ControllerNodesPostDeployment]: CREATE_FAILED
>> Error:
>> resources.ControllerNodesPostDeployment.resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
>> Deployment to server failed: deploy_status_code: Deployment exited with
>> non-zero status code: 6
>> 2016-06-29 12:42:44 [2]: SIGNAL_COMPLETE Unknown
>> 2016-06-29 12:42:45 [2]: SIGNAL_COMPLETE Unknown
>> 2016-06-29 12:42:45 [2]: SIGNAL_COMPLETE Unknown
>> 2016-06-29 12:42:46 [overcloud]: CREATE_FAILED Resource CREATE failed:
>> Error:
>> resources.ControllerNodesPostDeployment.resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
>> Deployment to server failed: deploy_status_code: Deployment exited with
>> non-zero status code: 6
>> 2016-06-29 12:42:46 [2]: SIGNAL_COMPLETE Unknown
>> 2016-06-29 12:42:47 [2]: SIGNAL_COMPLETE Unknown
>> 2016-06-29 12:42:47 [ControllerDeployment]: SIGNAL_COMPLETE Unknown
>> 2016-06-29 12:42:48 [NetworkDeployment]: SIGNAL_COMPLETE Unknown
>> 2016-06-29 12:42:48 [2]: SIGNAL_COMPLETE Unknown
>> Stack overcloud CREATE_FAILED
>> Deployment failed: Heat Stack create failed.
>> + heat stack-list
>> + grep -q CREATE_FAILED
>> + deploy_status=1
>> ++ heat resource-list --nested-depth 5 overcloud
>> ++ grep FAILED
>> ++ grep 'StructuredDeployment '
>> ++ cut -d '|' -f3
>> + for failed in '$(heat resource-list --nested-depth 5
>> overcloud | grep FAILED |
>> grep '\''StructuredDeployment '\'' | cut -d '\''|'\'' -f3)'
>> + heat deployment-show 655c77fc-6a78-4cca-b4b7-a153a3f4ad52
>> + for failed in '$(heat resource-list --nested-depth 5
>> overcloud | grep FAILED |
>> grep '\''StructuredDeployment '\'' | cut -d '\''|'\'' -f3)'
>> + heat deployment-show 1fe5153c-e017-4ee5-823a-3d1524430c1d
>> + for failed in '$(heat resource-list --nested-depth 5
>> overcloud | grep FAILED |
>> grep '\''StructuredDeployment '\'' | cut -d '\''|'\'' -f3)'
>> + heat deployment-show bf6f25f4-d812-41e9-a7a8-122de619a624
>> + exit 1
>>
>> *****************************
>> Troubleshooting steps :-
>> *****************************
>>
>> [stack@undercloud ~]$ . stackrc
>> [stack@undercloud ~]$ heat resource-list overcloud | grep
>> ControllerNodesPost
>> | ControllerNodesPostDeployment |
>> f1d6a474-c946-46bf-ab0c-2fdaeb55d0b3 |
>> OS::TripleO::ControllerPostDeployment | CREATE_FAILED |
>> 2016-06-29T12:11:21 |
>>
>>
>> [stack@undercloud ~]$ heat stack-list -n | grep "^|
>> f1d6a474-c946-46bf-ab0c-2fdaeb55d0b3"
>> | f1d6a474-c946-46bf-ab0c-2fdaeb55d0b3 |
>> overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk
>> | CREATE_FAILED | 2016-06-29T12:31:11 | None |
>> 17f82f6e-e0ca-44c6-9058-de82c00d4f79 |
>>
>>
>>
>> [stack@undercloud ~]$ heat event-list -m
>> f1d6a474-c946-46bf-ab0c-2fdaeb55d0b3
>> overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk
>>
>> +------------------------------------------------------+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+---------------------+
>> | resource_name |
>> id |
>> resource_status_reason
>> | resource_status | event_time |
>> +------------------------------------------------------+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+---------------------+
>> | overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk |
>> 10ec0cf9-b3c9-4191-9966-3f4d47f27e2a | Stack CREATE started
>> . . . . . . . . . . . . . . . . .
>> Step1,2,3 succeeded
>> . . . . . . . . . . . . . . . . .
>>
>> | CREATE_IN_PROGRESS | 2016-06-29T12:31:14 |
>> | ControllerPuppetConfig |
>> a2a1df33-5106-425c-b16d-8d2df709b19f | state
>> changed
>> | CREATE_COMPLETE | 2016-06-29T12:35:02 |
>> | ControllerOvercloudServicesDeployment_Step4 |
>> 1e151333-4de5-4e7b-907c-ea0f42d31a47 | state
>> changed
>> | CREATE_IN_PROGRESS | 2016-06-29T12:35:03 |
>> | ControllerOvercloudServicesDeployment_Step4 |
>> 7bf36334-3d92-4554-b6c0-41294a072ab6 | Error:
>> resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
>> Deployment to server failed: deploy_status_code: Deployment exited with
>> non-zero status code: 6 | CREATE_FAILED |
>> 2016-06-29T12:42:42 |
>> | overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk
>> | e72fb6f4-c2aa-4fe8-9bd1-5f5ad152685c | Resource CREATE failed:
>> Error:
>> resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
>> Deployment to server failed: deploy_status_code: Deployment exited with
>> non-zero status code: 6 | CREATE_FAILED | 2016-06-29T12:42:43 |
>> +------------------------------------------------------+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+---------------------+
>>
>> [stack@undercloud ~]$ heat stack-show
>> overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk | grep
>> NodeConfigIdentifiers
>> | | "NodeConfigIdentifiers":
>> "{u'deployment_identifier': 1467202276, u'controller_config': {u'1':
>> u'os-apply-config deployment 796df02a-7550-414b-a084-8b591a13e6db
>> completed,Root CA cert injection not enabled.,TLS not enabled.,None,',
>> u'0': u'os-apply-config deployment 613ec889-d852-470a-8e4c-6e243e1d2033
>> completed,Root CA cert injection not enabled.,TLS not enabled.,None,',
>> u'2': u'os-apply-config deployment c8b099d0-3af4-4ba0-a056-a0ce60f40e2d
>> completed,Root CA cert injection not enabled.,TLS not enabled.,None,'},
>> u'allnodes_extra': u'none'}" |
>>
>> However, when stack creating crashed update wouldn't help.
>>
>> [stack@undercloud ~]$ heat stack-update -x
>> overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk -e update_env.yaml
>> ERROR: PATCH update to non-COMPLETE stack is not supported.
>>
>> DUE TO :-
>>
>> [stack@undercloud ~]$ heat stack-list
>> +--------------------------------------+------------+---------------+---------------------+--------------+
>> | id | stack_name | stack_status |
>> creation_time | updated_time |
>> +--------------------------------------+------------+---------------+---------------------+--------------+
>> | 17f82f6e-e0ca-44c6-9058-de82c00d4f79 | overcloud | CREATE_FAILED |
>> 2016-06-29T12:11:20 | None |
>> +--------------------------------------+------------+---------------+---------------------+------
>>
>>
>> Complete error file `heat deployment-show
>> 655c77fc-6a78-4cca-b4b7-a153a3f4ad52` is attached a gzip archive.
>>
>>
>> Thanks.
>>
>> Boris.
>>
>>
>>
>> _______________________________________________
>> rdo-list mailing list
>> rdo-list@redhat.com
>>
https://www.redhat.com/mailman/listinfo/rdo-list
>>
>> To unsubscribe: rdo-list-unsubscribe@redhat.com
>>
>
> The failure occurred during the post-deployment, which means that the
> initial deployment succeeded, but then the steps that are done to the
> completed overcloud failed.
>
> This is most commonly attributable to network problems between the
> Undercloud and the Overcloud Public API. The Undercloud needs to reach
> the Public API in order to do some of the post-configuration steps. If
> this API isn't reachable, you end up with the error you saw above.
>
> You can test this connectivity by pinging the Public API VIP from the
> Undercloud. Starting with the failed deployment, run "neutron
> port-list" against the Underlcloud and look for the IP on the port
> named "public_virtual_ip". You should be able to ping this address from
> the Undercloud. If you can't reach that IP, then you need to check the
> connectivity/routing between the Undercloud and the External network on
> the Overcloud.
>
I should also mention common causes of this problem:
* Incorrect value for ExternalInterfaceDefaultRoute in the network
environment file.
* Controllers do not have the default route on the External network in
the NIC config templates (required for reachability from remote subnets).
* Incorrect subnet mask on the ExternalNetCidr in the network environment.
* Incorrect ExternalAllocationPools values in the network environment.
* Incorrect Ethernet switch config for the Controllers.
Issue has been reproduced with exactly same error 4 times
starting since 06/25/16 on daily basis with exactly same error at Step4
of
overcloud-ControllerNodesPostDeployment.
In meantime I cannot reproduce the error.
Config 3xNode HA Controller + 1xCompute works .
There was one more issue
3xNode HA Controller + 2xCompute
failed immediately when overcloud-deploy.sh started due to
only 4 nodes could be introspected. I will test it tomorrow morning.
Thanks a lot.
Boris.
--
Dan Sneddon | Principal OpenStack Engineer
dsneddon@redhat.com | redhat.com/openstack
650.254.4025 | dsneddon:irc @dxs:twitter
_______________________________________________
rdo-list mailing list
rdo-list@redhat.com
https://www.redhat.com/mailman/listinfo/rdo-list
To unsubscribe: rdo-list-unsubscribe@redhat.com