Re: [rdo-list] [tripleo] [oooq] Deployment to baremetal fails; "No valid host was found"

Monday, 20 June 2016

Unfortunately, "No valid host" is the most generic error in OpenStack.
Maybe someday Nova will provide a better error message for that error,
but in the meantime we need to check the scheduler logs
(/var/log/nova/nova-scheduler.log) for more clues.

I usually grep for 'returned 0 host' to find which filter is failing to
match any hosts. That narrows the search space to investigate further
why the filter fails.

On 06/20/2016 05:30 AM, Gerard Braad wrote:
...
 Hi,

 as mentioned in a previous email, I am deploying baremetal nodes using
 the quickstart. At the moment I can introspect nodes correctly, but am
 unable to deploy to them.

 I performed the checks as mentioned in
 /tripleo-docs/doc/source/troubleshooting/troubleshooting-overcloud.rst:

 The flavor list I have is unchanged:
 [stack@undercloud ~]$ openstack flavor list

+--------------------------------------+---------------+------+------+-----------+-------+-----------+
 | ID                                   | Name          |  RAM | Disk |
 Ephemeral | VCPUs | Is Public |

+--------------------------------------+---------------+------+------+-----------+-------+-----------+
 | 2e72ffb5-c6d7-46fd-ad75-448c0ad6855f | baremetal     | 4096 |   40 |
         0 |     1 | True      |
 | 6b8b37e4-618d-4841-b5e3-f556ef27fd4d | oooq_compute  | 8192 |   49 |
         0 |     1 | True      |
 | 973b58c3-8730-4b1f-96b2-fda253c15dbc | oooq_control  | 8192 |   49 |
         0 |     1 | True      |
 | e22dc516-f53f-4a71-9793-29c614999801 | oooq_ceph     | 8192 |   49 |
         0 |     1 | True      |
 | e3dce62a-ac8d-41ba-9f97-84554b247faa | block-storage | 4096 |   40 |
         0 |     1 | True      |
 | f5fe9ba6-cf5c-4ef3-adc2-34f3b4381915 | control       | 4096 |   40 |
         0 |     1 | True      |
 | fabf81d8-44cb-4c25-8ed0-2afd124425db | compute       | 4096 |   40 |
         0 |     1 | True      |
 | fe512696-2294-40cb-9d20-12415f45c1a9 | ceph-storage  | 4096 |   40 |
         0 |     1 | True      |
 | ffc859af-dbfd-4e27-99fb-9ab02f4afa79 | swift-storage | 4096 |   40 |
         0 |     1 | True      |

+--------------------------------------+---------------+------+------+-----------+-------+-----------+

 In instackenv.json the nodes have been assigned as:
 [stack@undercloud ~]$ cat instackenv.json
 {
   "nodes":[
   {
     "_comment": "ooo1",
     "pm_type":"pxe_ipmitool",
     "mac": [
         "00:26:9e:9b:c3:36"
     ],
     "cpu": "16",
     "memory": "65536",
     "disk": "370",
     "arch": "x86_64",
     "pm_user":"root",
     "pm_password":"admin",
     "pm_addr":"10.0.108.126",
     "capabilities": "profile:control,boot_option:local"
   },
   {
     "_comment": "ooo2",
     "pm_type":"pxe_ipmitool",
     "mac": [
         "00:26:9e:9c:38:a6"
     ],
     "cpu": "16",
     "memory": "65536",
     "disk": "370",
     "arch": "x86_64",
     "pm_user":"root",
     "pm_password":"admin",
     "pm_addr":"10.0.108.127",
     "capabilities": "profile:compute,boot_option:local"
   }
   ]
 }

 [stack@undercloud ~]$ ironic node-list

+--------------------------------------+------+---------------+-------------+--------------------+-------------+
 | UUID                                 | Name | Instance UUID | Power
 State | Provisioning State | Maintenance |

+--------------------------------------+------+---------------+-------------+--------------------+-------------+
 | 0956df36-b642-44b8-a67f-0df88270372b | None | None          | power
 off   | manageable         | False       |
 | cc311355-f373-4e5c-99be-31ba3185639d | None | None          | power
 off   | manageable         | False       |

+--------------------------------------+------+---------------+-------------+--------------------+-------------+

 And manually I perform the introspection:

 [stack@undercloud ~]$ openstack baremetal introspection bulk start
 Setting nodes for introspection to manageable...
 Starting introspection of node: 0956df36-b642-44b8-a67f-0df88270372b
 Starting introspection of node: cc311355-f373-4e5c-99be-31ba3185639d
 Waiting for introspection to finish...
 Introspection for UUID 0956df36-b642-44b8-a67f-0df88270372b finished
 successfully.
 Introspection for UUID cc311355-f373-4e5c-99be-31ba3185639d finished
 successfully.
 Setting manageable nodes to available...
 Node 0956df36-b642-44b8-a67f-0df88270372b has been set to available.
 Node cc311355-f373-4e5c-99be-31ba3185639d has been set to available.
 Introspection completed.

 [stack@undercloud ~]$ ironic node-list

+--------------------------------------+------+---------------+-------------+--------------------+-------------+
 | UUID                                 | Name | Instance UUID | Power
 State | Provisioning State | Maintenance |

+--------------------------------------+------+---------------+-------------+--------------------+-------------+
 | 0956df36-b642-44b8-a67f-0df88270372b | None | None          | power
 off   | available          | False       |
 | cc311355-f373-4e5c-99be-31ba3185639d | None | None          | power
 off   | available          | False       |

+--------------------------------------+------+---------------+-------------+--------------------+-------------+

 After this, I start the deployment. I have defined the compute and
 control flavor to be of the respective type.

 [stack@undercloud ~]$ ./overcloud-deploy.sh
 <snip>
 + openstack overcloud deploy --templates --timeout 60 --control-scale
 1 --control-flavor control --compute-scale 1 --compute-flavor compute
 --ntp-server pool.ntp.org -e /tmp/deploy_env.yaml
 Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates
 2016-06-20 08:18:33 [overcloud]: CREATE_IN_PROGRESS Stack CREATE started
 2016-06-20 08:18:33 [HorizonSecret]: CREATE_IN_PROGRESS state changed
 2016-06-20 08:18:33 [RabbitCookie]: CREATE_IN_PROGRESS state changed
 2016-06-20 08:18:33 [PcsdPassword]: CREATE_IN_PROGRESS state changed
 2016-06-20 08:18:33 [MysqlClusterUniquePart]: CREATE_IN_PROGRESS state changed
 2016-06-20 08:18:33 [MysqlRootPassword]: CREATE_IN_PROGRESS state changed
 2016-06-20 08:18:33 [Networks]: CREATE_IN_PROGRESS state changed
 2016-06-20 08:18:34 [VipConfig]: CREATE_IN_PROGRESS state changed
 2016-06-20 08:18:34 [HeatAuthEncryptionKey]: CREATE_IN_PROGRESS state changed
 2016-06-20 08:18:34 [overcloud-VipConfig-i4dgmk37z6hg]:
 CREATE_IN_PROGRESS Stack CREATE started
 2016-06-20 08:18:34 [overcloud-Networks-4pb3htxq7rkd]:
 CREATE_IN_PROGRESS Stack CREATE started
 <snip>
 2016-06-20 08:19:06 [Controller]: CREATE_FAILED ResourceInError:
 resources.Controller: Went to status ERROR due to "Message: No valid
 host was found. There are not enough hosts available., Code: 500"
 2016-06-20 08:19:06 [Controller]: DELETE_IN_PROGRESS state changed
 2016-06-20 08:19:06 [NovaCompute]: CREATE_FAILED ResourceInError:
 resources.NovaCompute: Went to status ERROR due to "Message: No valid
 host was found. There are not enough hosts available., Code: 500"
 2016-06-20 08:19:06 [NovaCompute]: DELETE_IN_PROGRESS state changed
 2016-06-20 08:19:09 [Controller]: DELETE_COMPLETE state changed
 2016-06-20 08:19:09 [NovaCompute]: DELETE_COMPLETE state changed
 2016-06-20 08:19:12 [Controller]: CREATE_IN_PROGRESS state changed
 2016-06-20 08:19:12 [NovaCompute]: CREATE_IN_PROGRESS state changed
 2016-06-20 08:19:14 [Controller]: CREATE_FAILED ResourceInError:
 resources.Controller: Went to status ERROR due to "Message: No valid
 host was found. There are not enough hosts available., Code: 500"
 2016-06-20 08:19:14 [Controller]: DELETE_IN_PROGRESS state changed
 2016-06-20 08:19:14 [NovaCompute]: CREATE_FAILED ResourceInError:
 resources.NovaCompute: Went to status ERROR due to "Message: No valid
 host was found. There are not enough hosts available., Code: 500"
 2016-06-20 08:19:14 [NovaCompute]: DELETE_IN_PROGRESS state changed

 But as you can see, the deployment fails.

 I check the introspection information and verify that the disk, local
 memory and cpus are matching or exceeding the flavor:

 [stack@undercloud ~]$ ironic node-show 0956df36-b642-44b8-a67f-0df88270372b

+------------------------+-------------------------------------------------------------------------+
 | Property               | Value
                             |

+------------------------+-------------------------------------------------------------------------+
 | chassis_uuid           |
                             |
 | clean_step             | {}
                             |
 | console_enabled        | False
                             |
 | created_at             | 2016-06-20T05:51:17+00:00
                             |
 | driver                 | pxe_ipmitool
                             |
 | driver_info            | {u'ipmi_password': u'******',
 u'ipmi_address': u'10.0.108.126',         |
 |                        | u'ipmi_username': u'root',
 u'deploy_kernel':                            |
 |                        | u'07c794a6-b427-4e75-ba58-7c555abbf2f8',
 u'deploy_ramdisk': u'67a66b7b- |
 |                        | 637f-4b25-bcef-ed39ae32a1f4'}
                             |
 | driver_internal_info   | {}
                             |
 | extra                  | {u'hardware_swift_object':
 u'extra_hardware-0956df36-b642-44b8-a67f-    |
 |                        | 0df88270372b'}
                             |
 | inspection_finished_at | None
                             |
 | inspection_started_at  | None
                             |
 | instance_info          | {}
                             |
 | instance_uuid          | None
                             |
 | last_error             | None
                             |
 | maintenance            | False
                             |
 | maintenance_reason     | None
                             |
 | name                   | None
                             |
 | power_state            | power off
                             |
 | properties             | {u'memory_mb': u'65536', u'cpu_arch':
 u'x86_64', u'local_gb': u'371',   |
 |                        | u'cpus': u'16', u'capabilities':
 u'profile:control,boot_option:local'}  |
 | provision_state        | available
                             |
 | provision_updated_at   | 2016-06-20T07:32:46+00:00
                             |
 | raid_config            |
                             |
 | reservation            | None
                             |
 | target_power_state     | None
                             |
 | target_provision_state | None
                             |
 | target_raid_config     |
                             |
 | updated_at             | 2016-06-20T07:32:46+00:00
                             |
 | uuid                   | 0956df36-b642-44b8-a67f-0df88270372b
                             |

+------------------------+-------------------------------------------------------------------------+

 And also the hypervisor stats are set, but only matching the node count.
 [stack@undercloud ~]$ nova hypervisor-stats
 +----------------------+-------+
 | Property             | Value |
 +----------------------+-------+
 | count                | 2     |
 | current_workload     | 0     |
 | disk_available_least | 0     |
 | free_disk_gb         | 0     |
 | free_ram_mb          | 0     |
 | local_gb             | 0     |
 | local_gb_used        | 0     |
 | memory_mb            | 0     |
 | memory_mb_used       | 0     |
 | running_vms          | 0     |
 | vcpus                | 0     |
 | vcpus_used           | 0     |
 +----------------------+-------+

 Registering the nodes as profile:baremetal has the same effect.

 What other parameters are used in making the decision if a node can be
 deployed to? I probably miss a small detail... what can I check to
 make sure the deployment starts?

 regards,

 Gerard

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [rdo-list] [tripleo] [oooq] Deployment to baremetal fails; "No valid host was found"