hi all

i am attempting to build a 2 node basic overcloud. my previous emails have been talking about the problems i encountered.

what i have :
- 1 vm called rdo with undercloud AND overcloud. this one is has not been updated since november and i keep restoring snapshots to that date.
- a 2nd vm called rdo2, full updated, overcloud fails to deploy to a specific physical node

observations: (unscientific!)
the 2 physical nodes are both good. i tested by redeploying on rdo again and again. i even swapped their order in instackenv.json and redeploying succesfully from instackenv.json step.
however i have a particular machine that refuses to deploy. it doesnt matter what order. if it is the controller, it fails, if it is the compute it fails.
i am using the same flavour on both rdo vms. but again, i believe i have ruled out that variable.

how far did i reach?
over the past few days i have opened the console and watched this particular machine pxe boot, get an ip, reboot, change its hostname to reflect the ip, reboot to localhost.localdomain (?) and the power off. i am not saying i sat down and watched it for the entire 209 minutes but i have observed it unscientifically

last error:
Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates
Stack failed with status: Resource CREATE failed: resources.Controller: ResourceInError: resources[0].resources.Controller: Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500"
Heat Stack create failed.

real    209m19.252s
user    0m21.695s
sys     0m2.402s

what am i looking for?:
what do i look for in the logs? and my logs are huge; they dont get rotated for some reason
i would like to know the reason this particular physical machine refuses to deploy, so i can fix it. i believe i have eliminated all variables except the machine itself and it has me puzzled and frustrated as i need to move on to the next stage of network isolation.

any ideas?

thanks