[Rdo-list] [rdo-manager] - frustrating node - dev

Thursday, 28 January 2016

hi all

i am attempting to build a 2 node basic overcloud. my previous emails have
been talking about the problems i encountered.

what i have :
- 1 vm called rdo with undercloud AND overcloud. this one is has not been
updated since november and i keep restoring snapshots to that date.
- a 2nd vm called rdo2, full updated, overcloud fails to deploy to a
specific physical node

observations: (unscientific!)
the 2 physical nodes are both good. i tested by redeploying on rdo again
and again. i even swapped their order in instackenv.json and redeploying
succesfully from instackenv.json step.
however i have a particular machine that refuses to deploy. it doesnt
matter what order. if it is the controller, it fails, if it is the compute
it fails.
i am using the same flavour on both rdo vms. but again, i believe i have
ruled out that variable.

how far did i reach?
over the past few days i have opened the console and watched this
particular machine pxe boot, get an ip, reboot, change its hostname to
reflect the ip, reboot to localhost.localdomain (?) and the power off. i am
not saying i sat down and watched it for the entire 209 minutes but i have
observed it unscientifically

last error:
Deploying templates in the directory
/usr/share/openstack-tripleo-heat-templates
Stack failed with status: Resource CREATE failed: resources.Controller:
ResourceInError: resources[0].resources.Controller: Went to status ERROR
due to "Message: No valid host was found. There are not enough hosts
available., Code: 500"
Heat Stack create failed.

real    209m19.252s
user    0m21.695s
sys     0m2.402s

what am i looking for?:
what do i look for in the logs? and my logs are huge; they dont get rotated
for some reason
i would like to know the reason this particular physical machine refuses to
deploy, so i can fix it. i believe i have eliminated all variables except
the machine itself and it has me puzzled and frustrated as i need to move
on to the next stage of network isolation.

any ideas?

thanks
-- 

<https://candidate.peoplecert.org/ReportsLink.aspx?argType=1&id=13D642...

*805010942448935*
<https://www.redhat.com/wapps/training/certification/verify.html?certNumbe...

*GR750055912MA*
<https://candidate.peoplecert.org/ReportsLink.aspx?argType=1&id=13D642...

*Link to me on LinkedIn <http://www.linkedin.com/in/mohammedarafa>*

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

[Rdo-list] [rdo-manager] - frustrating node