[Rdo-list] [RDO-Manager] Solving "nova is locked" errors from Ironic

Dmitry Tantsur dtantsur at redhat.com
Tue Jun 23 11:26:08 UTC 2015


Hi all,

So, Ironic task manager is attacking us again:
https://bugzilla.redhat.com/show_bug.cgi?id=1233452

Previously we already had
https://bugzilla.redhat.com/show_bug.cgi?id=1212134
I've implemented retries upstream in ironicclient and backported to our 
packages. Later I had to bump the retry timeout in instack-undercloud to 
1 minute:
https://github.com/rdo-management/instack-undercloud/blob/master/scripts/instack-ironic-deployment#L72-L74

Now we have the same problem in another place and I wonder how to fix 
it. I have 2 obvious idea:
1. Patch ironicclient to have longer default timeout (2 mins?)
2. Update stackrc to carry longer IRONIC_MAX_RETRIES and 
IRONIC_RETRY_INTERVAL

I'd prefer the latter, as it does not touch ironicclient package, only 
undercloud installation tool. It can also be changed more easily in run 
time. WDYT?

I wonder what the root cause is as well. I suspect some very slow BMC's 
take too much time to do power actions or power syncs. It's possible 
we'll make a wrong guess, and the problem will persist despite becoming 
rare.

Thanks,
Dmitry




More information about the dev mailing list