[Rdo-list] [RDO-Manager] Solving "nova is locked" errors from Ironic

Imre Farkas ifarkas at redhat.com
Tue Jun 23 16:05:08 UTC 2015


On 06/23/2015 05:51 PM, Dmitry Tantsur wrote:
> On 06/23/2015 05:39 PM, Imre Farkas wrote:
>> On 06/23/2015 01:26 PM, Dmitry Tantsur wrote:
>>> Hi all,
>>>
>>> So, Ironic task manager is attacking us again:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1233452
>>>
>>> Previously we already had
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1212134
>>> I've implemented retries upstream in ironicclient and backported to our
>>> packages. Later I had to bump the retry timeout in instack-undercloud to
>>> 1 minute:
>>> https://github.com/rdo-management/instack-undercloud/blob/master/scripts/instack-ironic-deployment#L72-L74
>>>
>>>
>>>
>>>
>>> Now we have the same problem in another place and I wonder how to fix
>>> it. I have 2 obvious idea:
>>> 1. Patch ironicclient to have longer default timeout (2 mins?)
>>> 2. Update stackrc to carry longer IRONIC_MAX_RETRIES and
>>> IRONIC_RETRY_INTERVAL
>>>
>>> I'd prefer the latter, as it does not touch ironicclient package, only
>>> undercloud installation tool. It can also be changed more easily in run
>>> time. WDYT?
>>
>> Hi Dmitry,
>>
>> These env variables work with the unified CLI too, not just with the
>> instack script, right? I assume the answer is yes (as it seems the
>> script is using the CLI), so #2 seems to be a good option. However, I
>> was wondering whether others will also hit this issue outside RDO, which
>> would make #1 a better option. Considering that, we can try to fix it in
>> ironicclient and if there's any push-back, we can still go with updating
>> stackrc.
>> Also, as an improvement to #1, can we make the timeout configurable in
>> ironicclient eg. by passing a --timeout flag?
>
> It is configurable, that's why I'm not that worried about other users of
> ironicclient :)
>

Then #2 seems perfectly fine! ;-)

>>
>> Imre
>>
>>
>>>
>>> I wonder what the root cause is as well. I suspect some very slow BMC's
>>> take too much time to do power actions or power syncs. It's possible
>>> we'll make a wrong guess, and the problem will persist despite becoming
>>> rare.
>>>
>>> Thanks,
>>> Dmitry
>>>
>>> _______________________________________________
>>> Rdo-list mailing list
>>> Rdo-list at redhat.com
>>> https://www.redhat.com/mailman/listinfo/rdo-list
>>>
>>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>
>




More information about the dev mailing list