[rdo-dev] Long queue in RDO SF
tdecacqu at redhat.com
Tue Feb 13 13:29:54 UTC 2018
On February 13, 2018 12:52 pm, Jakub Ruzicka wrote:
> On Mon, Feb 12, 2018 at 5:08 PM, Tristan Cacqueray <tdecacqu at redhat.com>
>> On February 12, 2018 8:59 am, Javier Pena wrote:
>>> My only doubt is why this does not show up as "NOT_REGISTERED" in Zuul as
>>> it did before.
>> This is because we changed check_job_registration to False in zuul.conf
>> to make Zuul always queue new job. We did that because during previous
>> nodepool outage, zuul would fail with NOT_REGISTERED when no slaves
>> where online (zuul(v2) only register job for available labels).
>> Perhaps we could add a check for missing jjb job in zuul.yaml, or revert
>> that check_job_registration back to true.
> I was previsously confused by NOT_REGISTERED on wrong configuration too,
> but it's still better than having the job stuck. That said, I didn't know
> howto debug this error, someone with experience told me howto fix based on
> So do I understand it correctly that Zuul has no good way of communicating
> job configuration errors?
This is the design of the zuul(v2) gearman architecture, jobs only get
registered when the associated label are available. So when nodepool or
jenkins get restarted, it can take a few minutes before slave are
online, and any change getting queued in that period will get the
> Isn't this possibly an issue to be solved in
> upstream Zuul? Something like returning CONFIG_ERROR that is clickable and
> leads to a log of config errors.
The only way would be to prevent adding unknown job to the pipeline in
the first place. Though this would a temporary measure until the
migration to zuul(v3) which does exactly that by default.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 488 bytes
Desc: not available
More information about the dev