On February 13, 2018 12:52 pm, Jakub Ruzicka wrote:
On Mon, Feb 12, 2018 at 5:08 PM, Tristan Cacqueray
<tdecacqu(a)redhat.com>
wrote:
> On February 12, 2018 8:59 am, Javier Pena wrote:
> [snip]
>
>> My only doubt is why this does not show up as "NOT_REGISTERED" in Zuul
as
>> it did before.
>>
>
> This is because we changed check_job_registration to False in zuul.conf
> to make Zuul always queue new job. We did that because during previous
> nodepool outage, zuul would fail with NOT_REGISTERED when no slaves
> where online (zuul(v2) only register job for available labels).
>
> Perhaps we could add a check for missing jjb job in zuul.yaml, or revert
> that check_job_registration back to true.
I was previsously confused by NOT_REGISTERED on wrong configuration too,
but it's still better than having the job stuck. That said, I didn't know
howto debug this error, someone with experience told me howto fix based on
guesswork.
So do I understand it correctly that Zuul has no good way of communicating
job configuration errors?
This is the design of the zuul(v2) gearman architecture,
jobs only get
registered when the associated label are available. So when nodepool or
jenkins get restarted, it can take a few minutes before slave are
online, and any change getting queued in that period will get the
NOT_REGISTERED error.
Isn't this possibly an issue to be solved in
upstream Zuul? Something like returning CONFIG_ERROR that is clickable and
leads to a log of config errors.
The only way would be to prevent adding unknown job to the pipeline in
the first place. Though this would a temporary measure until the
migration to zuul(v3) which does exactly that by default.
Regards,
-Tristan