Hello stackers,

I have a 3-controller based cluster deployed using TripleO (Queens, tripleo-current). Every time after a fresh deployment, I am unable to launch instances for the first few times. New launches either stuck at BUILD or go to ERROR state. The issue continues until the "nova_placement" container on the controller with the InternalAPI VIP becomes "unhealthy" and receives a manual docker restart. Only after that, everything goes back to normal. The problem is reproducible and consistent every time with fresh deployments.

I found a similar description of the syndrome in this bug [1]. The report is for Rocky, but the syndrome is similar to mine using Queens.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1630069

Here is an excerpt from the /var/log/containers/nova/nova-scheduler.log from one of the controller nodes:
...
2018-10-06 04:57:37.926 28 INFO nova.scheduler.host_manager [req-dfa89ff4-cbde-482e-ad40-0696294ffdd1 - - - - -] Host mapping not found for host overcloud-novacompute-1.localdomain. Not tracking instance info for this host.
2018-10-06 04:57:37.926 25 INFO nova.scheduler.host_manager [req-dfa89ff4-cbde-482e-ad40-0696294ffdd1 - - - - -] Host mapping not found for host overcloud-novacompute-1.localdomain. Not tracking instance info for this host.
2018-10-06 04:57:37.926 30 INFO nova.scheduler.host_manager [req-dfa89ff4-cbde-482e-ad40-0696294ffdd1 - - - - -] Host mapping not found for host overcloud-novacompute-1.localdomain. Not tracking instance info for this host.
2018-10-06 04:57:37.926 32 INFO nova.scheduler.host_manager [req-dfa89ff4-cbde-482e-ad40-0696294ffdd1 - - - - -] Host mapping not found for host overcloud-novacompute-1.localdomain. Not tracking instance info for this host.
2018-10-06 04:57:37.926 26 INFO nova.scheduler.host_manager [req-dfa89ff4-cbde-482e-ad40-0696294ffdd1 - - - - -] Host mapping not found for host overcloud-novacompute-1.localdomain. Not tracking instance info for this host.
2018-10-06 04:57:37.927 28 INFO nova.scheduler.host_manager [req-dfa89ff4-cbde-482e-ad40-0696294ffdd1 - - - - -] Received a sync request from an unknown host 'overcloud-novacompute-1.localdomain'. Re-created its InstanceList.
2018-10-06 04:57:37.927 25 INFO nova.scheduler.host_manager [req-dfa89ff4-cbde-482e-ad40-0696294ffdd1 - - - - -] Received a sync request from an unknown host 'overcloud-novacompute-1.localdomain'. Re-created its InstanceList.
2018-10-06 04:57:37.927 30 INFO nova.scheduler.host_manager [req-dfa89ff4-cbde-482e-ad40-0696294ffdd1 - - - - -] Received a sync request from an unknown host 'overcloud-novacompute-1.localdomain'. Re-created its InstanceList.
2018-10-06 04:57:37.927 32 INFO nova.scheduler.host_manager [req-dfa89ff4-cbde-482e-ad40-0696294ffdd1 - - - - -] Received a sync request from an unknown host 'overcloud-novacompute-1.localdomain'. Re-created its InstanceList.
2018-10-06 04:57:37.927 26 INFO nova.scheduler.host_manager [req-dfa89ff4-cbde-482e-ad40-0696294ffdd1 - - - - -] Received a sync request from an unknown host 'overcloud-novacompute-1.localdomain'. Re-created its InstanceList.

Could this be the same bug for Queens, too?

Cody