4. A critical bug was found in both Juno and Kilo versions of nova. If I
launch approximately 20 Vms via a heat resource group with floating ips,
only about 7 of the Vms get ports assigned. The others do get their ports
assigned because they can access dhcp and metadata server, so their
networking is operational. Neutron port-list shows their ports are
active. However nova-list does not show their IPs from the instance info
cache.
My only workaround to this problem is to run the icehouse version of nova
(api, conductor, scheduler, compute) which works perfectly. I have filed
a bug with a 100% reliable easy to use reproducer and more details and
logs here:
Interestingly in my informal tests icehouse nova is about 4x faster at
placing Vms in the active state as compared to juno or kilo, so that may
need some attention as well. Just watching top, it appears neutron-server
is much busier (~35% cpu utilization of 1 core during the entire ->ACTIVE
process) with the juno/kilo releases.
Note I spent about 7 days trying to debug this problem but the code
literally calls IP assignments in about 40 different places in the code
base, including exchanges over RPC and python-neutronclient, so it is very
difficult to track. I would appreciate finding a nova expert to debug the
problem further.