[Rdo-list] Fedora 20 / Devstack Networking Issues
bnemec at redhat.com
Tue Jan 28 15:56:43 UTC 2014
----- Original Message -----
> On 01/26/2014 12:01 AM, Perry Myers wrote:
> > Ok, I've been chasing down some networking issues along with some other
> > folks. Here's what I'm seeing:
> > Starting with a vanilla F20 cloud image running on a F20 host, clone
> > devstack into it and run stack.sh.
> > First thing is that the RabbitMQ server issue I noted a few weeks ago is
> > still intermittently there. So during the step where rabbitmqctl is run
> > to set the password of the rabbit admin user, it might fail and all
> > subsequent AMQP communication fails which makes a lot of the nova
> > commands in devstack also fail.
> > But... if you get past this error (since it is intermittent), then
> > devstack seems to complete successfully. Standard commands like nova
> > list, keystone user-list, etc all work fine.
> > I did note though that access to Horizon does not work. I need to
> > investigate this further.
> > But worse than that is when you run nova boot, the host to guest
> > networking (remember this is devstack running in a VM) immediately gets
> > disconnected. This issue is 100% reproducible and multiple users are
> > reporting it (tsedovic, eharney, bnemec cc'd)
> > I did some investigation when this happens and here's what I found...
> > If I do:
> > $ brctl delif br100 eth0
> > I was immediately able to ping the guest from the host and vice versa.
> > If I reattach eth0 back to br100, networking stops again
> > Another thing... I notice that on the system br100 does not have an ip
> > address, but eth0 does. I thought when doing bridged networking like
> > this, the bridge should have the ip address and the physical iface that
> > is attached to the bridge does not get an ip addr.
> > So... I tweaked /etc/sysconfig/network-scripts/ifcfg-eth0 to remove the
> > dhcp from the bootproto line and I copied ifcfg-eth0 to ifcfg-br100
> > allowing it to use bootproto dhcp
> > I brought both ifaces down and then brought them both up. eth0 first
> > and br100 second
> > This time, br100 got the dhcp address from the host and networking
> > worked fine.
> > So is this just an issue with how nova is setting up bridges?
> > Since this network disconnect didn't happen until nova launched a vm, I
> > imagine this isn't a problem with devstack itself, but is likely an
> > issue with Nova Networking somehow.
> > Russell/DanS, is there any chance that all of the refactoring you did in
> > Nova Networking very recently introduce a regression?
> I suppose it's possible. You could try going back to before any of the
> nova-network-objects patches went in. The first one to merge was:
> commit a8c73c7d3298589440579d67e0c5638981dd7718
> Merge: a1f6e85 aa40c8f
> Author: Jenkins <jenkins at review.openstack.org>
> Date: Wed Jan 15 18:38:37 2014 +0000
> Merge "Make nova-network use Service object"
> Try going back to before that and see if you get a different result. If
> so, try using "git bisect" to find the offending commit.
> Russell Bryant
I am still unable to reproduce the networking issue in my environment. I booted a stock Fedora 20 cloud image, installed git, cloned devstack and ran it with a minimal localrc configuration (so using the defaults of nova-network and rabbitmq). Other than the rabbitmq race issue that always makes my first stack.sh run on a new VM fail, I had no problem completing stack.sh and booting a nova instance. The instance's IP was correctly moved to the bridge for me. If this is a regression in nova network then it only presents in combination with some other circumstance that isn't present for me.
As I mentioned in our off-list discussion of this, I run my development VM's in a local OpenStack installation I have, so maybe there's some difference in the way the networking works there. In any case, while I don't have an answer hopefully another data point will help in figuring this out.
More information about the dev