[Rdo-list] Trying out Neutron Quickstart running into issues with netns (l2 agent and dhcp agent)

Gilles Dubreuil gilles at redhat.com
Tue Aug 6 00:46:11 UTC 2013


On Mon, 2013-08-05 at 14:53 -0230, Brent Eagles wrote:
> On 08/04/2013 11:27 AM, Perry Myers wrote:
> > Hi,
> >
> > I followed the instructions at:
> > http://openstack.redhat.com/Neutron-Quickstart
> > http://openstack.redhat.com/Running_an_instance_with_Neutron
> >
> > I ran this on a RHEL 6.4 VM with latest updates from 6.4.z.  I made sure
> > to install the netns enabled kernel from RDO repos and reboot with that
> > kernel before running packstack so that I didn't need to reboot the VM
> > after the packstack install (and have br-ex disappear)
> >
> > The packstack install went without incident.  And I was able to follow
> > the launch an instance instructions.
> >
> > I noticed that the cirros VM took a long time to get to a login prompt
> > on the VNC console.  From looking at the console output it appears that
> > the instance was waiting for a dhcp address.
> >
> > Once the VNC session got me to a login prompt, I logged in (as the
> > cirros user) and confirmed that eth0 did not have an ip address.
> >
> > So, something networking related prevented the instance from getting an
> > IP which of course makes ssh'ing into the instance via the floating ip
> > later in the instructions not work properly.
> >
> > I tried ifup'ing eth0 and dhcp discovers were sent out but not responded to.
> >
> > One thing is that on the host running OpenStack services (the VM I ran
> > packstack on), I don't see dnsmasq running except for the default
> > libvirt network:
> >
> >> [admin at rdo-mgmt ~(keystone_demo)]$ ps -ef | grep dnsmas
> >> nobody    1968     1  0 08:59 ?        00:00:00 /usr/sbin/dnsmasq --strict-order --local=// --domain-needed --pid-file=/var/run/libvirt/network/default.pid --conf-file= --except-interface lo --bind-interfaces --listen-address 192.168.122.1 --dhcp-range 192.168.122.2,192.168.122.254 --dhcp-leasefile=/var/lib/libvirt/dnsmasq/default.leases --dhcp-lease-max=253 --dhcp-no-override --dhcp-hostsfile=/var/lib/libvirt/dnsmasq/default.hostsfile --addn-hosts=/var/lib/libvirt/dnsmasq/default.addnhosts
> >
> > So... that seems to be a problem :)
> >
> > Just to confirm, I am running the right kernel:
> >> [root at rdo-mgmt log(keystone_demo)]# uname -a
> >> Linux rdo-mgmt 2.6.32-358.114.1.openstack.el6.x86_64 #1 SMP Wed Jul 3 02:11:25 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
> >
> >> [root at rdo-mgmt log(keystone_demo)]# rpm -q iproute kernel
> >> iproute-2.6.32-23.el6_4.netns.1.x86_64
> >> kernel-2.6.32-358.114.1.openstack.el6.x86_64
> >
> >  From quantum server.log:
> >> 2013-08-04 09:10:48    ERROR [keystoneclient.common.cms] Verify error: Error opening certificate file /var/lib/quantum/keystone-signing/signing_cert.pem
> >> 140222780139336:error:02001002:system library:fopen:No such file or directory:bss_file.c:126:fopen('/var/lib/quantum/keystone-signing/signing_cert.pem','r')
> >> 140222780139336:error:2006D080:BIO routines:BIO_new_file:no such file:bss_file.c:129:
> >>
> >> 2013-08-04 09:10:48    ERROR [keystoneclient.common.cms] Verify error: Error loading file /var/lib/quantum/keystone-signing/cacert.pem
> >> 140279285741384:error:02001002:system library:fopen:No such file or directory:bss_file.c:126:fopen('/var/lib/quantum/keystone-signing/cacert.pem','r')
> >> 140279285741384:error:2006D080:BIO routines:BIO_new_file:no such file:bss_file.c:129:
> >> 140279285741384:error:0B084002:x509 certificate routines:X509_load_cert_crl_file:system lib:by_file.c:279:
> >
> >  From quantum dhcp-agent.log:
> >
> >> 2013-08-04 09:08:05    ERROR [quantum.openstack.common.rpc.amqp] Timed out waiting for RPC response.
> >> Traceback (most recent call last):
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 495, in __iter__
> >>      data = self._dataqueue.get(timeout=self._timeout)
> >>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 298, in get
> >>      return waiter.wait()
> >>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 129, in wait
> >>      return get_hub().switch()
> >>    File "/usr/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 177, in switch
> >>      return self.greenlet.switch()
> >> Empty
> >> 2013-08-04 09:08:05    ERROR [quantum.agent.dhcp_agent] Failed reporting state!
> >> Traceback (most recent call last):
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 702, in _report_state
> >>      self.agent_state)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/rpc.py", line 66, in report_state
> >>      topic=self.topic)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/proxy.py", line 80, in call
> >>      return rpc.call(context, self._get_topic(topic), msg, timeout)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/__init__.py", line 140, in call
> >>      return _get_impl().call(CONF, context, topic, msg, timeout)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/impl_qpid.py", line 611, in call
> >>      rpc_amqp.get_connection_pool(conf, Connection))
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 614, in call
> >>      rv = list(rv)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 500, in __iter__
> >>      raise rpc_common.Timeout()
> >> Timeout: Timeout while waiting on RPC response.
> >> 2013-08-04 09:08:05  WARNING [quantum.openstack.common.loopingcall] task run outlasted interval by 56.853869 sec
> >> 2013-08-04 09:08:06     INFO [quantum.agent.dhcp_agent] Synchronizing state
> >> 2013-08-04 09:32:34    ERROR [quantum.agent.dhcp_agent] Unable to enable dhcp.
> >> Traceback (most recent call last):
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 131, in call_driver
> >>      getattr(driver, action)()
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/dhcp.py", line 124, in enable
> >>      reuse_existing=True)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 554, in setup
> >>      namespace=namespace)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/interface.py", line 181, in plug
> >>      ns_dev.link.set_address(mac_address)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 180, in set_address
> >>      self._as_root('set', self.name, 'address', mac_address)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 167, in _as_root
> >>      kwargs.get('use_root_namespace', False))
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 47, in _as_root
> >>      namespace)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 58, in _execute
> >>      root_helper=root_helper)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/utils.py", line 61, in execute
> >>      raise RuntimeError(m)
> >> RuntimeError:
> >> Command: ['sudo', 'quantum-rootwrap', '/etc/quantum/rootwrap.conf', 'ip', 'link', 'set', 'tap07d8cc77-fc', 'address', 'fa:16:3e:da:66:28']
> >> Exit code: 2
> >> Stdout: ''
> >> Stderr: 'RTNETLINK answers: Device or resource busy\n'
> >> 2013-08-04 09:32:36     INFO [quantum.agent.dhcp_agent] Synchronizing state
> >> 2013-08-04 09:32:41    ERROR [quantum.agent.dhcp_agent] Unable to enable dhcp.
> >> Traceback (most recent call last):
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 131, in call_driver
> >>      getattr(driver, action)()
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/dhcp.py", line 124, in enable
> >>      reuse_existing=True)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 554, in setup
> >>      namespace=namespace)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/interface.py", line 181, in plug
> >>      ns_dev.link.set_address(mac_address)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 180, in set_address
> >>      self._as_root('set', self.name, 'address', mac_address)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 167, in _as_root
> >>      kwargs.get('use_root_namespace', False))
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 47, in _as_root
> >>      namespace)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 58, in _execute
> >>      root_helper=root_helper)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/utils.py", line 61, in execute
> >>      raise RuntimeError(m)
> >
> > The RTNETLINK errors just repeat indefinitely
> >
> >  From openvswitch-agent.log:
> >
> >> 2013-08-04 09:08:29    ERROR [quantum.openstack.common.rpc.amqp] Timed out waiting for RPC response.
> >> Traceback (most recent call last):
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 495, in __iter__
> >>      data = self._dataqueue.get(timeout=self._timeout)
> >>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 298, in get
> >>      return waiter.wait()
> >>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 129, in wait
> >>      return get_hub().switch()
> >>    File "/usr/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 177, in switch
> >>      return self.greenlet.switch()
> >> Empty
> >> 2013-08-04 09:08:29    ERROR [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Failed reporting state!
> >> Traceback (most recent call last):
> >>    File "/usr/lib/python2.6/site-packages/quantum/plugins/openvswitch/agent/ovs_quantum_agent.py", line 201, in _report_state
> >>      self.agent_state)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/rpc.py", line 66, in report_state
> >>      topic=self.topic)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/proxy.py", line 80, in call
> >>      return rpc.call(context, self._get_topic(topic), msg, timeout)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/__init__.py", line 140, in call
> >>      return _get_impl().call(CONF, context, topic, msg, timeout)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/impl_qpid.py", line 611, in call
> >>      rpc_amqp.get_connection_pool(conf, Connection))
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 614, in call
> >>      rv = list(rv)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 500, in __iter__
> >>      raise rpc_common.Timeout()
> >> Timeout: Timeout while waiting on RPC response.
> >
> > Do we have a race condition wrt various Quantum agents connecting to the
> > qpid bus that is just generating initial qpid connection error messages
> > that can be safely ignored?
> >
> > If so, is there any way we can clean this up?
> >
> >  From l3-agent.log:
> >
> >> 2013-08-04 09:08:06    ERROR [quantum.openstack.common.rpc.amqp] Timed out waiting for RPC response.
> >> Traceback (most recent call last):
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 495, in __iter__
> >>      data = self._dataqueue.get(timeout=self._timeout)
> >>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 298, in get
> >>      return waiter.wait()
> >>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 129, in wait
> >>      return get_hub().switch()
> >>    File "/usr/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 177, in switch
> >>      return self.greenlet.switch()
> >> Empty
> >> 2013-08-04 09:08:06    ERROR [quantum.agent.l3_agent] Failed reporting state!
> >> Traceback (most recent call last):
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 723, in _report_state
> >>      self.agent_state)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/rpc.py", line 66, in report_state
> >>      topic=self.topic)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/proxy.py", line 80, in call
> >>      return rpc.call(context, self._get_topic(topic), msg, timeout)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/__init__.py", line 140, in call
> >>      return _get_impl().call(CONF, context, topic, msg, timeout)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/impl_qpid.py", line 611, in call
> >>      rpc_amqp.get_connection_pool(conf, Connection))
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 614, in call
> >>      rv = list(rv)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 500, in __iter__
> >>      raise rpc_common.Timeout()
> >> Timeout: Timeout while waiting on RPC response.
> >> 2013-08-04 09:08:06  WARNING [quantum.openstack.common.loopingcall] task run outlasted interval by 56.554131 sec
> >> 2013-08-04 09:08:10    ERROR [quantum.openstack.common.rpc.amqp] Timed out waiting for RPC response.
> >> Traceback (most recent call last):
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 495, in __iter__
> >>      data = self._dataqueue.get(timeout=self._timeout)
> >>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 298, in get
> >>      return waiter.wait()
> >>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 129, in wait
> >>      return get_hub().switch()
> >>    File "/usr/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 177, in switch
> >>      return self.greenlet.switch()
> >> Empty
> >> 2013-08-04 09:08:10    ERROR [quantum.agent.l3_agent] Failed synchronizing routers
> >> Traceback (most recent call last):
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 637, in _sync_routers_task
> >>      context, router_id)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 77, in get_routers
> >>      topic=self.topic)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/proxy.py", line 80, in call
> >>      return rpc.call(context, self._get_topic(topic), msg, timeout)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/__init__.py", line 140, in call
> >>      return _get_impl().call(CONF, context, topic, msg, timeout)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/impl_qpid.py", line 611, in call
> >>      rpc_amqp.get_connection_pool(conf, Connection))
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 614, in call
> >>      rv = list(rv)
> >>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 500, in __iter__
> >>      raise rpc_common.Timeout()
> >> Timeout: Timeout while waiting on RPC response.
> >> 2013-08-04 09:08:10  WARNING [quantum.openstack.common.loopingcall] task run outlasted interval by 20.022704 sec
> >> 2013-08-04 09:11:33    ERROR [quantum.agent.l3_agent] Failed synchronizing routers
> >> Traceback (most recent call last):
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 638, in _sync_routers_task
> >>      self._process_routers(routers, all_routers=True)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 621, in _process_routers
> >>      self.process_router(ri)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 319, in process_router
> >>      self.external_gateway_added(ri, ex_gw_port, internal_cidrs)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 410, in external_gateway_added
> >>      prefix=EXTERNAL_DEV_PREFIX)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/interface.py", line 181, in plug
> >>      ns_dev.link.set_address(mac_address)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 180, in set_address
> >>      self._as_root('set', self.name, 'address', mac_address)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 167, in _as_root
> >>      kwargs.get('use_root_namespace', False))
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 47, in _as_root
> >>      namespace)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 58, in _execute
> >>      root_helper=root_helper)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/utils.py", line 61, in execute
> >>      raise RuntimeError(m)
> >> RuntimeError:
> >> Command: ['sudo', 'quantum-rootwrap', '/etc/quantum/rootwrap.conf', 'ip', 'link', 'set', 'qg-46ed452c-5e', 'address', 'fa:16:3e:e7:d8:30']
> >> Exit code: 2
> >> Stdout: ''
> >> Stderr: 'RTNETLINK answers: Device or resource busy\n'
> >> 2013-08-04 09:12:11    ERROR [quantum.agent.l3_agent] Failed synchronizing routers
> >> Traceback (most recent call last):
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 638, in _sync_routers_task
> >>      self._process_routers(routers, all_routers=True)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 621, in _process_routers
> >>      self.process_router(ri)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 319, in process_router
> >>      self.external_gateway_added(ri, ex_gw_port, internal_cidrs)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 410, in external_gateway_added
> >>      prefix=EXTERNAL_DEV_PREFIX)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/interface.py", line 181, in plug
> >>      ns_dev.link.set_address(mac_address)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 180, in set_address
> >>      self._as_root('set', self.name, 'address', mac_address)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 167, in _as_root
> >>      kwargs.get('use_root_namespace', False))
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 47, in _as_root
> >>      namespace)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 58, in _execute
> >>      root_helper=root_helper)
> >>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/utils.py", line 61, in execute
> >>      raise RuntimeError(m)
> >
> > Same qpid connection issue, which I'm assuming can just be ignored at
> > this point.  But also similar device busy errors with creating the
> > namespace for the l2 agent
> >
> > It appears that the issue with both the l2 agent and the dhcp agent that
> > the namespace can't be created, which causes both of them to fail.
> >
> > Anyone have any thoughts on what to look at next here?
> >
> > Perry
> 
> I ran into these issues as well. I noticed that ovs_use_veth was 
> commented out in dhcp_agent.ini and l3_agent.ini. I uncommented them and 
> set them to True and restarted. The vm now has an IP address.
> 
This seems to be the case on RDO.
Meanwhile, in RHOS, this seems to be set by default
in /usr/share/quantum/quantum-dist.conf.

> I noticed something else peculiar though... the public network.. the one 
> set as the gateway for the router has dhcp enabled. I'm not sure why we 
> would do that.
> 
> Cheers,
> 
> Brent
> 
> _______________________________________________
> Rdo-list mailing list
> Rdo-list at redhat.com
> https://www.redhat.com/mailman/listinfo/rdo-list





More information about the dev mailing list