[Rdo-list] Trying out Neutron Quickstart running into issues with netns (l2 agent and dhcp agent)

Brent Eagles beagles at redhat.com
Mon Aug 5 17:23:39 UTC 2013


On 08/04/2013 11:27 AM, Perry Myers wrote:
> Hi,
>
> I followed the instructions at:
> http://openstack.redhat.com/Neutron-Quickstart
> http://openstack.redhat.com/Running_an_instance_with_Neutron
>
> I ran this on a RHEL 6.4 VM with latest updates from 6.4.z.  I made sure
> to install the netns enabled kernel from RDO repos and reboot with that
> kernel before running packstack so that I didn't need to reboot the VM
> after the packstack install (and have br-ex disappear)
>
> The packstack install went without incident.  And I was able to follow
> the launch an instance instructions.
>
> I noticed that the cirros VM took a long time to get to a login prompt
> on the VNC console.  From looking at the console output it appears that
> the instance was waiting for a dhcp address.
>
> Once the VNC session got me to a login prompt, I logged in (as the
> cirros user) and confirmed that eth0 did not have an ip address.
>
> So, something networking related prevented the instance from getting an
> IP which of course makes ssh'ing into the instance via the floating ip
> later in the instructions not work properly.
>
> I tried ifup'ing eth0 and dhcp discovers were sent out but not responded to.
>
> One thing is that on the host running OpenStack services (the VM I ran
> packstack on), I don't see dnsmasq running except for the default
> libvirt network:
>
>> [admin at rdo-mgmt ~(keystone_demo)]$ ps -ef | grep dnsmas
>> nobody    1968     1  0 08:59 ?        00:00:00 /usr/sbin/dnsmasq --strict-order --local=// --domain-needed --pid-file=/var/run/libvirt/network/default.pid --conf-file= --except-interface lo --bind-interfaces --listen-address 192.168.122.1 --dhcp-range 192.168.122.2,192.168.122.254 --dhcp-leasefile=/var/lib/libvirt/dnsmasq/default.leases --dhcp-lease-max=253 --dhcp-no-override --dhcp-hostsfile=/var/lib/libvirt/dnsmasq/default.hostsfile --addn-hosts=/var/lib/libvirt/dnsmasq/default.addnhosts
>
> So... that seems to be a problem :)
>
> Just to confirm, I am running the right kernel:
>> [root at rdo-mgmt log(keystone_demo)]# uname -a
>> Linux rdo-mgmt 2.6.32-358.114.1.openstack.el6.x86_64 #1 SMP Wed Jul 3 02:11:25 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
>
>> [root at rdo-mgmt log(keystone_demo)]# rpm -q iproute kernel
>> iproute-2.6.32-23.el6_4.netns.1.x86_64
>> kernel-2.6.32-358.114.1.openstack.el6.x86_64
>
>  From quantum server.log:
>> 2013-08-04 09:10:48    ERROR [keystoneclient.common.cms] Verify error: Error opening certificate file /var/lib/quantum/keystone-signing/signing_cert.pem
>> 140222780139336:error:02001002:system library:fopen:No such file or directory:bss_file.c:126:fopen('/var/lib/quantum/keystone-signing/signing_cert.pem','r')
>> 140222780139336:error:2006D080:BIO routines:BIO_new_file:no such file:bss_file.c:129:
>>
>> 2013-08-04 09:10:48    ERROR [keystoneclient.common.cms] Verify error: Error loading file /var/lib/quantum/keystone-signing/cacert.pem
>> 140279285741384:error:02001002:system library:fopen:No such file or directory:bss_file.c:126:fopen('/var/lib/quantum/keystone-signing/cacert.pem','r')
>> 140279285741384:error:2006D080:BIO routines:BIO_new_file:no such file:bss_file.c:129:
>> 140279285741384:error:0B084002:x509 certificate routines:X509_load_cert_crl_file:system lib:by_file.c:279:
>
>  From quantum dhcp-agent.log:
>
>> 2013-08-04 09:08:05    ERROR [quantum.openstack.common.rpc.amqp] Timed out waiting for RPC response.
>> Traceback (most recent call last):
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 495, in __iter__
>>      data = self._dataqueue.get(timeout=self._timeout)
>>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 298, in get
>>      return waiter.wait()
>>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 129, in wait
>>      return get_hub().switch()
>>    File "/usr/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 177, in switch
>>      return self.greenlet.switch()
>> Empty
>> 2013-08-04 09:08:05    ERROR [quantum.agent.dhcp_agent] Failed reporting state!
>> Traceback (most recent call last):
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 702, in _report_state
>>      self.agent_state)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/rpc.py", line 66, in report_state
>>      topic=self.topic)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/proxy.py", line 80, in call
>>      return rpc.call(context, self._get_topic(topic), msg, timeout)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/__init__.py", line 140, in call
>>      return _get_impl().call(CONF, context, topic, msg, timeout)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/impl_qpid.py", line 611, in call
>>      rpc_amqp.get_connection_pool(conf, Connection))
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 614, in call
>>      rv = list(rv)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 500, in __iter__
>>      raise rpc_common.Timeout()
>> Timeout: Timeout while waiting on RPC response.
>> 2013-08-04 09:08:05  WARNING [quantum.openstack.common.loopingcall] task run outlasted interval by 56.853869 sec
>> 2013-08-04 09:08:06     INFO [quantum.agent.dhcp_agent] Synchronizing state
>> 2013-08-04 09:32:34    ERROR [quantum.agent.dhcp_agent] Unable to enable dhcp.
>> Traceback (most recent call last):
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 131, in call_driver
>>      getattr(driver, action)()
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/dhcp.py", line 124, in enable
>>      reuse_existing=True)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 554, in setup
>>      namespace=namespace)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/interface.py", line 181, in plug
>>      ns_dev.link.set_address(mac_address)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 180, in set_address
>>      self._as_root('set', self.name, 'address', mac_address)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 167, in _as_root
>>      kwargs.get('use_root_namespace', False))
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 47, in _as_root
>>      namespace)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 58, in _execute
>>      root_helper=root_helper)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/utils.py", line 61, in execute
>>      raise RuntimeError(m)
>> RuntimeError:
>> Command: ['sudo', 'quantum-rootwrap', '/etc/quantum/rootwrap.conf', 'ip', 'link', 'set', 'tap07d8cc77-fc', 'address', 'fa:16:3e:da:66:28']
>> Exit code: 2
>> Stdout: ''
>> Stderr: 'RTNETLINK answers: Device or resource busy\n'
>> 2013-08-04 09:32:36     INFO [quantum.agent.dhcp_agent] Synchronizing state
>> 2013-08-04 09:32:41    ERROR [quantum.agent.dhcp_agent] Unable to enable dhcp.
>> Traceback (most recent call last):
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 131, in call_driver
>>      getattr(driver, action)()
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/dhcp.py", line 124, in enable
>>      reuse_existing=True)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/dhcp_agent.py", line 554, in setup
>>      namespace=namespace)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/interface.py", line 181, in plug
>>      ns_dev.link.set_address(mac_address)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 180, in set_address
>>      self._as_root('set', self.name, 'address', mac_address)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 167, in _as_root
>>      kwargs.get('use_root_namespace', False))
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 47, in _as_root
>>      namespace)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 58, in _execute
>>      root_helper=root_helper)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/utils.py", line 61, in execute
>>      raise RuntimeError(m)
>
> The RTNETLINK errors just repeat indefinitely
>
>  From openvswitch-agent.log:
>
>> 2013-08-04 09:08:29    ERROR [quantum.openstack.common.rpc.amqp] Timed out waiting for RPC response.
>> Traceback (most recent call last):
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 495, in __iter__
>>      data = self._dataqueue.get(timeout=self._timeout)
>>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 298, in get
>>      return waiter.wait()
>>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 129, in wait
>>      return get_hub().switch()
>>    File "/usr/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 177, in switch
>>      return self.greenlet.switch()
>> Empty
>> 2013-08-04 09:08:29    ERROR [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Failed reporting state!
>> Traceback (most recent call last):
>>    File "/usr/lib/python2.6/site-packages/quantum/plugins/openvswitch/agent/ovs_quantum_agent.py", line 201, in _report_state
>>      self.agent_state)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/rpc.py", line 66, in report_state
>>      topic=self.topic)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/proxy.py", line 80, in call
>>      return rpc.call(context, self._get_topic(topic), msg, timeout)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/__init__.py", line 140, in call
>>      return _get_impl().call(CONF, context, topic, msg, timeout)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/impl_qpid.py", line 611, in call
>>      rpc_amqp.get_connection_pool(conf, Connection))
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 614, in call
>>      rv = list(rv)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 500, in __iter__
>>      raise rpc_common.Timeout()
>> Timeout: Timeout while waiting on RPC response.
>
> Do we have a race condition wrt various Quantum agents connecting to the
> qpid bus that is just generating initial qpid connection error messages
> that can be safely ignored?
>
> If so, is there any way we can clean this up?
>
>  From l3-agent.log:
>
>> 2013-08-04 09:08:06    ERROR [quantum.openstack.common.rpc.amqp] Timed out waiting for RPC response.
>> Traceback (most recent call last):
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 495, in __iter__
>>      data = self._dataqueue.get(timeout=self._timeout)
>>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 298, in get
>>      return waiter.wait()
>>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 129, in wait
>>      return get_hub().switch()
>>    File "/usr/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 177, in switch
>>      return self.greenlet.switch()
>> Empty
>> 2013-08-04 09:08:06    ERROR [quantum.agent.l3_agent] Failed reporting state!
>> Traceback (most recent call last):
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 723, in _report_state
>>      self.agent_state)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/rpc.py", line 66, in report_state
>>      topic=self.topic)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/proxy.py", line 80, in call
>>      return rpc.call(context, self._get_topic(topic), msg, timeout)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/__init__.py", line 140, in call
>>      return _get_impl().call(CONF, context, topic, msg, timeout)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/impl_qpid.py", line 611, in call
>>      rpc_amqp.get_connection_pool(conf, Connection))
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 614, in call
>>      rv = list(rv)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 500, in __iter__
>>      raise rpc_common.Timeout()
>> Timeout: Timeout while waiting on RPC response.
>> 2013-08-04 09:08:06  WARNING [quantum.openstack.common.loopingcall] task run outlasted interval by 56.554131 sec
>> 2013-08-04 09:08:10    ERROR [quantum.openstack.common.rpc.amqp] Timed out waiting for RPC response.
>> Traceback (most recent call last):
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 495, in __iter__
>>      data = self._dataqueue.get(timeout=self._timeout)
>>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 298, in get
>>      return waiter.wait()
>>    File "/usr/lib/python2.6/site-packages/eventlet/queue.py", line 129, in wait
>>      return get_hub().switch()
>>    File "/usr/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 177, in switch
>>      return self.greenlet.switch()
>> Empty
>> 2013-08-04 09:08:10    ERROR [quantum.agent.l3_agent] Failed synchronizing routers
>> Traceback (most recent call last):
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 637, in _sync_routers_task
>>      context, router_id)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 77, in get_routers
>>      topic=self.topic)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/proxy.py", line 80, in call
>>      return rpc.call(context, self._get_topic(topic), msg, timeout)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/__init__.py", line 140, in call
>>      return _get_impl().call(CONF, context, topic, msg, timeout)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/impl_qpid.py", line 611, in call
>>      rpc_amqp.get_connection_pool(conf, Connection))
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 614, in call
>>      rv = list(rv)
>>    File "/usr/lib/python2.6/site-packages/quantum/openstack/common/rpc/amqp.py", line 500, in __iter__
>>      raise rpc_common.Timeout()
>> Timeout: Timeout while waiting on RPC response.
>> 2013-08-04 09:08:10  WARNING [quantum.openstack.common.loopingcall] task run outlasted interval by 20.022704 sec
>> 2013-08-04 09:11:33    ERROR [quantum.agent.l3_agent] Failed synchronizing routers
>> Traceback (most recent call last):
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 638, in _sync_routers_task
>>      self._process_routers(routers, all_routers=True)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 621, in _process_routers
>>      self.process_router(ri)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 319, in process_router
>>      self.external_gateway_added(ri, ex_gw_port, internal_cidrs)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 410, in external_gateway_added
>>      prefix=EXTERNAL_DEV_PREFIX)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/interface.py", line 181, in plug
>>      ns_dev.link.set_address(mac_address)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 180, in set_address
>>      self._as_root('set', self.name, 'address', mac_address)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 167, in _as_root
>>      kwargs.get('use_root_namespace', False))
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 47, in _as_root
>>      namespace)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 58, in _execute
>>      root_helper=root_helper)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/utils.py", line 61, in execute
>>      raise RuntimeError(m)
>> RuntimeError:
>> Command: ['sudo', 'quantum-rootwrap', '/etc/quantum/rootwrap.conf', 'ip', 'link', 'set', 'qg-46ed452c-5e', 'address', 'fa:16:3e:e7:d8:30']
>> Exit code: 2
>> Stdout: ''
>> Stderr: 'RTNETLINK answers: Device or resource busy\n'
>> 2013-08-04 09:12:11    ERROR [quantum.agent.l3_agent] Failed synchronizing routers
>> Traceback (most recent call last):
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 638, in _sync_routers_task
>>      self._process_routers(routers, all_routers=True)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 621, in _process_routers
>>      self.process_router(ri)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 319, in process_router
>>      self.external_gateway_added(ri, ex_gw_port, internal_cidrs)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/l3_agent.py", line 410, in external_gateway_added
>>      prefix=EXTERNAL_DEV_PREFIX)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/interface.py", line 181, in plug
>>      ns_dev.link.set_address(mac_address)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 180, in set_address
>>      self._as_root('set', self.name, 'address', mac_address)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 167, in _as_root
>>      kwargs.get('use_root_namespace', False))
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 47, in _as_root
>>      namespace)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/ip_lib.py", line 58, in _execute
>>      root_helper=root_helper)
>>    File "/usr/lib/python2.6/site-packages/quantum/agent/linux/utils.py", line 61, in execute
>>      raise RuntimeError(m)
>
> Same qpid connection issue, which I'm assuming can just be ignored at
> this point.  But also similar device busy errors with creating the
> namespace for the l2 agent
>
> It appears that the issue with both the l2 agent and the dhcp agent that
> the namespace can't be created, which causes both of them to fail.
>
> Anyone have any thoughts on what to look at next here?
>
> Perry

I ran into these issues as well. I noticed that ovs_use_veth was 
commented out in dhcp_agent.ini and l3_agent.ini. I uncommented them and 
set them to True and restarted. The vm now has an IP address.

I noticed something else peculiar though... the public network.. the one 
set as the gateway for the router has dhcp enabled. I'm not sure why we 
would do that.

Cheers,

Brent




More information about the dev mailing list