[rdo-list] Problem with ha-router

Emilien Macchi emilien at redhat.com
Mon Oct 23 16:57:03 UTC 2017


Hey Cédric,

You might get some help on openstack-dev [tripleo] or by filling a bug in
launchpad/tripleo but from my experience with rdo-list, there is no tripleo
support.

HTH,

On Mon, Oct 23, 2017 at 1:23 AM, Cedric Lecomte <clecomte at redhat.com> wrote:

> Hello all,
>
> I tried to deploy RDO Pike without container on our internal plateform.
>
> The setup is pretty simple :
>  - 3 Controller in HA
>  - 5 Ceph
>  - 4 Compute
>  - 3 Object-Store
>
> I didn't used any exotic parameter.
> This is my deployment command :
>
> openstack overcloud deploy --templates
>   -e environement.yaml
>   --ntp-server 0.pool.ntp.org
>   -e storage-env.yaml
>   -e network-env.yaml
>   -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-ceph.yaml
>
>   --control-scale 3 --control-flavor control
>   --compute-scale 4 --compute-flavor compute
>   --ceph-storage-scale 5 --ceph-storage-flavor ceph-storage
>   --swift-storage-flavor swift-storage --swift-storage-scale 3
>   -e scheduler_hints_env.yaml
>   -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml
>
>   -e /usr/share/openstack-tripleo-heat-templates/environments/pup
> pet-pacemaker.yaml
>
> *environnement.yaml :*
>   parameter_defaults:
>   ControllerCount: 3
>   ComputeCount: 4
>   CephStorageCount: 5
>   OvercloudCephStorageFlavor: ceph-storage
>   CephDefaultPoolSize: 3
>   ObjectStorageCount: 3
>
> *network-env.yaml :*
>   resource_registry:
>   OS::TripleO::Compute::Net::SoftwareConfig:
> /home/stack/templates/nic-configs/compute.yaml
>   OS::TripleO::Controller::Net::SoftwareConfig:
> /home/stack/templates/nic-configs/controller.yaml
>   OS::TripleO::CephStorage::Net::SoftwareConfig:
> /home/stack/templates/nic-configs/ceph-storage.yaml
>   OS::TripleO::ObjectStorage::Net::SoftwareConfig:
> /home/stack/templates/nic-configs/swift-storage.yaml
>
> parameter_defaults:
>   InternalApiNetCidr: 172.16.0.0/24
>   TenantNetCidr: 172.17.0.0/24
>   StorageNetCidr: 172.18.0.0/24
>   StorageMgmtNetCidr: 172.19.0.0/24
>   ManagementNetCidr: 172.20.0.0/24
>   ExternalNetCidr: 10.41.11.0/24
>   InternalApiAllocationPools: [{'start': '172.16.0.10', 'end':
> '172.16.0.200'}]
>   TenantAllocationPools: [{'start': '172.17.0.10', 'end': '172.17.0.200'}]
>   StorageAllocationPools: [{'start': '172.18.0.10', 'end': '172.18.0.200'}]
>   StorageMgmtAllocationPools: [{'start': '172.19.0.10', 'end':
> '172.19.0.200'}]
>   ManagementAllocationPools: [{'start': '172.20.0.10', 'end':
> '172.20.0.200'}]
>   # Leave room for floating IPs in the External allocation pool
>   ExternalAllocationPools: [{'start': '10.41.11.10', 'end': '10.41.11.30'}]
>   # Set to the router gateway on the external network
>   ExternalInterfaceDefaultRoute: 10.41.11.254
>   # Gateway router for the provisioning network (or Undercloud IP)
>   ControlPlaneDefaultRoute: 192.168.131.253
>   # The IP address of the EC2 metadata server. Generally the IP of the
> Undercloud
>   EC2MetadataIp: 192.0.2.1
>   # Define the DNS servers (maximum 2) for the overcloud nodes
>   DnsServers: ["10.38.5.26"]
>   InternalApiNetworkVlanID: 202
>   StorageNetworkVlanID: 203
>   StorageMgmtNetworkVlanID: 204
>   TenantNetworkVlanID: 205
>   ManagementNetworkVlanID: 206
>   ExternalNetworkVlanID: 198
>   NeutronExternalNetworkBridge: "''"
>   ControlPlaneSubnetCidr: '24'
>   BondInterfaceOvsOptions:
>       "mode=balance-xor"
>
> *storage-env.yaml :*
> parameter_defaults:
>   ExtraConfig:
>     ceph::profile::params::osds:
>         '/dev/sdb': {}
>         '/dev/sdc': {}
>         '/dev/sdd': {}
>         '/dev/sde': {}
>         '/dev/sdf': {}
>         '/dev/sdg': {}
>   SwiftRingBuild: false
>   RingBuild: false
>
>
> *scheduler_hints_env.yaml*
> parameter_defaults:
>     ControllerSchedulerHints:
>         'capabilities:node': 'control-%index%'
>     NovaComputeSchedulerHints:
>         'capabilities:node': 'compute-%index%'
>     CephStorageSchedulerHints:
>         'capabilities:node': 'ceph-storage-%index%'
>     ObjectStorageSchedulerHints:
>         'capabilities:node': 'swift-storage-%index%'
>
> After a little use, I found that I found that one controller is unable to
> get active ha-router and I got this output :
>
> neutron l3-agent-list-hosting-router XXX
> +--------------------------------------+--------------------
> ----------------+----------------+-------+----------+
> | id                                   | host
> | admin_state_up | alive | ha_state |
> +--------------------------------------+--------------------
> ----------------+----------------+-------+----------+
> | 420a7e31-bae1-4f8c-9438-97839cf190c4 | overcloud-controller-0.localdomain
> | True           | :-)   | standby  |
> | 6a943aa5-6fd1-4b44-8557-f0043b266a2f | overcloud-controller-1.localdomain
> | True           | :-)   | standby  |
> | dd66ef16-7533-434f-bf5b-25e38c51375f | overcloud-controller-2.localdomain
> | True           | :-)   | standby  |
> +--------------------------------------+--------------------
> ----------------+----------------+-------+----------+
>
> So each time a router is schedule on this controller I can't get an active
> router. I tried to compare the configuration but everything seems to be
> good. I redeployed to see if it help, and the only thing that change is the
> controller where the ha-router are stuck.
>
> The only message that I got is fron OVS :
>
> 2017-10-20 08:38:44.930 136145 WARNING neutron.agent.rpc
> [req-0ad9aec4-f718-498f-9ca7-15b265340174 - - - - -] Device
> Port(admin_state_up=True,allowed_address_pairs=[],binding=
> PortBinding,binding_levels=[],created_at=2017-10-20T08:38:
> 38Z,data_plane_status=<?>,description='',device_id='
> a7e23552-9329-4572-a69d-d7f316fcc5c9',device_owner='
> network:router_ha_interface',dhcp_options=[],distributed_
> binding=None,dns=None,fixed_ips=[IPAllocation],id=
> 7b6d81ef-0451-4216-9fe5-52d921052cb7,mac_address=fa:16:3e:13:e9:3c,name='HA
> port tenant 0ee0af8e94044a42923873939978ed42',network_id=ffe5ffa5-2693-4
> d35-988e-7290899601e0,project_id='',qos_policy_id=None,
> revision_number=5,security=PortSecurity(7b6d81ef-0451-
> 4216-9fe5-52d921052cb7),security_group_ids=set([]),
> status='DOWN',updated_at=2017-10-20T08:38:44Z) is not bound.
> 2017-10-20 08:38:44.944 136145 WARNING neutron.plugins.ml2.drivers.op
> envswitch.agent.ovs_neutron_agent [req-0ad9aec4-f718-498f-9ca7-15b265340174
> - - - - -] Device 7b6d81ef-0451-4216-9fe5-52d921052cb7 not defined on
> plugin or binding failed
>
> Any Idea ?
>
> --
>
> LECOMTE Cedric
>
> Senior software ENgineer
>
> Red Hat
>
> <https://www.redhat.com>
>
> clecomte at redhat.com
> <https://red.ht/sig>
> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>
>
> _______________________________________________
> rdo-list mailing list
> rdo-list at redhat.com
> https://www.redhat.com/mailman/listinfo/rdo-list
>
> To unsubscribe: rdo-list-unsubscribe at redhat.com
>



-- 
Emilien Macchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rdoproject.org/pipermail/dev/attachments/20171023/0236bc4e/attachment.html>


More information about the dev mailing list