[rdo-list] Problem with ha-router
Emilien Macchi
emilien at redhat.com
Mon Oct 23 16:57:03 UTC 2017
Hey Cédric,
You might get some help on openstack-dev [tripleo] or by filling a bug in
launchpad/tripleo but from my experience with rdo-list, there is no tripleo
support.
HTH,
On Mon, Oct 23, 2017 at 1:23 AM, Cedric Lecomte <clecomte at redhat.com> wrote:
> Hello all,
>
> I tried to deploy RDO Pike without container on our internal plateform.
>
> The setup is pretty simple :
> - 3 Controller in HA
> - 5 Ceph
> - 4 Compute
> - 3 Object-Store
>
> I didn't used any exotic parameter.
> This is my deployment command :
>
> openstack overcloud deploy --templates
> -e environement.yaml
> --ntp-server 0.pool.ntp.org
> -e storage-env.yaml
> -e network-env.yaml
> -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-ceph.yaml
>
> --control-scale 3 --control-flavor control
> --compute-scale 4 --compute-flavor compute
> --ceph-storage-scale 5 --ceph-storage-flavor ceph-storage
> --swift-storage-flavor swift-storage --swift-storage-scale 3
> -e scheduler_hints_env.yaml
> -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml
>
> -e /usr/share/openstack-tripleo-heat-templates/environments/pup
> pet-pacemaker.yaml
>
> *environnement.yaml :*
> parameter_defaults:
> ControllerCount: 3
> ComputeCount: 4
> CephStorageCount: 5
> OvercloudCephStorageFlavor: ceph-storage
> CephDefaultPoolSize: 3
> ObjectStorageCount: 3
>
> *network-env.yaml :*
> resource_registry:
> OS::TripleO::Compute::Net::SoftwareConfig:
> /home/stack/templates/nic-configs/compute.yaml
> OS::TripleO::Controller::Net::SoftwareConfig:
> /home/stack/templates/nic-configs/controller.yaml
> OS::TripleO::CephStorage::Net::SoftwareConfig:
> /home/stack/templates/nic-configs/ceph-storage.yaml
> OS::TripleO::ObjectStorage::Net::SoftwareConfig:
> /home/stack/templates/nic-configs/swift-storage.yaml
>
> parameter_defaults:
> InternalApiNetCidr: 172.16.0.0/24
> TenantNetCidr: 172.17.0.0/24
> StorageNetCidr: 172.18.0.0/24
> StorageMgmtNetCidr: 172.19.0.0/24
> ManagementNetCidr: 172.20.0.0/24
> ExternalNetCidr: 10.41.11.0/24
> InternalApiAllocationPools: [{'start': '172.16.0.10', 'end':
> '172.16.0.200'}]
> TenantAllocationPools: [{'start': '172.17.0.10', 'end': '172.17.0.200'}]
> StorageAllocationPools: [{'start': '172.18.0.10', 'end': '172.18.0.200'}]
> StorageMgmtAllocationPools: [{'start': '172.19.0.10', 'end':
> '172.19.0.200'}]
> ManagementAllocationPools: [{'start': '172.20.0.10', 'end':
> '172.20.0.200'}]
> # Leave room for floating IPs in the External allocation pool
> ExternalAllocationPools: [{'start': '10.41.11.10', 'end': '10.41.11.30'}]
> # Set to the router gateway on the external network
> ExternalInterfaceDefaultRoute: 10.41.11.254
> # Gateway router for the provisioning network (or Undercloud IP)
> ControlPlaneDefaultRoute: 192.168.131.253
> # The IP address of the EC2 metadata server. Generally the IP of the
> Undercloud
> EC2MetadataIp: 192.0.2.1
> # Define the DNS servers (maximum 2) for the overcloud nodes
> DnsServers: ["10.38.5.26"]
> InternalApiNetworkVlanID: 202
> StorageNetworkVlanID: 203
> StorageMgmtNetworkVlanID: 204
> TenantNetworkVlanID: 205
> ManagementNetworkVlanID: 206
> ExternalNetworkVlanID: 198
> NeutronExternalNetworkBridge: "''"
> ControlPlaneSubnetCidr: '24'
> BondInterfaceOvsOptions:
> "mode=balance-xor"
>
> *storage-env.yaml :*
> parameter_defaults:
> ExtraConfig:
> ceph::profile::params::osds:
> '/dev/sdb': {}
> '/dev/sdc': {}
> '/dev/sdd': {}
> '/dev/sde': {}
> '/dev/sdf': {}
> '/dev/sdg': {}
> SwiftRingBuild: false
> RingBuild: false
>
>
> *scheduler_hints_env.yaml*
> parameter_defaults:
> ControllerSchedulerHints:
> 'capabilities:node': 'control-%index%'
> NovaComputeSchedulerHints:
> 'capabilities:node': 'compute-%index%'
> CephStorageSchedulerHints:
> 'capabilities:node': 'ceph-storage-%index%'
> ObjectStorageSchedulerHints:
> 'capabilities:node': 'swift-storage-%index%'
>
> After a little use, I found that I found that one controller is unable to
> get active ha-router and I got this output :
>
> neutron l3-agent-list-hosting-router XXX
> +--------------------------------------+--------------------
> ----------------+----------------+-------+----------+
> | id | host
> | admin_state_up | alive | ha_state |
> +--------------------------------------+--------------------
> ----------------+----------------+-------+----------+
> | 420a7e31-bae1-4f8c-9438-97839cf190c4 | overcloud-controller-0.localdomain
> | True | :-) | standby |
> | 6a943aa5-6fd1-4b44-8557-f0043b266a2f | overcloud-controller-1.localdomain
> | True | :-) | standby |
> | dd66ef16-7533-434f-bf5b-25e38c51375f | overcloud-controller-2.localdomain
> | True | :-) | standby |
> +--------------------------------------+--------------------
> ----------------+----------------+-------+----------+
>
> So each time a router is schedule on this controller I can't get an active
> router. I tried to compare the configuration but everything seems to be
> good. I redeployed to see if it help, and the only thing that change is the
> controller where the ha-router are stuck.
>
> The only message that I got is fron OVS :
>
> 2017-10-20 08:38:44.930 136145 WARNING neutron.agent.rpc
> [req-0ad9aec4-f718-498f-9ca7-15b265340174 - - - - -] Device
> Port(admin_state_up=True,allowed_address_pairs=[],binding=
> PortBinding,binding_levels=[],created_at=2017-10-20T08:38:
> 38Z,data_plane_status=<?>,description='',device_id='
> a7e23552-9329-4572-a69d-d7f316fcc5c9',device_owner='
> network:router_ha_interface',dhcp_options=[],distributed_
> binding=None,dns=None,fixed_ips=[IPAllocation],id=
> 7b6d81ef-0451-4216-9fe5-52d921052cb7,mac_address=fa:16:3e:13:e9:3c,name='HA
> port tenant 0ee0af8e94044a42923873939978ed42',network_id=ffe5ffa5-2693-4
> d35-988e-7290899601e0,project_id='',qos_policy_id=None,
> revision_number=5,security=PortSecurity(7b6d81ef-0451-
> 4216-9fe5-52d921052cb7),security_group_ids=set([]),
> status='DOWN',updated_at=2017-10-20T08:38:44Z) is not bound.
> 2017-10-20 08:38:44.944 136145 WARNING neutron.plugins.ml2.drivers.op
> envswitch.agent.ovs_neutron_agent [req-0ad9aec4-f718-498f-9ca7-15b265340174
> - - - - -] Device 7b6d81ef-0451-4216-9fe5-52d921052cb7 not defined on
> plugin or binding failed
>
> Any Idea ?
>
> --
>
> LECOMTE Cedric
>
> Senior software ENgineer
>
> Red Hat
>
> <https://www.redhat.com>
>
> clecomte at redhat.com
> <https://red.ht/sig>
> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>
>
> _______________________________________________
> rdo-list mailing list
> rdo-list at redhat.com
> https://www.redhat.com/mailman/listinfo/rdo-list
>
> To unsubscribe: rdo-list-unsubscribe at redhat.com
>
--
Emilien Macchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rdoproject.org/pipermail/dev/attachments/20171023/0236bc4e/attachment.html>
More information about the dev
mailing list