[Rdo-list] Failing to deploy Mitaka on baremetal

Marius Cornea marius at remote-lab.net
Mon Feb 8 17:26:45 UTC 2016


Hi Raoul,

Can you post the output of the following commands on that compute node please?

cat /etc/os-net-config/config.json | python -m json.tool
ip a
ip r

Thanks

On Mon, Feb 8, 2016 at 6:15 PM, Raoul Scarazzini <rasca at redhat.com> wrote:
> Just another update. I fixed the connectivity issue (nic1 and nic2 were
> inverted in yaml files) but the setup files anyway.
> The problem now, looking into the compute's
> /var/lib/heat-config/deployed directory) is this one:
>
> {
>   "deploy_stdout": "Trying to ping 172.16.0.14 for local network
> 172.16.0.0/24...SUCCESS\nTrying to ping 172.17.0.16 for local network
> 172.17.0.0/24...SUCCESS\nTrying to ping 172.18.0.14 for local network
> 172.18.0.0/24...SUCCESS\nTrying to ping default gateway
> 10.1.241.254...FAILURE\n10.1.241.254 is not pingable.\n",
>   "deploy_stderr": "",
>   "deploy_status_code": 1
> }
>
> Funny thing is that I'm able to ping 10.1.241.254 from the compute
> nodes, and so I'm asking why it is failing during the deployment.
>
> Should it be that something is not ready from network side? Then what?
> Can you give me some hints on how to debug this?
>
> --
> Raoul Scarazzini
> rasca at redhat.com
>
> Il giorno 8/2/2016 12:01:29, Raoul Scarazzini ha scritto:
>> Hi David,
>> you are absolutely right, I did too many assumptions. First of all I'm
>> installing everything with rdo-manager, using an identical set of
>> configurations that came from my previous (working) osp-director 8 setup.
>>
>> Things seems to fail on compute node verifications. Specifically here:
>>
>> Feb 05 16:37:23 overcloud-novacompute-0 os-collect-config[6014]:
>> [2016-02-05 16:37:23,707] (heat-config) [ERROR] Error running
>> /var/lib/heat-config/heat-config-script/a435044e-9be8-42ea-8b03-92bee12b3d23.
>> [1]
>>
>> Looking that script I identified two different actions:
>>
>> 1) # For each unique remote IP (specified via Heat) we check to
>> # see if one of the locally configured networks matches and if so we
>> # attempt a ping test the remote network IP.
>>
>> 2) # Ping all default gateways. There should only be one
>> # if using upstream t-h-t network templates but we test
>> # all of them should some manual network config have
>> # multiple gateways.
>>
>> And in fact after a verification I'm not able to reach compute nodes
>> from controllers or other computes inside one of the
>> InternalApiAllocationPools (172.17.0) or TenantAllocationPools (172.16.0).
>>
>> I'm using a specific network setup, as I said the same one I was using
>> with osp-director 8. So I've got a specific network-management.yaml file
>> in which I've specified these settings:
>>
>> resource_registry:
>>   OS::TripleO::BlockStorage::Net::SoftwareConfig:
>> /home/stack/nic-configs/cinder-storage.yaml
>>   OS::TripleO::Compute::Net::SoftwareConfig:
>> /home/stack/nic-configs/compute.yaml
>>   OS::TripleO::Controller::Net::SoftwareConfig:
>> /home/stack/nic-configs/controller.yaml
>>   OS::TripleO::ObjectStorage::Net::SoftwareConfig:
>> /home/stack/nic-configs/swift-storage.yaml
>>   OS::TripleO::CephStorage::Net::SoftwareConfig:
>> /home/stack/nic-configs/ceph-storage.yaml
>>
>> parameter_defaults:
>>   # Customize the IP subnets to match the local environment
>>   InternalApiNetCidr: 172.17.0.0/24
>>   StorageNetCidr: 172.18.0.0/24
>>   StorageMgmtNetCidr: 172.19.0.0/24
>>   TenantNetCidr: 172.16.0.0/24
>>   ExternalNetCidr: 172.20.0.0/24
>>   ControlPlaneSubnetCidr: '24'
>>   InternalApiAllocationPools: [{'start': '172.17.0.10', 'end':
>> '172.17.0.200'}]
>>   StorageAllocationPools: [{'start': '172.18.0.10', 'end': '172.18.0.200'}]
>>   StorageMgmtAllocationPools: [{'start': '172.19.0.10', 'end':
>> '172.19.0.200'}]
>>   TenantAllocationPools: [{'start': '172.16.0.10', 'end': '172.16.0.200'}]
>>   ExternalAllocationPools: [{'start': '172.20.0.10', 'end': '172.20.0.200'}]
>>   # Specify the gateway on the external network.
>>   ExternalInterfaceDefaultRoute: 172.20.0.254
>>   # Gateway router for the provisioning network (or Undercloud IP)
>>   ControlPlaneDefaultRoute: 192.0.2.1
>>   # Generally the IP of the Undercloud
>>   EC2MetadataIp: 192.0.2.1
>>   DnsServers: ["10.1.241.2"]
>>   InternalApiNetworkVlanID: 2201
>>   StorageNetworkVlanID: 2203
>>   StorageMgmtNetworkVlanID: 2204
>>   TenantNetworkVlanID: 2202
>>   ExternalNetworkVlanID: 2205
>>   # Floating IP networks do not have to use br-ex, they can use any
>> bridge as long as the NeutronExternalNetworkBridge is set to "''".
>>   NeutronExternalNetworkBridge: "''"
>>
>> # Variables in "parameters" apply an actual value to one of the
>> top-level params
>> parameters:
>>   # The OVS logical->physical bridge mappings to use. Defaults to
>> mapping br-ex - the external bridge on hosts - to a physical name
>> 'datacentre' which can be used
>>   # to create provider networks (and we use this for the default
>> floating network) - if changing this either use different post-install
>> network scripts or be sure
>>   # to keep 'datacentre' as a mapping network name.
>>   # Unfortunately this option is overridden by the command line, due to
>> a limitation (that will be fixed), so even declaring this won't have effect.
>>   # See overcloud-deploy.sh for all the explenations.
>>   #
>> https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud-without-mergepy.yaml#L112
>>   NeutronBridgeMappings: "datacentre:br-floating"
>>
>> Obviously the controller.yaml was modified to reflect my needs, as
>> described here [1] and as you can see, I don't have declared a
>> ManagementNetworkVlan, since before this was not needed.
>> So the first question is: what is this new network and how it
>> differentiate from the other networks actually available? Can this
>> affect communications that before were working?
>>
>> Many thanks
>>
>> [1]
>> https://github.com/rscarazz/openstack/blob/master/ospd-network-isolation-considerations.md
>>
>> --
>> Raoul Scarazzini
>> rasca at redhat.com
>>
>> Il giorno 6/2/2016 00:38:38, David Moreau Simard ha scritto:
>>> Hi Raoul,
>>>
>>> A good start would be to give us some more details about how you did the
>>> installation.
>>>
>>> What installation tool/procedure ? What repositories ?
>>>
>>> David Moreau Simard
>>> Senior Software Engineer | Openstack RDO
>>>
>>> dmsimard = [irc, github, twitter]
>>>
>>> On Feb 5, 2016 5:16 AM, "Raoul Scarazzini" <rasca at redhat.com
>>> <mailto:rasca at redhat.com>> wrote:
>>>
>>>     Hi,
>>>     I'm trying to deploy Mitaka on a baremetal environment composed
>>>     by 3 controllers and 4 computes.
>>>     After introspection nodes seems fine, even for one of them I need to do
>>>     introspection by hand, since it was not completing the process. But in
>>>     the end all my 7 nodes were in state "available".
>>>
>>>     Launching the overcloud deploy, the controller part it goes fine, but
>>>     then it gives me this error about compute:
>>>
>>>     2016-02-05 09:26:59 [NovaCompute]: CREATE_FAILED  ResourceInError:
>>>     resources.NovaCompute: Went to status ERROR due to "Message: Exceeded
>>>     maximum number of retries. Exceeded max scheduling at
>>>     tempts 3 for instance 0227f7c1-3c2b-4e10-93bf-e7d84a7aca71. Last
>>>     exception: Port b8:ca:3a:66:ef:5a is still in use.
>>>
>>>     The funny thing is that I can't find anywhere the incriminated ID it's
>>>     not an Ironic node ID and neither a Nova one.
>>>
>>>     Can you help me point the attention in the right direction?
>>>
>>>     Many thanks,
>>>
>>>     --
>>>     Raoul Scarazzini
>>>     rasca at redhat.com <mailto:rasca at redhat.com>
>>>
>>>     _______________________________________________
>>>     Rdo-list mailing list
>>>     Rdo-list at redhat.com <mailto:Rdo-list at redhat.com>
>>>     https://www.redhat.com/mailman/listinfo/rdo-list
>>>
>>>     To unsubscribe: rdo-list-unsubscribe at redhat.com
>>>     <mailto:rdo-list-unsubscribe at redhat.com>
>>>
>>
>> _______________________________________________
>> Rdo-list mailing list
>> Rdo-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/rdo-list
>>
>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>
>
> _______________________________________________
> Rdo-list mailing list
> Rdo-list at redhat.com
> https://www.redhat.com/mailman/listinfo/rdo-list
>
> To unsubscribe: rdo-list-unsubscribe at redhat.com




More information about the dev mailing list