[Rdo-list] Failing to deploy Mitaka on baremetal
Marius Cornea
marius at remote-lab.net
Mon Feb 8 17:26:45 UTC 2016
Hi Raoul,
Can you post the output of the following commands on that compute node please?
cat /etc/os-net-config/config.json | python -m json.tool
ip a
ip r
Thanks
On Mon, Feb 8, 2016 at 6:15 PM, Raoul Scarazzini <rasca at redhat.com> wrote:
> Just another update. I fixed the connectivity issue (nic1 and nic2 were
> inverted in yaml files) but the setup files anyway.
> The problem now, looking into the compute's
> /var/lib/heat-config/deployed directory) is this one:
>
> {
> "deploy_stdout": "Trying to ping 172.16.0.14 for local network
> 172.16.0.0/24...SUCCESS\nTrying to ping 172.17.0.16 for local network
> 172.17.0.0/24...SUCCESS\nTrying to ping 172.18.0.14 for local network
> 172.18.0.0/24...SUCCESS\nTrying to ping default gateway
> 10.1.241.254...FAILURE\n10.1.241.254 is not pingable.\n",
> "deploy_stderr": "",
> "deploy_status_code": 1
> }
>
> Funny thing is that I'm able to ping 10.1.241.254 from the compute
> nodes, and so I'm asking why it is failing during the deployment.
>
> Should it be that something is not ready from network side? Then what?
> Can you give me some hints on how to debug this?
>
> --
> Raoul Scarazzini
> rasca at redhat.com
>
> Il giorno 8/2/2016 12:01:29, Raoul Scarazzini ha scritto:
>> Hi David,
>> you are absolutely right, I did too many assumptions. First of all I'm
>> installing everything with rdo-manager, using an identical set of
>> configurations that came from my previous (working) osp-director 8 setup.
>>
>> Things seems to fail on compute node verifications. Specifically here:
>>
>> Feb 05 16:37:23 overcloud-novacompute-0 os-collect-config[6014]:
>> [2016-02-05 16:37:23,707] (heat-config) [ERROR] Error running
>> /var/lib/heat-config/heat-config-script/a435044e-9be8-42ea-8b03-92bee12b3d23.
>> [1]
>>
>> Looking that script I identified two different actions:
>>
>> 1) # For each unique remote IP (specified via Heat) we check to
>> # see if one of the locally configured networks matches and if so we
>> # attempt a ping test the remote network IP.
>>
>> 2) # Ping all default gateways. There should only be one
>> # if using upstream t-h-t network templates but we test
>> # all of them should some manual network config have
>> # multiple gateways.
>>
>> And in fact after a verification I'm not able to reach compute nodes
>> from controllers or other computes inside one of the
>> InternalApiAllocationPools (172.17.0) or TenantAllocationPools (172.16.0).
>>
>> I'm using a specific network setup, as I said the same one I was using
>> with osp-director 8. So I've got a specific network-management.yaml file
>> in which I've specified these settings:
>>
>> resource_registry:
>> OS::TripleO::BlockStorage::Net::SoftwareConfig:
>> /home/stack/nic-configs/cinder-storage.yaml
>> OS::TripleO::Compute::Net::SoftwareConfig:
>> /home/stack/nic-configs/compute.yaml
>> OS::TripleO::Controller::Net::SoftwareConfig:
>> /home/stack/nic-configs/controller.yaml
>> OS::TripleO::ObjectStorage::Net::SoftwareConfig:
>> /home/stack/nic-configs/swift-storage.yaml
>> OS::TripleO::CephStorage::Net::SoftwareConfig:
>> /home/stack/nic-configs/ceph-storage.yaml
>>
>> parameter_defaults:
>> # Customize the IP subnets to match the local environment
>> InternalApiNetCidr: 172.17.0.0/24
>> StorageNetCidr: 172.18.0.0/24
>> StorageMgmtNetCidr: 172.19.0.0/24
>> TenantNetCidr: 172.16.0.0/24
>> ExternalNetCidr: 172.20.0.0/24
>> ControlPlaneSubnetCidr: '24'
>> InternalApiAllocationPools: [{'start': '172.17.0.10', 'end':
>> '172.17.0.200'}]
>> StorageAllocationPools: [{'start': '172.18.0.10', 'end': '172.18.0.200'}]
>> StorageMgmtAllocationPools: [{'start': '172.19.0.10', 'end':
>> '172.19.0.200'}]
>> TenantAllocationPools: [{'start': '172.16.0.10', 'end': '172.16.0.200'}]
>> ExternalAllocationPools: [{'start': '172.20.0.10', 'end': '172.20.0.200'}]
>> # Specify the gateway on the external network.
>> ExternalInterfaceDefaultRoute: 172.20.0.254
>> # Gateway router for the provisioning network (or Undercloud IP)
>> ControlPlaneDefaultRoute: 192.0.2.1
>> # Generally the IP of the Undercloud
>> EC2MetadataIp: 192.0.2.1
>> DnsServers: ["10.1.241.2"]
>> InternalApiNetworkVlanID: 2201
>> StorageNetworkVlanID: 2203
>> StorageMgmtNetworkVlanID: 2204
>> TenantNetworkVlanID: 2202
>> ExternalNetworkVlanID: 2205
>> # Floating IP networks do not have to use br-ex, they can use any
>> bridge as long as the NeutronExternalNetworkBridge is set to "''".
>> NeutronExternalNetworkBridge: "''"
>>
>> # Variables in "parameters" apply an actual value to one of the
>> top-level params
>> parameters:
>> # The OVS logical->physical bridge mappings to use. Defaults to
>> mapping br-ex - the external bridge on hosts - to a physical name
>> 'datacentre' which can be used
>> # to create provider networks (and we use this for the default
>> floating network) - if changing this either use different post-install
>> network scripts or be sure
>> # to keep 'datacentre' as a mapping network name.
>> # Unfortunately this option is overridden by the command line, due to
>> a limitation (that will be fixed), so even declaring this won't have effect.
>> # See overcloud-deploy.sh for all the explenations.
>> #
>> https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud-without-mergepy.yaml#L112
>> NeutronBridgeMappings: "datacentre:br-floating"
>>
>> Obviously the controller.yaml was modified to reflect my needs, as
>> described here [1] and as you can see, I don't have declared a
>> ManagementNetworkVlan, since before this was not needed.
>> So the first question is: what is this new network and how it
>> differentiate from the other networks actually available? Can this
>> affect communications that before were working?
>>
>> Many thanks
>>
>> [1]
>> https://github.com/rscarazz/openstack/blob/master/ospd-network-isolation-considerations.md
>>
>> --
>> Raoul Scarazzini
>> rasca at redhat.com
>>
>> Il giorno 6/2/2016 00:38:38, David Moreau Simard ha scritto:
>>> Hi Raoul,
>>>
>>> A good start would be to give us some more details about how you did the
>>> installation.
>>>
>>> What installation tool/procedure ? What repositories ?
>>>
>>> David Moreau Simard
>>> Senior Software Engineer | Openstack RDO
>>>
>>> dmsimard = [irc, github, twitter]
>>>
>>> On Feb 5, 2016 5:16 AM, "Raoul Scarazzini" <rasca at redhat.com
>>> <mailto:rasca at redhat.com>> wrote:
>>>
>>> Hi,
>>> I'm trying to deploy Mitaka on a baremetal environment composed
>>> by 3 controllers and 4 computes.
>>> After introspection nodes seems fine, even for one of them I need to do
>>> introspection by hand, since it was not completing the process. But in
>>> the end all my 7 nodes were in state "available".
>>>
>>> Launching the overcloud deploy, the controller part it goes fine, but
>>> then it gives me this error about compute:
>>>
>>> 2016-02-05 09:26:59 [NovaCompute]: CREATE_FAILED ResourceInError:
>>> resources.NovaCompute: Went to status ERROR due to "Message: Exceeded
>>> maximum number of retries. Exceeded max scheduling at
>>> tempts 3 for instance 0227f7c1-3c2b-4e10-93bf-e7d84a7aca71. Last
>>> exception: Port b8:ca:3a:66:ef:5a is still in use.
>>>
>>> The funny thing is that I can't find anywhere the incriminated ID it's
>>> not an Ironic node ID and neither a Nova one.
>>>
>>> Can you help me point the attention in the right direction?
>>>
>>> Many thanks,
>>>
>>> --
>>> Raoul Scarazzini
>>> rasca at redhat.com <mailto:rasca at redhat.com>
>>>
>>> _______________________________________________
>>> Rdo-list mailing list
>>> Rdo-list at redhat.com <mailto:Rdo-list at redhat.com>
>>> https://www.redhat.com/mailman/listinfo/rdo-list
>>>
>>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>> <mailto:rdo-list-unsubscribe at redhat.com>
>>>
>>
>> _______________________________________________
>> Rdo-list mailing list
>> Rdo-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/rdo-list
>>
>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>
>
> _______________________________________________
> Rdo-list mailing list
> Rdo-list at redhat.com
> https://www.redhat.com/mailman/listinfo/rdo-list
>
> To unsubscribe: rdo-list-unsubscribe at redhat.com
More information about the dev
mailing list