[Rdo-list] Failing to deploy Mitaka on baremetal
Raoul Scarazzini
rasca at redhat.com
Tue Feb 9 06:58:17 UTC 2016
Hi Marius,
here it is:
# cat /etc/os-net-config/config.json | python -m json.tool
{
"network_config": [
{
"addresses": [
{
"ip_netmask": "192.0.2.21/24"
}
],
"dns_servers": [
"10.1.241.2"
],
"members": [
{
"name": "nic2",
"primary": true,
"type": "interface"
},
{
"addresses": [
{
"ip_netmask": "172.17.0.12/24"
}
],
"type": "vlan",
"vlan_id": 2201
},
{
"addresses": [
{
"ip_netmask": "172.18.0.11/24"
}
],
"type": "vlan",
"vlan_id": 2203
},
{
"addresses": [
{
"ip_netmask": "172.16.0.10/24"
}
],
"type": "vlan",
"vlan_id": 2202
}
],
"name": "br-ex",
"routes": [
{
"ip_netmask": "169.254.169.254/32",
"next_hop": "192.0.2.1"
},
{
"default": true,
"next_hop": "192.0.2.1"
}
],
"type": "ovs_bridge",
"use_dhcp": false
}
]
}
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
[4/1995]
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: em3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN
qlen 1000
link/ether b8:ca:3a:66:f1:b4 brd ff:ff:ff:ff:ff:ff
3: em4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN
qlen 1000
link/ether b8:ca:3a:66:f1:b5 brd ff:ff:ff:ff:ff:ff
4: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
qlen 1000
link/ether b8:ca:3a:66:f1:b0 brd ff:ff:ff:ff:ff:ff
inet 10.1.241.9/24 brd 10.1.241.255 scope global dynamic em1
valid_lft 530sec preferred_lft 530sec
inet6 fe80::baca:3aff:fe66:f1b0/64 scope link
valid_lft forever preferred_lft forever
5: em2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master
ovs-system state UP qlen 1000
link/ether b8:ca:3a:66:f1:b2 brd ff:ff:ff:ff:ff:ff
inet6 fe80::baca:3aff:fe66:f1b2/64 scope link
valid_lft forever preferred_lft forever
6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
link/ether 06:65:08:31:11:35 brd ff:ff:ff:ff:ff:ff
7: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
UNKNOWN
link/ether b8:ca:3a:66:f1:b2 brd ff:ff:ff:ff:ff:ff
inet 192.0.2.21/24 brd 192.0.2.255 scope global br-ex
valid_lft forever preferred_lft forever
inet6 fe80::baca:3aff:fe66:f1b2/64 scope link
valid_lft forever preferred_lft forever
8: vlan2203: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UNKNOWN
link/ether 96:6a:10:5b:1a:47 brd ff:ff:ff:ff:ff:ff
inet 172.18.0.11/24 brd 172.18.0.255 scope global vlan2203
valid_lft forever preferred_lft forever
inet6 fe80::946a:10ff:fe5b:1a47/64 scope link
valid_lft forever preferred_lft forever
9: vlan2202: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UNKNOWN
link/ether ca:e0:0d:b0:7e:30 brd ff:ff:ff:ff:ff:ff
inet 172.16.0.10/24 brd 172.16.0.255 scope global vlan2202
valid_lft forever preferred_lft forever
inet6 fe80::c8e0:dff:feb0:7e30/64 scope link
valid_lft forever preferred_lft forever
10: vlan2201: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UNKNOWN
link/ether c2:24:5f:4c:37:6c brd ff:ff:ff:ff:ff:ff
inet 172.17.0.12/24 brd 172.17.0.255 scope global vlan2201
valid_lft forever preferred_lft forever
inet6 fe80::c024:5fff:fe4c:376c/64 scope link
valid_lft forever preferred_lft forever
11: br-floating: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
noqueue state UNKNOWN
link/ether de:7f:cd:d7:c1:46 brd ff:ff:ff:ff:ff:ff
inet6 fe80::dc7f:cdff:fed7:c146/64 scope link
valid_lft forever preferred_lft forever
12: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
link/ether 36:39:0f:39:85:4c brd ff:ff:ff:ff:ff:ff
13: br-tun: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
link/ether 72:d1:82:d9:15:4f brd ff:ff:ff:ff:ff:ff
# ip r
default via 10.1.241.254 dev em1
10.1.241.0/24 dev em1 proto kernel scope link src 10.1.241.9
169.254.169.254 via 192.0.2.1 dev br-ex
172.16.0.0/24 dev vlan2202 proto kernel scope link src 172.16.0.10
172.17.0.0/24 dev vlan2201 proto kernel scope link src 172.17.0.12
172.18.0.0/24 dev vlan2203 proto kernel scope link src 172.18.0.11
192.0.2.0/24 dev br-ex proto kernel scope link src 192.0.2.21
Many thanks,
--
Raoul Scarazzini
rasca at redhat.com
Il giorno 8/2/2016 18:26:45, Marius Cornea ha scritto:
> Hi Raoul,
>
> Can you post the output of the following commands on that compute node please?
>
> cat /etc/os-net-config/config.json | python -m json.tool
> ip a
> ip r
>
> Thanks
>
> On Mon, Feb 8, 2016 at 6:15 PM, Raoul Scarazzini <rasca at redhat.com> wrote:
>> Just another update. I fixed the connectivity issue (nic1 and nic2 were
>> inverted in yaml files) but the setup files anyway.
>> The problem now, looking into the compute's
>> /var/lib/heat-config/deployed directory) is this one:
>>
>> {
>> "deploy_stdout": "Trying to ping 172.16.0.14 for local network
>> 172.16.0.0/24...SUCCESS\nTrying to ping 172.17.0.16 for local network
>> 172.17.0.0/24...SUCCESS\nTrying to ping 172.18.0.14 for local network
>> 172.18.0.0/24...SUCCESS\nTrying to ping default gateway
>> 10.1.241.254...FAILURE\n10.1.241.254 is not pingable.\n",
>> "deploy_stderr": "",
>> "deploy_status_code": 1
>> }
>>
>> Funny thing is that I'm able to ping 10.1.241.254 from the compute
>> nodes, and so I'm asking why it is failing during the deployment.
>>
>> Should it be that something is not ready from network side? Then what?
>> Can you give me some hints on how to debug this?
>>
>> --
>> Raoul Scarazzini
>> rasca at redhat.com
>>
>> Il giorno 8/2/2016 12:01:29, Raoul Scarazzini ha scritto:
>>> Hi David,
>>> you are absolutely right, I did too many assumptions. First of all I'm
>>> installing everything with rdo-manager, using an identical set of
>>> configurations that came from my previous (working) osp-director 8 setup.
>>>
>>> Things seems to fail on compute node verifications. Specifically here:
>>>
>>> Feb 05 16:37:23 overcloud-novacompute-0 os-collect-config[6014]:
>>> [2016-02-05 16:37:23,707] (heat-config) [ERROR] Error running
>>> /var/lib/heat-config/heat-config-script/a435044e-9be8-42ea-8b03-92bee12b3d23.
>>> [1]
>>>
>>> Looking that script I identified two different actions:
>>>
>>> 1) # For each unique remote IP (specified via Heat) we check to
>>> # see if one of the locally configured networks matches and if so we
>>> # attempt a ping test the remote network IP.
>>>
>>> 2) # Ping all default gateways. There should only be one
>>> # if using upstream t-h-t network templates but we test
>>> # all of them should some manual network config have
>>> # multiple gateways.
>>>
>>> And in fact after a verification I'm not able to reach compute nodes
>>> from controllers or other computes inside one of the
>>> InternalApiAllocationPools (172.17.0) or TenantAllocationPools (172.16.0).
>>>
>>> I'm using a specific network setup, as I said the same one I was using
>>> with osp-director 8. So I've got a specific network-management.yaml file
>>> in which I've specified these settings:
>>>
>>> resource_registry:
>>> OS::TripleO::BlockStorage::Net::SoftwareConfig:
>>> /home/stack/nic-configs/cinder-storage.yaml
>>> OS::TripleO::Compute::Net::SoftwareConfig:
>>> /home/stack/nic-configs/compute.yaml
>>> OS::TripleO::Controller::Net::SoftwareConfig:
>>> /home/stack/nic-configs/controller.yaml
>>> OS::TripleO::ObjectStorage::Net::SoftwareConfig:
>>> /home/stack/nic-configs/swift-storage.yaml
>>> OS::TripleO::CephStorage::Net::SoftwareConfig:
>>> /home/stack/nic-configs/ceph-storage.yaml
>>>
>>> parameter_defaults:
>>> # Customize the IP subnets to match the local environment
>>> InternalApiNetCidr: 172.17.0.0/24
>>> StorageNetCidr: 172.18.0.0/24
>>> StorageMgmtNetCidr: 172.19.0.0/24
>>> TenantNetCidr: 172.16.0.0/24
>>> ExternalNetCidr: 172.20.0.0/24
>>> ControlPlaneSubnetCidr: '24'
>>> InternalApiAllocationPools: [{'start': '172.17.0.10', 'end':
>>> '172.17.0.200'}]
>>> StorageAllocationPools: [{'start': '172.18.0.10', 'end': '172.18.0.200'}]
>>> StorageMgmtAllocationPools: [{'start': '172.19.0.10', 'end':
>>> '172.19.0.200'}]
>>> TenantAllocationPools: [{'start': '172.16.0.10', 'end': '172.16.0.200'}]
>>> ExternalAllocationPools: [{'start': '172.20.0.10', 'end': '172.20.0.200'}]
>>> # Specify the gateway on the external network.
>>> ExternalInterfaceDefaultRoute: 172.20.0.254
>>> # Gateway router for the provisioning network (or Undercloud IP)
>>> ControlPlaneDefaultRoute: 192.0.2.1
>>> # Generally the IP of the Undercloud
>>> EC2MetadataIp: 192.0.2.1
>>> DnsServers: ["10.1.241.2"]
>>> InternalApiNetworkVlanID: 2201
>>> StorageNetworkVlanID: 2203
>>> StorageMgmtNetworkVlanID: 2204
>>> TenantNetworkVlanID: 2202
>>> ExternalNetworkVlanID: 2205
>>> # Floating IP networks do not have to use br-ex, they can use any
>>> bridge as long as the NeutronExternalNetworkBridge is set to "''".
>>> NeutronExternalNetworkBridge: "''"
>>>
>>> # Variables in "parameters" apply an actual value to one of the
>>> top-level params
>>> parameters:
>>> # The OVS logical->physical bridge mappings to use. Defaults to
>>> mapping br-ex - the external bridge on hosts - to a physical name
>>> 'datacentre' which can be used
>>> # to create provider networks (and we use this for the default
>>> floating network) - if changing this either use different post-install
>>> network scripts or be sure
>>> # to keep 'datacentre' as a mapping network name.
>>> # Unfortunately this option is overridden by the command line, due to
>>> a limitation (that will be fixed), so even declaring this won't have effect.
>>> # See overcloud-deploy.sh for all the explenations.
>>> #
>>> https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud-without-mergepy.yaml#L112
>>> NeutronBridgeMappings: "datacentre:br-floating"
>>>
>>> Obviously the controller.yaml was modified to reflect my needs, as
>>> described here [1] and as you can see, I don't have declared a
>>> ManagementNetworkVlan, since before this was not needed.
>>> So the first question is: what is this new network and how it
>>> differentiate from the other networks actually available? Can this
>>> affect communications that before were working?
>>>
>>> Many thanks
>>>
>>> [1]
>>> https://github.com/rscarazz/openstack/blob/master/ospd-network-isolation-considerations.md
>>>
>>> --
>>> Raoul Scarazzini
>>> rasca at redhat.com
>>>
>>> Il giorno 6/2/2016 00:38:38, David Moreau Simard ha scritto:
>>>> Hi Raoul,
>>>>
>>>> A good start would be to give us some more details about how you did the
>>>> installation.
>>>>
>>>> What installation tool/procedure ? What repositories ?
>>>>
>>>> David Moreau Simard
>>>> Senior Software Engineer | Openstack RDO
>>>>
>>>> dmsimard = [irc, github, twitter]
>>>>
>>>> On Feb 5, 2016 5:16 AM, "Raoul Scarazzini" <rasca at redhat.com
>>>> <mailto:rasca at redhat.com>> wrote:
>>>>
>>>> Hi,
>>>> I'm trying to deploy Mitaka on a baremetal environment composed
>>>> by 3 controllers and 4 computes.
>>>> After introspection nodes seems fine, even for one of them I need to do
>>>> introspection by hand, since it was not completing the process. But in
>>>> the end all my 7 nodes were in state "available".
>>>>
>>>> Launching the overcloud deploy, the controller part it goes fine, but
>>>> then it gives me this error about compute:
>>>>
>>>> 2016-02-05 09:26:59 [NovaCompute]: CREATE_FAILED ResourceInError:
>>>> resources.NovaCompute: Went to status ERROR due to "Message: Exceeded
>>>> maximum number of retries. Exceeded max scheduling at
>>>> tempts 3 for instance 0227f7c1-3c2b-4e10-93bf-e7d84a7aca71. Last
>>>> exception: Port b8:ca:3a:66:ef:5a is still in use.
>>>>
>>>> The funny thing is that I can't find anywhere the incriminated ID it's
>>>> not an Ironic node ID and neither a Nova one.
>>>>
>>>> Can you help me point the attention in the right direction?
>>>>
>>>> Many thanks,
>>>>
>>>> --
>>>> Raoul Scarazzini
>>>> rasca at redhat.com <mailto:rasca at redhat.com>
>>>>
>>>> _______________________________________________
>>>> Rdo-list mailing list
>>>> Rdo-list at redhat.com <mailto:Rdo-list at redhat.com>
>>>> https://www.redhat.com/mailman/listinfo/rdo-list
>>>>
>>>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>>> <mailto:rdo-list-unsubscribe at redhat.com>
>>>>
>>>
>>> _______________________________________________
>>> Rdo-list mailing list
>>> Rdo-list at redhat.com
>>> https://www.redhat.com/mailman/listinfo/rdo-list
>>>
>>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>>
>>
>> _______________________________________________
>> Rdo-list mailing list
>> Rdo-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/rdo-list
>>
>> To unsubscribe: rdo-list-unsubscribe at redhat.com
More information about the dev
mailing list