Just another update. I fixed the connectivity issue (nic1 and nic2 were
inverted in yaml files) but the setup files anyway.
The problem now, looking into the compute's
/var/lib/heat-config/deployed directory) is this one:
{
"deploy_stdout": "Trying to ping 172.16.0.14 for local network
172.16.0.0/24...SUCCESS\nTrying to ping 172.17.0.16 for local network
172.17.0.0/24...SUCCESS\nTrying to ping 172.18.0.14 for local network
172.18.0.0/24...SUCCESS\nTrying to ping default gateway
10.1.241.254...FAILURE\n10.1.241.254 is not pingable.\n",
"deploy_stderr": "",
"deploy_status_code": 1
}
Funny thing is that I'm able to ping 10.1.241.254 from the compute
nodes, and so I'm asking why it is failing during the deployment.
Should it be that something is not ready from network side? Then what?
Can you give me some hints on how to debug this?
--
Raoul Scarazzini
rasca(a)redhat.com
Il giorno 8/2/2016 12:01:29, Raoul Scarazzini ha scritto:
Hi David,
you are absolutely right, I did too many assumptions. First of all I'm
installing everything with rdo-manager, using an identical set of
configurations that came from my previous (working) osp-director 8 setup.
Things seems to fail on compute node verifications. Specifically here:
Feb 05 16:37:23 overcloud-novacompute-0 os-collect-config[6014]:
[2016-02-05 16:37:23,707] (heat-config) [ERROR] Error running
/var/lib/heat-config/heat-config-script/a435044e-9be8-42ea-8b03-92bee12b3d23.
[1]
Looking that script I identified two different actions:
1) # For each unique remote IP (specified via Heat) we check to
# see if one of the locally configured networks matches and if so we
# attempt a ping test the remote network IP.
2) # Ping all default gateways. There should only be one
# if using upstream t-h-t network templates but we test
# all of them should some manual network config have
# multiple gateways.
And in fact after a verification I'm not able to reach compute nodes
from controllers or other computes inside one of the
InternalApiAllocationPools (172.17.0) or TenantAllocationPools (172.16.0).
I'm using a specific network setup, as I said the same one I was using
with osp-director 8. So I've got a specific network-management.yaml file
in which I've specified these settings:
resource_registry:
OS::TripleO::BlockStorage::Net::SoftwareConfig:
/home/stack/nic-configs/cinder-storage.yaml
OS::TripleO::Compute::Net::SoftwareConfig:
/home/stack/nic-configs/compute.yaml
OS::TripleO::Controller::Net::SoftwareConfig:
/home/stack/nic-configs/controller.yaml
OS::TripleO::ObjectStorage::Net::SoftwareConfig:
/home/stack/nic-configs/swift-storage.yaml
OS::TripleO::CephStorage::Net::SoftwareConfig:
/home/stack/nic-configs/ceph-storage.yaml
parameter_defaults:
# Customize the IP subnets to match the local environment
InternalApiNetCidr: 172.17.0.0/24
StorageNetCidr: 172.18.0.0/24
StorageMgmtNetCidr: 172.19.0.0/24
TenantNetCidr: 172.16.0.0/24
ExternalNetCidr: 172.20.0.0/24
ControlPlaneSubnetCidr: '24'
InternalApiAllocationPools: [{'start': '172.17.0.10', 'end':
'172.17.0.200'}]
StorageAllocationPools: [{'start': '172.18.0.10', 'end':
'172.18.0.200'}]
StorageMgmtAllocationPools: [{'start': '172.19.0.10', 'end':
'172.19.0.200'}]
TenantAllocationPools: [{'start': '172.16.0.10', 'end':
'172.16.0.200'}]
ExternalAllocationPools: [{'start': '172.20.0.10', 'end':
'172.20.0.200'}]
# Specify the gateway on the external network.
ExternalInterfaceDefaultRoute: 172.20.0.254
# Gateway router for the provisioning network (or Undercloud IP)
ControlPlaneDefaultRoute: 192.0.2.1
# Generally the IP of the Undercloud
EC2MetadataIp: 192.0.2.1
DnsServers: ["10.1.241.2"]
InternalApiNetworkVlanID: 2201
StorageNetworkVlanID: 2203
StorageMgmtNetworkVlanID: 2204
TenantNetworkVlanID: 2202
ExternalNetworkVlanID: 2205
# Floating IP networks do not have to use br-ex, they can use any
bridge as long as the NeutronExternalNetworkBridge is set to "''".
NeutronExternalNetworkBridge: "''"
# Variables in "parameters" apply an actual value to one of the
top-level params
parameters:
# The OVS logical->physical bridge mappings to use. Defaults to
mapping br-ex - the external bridge on hosts - to a physical name
'datacentre' which can be used
# to create provider networks (and we use this for the default
floating network) - if changing this either use different post-install
network scripts or be sure
# to keep 'datacentre' as a mapping network name.
# Unfortunately this option is overridden by the command line, due to
a limitation (that will be fixed), so even declaring this won't have effect.
# See overcloud-deploy.sh for all the explenations.
#
https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud...
NeutronBridgeMappings: "datacentre:br-floating"
Obviously the controller.yaml was modified to reflect my needs, as
described here [1] and as you can see, I don't have declared a
ManagementNetworkVlan, since before this was not needed.
So the first question is: what is this new network and how it
differentiate from the other networks actually available? Can this
affect communications that before were working?
Many thanks
[1]
https://github.com/rscarazz/openstack/blob/master/ospd-network-isolation-...
--
Raoul Scarazzini
rasca(a)redhat.com
Il giorno 6/2/2016 00:38:38, David Moreau Simard ha scritto:
> Hi Raoul,
>
> A good start would be to give us some more details about how you did the
> installation.
>
> What installation tool/procedure ? What repositories ?
>
> David Moreau Simard
> Senior Software Engineer | Openstack RDO
>
> dmsimard = [irc, github, twitter]
>
> On Feb 5, 2016 5:16 AM, "Raoul Scarazzini" <rasca(a)redhat.com
> <mailto:rasca@redhat.com>> wrote:
>
> Hi,
> I'm trying to deploy Mitaka on a baremetal environment composed
> by 3 controllers and 4 computes.
> After introspection nodes seems fine, even for one of them I need to do
> introspection by hand, since it was not completing the process. But in
> the end all my 7 nodes were in state "available".
>
> Launching the overcloud deploy, the controller part it goes fine, but
> then it gives me this error about compute:
>
> 2016-02-05 09:26:59 [NovaCompute]: CREATE_FAILED ResourceInError:
> resources.NovaCompute: Went to status ERROR due to "Message: Exceeded
> maximum number of retries. Exceeded max scheduling at
> tempts 3 for instance 0227f7c1-3c2b-4e10-93bf-e7d84a7aca71. Last
> exception: Port b8:ca:3a:66:ef:5a is still in use.
>
> The funny thing is that I can't find anywhere the incriminated ID it's
> not an Ironic node ID and neither a Nova one.
>
> Can you help me point the attention in the right direction?
>
> Many thanks,
>
> --
> Raoul Scarazzini
> rasca(a)redhat.com <mailto:rasca@redhat.com>
>
> _______________________________________________
> Rdo-list mailing list
> Rdo-list(a)redhat.com <mailto:Rdo-list@redhat.com>
>
https://www.redhat.com/mailman/listinfo/rdo-list
>
> To unsubscribe: rdo-list-unsubscribe(a)redhat.com
> <mailto:rdo-list-unsubscribe@redhat.com>
>
_______________________________________________
Rdo-list mailing list
Rdo-list(a)redhat.com
https://www.redhat.com/mailman/listinfo/rdo-list
To unsubscribe: rdo-list-unsubscribe(a)redhat.com