[Rdo-list] Overcloud deploy stuck for a long time
by Tzach Shefi
Hi,
Server running centos 7.1, vm running for undercloud got up to overcloud
deploy stage.
It looks like its stuck nothing advancing for a while.
Ideas, what to check?
[stack@instack ~]$ openstack overcloud deploy --templates
Deploying templates in the directory
/usr/share/openstack-tripleo-heat-templates
[91665.696658] device vnet2 entered promiscuous mode
[91665.781346] device vnet3 entered promiscuous mode
[91675.260324] kvm [71183]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
[91675.291232] kvm [71200]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
[91767.799404] kvm: zapping shadow pages for mmio generation wraparound
[91767.880480] kvm: zapping shadow pages for mmio generation wraparound
[91768.957761] device vnet2 left promiscuous mode
[91769.799446] device vnet3 left promiscuous mode
[91771.223273] device vnet3 entered promiscuous mode
[91771.232996] device vnet2 entered promiscuous mode
[91773.733967] kvm [72245]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
[91801.270510] device vnet2 left promiscuous mode
Thanks
Tzach
8 years, 10 months
[Rdo-list] Fwd: [OpenStack-docs] [install-guide] Status of RDO
by Rich Bowen
I wanted to be certain that everyone has seen this message to
OpenStack-docs, and the subsequent conversation at
http://lists.openstack.org/pipermail/openstack-docs/2015-October/007622.html
This is quite serious, as Lana is basically saying that RDO isn't a
viable way to deploy OpenStack in Liberty, and so it's being removed
from the docs.
It would be helpful if someone closer to Liberty packages, and Delorean,
could participate there in a constructive way to bring this to a happy
conclusion before the release tomorrow.
Thanks.
--Rich
-------- Forwarded Message --------
Subject: [OpenStack-docs] [install-guide] Status of RDO
Date: Wed, 14 Oct 2015 16:22:45 +1000
From: Lana Brindley <openstack(a)lanabrindley.com>
To: openstack-docs(a)lists.openstack.org <openstack-docs(a)lists.openstack.org>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi everyone,
We've been unable to obtain good pre-release packages from Red Hat for
the Fedora and Red Hat/CentOS repos, despite our best efforts. This has
left the RDO Install Guide in a largely untested state, so I don't feel
confident publishing it at this stage.
As far as we can tell, Fedora are no longer planning on having
pre-release packages available, so this might be a permanent change for
that OS. For Red Hat/CentOS, it seems to be a temporary problem, so
hopefully we can get the packages, complete testing, and publish the
book soon.
The patch to remove RDO is here, for anyone who cares to comment:
https://review.openstack.org/#/c/234584/
Lana
- --
Lana Brindley
Technical Writer
Rackspace Cloud Builders Australia
http://lanabrindley.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEcBAEBCAAGBQJWHfS1AAoJELppzVb4+KUyM7cH/ii5Ekz5vjTe3dTykXBUbWGt
bR2XJTAbS/mFB+xayecNNPLvgejI6Nxvk8msSFNnN7/ZyDNwr+eceQw7ftMKuJnR
h7qKBb6o5iayLJxgNRK3Kjo13NjGdaiXwfLTbB5br/aiP2HHsrDRexAcLteUCKGt
eHbZUEYqg4VADUvodxNpbZ+7fHuXrIRZoH4aDQ4+o1p0dCdw+vkjzF/MzPSgZFar
Rq9L94rpofDat9ymuW48c+SgUeOnmTvxwEN8ExTENNMXo4nUOJwcUS65J6XURO9K
RUGvjPmSmm7ZaQGE+koKyGZSzF/Oqoa+vBUwxdeQqmtr2tWo//jlUVV/PDc8QV0=
=rQp4
-----END PGP SIGNATURE-----
_______________________________________________
OpenStack-docs mailing list
OpenStack-docs(a)lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-docs
9 years, 1 month
[Rdo-list] RDO-Manager HA Pacemaker in Compute Nodes
by Pedro Sousa
Hi all,
I would like to be able to recover automatically the VMS when a compute
node dies as described here:
http://blog.clusterlabs.org/blog/2015/openstack-ha-compute/
I've checked that I have pacemaker_remote.service
and NovaCompute/NovaEvacuate pacemaker resources on my compute nodes, but
it's doesn't seem to be configured/running:
[*root@overcloud-novacompute-0 openstack]# systemctl list-unit-files | grep
pacemaker*
*pacemaker.service disabled*
*pacemaker_remote.service disabled*
*[root@overcloud-novacompute-0 openstack]# pcs status*
*Error: cluster is not currently running on this node*
Is there a way to activate this on stack deployment? Or do I have to
customize it?
Thanks,
Pedro Sousa
9 years, 1 month
[Rdo-list] How to access the 192.0.2.1:8004 URL to get the deployment failure logs
by Ramkumar GOWRISHANKAR
Hi,
My virtual test bed deployment with just one controller and no computes is
failing at ControllerNodesPostDeployment. The debug steps when a deployment
fails tells to run the following command: "heat resource-show overcloud
ControllerNodesPostDeployment". When I run the command, I see 3 URL
starting with http://192.0.2.1:8004.
How do I access these URLs? When I try a wget on these URLs or when I
create a ssh tunnel from the base machine and try to access the URLs I get
permission denied message. When I try to access just the base URL (
http://192.0.2.1:8004 mapped to http://localhost:8005) via a tunnel, I get
the following message:
{"versions": [{"status":"CURRENT", "id": "v1.0", "links": [{"href":"
http://localhost:8005/v1/","rel":"self"}]}]}
I have looked through the /var/log/heat/ folder for any error messages but
I cannot find any more detailed error message other than deployment failed
at step 1 LoadBalancer.
Any pointers on how to debug a deployment?
Thanks,
Ramkumar
9 years, 2 months
[Rdo-list] Network Isolation setup check
by Raoul Scarazzini
Hi everybody,
I'm trying to deploy a tripleo environment with network isolation using
a pool of 8 machines: 3 controller, 2 compute and 3 storage.
Each of those machine has got 2 network interfaces, the first one (em1)
connected to the lan, the second one (em2) used for the undercloud
provisioning.
The ultimate goal of the setup is to have the ExternalNet on the em1 (so
to be able to put instances with floating Ips in the LAN) and all the
other networks (InternalApi, Storage and StorageMgmt) on the em2.
To produce what described I created this network-environment.yaml
configuration:
resource_registry:
OS::TripleO::BlockStorage::Net::SoftwareConfig:
/home/stack/nic-configs/cinder-storage.yaml
OS::TripleO::Compute::Net::SoftwareConfig:
/home/stack/nic-configs/compute.yaml
OS::TripleO::Controller::Net::SoftwareConfig:
/home/stack/nic-configs/controller.yaml
OS::TripleO::ObjectStorage::Net::SoftwareConfig:
/home/stack/nic-configs/swift-storage.yaml
OS::TripleO::CephStorage::Net::SoftwareConfig:
/home/stack/nic-configs/ceph-storage.yaml
parameter_defaults:
# Customize the IP subnets to match the local environment
InternalApiNetCidr: 172.17.0.0/24
StorageNetCidr: 172.18.0.0/24
StorageMgmtNetCidr: 172.19.0.0/24
TenantNetCidr: 172.16.0.0/24
ExternalNetCidr: 10.1.240.0/24
ControlPlaneSubnetCidr: '24'
InternalApiAllocationPools: [{'start': '172.17.0.10', 'end':
'172.17.0.200'}]
StorageAllocationPools: [{'start': '172.18.0.10', 'end': '172.18.0.200'}]
StorageMgmtAllocationPools: [{'start': '172.19.0.10', 'end':
'172.19.0.200'}]
TenantAllocationPools: [{'start': '172.16.0.10', 'end': '172.16.0.200'}]
ExternalAllocationPools: [{'start': '10.1.240.10', 'end': '10.1.240.200'}]
# Specify the gateway on the external network.
ExternalInterfaceDefaultRoute: 10.1.240.254
# Gateway router for the provisioning network (or Undercloud IP)
ControlPlaneDefaultRoute: 192.0.2.1
# Generally the IP of the Undercloud
EC2MetadataIp: 192.0.2.1
DnsServers: ["10.1.241.2"]
InternalApiNetworkVlanID: 2201
StorageNetworkVlanID: 2203
StorageMgmtNetworkVlanID: 2204
TenantNetworkVlanID: 2202
# This won't actually be used since external is on native VLAN, just
here for reference
#ExternalNetworkVlanID: 38
# Floating IP networks do not have to use br-ex, they can use any
bridge as long as the NeutronExternalNetworkBridge is set to "''".
NeutronExternalNetworkBridge: "''"
And modified the controller.yaml file in this way (default parts are
omitted, nic1 == em1 and nic2 == em2):
...
...
resources:
OsNetConfigImpl:
type: OS::Heat::StructuredConfig
properties:
group: os-apply-config
config:
os_net_config:
network_config:
-
type: ovs_bridge
name: {get_input: bridge_name}
use_dhcp: false
dns_servers: {get_param: DnsServers}
addresses:
-
ip_netmask:
list_join:
- '/'
- - {get_param: ControlPlaneIp}
- {get_param: ControlPlaneSubnetCidr}
routes:
-
ip_netmask: 169.254.169.254/32
next_hop: {get_param: EC2MetadataIp}
members:
-
type: interface
name: nic1
addresses:
-
ip_netmask: {get_param: ExternalIpSubnet}
routes:
-
ip_netmask: 0.0.0.0/0
next_hop: {get_param: ExternalInterfaceDefaultRoute}
-
type: interface
name: nic2
# force the MAC address of the bridge to this interface
primary: true
-
type: vlan
vlan_id: {get_param: InternalApiNetworkVlanID}
addresses:
-
ip_netmask: {get_param: InternalApiIpSubnet}
-
type: vlan
vlan_id: {get_param: StorageNetworkVlanID}
addresses:
-
ip_netmask: {get_param: StorageIpSubnet}
-
type: vlan
vlan_id: {get_param: StorageMgmtNetworkVlanID}
addresses:
-
ip_netmask: {get_param: StorageMgmtIpSubnet}
-
type: vlan
vlan_id: {get_param: TenantNetworkVlanID}
addresses:
-
ip_netmask: {get_param: TenantIpSubnet}
outputs:
OS::stack_id:
description: The OsNetConfigImpl resource.
value: {get_resource: OsNetConfigImpl}
The deploy of the overcloud was invoked with this command:
openstack overcloud deploy --templates --libvirt-type=kvm --ntp-server
10.5.26.10 --control-scale 3 --compute-scale 2 --ceph-storage-scale 3
--block-storage-scale 0 --swift-storage-scale 0 --control-flavor
baremetal --compute-flavor baremetal --ceph-storage-flavor baremetal
--block-storage-flavor baremetal --swift-storage-flavor baremetal
--templates -e
/usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml
-e
/usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml
-e /home/stack/network-environment.yaml
Now the point is that I need to know if my configurations are formally
correct since I got some network problems once the post deployment
status were done.
I still don't know (we're investigating) if those problems are related
to the switch configurations (so hardware side) but for some reason
everything exploded.
What I saw until the machines were reachable was what I was expecting:
the external address assigned to em1 and the vlans correctly assigned to
the em2. With all the external IPs address pingable one to each other.
But I was not able to do further tests.
>From your point of view, do I miss something?
Many thanks,
--
Raoul Scarazzini
rasca(a)redhat.com
9 years, 2 months
[Rdo-list] TryStack Outage Report 2015-10-28
by Kambiz Aghaiepour
TryStack Outage
Wednesday Oct 28, 2015
Impact -
Earlier Wednesday morning, TryStack ( http://x86.trystack.org/ )
experienced an outage for several hours beginning in the early hours of
the day. The outage impacted all tenants and appears to have been
caused due to exhaustion of services related to tenant networks building
up over the course of several months. In order to return services to
normal, resources (networks, router ports, etc) for tenants without any
running VMs were manually deleted freeing up system resources on our
neutron host and returning TryStack back to normal operations.
Per Tenant Fix -
If you have occasion to use TryStack as a sandbox environment, you may
need to delete and recreate your router in your tenant if you find your
launched guests are not acquiring a DHCP address correctly or able to be
connected with over an associated floating IP address.
Ongoing Resource Management -
In order to prevent exhaustion of system resources, we have been
automatically deleting VMs 24 hours after they are
created. Additionally, we clear router gateways as well as floating IP
allocations 12 hours after they are set (the public subnet is a /24
network and anyone with an account can use the public subnet free of
charge, hence the need for aggressively culling resources)
Until today we had not been purging other resources, and over the course
of the last three to four months, the tenant/project count has grown to
just over 1300 tenants. Many users login a few times, create their
networks and routers, and launch some test VMs and may not revisit
TryStack for some time. As such the qrouter and qdhcp network
namespaces are created, and ports created in OVS, along with associated
dnsmasq processes for each subnet the tenant creates. We are adding
management and culling of these additional resource types using the
ospurge utility ( see: https://github.com/openstack/ospurge )
IRC Alerting -
We have also added IRC bots that can announce alerts in the #trystack
channel in Freenode. Alerts are sent to the IRC bot via a nagios
instance monitoring the environment.
Grafana / Graphite -
We are currently working on building dashboards using grafana, using a
graphite backend, and collectd agents sending data to graphite. Will
Foster has built an initial dashboard to see resource utilization and
trending at a glance (Thanks Will!). The dashboard(s) are not yet ready
for public consumption, but we plan on making a read-only grafana
interface available in the near future. For a sample of what the
dashboard will look like, see :
http://ibin.co/2Kf8i9WxsWIl
(The image is only depicting part of the dashboard as it is only a
screenshot).
--
Red Hat, Inc.
100 East Davie Street
Raleigh, NC 27601
"All tyranny needs to gain a foothold is for people of good conscience
to remain silent." --Thomas Jefferson
9 years, 2 months
[Rdo-list] Discovery password
by Alessandro Vozza
Hi
I’m discovering my bare metals but I run into network problems: some of them (dell blades) are discovered correctly while others can’t send their results back to the undercloud (but they do boot correctly the discovery image). Is there a way to drop into a shell at the console of the nodes that are failing and check the network configuration there? In the good ‘ol days of staypuft we could pass the rootpw= parameter to pxe, is that an option now as well?
thanks
alessandro
9 years, 2 months