[Rdo-list] Permission denied errors after installing a Storage Node in a Swift Cluster
by Diogo Vieira
Hello,
I have a 4 node Swift cluster (1 Proxy Node and 3 Storage Nodes) to which I would like to add a 4th Storage Node. I've tried two different ways to add it. The first one was manually and the second one was with packstack (which in itself had some other problems described here[1], if you'd like to help - I worked around the problem with a symbolic link of the swift-ring-builder to the true binary). With both methods, as soon as I add the 4th Storage Node I start getting errors in syslog coming from rsync like:
> May 5 13:48:55 host-10-10-6-28 object-replicator: rsync: recv_generator: mkdir "/device3/objects/134106/f2a/82f6a3461bb69f80918a1a508a8bdf2a" (in object) failed: Permission denied (13)
> May 5 13:48:55 host-10-10-6-28 object-replicator: *** Skipping any contents from this failed directory ***
> May 5 13:48:55 host-10-10-6-28 object-replicator: rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1165) [sender=3.1.0pre1]
> May 5 13:48:55 host-10-10-6-28 object-replicator: Bad rsync return code: 23 <- ['rsync', '--recursive', '--whole-file', '--human-readable', '--xattrs', '--itemize-changes', '--ignore-existing', '--timeout=30', '--contimeout=30', '--bwlimit=0', '/srv/node/device1/objects/134106/f2a', '10.10.6.30::object/device3/objects/134106']
or
> May 5 13:48:23 host-10-10-6-28 object-replicator: rsync: recv_generator: failed to stat "/device2/objects/134106/f2a/82f6a3461bb69f80918a1a508a8bdf2a/1399288004.47239.data" (in object): Permission denied (13)
> May 5 13:48:23 host-10-10-6-28 object-replicator: rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1165) [sender=3.1.0pre1]
> May 5 13:48:23 host-10-10-6-28 object-replicator: Bad rsync return code: 23 <- ['rsync', '--recursive', '--whole-file', '--human-readable', '--xattrs', '--itemize-changes', '--ignore-existing', '--timeout=30', '--contimeout=30', '--bwlimit=0', '/srv/node/device1/objects/134106/f2a', '10.10.6.29::object/device2/objects/134106']
Please note that I got these messages on one of the older Storage Nodes while on the newly added one I get some errors like these:
> May 5 14:17:44 host-10-10-6-34 object-auditor: ERROR Trying to audit /srv/node/device4/objects/134106/f2a/82f6a3461bb69f80918a1a508a8bdf2a: #012Traceback (most recent call last):#012 File "/usr/lib/python2.7/site-packages/swift/obj/auditor.py", line 173, in failsafe_object_audit#012 self.object_audit(location)#012 File "/usr/lib/python2.7/site-packages/swift/obj/auditor.py", line 191, in object_audit#012 with df.open():#012 File "/usr/lib/python2.7/site-packages/swift/obj/diskfile.py", line 1025, in open#012 data_file, meta_file, ts_file = self._get_ondisk_file()#012 File "/usr/lib/python2.7/site-packages/swift/obj/diskfile.py", line 1114, in _get_ondisk_file#012 "Error listing directory %s: %s" % (self._datadir, err))#012DiskFileError: Error listing directory /srv/node/device4/objects/134106/f2a/82f6a3461bb69f80918a1a508a8bdf2a: [Errno 13] Permission denied: '/srv/node/device4/objects/134106/f2a/82f6a3461bb69f80918a1a508a8bdf2a'
> May 5 14:17:44 host-10-10-6-34 object-auditor: ERROR Trying to audit /srv/node/device4/objects/215176/306/d222240f67449968145f65edd48ad306: #012Traceback (most recent call last):#012 File "/usr/lib/python2.7/site-packages/swift/obj/auditor.py", line 173, in failsafe_object_audit#012 self.object_audit(location)#012 File "/usr/lib/python2.7/site-packages/swift/obj/auditor.py", line 191, in object_audit#012 with df.open():#012 File "/usr/lib/python2.7/site-packages/swift/obj/diskfile.py", line 1025, in open#012 data_file, meta_file, ts_file = self._get_ondisk_file()#012 File "/usr/lib/python2.7/site-packages/swift/obj/diskfile.py", line 1114, in _get_ondisk_file#012 "Error listing directory %s: %s" % (self._datadir, err))#012DiskFileError: Error listing directory /srv/node/device4/objects/215176/306/d222240f67449968145f65edd48ad306: [Errno 13] Permission denied: '/srv/node/device4/objects/215176/306/d222240f67449968145f65edd48ad306'
> May 5 14:17:44 host-10-10-6-34 object-auditor: ERROR Trying to audit /srv/node/device4/objects/79135/961/4d47ee46560698a7a938d225a2357961: #012Traceback (most recent call last):#012 File "/usr/lib/python2.7/site-packages/swift/obj/auditor.py", line 173, in failsafe_object_audit#012 self.object_audit(location)#012 File "/usr/lib/python2.7/site-packages/swift/obj/auditor.py", line 191, in object_audit#012 with df.open():#012 File "/usr/lib/python2.7/site-packages/swift/obj/diskfile.py", line 1025, in open#012 data_file, meta_file, ts_file = self._get_ondisk_file()#012 File "/usr/lib/python2.7/site-packages/swift/obj/diskfile.py", line 1114, in _get_ondisk_file#012 "Error listing directory %s: %s" % (self._datadir, err))#012DiskFileError: Error listing directory /srv/node/device4/objects/79135/961/4d47ee46560698a7a938d225a2357961: [Errno 13] Permission denied: '/srv/node/device4/objects/79135/961/4d47ee46560698a7a938d225a2357961'
> May 5 14:17:44 host-10-10-6-34 object-auditor: Object audit (ZBF) "forever" mode completed: 0.06s. Total quarantined: 0, Total errors: 3, Total files/sec: 52.40, Total bytes/sec: 0.00, Auditing time: 0.06, Rate: 0.98
as well as:
> May 5 14:21:27 host-10-10-6-34 xinetd[519]: START: rsync pid=7319 from=10.10.6.28
> May 5 14:21:27 host-10-10-6-34 rsyncd[7319]: name lookup failed for 10.10.6.28: Name or service not known
> May 5 14:21:27 host-10-10-6-34 rsyncd[7319]: connect from UNKNOWN (10.10.6.28)
> May 5 14:21:27 host-10-10-6-34 rsyncd[7319]: rsync to object/device4/objects/134106 from UNKNOWN (10.10.6.28)
> May 5 14:21:27 host-10-10-6-34 rsyncd[7319]: receiving file list
> May 5 14:21:27 host-10-10-6-34 rsyncd[7319]: rsync: recv_generator: mkdir "/device4/objects/134106/f2a/82f6a3461bb69f80918a1a508a8bdf2a" (in object) failed: Permission denied (13)
> May 5 14:21:27 host-10-10-6-34 rsyncd[7319]: *** Skipping any contents from this failed directory ***
> May 5 14:21:28 host-10-10-6-34 rsyncd[7319]: sent 234 bytes received 243 bytes total size 1,048,576
> May 5 14:21:28 host-10-10-6-34 xinetd[519]: EXIT: rsync status=0 pid=7319 duration=1(sec)
I have no idea if they're related. I've checked and the swift user owns the /srv/node folders. When I manually added the node I really thought there was a problem of mine but when I got the same problem with packstack I had no idea of what the cause could be. Can someone help me?
[1]: https://ask.openstack.org/en/question/28776/packstack-aborting-installati...
Thank you in advance,
Diogo Vieira <dfv(a)eurotux.com>
Programador
Eurotux Informática, S.A. | www.eurotux.com
(t) +351 253 680 300
10 years, 6 months
[Rdo-list] RDO Nova error - cannot launch an instance
by Adam Fyfe
Hi List
I cannot launch an instance.
get this in compute.log:
2014-05-15 17:02:09.225 4209 TRACE nova.compute.manager [instance:
31f99d54-1da0-4847-b8f0-d38fe1617ef9] libvirtError: Hook script execution
failed: Hook script /etc/libvirt/hooks/qemu qemu failed with error code 256
Any help would be super!
thanks
adam
10 years, 6 months
[Rdo-list] [package announce] openstack-packstack icehouse update
by Pádraig Brady
Icehouse RDO packstack has been updated as follows.
openstack-packstack-2014.1.1-0.12.dev1068:
- Ensure all puppet modules dependencies installed on all nodes
- [Nova] Setup ssh keys to support ssh-based live migration (lp#1311168)
- [Nova] Support multiple sshkeys per host
- [Nova] Fix vcenter parameters duplicated in answer file (rhbz#1061372, rhbz#1092008)
- [Ceilometer] Install ceilometer compute agent on compute nodes (lp#1318383)
- [Ceilometer] Start openstack-ceilometer-notification service (rhbz#1096268)
- [Horizon] Fix help_url to point to upstream docs (rhbz#1080917)
- [Horizon] Fix invalid keystone_default_role causing swift issues
- [Horizon] Improved SSL configuration (rhbz#1078130)
- [Neutron] Fix ML2 install (rhbz#1096510)
10 years, 6 months
[Rdo-list] Launching a Nova instance results in "NovaException: Unexpected vif_type=binding_failed"
by Kashyap Chamarthy
Setup: A 2-node install (in virtual machines w/ nested virt) with
IceHouse (Neutron w/ ML2+OVS+GRE) on Fedora 20, but OpenStack IceHouse
packages are from Rawhide (Version details below).
Problem
-------
Attempt to launch a Nova instance as a user tenant results in this trace
back saying "Unexpected vif_type". Interesting thing is, the instance
goes into ACTIVE when I launch the Nova instance with admin tenant.
2014-05-13 07:06:32.123 29455 ERROR nova.compute.manager [req-402f21c1-98ed-4600-96b9-84efdb9c823d cb68d099e78d490ab0adf4030881153b 0a6eb2259ca142e7a80541db10835e71] [instance: 950de10f-4368-4498-b46a-b1595d057e
38] Error: Unexpected vif_type=binding_failed
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] Traceback (most recent call last):
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1311, in _build_instance
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] set_access_ip=set_access_ip)
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 399, in decorated_function
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] return function(self, context, *args, **kwargs)
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1723, in _spawn
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] LOG.exception(_('Instance failed to spawn'), instance=instance)
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] File "/usr/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 68, in __exit__
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] six.reraise(self.type_, self.value, self.tb)
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1720, in _spawn
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] block_device_info)
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2250, in spawn
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] write_to_disk=True)
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3431, in to_xml
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] disk_info, rescue, block_device_info)
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3247, in get_guest_config
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] flavor)
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/vif.py", line 384, in get_config
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] _("Unexpected vif_type=%s") % vif_type)
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38] NovaException: Unexpected vif_type=binding_failed
2014-05-13 07:06:32.123 29455 TRACE nova.compute.manager [instance: 950de10f-4368-4498-b46a-b1595d057e38]
2014-05-13 07:06:32.846 29455 ERROR oslo.messaging.rpc.dispatcher [-] Exception during message handling: Unexpected vif_type=binding_failed
Notes/Observations/diagnostics
------------------------------
- I can reach the inter-webs from the router namespace, but not DHCP
namespace.
- In nova.conf, for 'libvirt_vif_driver', I tried (a) both the below
options, separately, , also I tried commenting it out, an upstream
Nova commit[1] from 4APR2014 marks it as deprecated.
libvirt_vif_driver=nova.virt.libvirt.vif.LibvirtGenericVIFDriver
libvirt_vif_driver=nova.virt.libvirt.vif.LibvirtHybridOVSBridgeDriver
- Some diagnostics are here[2]
- From some debugging (and from the diagnoistics above), I guess a
br-tun is missing
Related
-------
I see a related Neutron bug[3], but that's not the root cause of this
bug.
Versions
--------
Nova, Neutron, libvirt, QEMU, OpenvSwitch versions:
openstack-nova-compute-2014.1-2.fc21.noarch
openstack-neutron-2014.1-11.fc21.noarch
libvirt-daemon-kvm-1.1.3.5-1.fc20.x86_64
qemu-system-x86-1.6.2-4.fc20.x86_64
openvswitch-2.0.1-1.fc20.x86_64
[1] https://git.openstack.org/cgit/openstack/nova/commit/?id=9f6070e194504cc2...
[2] https://gist.github.com/kashyapc/0d4869796c7ea79bfb89
[3] https://bugs.launchpad.net/neutron/+bug/1244255
nova.conf and ml2_conf.ini
--------------------------
nova.conf:
$ cat /etc/nova/nova.conf | grep -v ^$ | grep -v ^#
[DEFAULT]
logdir = /var/log/nova
state_path = /var/lib/nova
lock_path = /var/lib/nova/tmp
volumes_dir = /etc/nova/volumes
dhcpbridge = /usr/bin/nova-dhcpbridge
dhcpbridge_flagfile = /etc/nova/nova.conf
force_dhcp_release = True
injected_network_template = /usr/share/nova/interfaces.template
libvirt_nonblocking = True
libvirt_use_virtio_for_bridges=True
libvirt_inject_partition = -1
#libvirt_vif_driver=nova.virt.libvirt.vif.LibvirtGenericVIFDriver
#libvirt_vif_driver=nova.virt.libvirt.vif.LibvirtHybridOVSBridgeDriver
#iscsi_helper = tgtadm
sql_connection = mysql://nova:nova@192.169.142.97/nova
compute_driver = libvirt.LibvirtDriver
libvirt_type=qemu
rootwrap_config = /etc/nova/rootwrap.conf
auth_strategy = keystone
firewall_driver=nova.virt.firewall.NoopFirewallDriver
enabled_apis = ec2,osapi_compute,metadata
my_ip=192.169.142.168
network_api_class = nova.network.neutronv2.api.API
neutron_url = http://192.169.142.97:9696
neutron_auth_strategy = keystone
neutron_admin_tenant_name = services
neutron_admin_username = neutron
neutron_admin_password = fedora
neutron_admin_auth_url = http://192.169.142.97:35357/v2.0
linuxnet_interface_driver = nova.network.linux_net.LinuxOVSInterfaceDriver
firewall_driver = nova.virt.firewall.NoopFirewallDriver
security_group_api = neutron
rpc_backend = nova.rpc.impl_kombu
rabbit_host = 192.169.142.97
rabbit_port = 5672
rabbit_userid = guest
rabbit_password = fedora
glance_host = 192.169.142.97
[keystone_authtoken]
auth_uri = http://192.169.142.97:5000
admin_tenant_name = services
admin_user = nova
admin_password = fedora
auth_host = 192.169.142.97
auth_port = 35357
auth_protocol = http
signing_dirname = /tmp/keystone-signing-nova
ml2 plugin:
$ cat /etc/neutron/plugin.ini | grep -v ^$ | grep -v ^#
[ml2]
type_drivers = gre
tenant_network_types = gre
mechanism_drivers = openvswitch
[ml2_type_flat]
[ml2_type_vlan]
[ml2_type_gre]
tunnel_id_ranges = 1:1000
[ml2_type_vxlan]
[securitygroup]
firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver
enable_security_group = True
What am I missing?
I'm still investigating by playing with these config settings
enable_tunneling = True
integration_bridge = br-int
tunnel_bridge = br-tun
bridge_mappings = ens2:br-ex
in ml2_conf.ini
--
/kashyap
10 years, 6 months
[Rdo-list] Cinder volume deleting issue
by anand ts
Hi all,
I have multinode setup on openstack+havana+rdo on CentOS6.5
Issue- Can't able to delete cinder volume.
when try to delete through command line
[root@cinder ~(keystone_admin)]# cinder list
+--------------------------------------+--------+--------------+------+-------------+----------+--------------------------------------+
| ID | Status | Display Name | Size |
Volume Type | Bootable | Attached to |
+--------------------------------------+--------+--------------+------+-------------+----------+--------------------------------------+
| fe0fdad1-2f8a-4cce-a173-797391dbc7ad | in-use | vol2 | 10 |
None | true | b998107b-e708-42a5-8790-4727fed879a3 |
+--------------------------------------+--------+--------------+------+-------------+----------+------------------------------
[root@cinder ~(keystone_admin)]# cinder delete
fe0fdad1-2f8a-4cce-a173-797391dbc7ad
Delete for volume fe0fdad1-2f8a-4cce-a173-797391dbc7ad failed: Invalid
volume: Volume status must be available or error, but current status is:
in-use (HTTP 400) (Request-ID: req-d9be63f0-476a-4ecd-8655-20491336ee8b)
ERROR: Unable to delete any of the specified volumes.
when try to delete through dashboard, screen shot attached with the mail.
This occured when a cinder volume attached instance is deleted from the
database without detaching the volume. Now the volume is in use and
attached to NONE.
Please find the cinder logs here , http://paste.openstack.org/show/80333/
Any work around to this problem.
10 years, 6 months
[Rdo-list] keystone error when run packstack
by HuJun
Hi Guys:
I met a keystone error when run packstack on Fedora 20, It seems like
parameters error.
what will I do for escaping it?
[root@cloudf ~]# cat /etc/fedora-release
Fedora release 20 (Heisenbug)
[root@cloudf ~]# uname -a
Linux cloudf 3.14.2-200.fc20.x86_64 #1 SMP Mon Apr 28 14:40:57 UTC 2014
x86_64 x86_64 x86_64 GNU/Linux
[root@cloudf ~]# packstack
--answer-file=packstack-answers-20140512-202737.txt
Welcome to Installer setup utility
Installing:
Clean Up... [ DONE ]
Setting up ssh keys... [ DONE ]
Discovering hosts' details... [ DONE ]
Adding pre install manifest entries... [ DONE ]
Adding MySQL manifest entries... [ DONE ]
Adding QPID manifest entries... [ DONE ]
Adding Keystone manifest entries... [ DONE ]
Adding Glance Keystone manifest entries... [ DONE ]
Adding Glance manifest entries... [ DONE ]
Installing dependencies for Cinder... [ DONE ]
Adding Cinder Keystone manifest entries... [ DONE ]
Adding Cinder manifest entries... [ DONE ]
Checking if the Cinder server has a cinder-volumes vg...[ DONE ]
Adding Nova API manifest entries... [ DONE ]
Adding Nova Keystone manifest entries... [ DONE ]
Adding Nova Cert manifest entries... [ DONE ]
Adding Nova Conductor manifest entries... [ DONE ]
Adding Nova Compute manifest entries... [ DONE ]
Adding Nova Scheduler manifest entries... [ DONE ]
Adding Nova VNC Proxy manifest entries... [ DONE ]
Adding Nova Common manifest entries... [ DONE ]
Adding Openstack Network-related Nova manifest entries...[ DONE ]
Adding Neutron API manifest entries... [ DONE ]
Adding Neutron Keystone manifest entries... [ DONE ]
Adding Neutron L3 manifest entries... [ DONE ]
Adding Neutron L2 Agent manifest entries... [ DONE ]
Adding Neutron DHCP Agent manifest entries... [ DONE ]
Adding Neutron LBaaS Agent manifest entries... [ DONE ]
Adding Neutron Metadata Agent manifest entries... [ DONE ]
Adding OpenStack Client manifest entries... [ DONE ]
Adding Horizon manifest entries... [ DONE ]
Adding Swift Keystone manifest entries... [ DONE ]
Adding Swift builder manifest entries... [ DONE ]
Adding Swift proxy manifest entries... [ DONE ]
Adding Swift storage manifest entries... [ DONE ]
Adding Swift common manifest entries... [ DONE ]
Adding Provisioning manifest entries... [ DONE ]
Adding Ceilometer manifest entries... [ DONE ]
Adding Ceilometer Keystone manifest entries... [ DONE ]
Adding Nagios server manifest entries... [ DONE ]
Adding Nagios host manifest entries... [ DONE ]
Adding post install manifest entries... [ DONE ]
Preparing servers... [ DONE ]
Installing Dependencies... [ DONE ]
Copying Puppet modules and manifests... [ DONE ]
Applying Puppet manifests...
Applying 192.168.0.101_prescript.pp
192.168.0.101_prescript.pp : [ DONE ]
Applying 192.168.0.101_mysql.pp
Applying 192.168.0.101_qpid.pp
192.168.0.101_mysql.pp : [ DONE ]
192.168.0.101_qpid.pp : [ DONE ]
Applying 192.168.0.101_keystone.pp
Applying 192.168.0.101_glance.pp
Applying 192.168.0.101_cinder.pp
[ ERROR ]
ERROR : Error appeared during Puppet run: 192.168.0.101_keystone.pp
Error: /Stage[main]/Keystone::Roles::Admin/Keystone_role[_member_]:
Could not evaluate: Execution of '/usr/bin/keystone --endpoint
http://127.0.0.1:35357/v2.0/ role-list' returned 2: usage: keystone
[--version] [--timeout <seconds>]
You will find full trace in log
/var/tmp/packstack/20140512-210500-q42ayh/manifests/192.168.0.101_keystone.pp.log
Please check log file
/var/tmp/packstack/20140512-210500-q42ayh/openstack-setup.log for more
information
Additional information:
* Time synchronization installation was skipped. Please note that
unsynchronized time on server instances might be problem for some
OpenStack components.
* Did not create a cinder volume group, one already existed
* File /root/keystonerc_admin has been created on OpenStack client
host 192.168.0.101. To use the command line tools you need to source the
file.
* To access the OpenStack Dashboard browse to
http://192.168.0.101/dashboard.
Please, find your login credentials stored in the keystonerc_admin in
your home directory.
* To use Nagios, browse to http://192.168.0.101/nagios username :
nagiosadmin, password : 971d4caec4534007
--
--------------------
Jun Hu
mobile:186 8035 6499
Tel :0755-8282 2635
email :jhu@novell.com
jhu_com(a)163.com
Suse, China
----------------------
10 years, 6 months
[Rdo-list] [OFI] Astapor, Foreman, Staypuft interaction
by Martyn Taylor
All,
I recently had some discussion about HA orchestration this morning with
Petr Chalupa. Particular around the HA Controller node deployment.
This particular role behaves slightly differently to the other roles in
a Staypuft deployment in that it requires more than one puppet run to
complete.
Up to now we have worked on the assumption that once we have received a
successful puppet run report in foreman, then the node associated with
the role is configured and ready to go. We use this for scheduling the
next list of nodes in a given deployment.
We do have a work around for HA Controller issue described above in the
astapor modules. Blocking is implemented in the subsequent puppet
modules that are dependent on the HA Controller services. This means
that any depdendent modules will wait until controller completes before
proceeding. This results in the following behaviour.
Sequence
- Controller Nodes Provisioned.
- First puppet run returns successful.
- LVM Block Storage is provisioned.
- Controller Node puppet run 2 completes
- LVM Block storage puppet run completes.
In this case, the LVM block storage is provisioned before the
controllers are complete, but will block until the Controller puppet run
2 completes.
This work around is sufficient for the time being. But really what we
would like is to have Staypuft orchestrate the whole process, rather
than it be partially orchestrated by the puppet modules, partially by
Staypuft orchestration.
The difficulty we have right now in Staypuft is that (with out knowing
the specific implementation details of the puppet modules), there is no
clear way to detect whether a node with role X is complete and we are
able to schedule the next roles in the sequence.
What we need here is a clear interface for determining status of puppet
class and/or HostGroup status for the Astapor modules.
I have 2 questions around this,
1. Does there currently exist anyway to consistently detect the status
of a role/list of classes within Foreman for Astapor classes that we can
utilize?
-. If so can we do this without knowing the implementation details
of the Astapor puppet modules? (We do not want to, for example, look
for class specific facts in foreman, since these vary between classes
and may change in Astapor)?
2. If not 1. Is is possible to add something to the puppet modules to
explicitly show that a class/Hostgroup is complete? I am thinking
something along the lines of reporting a "Ready" flag back to foreman.
If none of the above, any other suggestions?
Cheers
Martyn
10 years, 6 months
[Rdo-list] Can't Deploy Foreman with openstack-foreman-installer for Bare Metal Provisioning (undefined method `[]' for nil:NilClass)
by Ramon Acedo
Hi all,
I have been trying to test the OpenStack Foreman Installer with different combinations of Foreman versions and of the installer itself (and even different versions of Puppet) with no success so far.
I know that Packstack alone works but I want to go all the way with multiple hosts and bare metal provisioning to eventually use it for large deployments and scale out Nova Compute and other services seamlessly.
The error I get when running the foreman_server.sh script is always:
--------------
rake aborted!
undefined method `[]' for nil:NilClass
Tasks: TOP => db:seed
(See full trace by running task with --trace)
--------------
After that, if Foreman starts, there’s nothing in the "Host groups" section which is supposed to be prepopulated by the foreman_server.sh script (as described in http://red.ht/1jdJ03q).
The process I follow is very simple:
1. Install a clean RHEL 6.5 or CentOS 6.5
2. Enable EPEL
3. Enable the rdo-release repo:
a. rdo-release-havana-7: Foreman 1.3 and openstack-foreman-installer 1.0.6
b. rdo-release-havana-8: Foreman 1.5 and openstack-foreman-installer 1.0.6
c. rdo-release-icehouse-3: Foreman 1.5 and openstack-foreman-installer 2.0 (as a note here, the SCL repo needs to be enabled before the next step too).
4. Install openstack-foreman-installer
5. Create and export the needed variables:
export PROVISIONING_INTERFACE=eth0
export FOREMAN_GATEWAY=192.168.5.100
export FOREMAN_PROVISIONING=true
6. Run the script foreman_server.sh from /usr/share/openstack-foreman-installer/bin
For 3a and 3b I also tried with an older version of Puppet (3.2) with the same result.
These are the full outputs:
3a: http://fpaste.org/97739/ (Havana and Foreman 1.3)
3b: http://fpaste.org/97760/ (Havana and Foreman 1.3 with Puppet 3.2)
3c: http://fpaste.org/97838/ (Icehouse and Foreman 1.5)
I’m sure somebody in the list has tried to deploy and configure Foreman for bare metal installations (DHCP+PXE) and the documentation and the foreman_server.sh script suggest it should be possible in a fairly easy way.
I filled a bug as it might well be one, pending confirmation: https://bugzilla.redhat.com/show_bug.cgi?id=1092443
Any help is really appreciated!
Many thanks.
Ramon
10 years, 6 months
[Rdo-list] deploy RDO in non-internal environment
by Kun Huang
Hi guys
I want to use RDO as default openstack deployment in my lab. However there
are many reasons servers could visit external network. So at least I should
clone RDO-related repositories first, such as EPEL and the below is in my
plan:
epel, epel-testing, foreman, puppetlabs, rdo-release
Are those enough? And anybody has tried?
thanks :)
10 years, 6 months