Hello,
I have a 4 node Swift cluster (1 Proxy Node and 3 Storage Nodes) to which I would like to add a 4th Storage Node. I've tried two different ways to add it. The first one was manually and the second one was with packstack (which in itself had some other problems described here[1], if you'd like to help - I worked around the problem with a symbolic link of the swift-ring-builder to the true binary). With both methods, as soon as I add the 4th Storage Node I start getting errors in syslog coming from rsync like:
May 5 13:48:55 host-10-10-6-28 object-replicator: rsync: recv_generator: mkdir "/device3/objects/134106/f2a/82f6a3461bb69f80918a1a508a8bdf2a" (in object) failed: Permission denied (13)
May 5 13:48:55 host-10-10-6-28 object-replicator: *** Skipping any contents from this failed directory ***
May 5 13:48:55 host-10-10-6-28 object-replicator: rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1165) [sender=3.1.0pre1]
May 5 13:48:55 host-10-10-6-28 object-replicator: Bad rsync return code: 23 <- ['rsync', '--recursive', '--whole-file', '--human-readable', '--xattrs', '--itemize-changes', '--ignore-existing', '--timeout=30', '--contimeout=30', '--bwlimit=0', '/srv/node/device1/objects/134106/f2a', '10.10.6.30::object/device3/objects/134106']
or
May 5 13:48:23 host-10-10-6-28 object-replicator: rsync: recv_generator: failed to stat "/device2/objects/134106/f2a/82f6a3461bb69f80918a1a508a8bdf2a/1399288004.47239.data" (in object): Permission denied (13)
May 5 13:48:23 host-10-10-6-28 object-replicator: rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1165) [sender=3.1.0pre1]
May 5 13:48:23 host-10-10-6-28 object-replicator: Bad rsync return code: 23 <- ['rsync', '--recursive', '--whole-file', '--human-readable', '--xattrs', '--itemize-changes', '--ignore-existing', '--timeout=30', '--contimeout=30', '--bwlimit=0', '/srv/node/device1/objects/134106/f2a', '10.10.6.29::object/device2/objects/134106']
Please note that I got these messages on one of the older Storage Nodes while on the newly added one I get some errors like these:
May 5 14:17:44 host-10-10-6-34 object-auditor: ERROR Trying to audit /srv/node/device4/objects/134106/f2a/82f6a3461bb69f80918a1a508a8bdf2a: #012Traceback (most recent call last):#012 File "/usr/lib/python2.7/site-packages/swift/obj/auditor.py", line 173, in failsafe_object_audit#012 self.object_audit(location)#012 File "/usr/lib/python2.7/site-packages/swift/obj/auditor.py", line 191, in object_audit#012 with df.open():#012 File "/usr/lib/python2.7/site-packages/swift/obj/diskfile.py", line 1025, in open#012 data_file, meta_file, ts_file = self._get_ondisk_file()#012 File "/usr/lib/python2.7/site-packages/swift/obj/diskfile.py", line 1114, in _get_ondisk_file#012 "Error listing directory %s: %s" % (self._datadir, err))#012DiskFileError: Error listing directory /srv/node/device4/objects/134106/f2a/82f6a3461bb69f80918a1a508a8bdf2a: [Errno 13] Permission denied: '/srv/node/device4/objects/134106/f2a/82f6a3461bb69f80918a1a508a8bdf2a'
May 5 14:17:44 host-10-10-6-34 object-auditor: ERROR Trying to audit /srv/node/device4/objects/215176/306/d222240f67449968145f65edd48ad306: #012Traceback (most recent call last):#012 File "/usr/lib/python2.7/site-packages/swift/obj/auditor.py", line 173, in failsafe_object_audit#012 self.object_audit(location)#012 File "/usr/lib/python2.7/site-packages/swift/obj/auditor.py", line 191, in object_audit#012 with df.open():#012 File "/usr/lib/python2.7/site-packages/swift/obj/diskfile.py", line 1025, in open#012 data_file, meta_file, ts_file = self._get_ondisk_file()#012 File "/usr/lib/python2.7/site-packages/swift/obj/diskfile.py", line 1114, in _get_ondisk_file#012 "Error listing directory %s: %s" % (self._datadir, err))#012DiskFileError: Error listing directory /srv/node/device4/objects/215176/306/d222240f67449968145f65edd48ad306: [Errno 13] Permission denied: '/srv/node/device4/objects/215176/306/d222240f67449968145f65edd48ad306'
May 5 14:17:44 host-10-10-6-34 object-auditor: ERROR Trying to audit /srv/node/device4/objects/79135/961/4d47ee46560698a7a938d225a2357961: #012Traceback (most recent call last):#012 File "/usr/lib/python2.7/site-packages/swift/obj/auditor.py", line 173, in failsafe_object_audit#012 self.object_audit(location)#012 File "/usr/lib/python2.7/site-packages/swift/obj/auditor.py", line 191, in object_audit#012 with df.open():#012 File "/usr/lib/python2.7/site-packages/swift/obj/diskfile.py", line 1025, in open#012 data_file, meta_file, ts_file = self._get_ondisk_file()#012 File "/usr/lib/python2.7/site-packages/swift/obj/diskfile.py", line 1114, in _get_ondisk_file#012 "Error listing directory %s: %s" % (self._datadir, err))#012DiskFileError: Error listing directory /srv/node/device4/objects/79135/961/4d47ee46560698a7a938d225a2357961: [Errno 13] Permission denied: '/srv/node/device4/objects/79135/961/4d47ee46560698a7a938d225a2357961'
May 5 14:17:44 host-10-10-6-34 object-auditor: Object audit (ZBF) "forever" mode completed: 0.06s. Total quarantined: 0, Total errors: 3, Total files/sec: 52.40, Total bytes/sec: 0.00, Auditing time: 0.06, Rate: 0.98
as well as:
May 5 14:21:27 host-10-10-6-34 xinetd[519]: START: rsync pid=7319 from=10.10.6.28
May 5 14:21:27 host-10-10-6-34 rsyncd[7319]: name lookup failed for 10.10.6.28: Name or service not known
May 5 14:21:27 host-10-10-6-34 rsyncd[7319]: connect from UNKNOWN (10.10.6.28)
May 5 14:21:27 host-10-10-6-34 rsyncd[7319]: rsync to object/device4/objects/134106 from UNKNOWN (10.10.6.28)
May 5 14:21:27 host-10-10-6-34 rsyncd[7319]: receiving file list
May 5 14:21:27 host-10-10-6-34 rsyncd[7319]: rsync: recv_generator: mkdir "/device4/objects/134106/f2a/82f6a3461bb69f80918a1a508a8bdf2a" (in object) failed: Permission denied (13)
May 5 14:21:27 host-10-10-6-34 rsyncd[7319]: *** Skipping any contents from this failed directory ***
May 5 14:21:28 host-10-10-6-34 rsyncd[7319]: sent 234 bytes received 243 bytes total size 1,048,576
May 5 14:21:28 host-10-10-6-34 xinetd[519]: EXIT: rsync status=0 pid=7319 duration=1(sec)
I have no idea if they're related. I've checked and the swift user owns the /srv/node folders. When I manually added the node I really thought there was a problem of mine but when I got the same problem with packstack I had no idea of what the cause could be. Can someone help me?
Thank you in advance,