Hello,

I'm seeing a constant behaviour with my implementation of openstack (libvirt/kvm) and cinder using glusterfs and I'm having troubles to find the real cause or if it's something not normal at all.

I have configured cinder to use glusterfs as storage backend, the volume is a replica 2 of 8 disks in 2 servers and I have several volumes attached to several instances provided by cinder. The problem is this, is not uncommon that one of the gluster servers reboot suddenly due to power failures (this is an infrastructure problem unavoidable right now), when this happens the instances start to see the attached volume as read only which force me to hard reboot the instance so it can access the volume normally again.

Here are my doubts, the gluster volume is created in such a way that not a single replica is on the same server as the master, if I lose a server due to hardware failure, the other is still usable so I don't really understand why couldn't the instances just use the replica brick in case that one of the servers reboots.

Also, why the data is still there, can be read but can't be written to in case of glusterfs failures? Is this a problem with my implementation? configuration error on my part? something known to openstack? a cinder thing? libvirt? glusterfs?

Having to hard reboot the instances is not a big issue right now, but nevertheless I want to understand what's happening and if I can avoid this issue.

Some specifics:

GlusterFS version is 3.5 All systems are CentOS 6.5 Openstack version is Icehouse installed with packstack/rdo

Thanks in advance!


--
ElĂ­as David.