[Rdo-list] read only volumes when glusterfs fails

Fri Sep 12 02:14:55 UTC 2014

Thank you JuanFra!

I've read the link you posted however I'm not completly clear.

As far as I know, in the case of a ping timeout, once the server is
accessible again a reconnect is issued, the client starts talking to the
server and lock tables will be updated, after that, operations will be
carried out normally.

Since the reconnect is happening and operation is not normal (before rw
access -> after ro access) we go to the next point: ext4.

My bricks are formatted with xfs, however when the partition is mounted by
cinder on the instance the partition was formatted as ext4. Given that the
read-only thing seems to be ext4's way to deal with disk access getting
lost, could I avoid this issue if I mount the ext4 partition with
'errors=continue'? What could happen then if I format this same partition
as xfs for instance, do you know?

On Thu, Sep 11, 2014 at 9:59 AM, JuanFra Rodriguez Cardoso <
juanfra.rodriguez.cardoso at gmail.com> wrote:

> Hi Elias:
>
> This Joe Julian's post may help you to solve that trouble:
>
>
> http://joejulian.name/blog/keeping-your-vms-from-going-read-only-when-encountering-a-ping-timeout-in-glusterfs/
>
> Regards,
> ---
> JuanFra Rodriguez Cardoso
>
> 2014-09-11 3:21 GMT+02:00 Elías David <elias.moreno.tec at gmail.com>:
>
>> Hello,
>>
>> I'm seeing a constant behaviour with my implementation of openstack
>> (libvirt/kvm) and cinder using glusterfs and I'm having troubles to find
>> the real cause or if it's something not normal at all.
>>
>> I have configured cinder to use glusterfs as storage backend, the volume
>> is a replica 2 of 8 disks in 2 servers and I have several volumes attached
>> to several instances provided by cinder. The problem is this, is not
>> uncommon that one of the gluster servers reboot suddenly due to power
>> failures (this is an infrastructure problem unavoidable right now), when
>> this happens the instances start to see the attached volume as read only
>> which force me to hard reboot the instance so it can access the volume
>> normally again.
>>
>> Here are my doubts, the gluster volume is created in such a way that not
>> a single replica is on the same server as the master, if I lose a server
>> due to hardware failure, the other is still usable so I don't really
>> understand why couldn't the instances just use the replica brick in case
>> that one of the servers reboots.
>>
>> Also, why the data is still there, can be read but can't be written to in
>> case of glusterfs failures? Is this a problem with my implementation?
>> configuration error on my part? something known to openstack? a cinder
>> thing? libvirt? glusterfs?
>>
>> Having to hard reboot the instances is not a big issue right now, but
>> nevertheless I want to understand what's happening and if I can avoid this
>> issue.
>>
>> Some specifics:
>>
>> GlusterFS version is 3.5 All systems are CentOS 6.5 Openstack version is
>> Icehouse installed with packstack/rdo
>>
>> Thanks in advance!
>>
>> --
>> Elías David.
>>
>> _______________________________________________
>> Rdo-list mailing list
>> Rdo-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/rdo-list
>>
>>
>

-- 
Elías David.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rdoproject.org/pipermail/dev/attachments/20140911/62f6bc17/attachment.html>