GFS, Xen and bonding
If you’re trying to use a GFS2-XEN combo on a bridged network, you may find this post useful.
For the past few months, I’ve been setting up a SaaS infrastructure for my company. This required a certain level of resilience, some of it achieved with the help of GFS.
Let me quote Wikipedia to get you up to speed: […]
*The Global File System (GFS) is a shared disk file system for Linux computer clusters. […]
GFS and GFS2 differ from distributed file systems (such as AFS, Coda, or InterMezzo) because they allow all nodes to have direct concurrent access to the same shared block storage. In addition, GFS or GFS2 can also be used as a local filesystem.*
Long story short, you can concurrently access one file-system from several different nodes. This is exactly what we want: our back-end is started from several nodes, and should one of the nodes crash, another one will take over. GFS will guarantee that the file-system is never corrupted in the process. GFS set-up is rather well documented, so this shouldn’t be a problem.
Did you sense a “but” ? Spot on! We also went for Xen virtualization, and this is getting problematic if you’re going to use GFS on top of that. Here’s the thing: our dom0s host some VMs (duh), and those VMs use GFS. Not surprising so far. The network interfaces on the domUs need to use the physical interfaces of the dom0, so we’re setting up Xen with network bridging. And just to make it interesting: we’re using bonding on the physical host (did I mention resilience?). And this is where the whole thing is starting to act up: the GFS daemons (cman, and others) start complaining about losing packets, the cluster never becomes quorate again, etc. Of course, RedHat support will blame it on your network, saying that the switch is either overloaded or doesn’t let multicast through. Yeah, right.
The real reason is: they NEVER tested the Xen/bonding/GFS combo. Just take a look at the start-up sequence in /etc/rc3.d: first, bring up the interfaces. Then start GFS (cman, rgmanager, etc.), and THEN, start Xen. This looks fine, but let’s not forget that Xen, when in bridging mode, will rename bond0 to pbond0 and then build its bridges, thus disrupting communication for the already-started GFS daemons: they’ll get hella confused.
The solution is simple: disable the GFS daemons and make them start at the very end of the start-up sequence (say, from rc.local). Problem solved! I will make a tutorial out of this; I just thought I’d drop a note on this thing right away to share it.