Clustered/Distributed Network Filesystems, Which Ones live up to the hype? -
Clustered/Distributed Network Filesystems, Which Ones live up to the hype?
I've tried to find a good sensible solution to cluster with and each technology has it's pros and cons and there is no perfect solution and I've found a lot of "exaggerations" in the applications, benefits and performance of these different filesystems.
I first started off with DRBD and I have to say it does live up to the hype, is quite reliable (although it can be annoying to match up the kernel module and user applications since they must match and when upgrading kernels you may find you have a version mismatch which can bring down your whole cluster if you don't plan and test ahead of time). The performance is perfect (it gives 100% network bandwidth performance if you chose). I like that it writes first (locally) and then transfers those files over the network. It gives you a perfect blend of essentially native performance and the redunancy you require and even allows you to control the transfer rate.
The drawbacks are some I mentioned, that userspace and kernel modules must match and upgrading one or the other (which often will cause a version mismatch) will bring your cluster down if you aren't careful to plan ahead. A lot of times you have to manually compile something whether it's the kernel module or userspace tools. It also does not handle "split-brain" in an easy and obvious way. The documentation is generally quite good/useful, but it left me searching high and dry to find the best way to handle split-brain.
DRBD's other drawback/annoyance is that you must actually dedicate a full partition to it which is often not a preferred or ideal solution.
The other large benefit is that it acts like a native/local filesystem and I've never had any application issue when using it for any kind of VPS use (many do have their quirks with virtualization such as OCFS2 and GlusterFS).
Another drawback of DRBD is also the fact that it ONLY knows how to be RAID1, you can't aggregate your storage, but it's a great way to mirror data between different sites or for redunancy purposes and it works perfectly if a RAID1 network solution is what you need. I'd say it's the most robust and polished for what it does.
OCFS2 isn't that hard to configure but what's the point of it? It is basically a clustered/aggregated RAID0 filesystem so it has absolutely no redunancy. I almost don't understand the point of it, unless you need vast amounts of storage without any regard for data integrity. Supposedly OCFS2 is self healing which is good, but it's not really enterprise ready in mind without any built-in redunancy options.
OCFS2 also crashes when using things like VMWare, I tried all kinds of work arounds but kept getting crashes. It's not a full-featured filesystem and VMWare can sense this.
I also dislike all the different services associated with it. If something breaks you have to figure out in what order to reload kernel modules and the two associated services with it This makes it a bit stressful and unnecesarily complex.
I will say it could be a great solution for a mass on-line RAID0 type array if downtime and redundancy aren't of any concern (maybe for Photobucket type places0.
I still don't see the relevance of OCFS2 and it seems overly complicated without any special features or anything at all that would prompt me to use it, but at least I can say I tried and played around with it.
I'd say it is pretty good, but has always been overrated and overhyped. There are often performance issues associated with it and the GlusterFS documentation is sparse and cryptic at best.
It has been touted as something you should run VPS's under, but I've tried this with poor results. Things are improving, but it seems that a number of years after it's release that it still hasn't fully matured and still has a long way to go.
However, with a little tweaking I think GlusterFS may be the most practical all-around solution for almost any use. You can set it up to be RAID 0+1 or almost any other way imaginable and you can aggregate the storage bricks.
Best of all you can use your existing filesystem and space, there's no need to dedicate a partition to it. The bad thing is that there's no way to control transfer rates like DRBD, and there doesn't seem to be a way to control the way data is written.
I'd prefer to have an option like DRBD where the write is done locally and then transferred over the wire, this would make it possible and practical for VPS solutions.
The other issue is lack of MMAP support and that for some other reason Checkpointing VPS's does not work because of the underyling filesystem structure.
I think I'm being hard on GlusterFS because I think it's only starting to coming close to living up to its hype and has a lot of common sense, fundamental issues that have not been resolved or addressed when they should have been done immediately.
With that said, if GlusterFS can provide better documentation, performance and flexibility (DRBD style delayed writes) then I think it could come out on top as the one-stop solution which will be superior in everyway.