LizardFS allows storing data across several nodes (which run on commodity hardware, even inside virtual machines) and also has a built-in HA mechanism. This works very well if failover can be done in a manual or simple way (e.g. by using Pacemaker). Also there is an automated built-in failover mechanism if you are ready to spend money for a support contract (it is not expensive, though).
Update: After having tested the commercial failover mechanism which comes with signing the support contract or a software demo I can say that it works as expected. It is easy to manage and just seems to work. To be fair I have to state that I didn't use it in production yet, only within tests. But it looks very good.
Improvements to My Organization
We have been running LizardFS for a while now, but only inside a testing environment. It will be deployed to production soon and it will help to make specific files available inside several datacenters in a very fast and efficient way. Normally you would buy very expensice enterprise storage solutions, but this open source software simply does the job well enough.
In addition, we have been running LizardFS within a second pilot environment for a couple of months now. It stores over one Million files across several servers, being located in different datacenters. So far, also this setup runs rock-solid, just as you would expect it to be.
Room for Improvement
Well, if you don't have a support contract and therefore don't have acess to the automated failover mechanism, you need to build it yourself. This can be achieved by using common open source software, but you will need to know what you are doing. It can be a painful process since you will need to write some scripts on your own (e.g. an OCF agent if you decide to use Pacemaker and CRM).
Use of Solution
I have been using it for 9 months now.
No, the deployment is very easy if you are able to find the online documentation.
Not yet. Even when we went through several edge-case test scenarios LizardFS simply worked as expected.
Not yet - and I expect not to since this piece of software is ment to scale.
Customer Service and Technical Support
Support has very small response times and gives very helpful advice. So far, everything was fine. Technical Support
I was using GlusterFS for months (only inside test environments), but it turned out that under our specific load scenario, GlusterFS will fail. We had issues with file consistency and the encryption feature.
It was very easy, but the detailed configuration is difficult since the online documentation does not explain every config option in a very good way.
Update: The official documentation was just released and answered many of my personal questsions.
We implemented it ourselves; however, the developers behind LizardFS sent someone to review and tuned our setup just to be sure. His hints were really helpful.
Pricing, Setup Cost and Licensing
Forget the prize, it is worth it.
Other Solutions Considered
GlusterFS, MooseFS and many others - all failed our tests because the way they are designed or because of bugs.
In addition, we also had a look at Ceph (which is an object storage with CephFS as some sort of emulated filesystem), but the part we need is not yet ready for production.
Make sure to test LizardFS under real-world workloads and have a look if it works well enough for your needs. So far, I found it to be the best distributed storage solution I ever worked with. You should also make sure that you define in which cases a failover of the master server shall happen and how it should happen.
Furthermore you should save yourself the trouble by building your own HA construct. I highly recommend to at least have a look at their commercial HA solution.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Aug 23 2016