What is most valuable?
- Ease of configuration
- Bundled ecosystem support
- MapR NFS - especially since we use Docker containers in every public-facing app and a few internal logging ones, too
- Reliability - I feel like I will likely never lose my data, with replication and simple backup methods, but that last one is hard to validate and I hope I never have to.
How has it helped my organization?
The features it comes with we leverage, like MapR NFS, for example. This allows our Docker environment to write directly to HDFS and NoSQL on the MapR cluster without the need for other tools, or for clients who need to connect to secured volumes for data ingestion, or even some resources to crunch some numbers.
What needs improvement?
I'd say we've had issues with pricing.
Also, we had CLDB errors with M3 that made it seem a little unstable, but after getting some support with it, we learned how the CLDB propagates information, and haven't had issues since. M7 felt a lot more robust, and I don't recall any CLDB issues there.
For how long have I used the solution?
We've been using it for about a year, and it's been in production for about six to seven years.
What was my experience with deployment of the solution?
Not really. There is a learning curve involved, of course, but aside from that, there are only the usual deployment considerations.
What do I think about the stability of the solution?
I think the fact on MapR, when compared to vanilla Hadoop, has a lot less to worry about, which makes it, in my view, a lot more stable.
How are customer service and technical support?
Support is a bit slow to respond to emails, but we are able to call in and get help on the phone in emergencies, which is very useful. The MapR answers site is quite useful, as MapR engineers and architects, including Ted Dunning, the chief architect, are quick to respond to user questions.
Which solution did I use previously and why did I switch?
I've only used MapR in production.
How was the initial setup?
MapR setup is very simple, and with some Linux and a little system admin knowledge, it was quite easy to get setup. It's like knowing how to drive and not having to know how the engine works. Whereas, vanilla Hadoop, for example, requires a bit more of a mechanic to be the driver- if that makes sense.
What about the implementation team?
Implemented in-house, as part of training with a view to get certified. We had support from MapR for any issues though, as well as architectural validation for planning the cluster and examining our needs before provisioning.
What's my experience with pricing, setup cost, and licensing?
We have a multi-tenant pricing model for clients using the platform so we expect to get more back, but the distribution is quite expensive so assessing needs and use cases first is crucial to realizing the gains.
Which other solutions did I evaluate?
We tried vanilla Hadoop in tests, as well as Hortonworks and Cloudera sandboxes.
What other advice do I have?
In terms of Hadoop growing pains, there was some pain, compared to the M7 license which I found largely painless. However, it was nowhere near the amount of pain I had playing vanilla Hadoop.
MapR is a great distribution, although I have limited experience with other distributors. I know that I have never come across the name node issue, and MapR even translates some posix calls to HDFS and abstracts away a lot of the complexity. It's enterprise ready, and comes with a host of features that really simplify some scenarios.
For example, MapR NFS provides great flexibility when it comes to connecting to the cluster and ingesting data. It also has true multi-tenancy, which allows clients to trust our platform with their data, knowing that there are several layers of security, including encryption authentication and volume access control.
I would recommend using the free MapR training resources and posting on MapR answers. The sandbox is a great place to start, and also there is pretty extensive documentation on their site.