- Ease of configuration
- Bundled ecosystem support
- MapR NFS - especially since we use Docker containers in every public-facing app and a few internal logging ones, too
- Reliability - I feel like I will likely never lose my data, with replication and simple backup methods, but that last one is hard to validate and I hope I never have to.
Improvements to My Organization
The features it comes with we leverage, like MapR NFS, for example. This allows our Docker environment to write directly to HDFS and NoSQL on the MapR cluster without the need for other tools, or for clients who need to connect to secured volumes for data ingestion, or even some resources to crunch some numbers.
Room for Improvement
I'd say we've had issues with pricing.
Also, we had CLDB errors with M3 that made it seem a little unstable, but after getting some support with it, we learned how the CLDB propagates information, and haven't had issues since. M7 felt a lot more robust, and I don't recall any CLDB issues there.
Use of Solution
We've been using it for about a year, and it's been in production for about six to seven years.
Not really. There is a learning curve involved, of course, but aside from that, there are only the usual deployment considerations.
I think the fact on MapR, when compared to vanilla Hadoop, has a lot less to worry about, which makes it, in my view, a lot more stable.
Customer Service and Technical Support
Support is a bit slow to respond to emails, but we are able to call in and get help on the phone in emergencies, which is very useful. The MapR answers site is quite useful, as MapR engineers and architects, including Ted Dunning, the chief architect, are quick to respond to user questions.
I've only used MapR in production.
MapR setup is very simple, and with some Linux and a little system admin knowledge, it was quite easy to get setup. It's like knowing how to drive and not having to know how the engine works. Whereas, vanilla Hadoop, for example, requires a bit more of a mechanic to be the driver- if that makes sense.
Implemented in-house, as part of training with a view to get certified. We had support from MapR for any issues though, as well as architectural validation for planning the cluster and examining our needs before provisioning.
Pricing, Setup Cost and Licensing
We have a multi-tenant pricing model for clients using the platform so we expect to get more back, but the distribution is quite expensive so assessing needs and use cases first is crucial to realizing the gains.
Other Solutions Considered
We tried vanilla Hadoop in tests, as well as Hortonworks and Cloudera sandboxes.
In terms of Hadoop growing pains, there was some pain, compared to the M7 license which I found largely painless. However, it was nowhere near the amount of pain I had playing vanilla Hadoop.
MapR is a great distribution, although I have limited experience with other distributors. I know that I have never come across the name node issue, and MapR even translates some posix calls to HDFS and abstracts away a lot of the complexity. It's enterprise ready, and comes with a host of features that really simplify some scenarios.
For example, MapR NFS provides great flexibility when it comes to connecting to the cluster and ingesting data. It also has true multi-tenancy, which allows clients to trust our platform with their data, knowing that there are several layers of security, including encryption authentication and volume access control.
I would recommend using the free MapR training resources and posting on MapR answers. The sandbox is a great place to start, and also there is pretty extensive documentation on their site.
Disclosure: My company has a business relationship with this vendor other than being a customer: We partnered with MapR earlier in the year to offer an analytics solution built on top of MapR. However, we pay for support.
Nov 24 2015