Cloudera Distribution for Hadoop Review

For the clusters using CM, we are able to more tightly control and manage the configuration of all nodes in the clusters. But, it has HBase 1.0 stability issues and processing speed needs improvement.


What is most valuable?

  • Cluster rolling restarts 
  • Cluster wide configuration management

How has it helped my organization?

For the clusters using CM, we are able to more tightly control and manage the configuration of all nodes in the clusters. 

We are currently running six production clusters totaling 900+ nodes, and are building three more clusters. Knowing that if someone has some custom configuration on a node that they haven’t communicated out, and that I can ignore that configuration and bring that node into line with where we’ve decided to run the cluster, is very beneficial.

What needs improvement?

HBase 1.0 stability issues and processing speed is a major area for improvement. Right now, our Cloudera 5 clusters run four to seven times slower than our Cloudera 4 clusters using our storm and kafka topologies, which causes real-time processing to be a major challenge.

CM’s API is very limited and difficult when used on multiple clusters in the same CM instance

For how long have I used the solution?

We've used it for approximately two years. We also use Cloudera Manager, which is 6/10.

What was my experience with deployment of the solution?

No issues encountered.

What do I think about the stability of the solution?

Cloudera 5 is currently very unstable. Between two Cloudera 5 clusters, we have an incident at least twice a week due to what are now outstanding bugs.

What do I think about the scalability of the solution?

It's very easy to deploy and scale as large as you want. Once created on the CM management cluster, is difficult to scale up as needed, as you add more clusters to the same CM instance.

Which solution did I use previously and why did I switch?

No previous solution was used.

How was the initial setup?

We were already running one production cluster with approximately 75 nodes when I joined, so I’m not familiar with what was needed to get the initial production cluster up. Once I joined, I assisted in standing up the additional nodes and clusters using our chef automation.

What about the implementation team?

In house via chef automation. Chef, or similar systems, makes it much simpler to stand up large scale clusters. That said, I have not used or evaluated vendor team implementation methods.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Add a Comment
Guest