Cloudera Distribution for Hadoop Review

Open-source solution for intelligent data management and analysis

What is our primary use case?

We make recommendations to clients for using different models of this solution to handle data intelligently.

How has it helped my organization?

It gives us the opportunity to offer more options to our clients and create better solution models.

What is most valuable?

We find CDSW useful and plan to use it as a one-stop application for model build and training. Currently, we use Zeppelin notebook and we want to gravitate to a single application for notebooks.

What needs improvement?

The Data Science Workbench doesn't support multiple languages. It needs to support multiple programming languages. We were trying to use Scalar and Python for some solutions we wanted to deploy, but they didn't work properly. As a result, we had to come up with other workaround solutions. If the Data Science Workbench supported multiple programming languages our workflow would be easier and the solutions could be better.

Another aspect we would like to see improved is better opportunities for integration. For example, we would like to use H2O machine learning, which is an open-source product, and Cloudera doesn't support H2O.

If they could support H2O and also deploy multi-language support on the Cloudera Data Science that would be great. But the biggest thing that would help right now is H2O support.

Finally, one other improvement I would suggest is integrating data privacy software into  Cloudera. It is not quite complete in this aspect.

For how long have I used the solution?

We have been using the solution for approximately eight weeks.

What do I think about the stability of the solution?

From a stability point of view, we know that there is a new product coming out called Unity — or that is the proposed name of the product that merges Cloudera and Hortonworks. We know that this means that some changes will be happening within the environment. We don't believe that they will be radical changes that will affect existing software that we have. It should just be added functionality of Hortonworks integration. But we know at the same time that Cloudera support will be available if we need it.

What do I think about the scalability of the solution?

While we have not yet done a lot to scale the solution, we think that is going to be quite scalable because it's working on a distributed architecture. 

We will probably start with 10 or 15 users once we roll the solution out into production, which will probably be at the end of this week. Afterward, the user base will be growing quite large by double digits in percentage. But that is just to start with. Over a few years, we plan to start thinking about rolling out our experiences to our international businesses as well. This would be a substantial increase in user base.

How are customer service and technical support?

At the moment and for what we have been able to experience, technical support seems to be fine. I would rate it at between seven to eight out of ten.

Which solution did I use previously and why did I switch?

We did not consider other solutions.

How was the initial setup?

The initial setup was difficult and we didn't like it. That is only because we implemented it with other software solutions outside Cloudera and needed to do the integrations. 

We are still battling with working out problems with some integrations after eight weeks. It's up and running, but we're optimizing, so that is why I'm saying it's probably medium to complex. But that was the situation for us and our particular needs. It may not be as complex for other businesses at all.

What about the implementation team?

We have been working through the implementation with our own team.

Which other solutions did I evaluate?

We did consider other opportunities. Although we are quite comfortable with our current solution we may look at Hortonworks again, but that is not yet confirmed. We believe, from what we have read and what has been advertised, that Hortonworks and Cloudera are going to eventually merge and become one product. According to some sources, it has already happened.

We're simply trying to get the best of both worlds.

What other advice do I have?

I would say that the product as it currently is should rate at an eight out of ten. The reason that score is not higher is because of the workarounds that we have to do when it comes to certain models that do not support using multiple programming languages. For example, in a single notebook, it is inflexible if you want to use other program languages. 

As far as other advice for people considering this solution, I would say take a good look at your business need before you decide on this technology and which solution to choose. Make sure that you are not already able to solve for your particular, identified needs using your existing technology before even considering a change.

You want to be sure you're applying the technology to the right business case because of actual need and not just change for change's sake.

Which version of this solution are you currently using?

Enterprise Data Platform
**Disclosure: I am a real user, and this review is based on my own experience and opinions.
More Cloudera Distribution for Hadoop reviews from users
...who work at a Financial Services Firm
...who compared it with Oracle NoSQL
Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: June 2021.
513,091 professionals have used our research since 2012.
Add a Comment
ITCS user