Cloudera Distribution for Hadoop Reviews

Cloudera Distribution for Hadoop is the #2 ranked solution of our top Hadoop tools. It's rated 3.9 out of 5 stars, and is most commonly compared to Amazon EMR - Cloudera Distribution for Hadoop vs Amazon EMR

Filter by:
Industry
Loading...
Filter Unavailable
Company Size
Loading...
Filter Unavailable
Job Level
Loading...
Filter Unavailable
Rating
Loading...
Filter Unavailable
Considered
Loading...
Filter Unavailable
Order by:
Loading...
  • Date
  • Highest Rating
  • Lowest Rating
  • Review Length
Search:
Showingreviews based on the current filters. Reset all filters
Real User
BI Manager at Discovery Health
Jul 17 2019

What is most valuable?

We find CDSW useful and plan to use it as a one-stop application for model build and training. Currently, we use Zeppelin notebook and we want to gravitate to a single application for notebooks.

How has it helped my organization?

It gives us the opportunity to offer more options to our clients and create better solution models.

What needs improvement?

The Data Science Workbench doesn't support multiple languages. It needs to support multiple programming languages. We were trying to use Scalar and Python for some solutions we wanted to deploy, but… more »

Which solution did I use previously and why did I switch?

We did not consider other solutions.

What other advice do I have?

I would say that the product as it currently is should rate at an eight out of ten. The reason that score is not higher is because of the workarounds that we have to do when it comes to certain models… more »

Which other solutions did I evaluate?

We did consider other opportunities. Although we are quite comfortable with our current solution we may look at Hortonworks again, but that is not yet confirmed. We believe, from what we have read and… more »
Zjaen Coetzee
Real User
Data Management at BCX
Jul 23 2019

What is most valuable?

The feature that we've used quite intensively is Spark, in how it specifically can speed up some of the data to assist with processing.

What needs improvement?

The one thing that we struggled with predominately was support. Because it was relatively new, support was always a big issue and I think it's still a bit of an ongoing concern with the team currently… more »

What's my experience with pricing, setup cost, and licensing?

The pricing is very competitive. It's not bad.

Which solution did I use previously and why did I switch?

This is our first solution. We tested a bunch of other technologies, but that was our first one and we're still using it.

What other advice do I have?

I would recommend the solution given that they've proven the business case and that they've proven the technology. We have found that if you don't use or address the right business code you end up… more »

Which other solutions did I evaluate?

We considered working with a few other companies, including IBM Bluemix.
Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: April 2020.
442,986 professionals have used our research since 2012.
NavneetKaur
Real User
Senior Software Engineer at a tech services company with 10,001+ employees
Mar 12 2020

What is most valuable?

The most valuable feature is Impala, the querying engine, which is very fast. We have been able to work with one terabyte of data in less than 20 minutes. The speed makes it easy for us to process all of the data that comes in, in time. The support is very good. All of the data has automatic triple… more »

What needs improvement?

There is a maximum of a one-gigabyte block size, which is an area of storage that can be improved upon. When we are upgrading CDH, there are many things that need to be upgraded and it would be helpful if it were bundled. As it is now, we have to upgrade many different things separately.

Which solution did I use previously and why did I switch?

We did use another solution prior to this one but it could not keep up with our increase in data.

What other advice do I have?

This suitability of this solution depends on the size of the data that you are going to be working with. If you have going to be working with a huge dataset that contains many gigabytes of data then this is a good solution. For smaller datasets, you should also consider other technologies. My advice… more »
Mohamed Gomaa
Real User
Data engineer at a tech services company with 11-50 employees
Mar 31 2020

What is most valuable?

Cloudera is always developing new tools and supports a wide range of tools. We also really like the Cloudera community. You can have any question and will have your answer within a few hours. Cloudera is better than other competitors because they acquired Hortonworks.

What needs improvement?

We're processing a huge amount of data on our system. Without the big data environment, we cannot store all of this data live. We have billions of records and terabytes of storage to be used. It's not an option actually for us to have a big data environment. Cloudera is trying to adopt new technologies. I think the idea of open source tools now is dominating. So Cloudera has to decide how to deal… more »

What other advice do I have?

In terms of the advice, I would say to focus on what tools are available on the market. In terms of open-source, most companies are delivering open source technologies and providing support to these tools. Now I have the option to purchase a license for whatever platform for $1. I can deliver it with another small company at a lower cost. If I was the decision-maker, I'd invest in open-source… more »
Real User
AD - Associate Director at a financial services firm with 10,001+ employees
Sep 19 2020

What is most valuable?

It allows us to store huge amounts of data, which is an advantage. They have BI (Business Intelligence) tools. There are many AI tools. We are able to connect and analyze the data to get reports. The… more »

How has it helped my organization?

It has been helpful in allowing data storage in one centralized location with data lakes and all of the surrounding applications. All of the data processes are being stored into the Big Data Lake.

What needs improvement?

The performance can be improved. We have experienced some performance issues. It is not as sophisticated as Oracle Sybase. Currently, we are using many other tools such as Spark and Blade Job to… more »

What's my experience with pricing, setup cost, and licensing?

When comparing with Oracle Sybase and SQL, it's cheaper. It's not expensive.

Which solution did I use previously and why did I switch?

Previously we were using Oracle Sybase SQL. We switched because now, we have introduced Big Data.

What other advice do I have?

I am a part of security and software development. We are currently considering migrating to the cloud, and planning on using Microsoft Azure, mainly for the Big Data component. I would rate this… more »
Doron Sela
Real User
DBA team manager at a financial services firm with 1,001-5,000 employees
Jul 16 2019

What is most valuable?

The features I find most valuable is that the solution is that it is easy to install and to work with. It starts with the installation and from there on the management is very simple and centralized.

What needs improvement?

I would like to see an improvement in how the solution helps me to handle the whole cluster. For example, when I'm going down to a specific tool, like Kafka, for example, the Cloudera manager doesn't really help me. Then I have to use Google with other Kafka knowledge and tools.

What other advice do I have?

I had a bad experience connecting the Cloudera Distribution for Hadoop cluster to my other resources in the company, like the active directory or firewall. I would like to see the outside environment to be easier to handle. I will rate this eight out of ten because the solution doesn't cover everything. It is a very complicated solution because it contains a lot of internal tools.
Sumit Chaudhuri
Consultant
Lead Consultant - Product Development at FIS (http://www.fisglobal.com/)
Jul 15 2019

What is most valuable?

Keeping multi copies of the file and tools of map reduce like PIG, HIVE due to their flexibility it is easy to develop the application with less or almost no knowledge of Java and Sql. And capability… more »

How has it helped my organization?

That is still in PUC stage, as I have mentioned our analyst used to do the actuarial on a spreadsheet but after Hadoop implementation they are getting confidence that now analysis is more appropriate… more »

What needs improvement?

As such in the product side, I don't have much to comment. But like other upcoming technologies like RPA, AI, GO etc they have ample training materials with variety of USE Cases, which users can… more »

What's my experience with pricing, setup cost, and licensing?

Which solution did I use previously and why did I switch?

No when we were heard of Hadoop, we tried on that only. I mean tried to migrate from spreadsheets to Hadoop.

Which other solutions did I evaluate?

Not really.
Consultant
Senior Consultant & Training at a tech services company with 51-200 employees
Jul 17 2019

What is most valuable?

I like the combination of all the tools that allow me to provide solutions and enable me to solve the use cases I'm working on. You need tools or components to foresee everything, and they are all in our emails. Sometimes you try several of them, and sometimes one will work better than the other. So you have to test the tools to see what works for you.

What needs improvement?

We experienced many issues when we started working with Hadoop 3.0 in the Cloudera 6.0 version, so there are a lot of things that need to improve. I believe they are working on that.

What other advice do I have?

I will rate this solution a nine out of ten because nothing is ever perfect. You will always face problems, but I'm quite happy with Cloudera.
See 2 More Cloudera Distribution for Hadoop Reviews

What is Cloudera Distribution for Hadoop?

Cloudera Distribution for Hadoop is the world's most complete, tested, and popular distribution of Apache Hadoop and related projects. CDH is 100% Apache-licensed open source and is the only Hadoop solution to offer unified batch processing, interactive SQL, and interactive search, and role-based access controls. More enterprises have downloaded CDH than all other such distributions combined.
Cloudera Distribution for Hadoop customers
37signals, Adconion,adgooroo, Aggregate Knowledge, AMD, Apollo Group, Blackberry, Box, BT, CSC
Read Archived Reviews