We just raised a $30M Series A: Read our story

Cloudera Distribution for Hadoop OverviewUNIXBusinessApplication

Cloudera Distribution for Hadoop is #2 ranked solution in top Hadoop tools and top NoSQL Databases. IT Central Station users give Cloudera Distribution for Hadoop an average rating of 8 out of 10. Cloudera Distribution for Hadoop is most commonly compared to HPE Ezmeral Data Fabric:Cloudera Distribution for Hadoop vs HPE Ezmeral Data Fabric. The top industry researching this solution are professionals from a computer software company, accounting for 28% of all views.
What is Cloudera Distribution for Hadoop?
Cloudera Distribution for Hadoop is the world's most complete, tested, and popular distribution of Apache Hadoop and related projects. CDH is 100% Apache-licensed open source and is the only Hadoop solution to offer unified batch processing, interactive SQL, and interactive search, and role-based access controls. More enterprises have downloaded CDH than all other such distributions combined.
Buyer's Guide

Download the Hadoop Buyer's Guide including reviews and more. Updated: November 2021

Cloudera Distribution for Hadoop Customers
37signals, Adconion,adgooroo, Aggregate Knowledge, AMD, Apollo Group, Blackberry, Box, BT, CSC
Cloudera Distribution for Hadoop Video

Pricing Advice

What users are saying about Cloudera Distribution for Hadoop pricing:
  • "When comparing with Oracle Sybase and SQL, it's cheaper. It's not expensive."
  • "The price could be better for the product."

Cloudera Distribution for Hadoop Reviews

Filter by:
Filter Reviews
Industry
Loading...
Filter Unavailable
Company Size
Loading...
Filter Unavailable
Job Level
Loading...
Filter Unavailable
Rating
Loading...
Filter Unavailable
Considered
Loading...
Filter Unavailable
Order by:
Loading...
  • Date
  • Highest Rating
  • Lowest Rating
  • Review Length
Search:
Showingreviews based on the current filters. Reset all filters
NavneetKaur
Senior Software Engineer at a tech services company with 10,001+ employees
Real User
Top 10
Performs well and the technical support is helpful, but the upgrade process needs to be consolidated

Pros and Cons

  • "The most valuable feature is Impala, the querying engine, which is very fast."
  • "There is a maximum of a one-gigabyte block size, which is an area of storage that can be improved upon."

What is our primary use case?

We are dealing with data from the telecom industry. We were using an Oracle system but our volume has increased. We now have a lot of real-time data that needs to be transformed so that it can be made available and used.

What is most valuable?

The most valuable feature is Impala, the querying engine, which is very fast. We have been able to work with one terabyte of data in less than 20 minutes. The speed makes it easy for us to process all of the data that comes in, in time.

The support is very good.

All of the data has automatic triple replication in order to secure integrity.

What needs improvement?

There is a maximum of a one-gigabyte block size, which is an area of storage that can be improved upon.

When we are upgrading CDH, there are many things that need to be upgraded and it would be helpful if it were bundled. As it is now, we have to upgrade many different things separately.

For how long have I used the solution?

I have been working with the Cloudera Distribution for Hadoop for around two years.

What do I think about the stability of the solution?

It is a stable solution.

What do I think about the scalability of the solution?

The scalability is good and it works on commodity hardware. One of the problems we have right now is that there is a lot of data and we're moving it from our Oracle solution. This means that there is a double cost, in terms of storage, during our transition to working with big data.

We are using a data lake that is a store for all of the data in our organization. There are more than25 projects, with between 25 and 30 people in each one, for a total of almost 1,000 people. All of them are dependent on this solution.

Most of our users are technicians who have problems to solve using the data available to them. A couple of them are data scientists and the remainder are upper management, who do the analysis.

How are customer service and technical support?

The technical support is very good. Whenever we open a ticket, we get support right away.

Which solution did I use previously and why did I switch?

We did use another solution prior to this one but it could not keep up with our increase in data.

What other advice do I have?

This suitability of this solution depends on the size of the data that you are going to be working with. If you have going to be working with a huge dataset that contains many gigabytes of data then this is a good solution. For smaller datasets, you should also consider other technologies.

My advice for anybody who is implementing this solution is to take some time to learn it. Beyond that, be sure to contact support if you have any problems because they are very helpful.

I would rate this solution a seven out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
MG
Data engineer at a tech services company with 11-50 employees
Real User
Top 20
Supports a wide range of tools and has a good support community

Pros and Cons

  • "We also really like the Cloudera community. You can have any question and will have your answer within a few hours."
  • "Without the big data environment, we cannot store all of this data live. We have billions of records and terabytes of storage to be used. It's not an option actually for us to have a big data environment."

What is our primary use case?

Our primary use case for this solution is to host a big amount of data in our platform, processing, analysis and all of this stuff on the platform.

What is most valuable?

Cloudera is always developing new tools and supports a wide range of tools. We also really like the Cloudera community. You can have any question and will have your answer within a few hours. Cloudera is better than other competitors because they acquired Hortonworks.

What needs improvement?

We're processing a huge amount of data on our system. Without the big data environment, we cannot store all of this data live. We have billions of records and terabytes of storage to be used. It's not an option actually for us to have a big data environment. Cloudera is trying to adopt new technologies.

I think the idea of open source tools now is dominating. So Cloudera has to decide how to deal with open-source tools. I subscribe to Cloudera to get an enterprise version but I have found that I can get some of its features from other vendors that would be at a lower cost than Cloudera. They should lower the price. 

For how long have I used the solution?

We have been using Cloudera for a year. 

What do I think about the stability of the solution?

It's stable. I have no issue regarding the stability.

What do I think about the scalability of the solution?

It's scalable. You can add more nodes and you can expand your cluster easily.

How are customer service and technical support?

After we open a ticket, the issue can be resolved very quickly, they have a management portal. I don't contact them directly, but I haven't heard anybody having any problems with it. 

How was the initial setup?

The initial setup is complicated. We needed the vendor to install it themselves. The deployment took around three weeks. Three people were involved because they just follow up and supervise the deployment, but they're not deploying anything. The vendor does it. 

What other advice do I have?

In terms of the advice, I would say to focus on what tools are available on the market. In terms of open-source, most companies are delivering open source technologies and providing support to these tools. Now I have the option to purchase a license for whatever platform for $1. I can deliver it with another small company at a lower cost. If I was the decision-maker, I'd invest in open-source tools. Cloudera and all of these companies are trying to adapt to these big data technologies and open source tools. Cloudera is trying to put it inside their platform so that we can have a compatible solution.

I would rate it an eight out of ten. 

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Find out what your peers are saying about Cloudera, IBM, Amazon and others in Hadoop. Updated: November 2021.
552,695 professionals have used our research since 2012.
RS
AD - Associate Director at a financial services firm with 10,001+ employees
Real User
Top 10
Feature rich and scalable with good support, but there are performance issues and the security could be improved

Pros and Cons

  • "The main advantage is the storage is less expensive."
  • "Currently, we are using many other tools such as Spark and Blade Job to improve the performance."

What is our primary use case?

We are using this solution for storing Big Data in one centralized location.

How has it helped my organization?

It has been helpful in allowing data storage in one centralized location with data lakes and all of the surrounding applications.

All of the data processes are being stored into the Big Data Lake.

What is most valuable?

It allows us to store huge amounts of data, which is an advantage.

They have BI (Business Intelligence) tools. There are many AI tools.

We are able to connect and analyze the data to get reports. The reports are very good.

The main advantage is the storage is less expensive.

What needs improvement?

The performance can be improved. We have experienced some performance issues. It is not as sophisticated as Oracle Sybase.

Currently, we are using many other tools such as Spark and Blade Job to improve the performance.

The setup could be simplified, it's complex.

The security needs to be improved.

For how long have I used the solution?

I have been using this solution since 2015.

What do I think about the stability of the solution?

It's a stable solution.

What do I think about the scalability of the solution?

Scalability is good. It's replicated and by default, with Big Data there is a replication factor.

Over the years we have grown, when we started we had 10 nodes now we have increased to a large number of nodes.

How are customer service and technical support?

Technical support is good. I have been able to learn from them. As a developer, I am learning every day.

I would rate the technical support a ten out of ten.

Which solution did I use previously and why did I switch?

Previously we were using Oracle Sybase SQL. We switched because now, we have introduced Big Data.

How was the initial setup?

The initial setup was complex.

It's not as simple as Oracle Sybase.

It's a complex architecture because you have raw data and many engines.

What's my experience with pricing, setup cost, and licensing?

When comparing with Oracle Sybase and SQL, it's cheaper. It's not expensive.

What other advice do I have?

I am a part of security and software development. 

We are currently considering migrating to the cloud, and planning on using Microsoft Azure, mainly for the Big Data component.

I would rate this solution a five out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
KG
Associate Manager at a consultancy with 501-1,000 employees
Real User
Top 5Leaderboard
Easy to install, good technical support, and with a single script we can run jobs within minutes

What is our primary use case?

We use this solution to process data. When using an SQL Server you have to build indexes and you need to fine-tune the data. We import the data that is in the SQL Source. With a single script, we are able to run the jobs within minutes, which is an advantage. We are using the Power BI model for the business convention. The performance in Power BI will be reduced if you incorporate more calculations. Those calculations are captured in the Hadoop layer and processed.

What needs improvement?

It could be faster and more user-friendly.

For how long have I used the solution?

I have been using this solution for seven months.

What do I think about the stability of the solution?

It's a stable product. I don't see any performance issues.

What do I

What is our primary use case?

We use this solution to process data.

When using an SQL Server you have to build indexes and you need to fine-tune the data.

We import the data that is in the SQL Source.

With a single script, we are able to run the jobs within minutes, which is an advantage.

We are using the Power BI model for the business convention. The performance in Power BI will be reduced if you incorporate more calculations. Those calculations are captured in the Hadoop layer and processed.

What needs improvement?

It could be faster and more user-friendly.

For how long have I used the solution?

I have been using this solution for seven months.

What do I think about the stability of the solution?

It's a stable product. I don't see any performance issues.

What do I think about the scalability of the solution?

This solution is scalable. We have 40 users for different projects in our organization.

We will continue to use this solution.

How are customer service and technical support?

Technical support is good.

Which solution did I use previously and why did I switch?

I didn't use any other product.

How was the initial setup?

The installing is straightforward.

Our clients provide us with the access to use it directly.

Once you have been given access to the edge nodes we are able to run the scripts in the Hadoop layer.

What's my experience with pricing, setup cost, and licensing?

We do not pay for licensing because our customers forward it, so there is no need to purchase the license for the project.

What other advice do I have?

I would recommend this solution.

I would rate Cloudera Distribution for Hadoop a nine out of ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
TG
BI Manager at a insurance company with 10,001+ employees
Real User
Includes several useful proprietary tools

Pros and Cons

  • "CDH has a wide variety of proprietary tools that we use, like Impala. So from that perspective, it's quite useful as opposed to something open-source. We get a lot of value from Cloudera's proprietary tools."
  • "It would be useful if Cloudera had more tools like SQL Engines that offer the traditional relational database. We have to do a lot of work preparing the data outside Cloudera before getting it into the platform."

How has it helped my organization?

CDH has a wide variety of proprietary tools that we use, like Impala. So from that perspective, it's quite useful as opposed to something open-source. We get a lot of value from Cloudera's proprietary tools. 

What needs improvement?

Integration is one of the main things we struggle with because we're working with several other environments. For example, we've got an MPP environment outside the Hadoop environment. Many cloud-based platforms like Azure are fully integrated with technology that gives you MPP machine learning and data lakes all in one environment. We've got on-premises IBM solutions and Cloudera, so it isn't easy to integrate. It would be useful if Cloudera had more tools like SQL Engines that offer the traditional relational database. We have to do a lot of work preparing the data outside Cloudera before getting it into the platform. And ideally, we should get as much raw data as possible into the platform before we can do the engineering, so we have machine learning and model training.

For how long have I used the solution?

I've been using CDH for about two years, or rather, I manage the team that uses it.

What do I think about the stability of the solution?

We haven't had any issues with Cloudera. It's a solid product. 

What do I think about the scalability of the solution?

Cloudera is dependable, and it's completely scalable.

How are customer service and support?

We have engaged the technical support based in the UK. My team hasn't worked with them directly, but the administration team has. To my knowledge, they're fairly responsive. 

What other advice do I have?

I rate Cloudera Distribution for Hadoop eight out of 10.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
EricLin
Chairman at Athemaster co.,ltd.
Real User
Top 5Leaderboard
Performs cost analysis tasks for our customers in the financial industry

What is our primary use case?

We are a solution provider and this is one of the systems that we implement for our clients. Our clients for this product are in the financial industry and they use it to perform cost analysis tasks.

What is most valuable?

The most valuable feature is Kubernetes.

What needs improvement?

The price of this solution could be lowered.

For how long have I used the solution?

We have been using the Cloudera Distribution for Hadoop for five years.

What do I think about the stability of the solution?

It is a stable solution.

What do I think about the scalability of the solution?

The Cloudera Distribution for Hadoop can be scaled. Our customers are enterprise-level companies and they have about 100 users for this solution.

How are customer

What is our primary use case?

We are a solution provider and this is one of the systems that we implement for our clients.

Our clients for this product are in the financial industry and they use it to perform cost analysis tasks.

What is most valuable?

The most valuable feature is Kubernetes.

What needs improvement?

The price of this solution could be lowered.

For how long have I used the solution?

We have been using the Cloudera Distribution for Hadoop for five years.

What do I think about the stability of the solution?

It is a stable solution.

What do I think about the scalability of the solution?

The Cloudera Distribution for Hadoop can be scaled. Our customers are enterprise-level companies and they have about 100 users for this solution.

How are customer service and technical support?

We offer technical support for this solution to our customers.

Which solution did I use previously and why did I switch?

We did not use another solution prior to this one.

How was the initial setup?

The initial setup is straightforward.

What's my experience with pricing, setup cost, and licensing?

The pricing is expensive.

Which other solutions did I evaluate?

Cloudera really has no competition.

What other advice do I have?

I would rate this solution a nine out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer: reseller
GW
Chief Executive Officer at a financial services firm with 51-200 employees
Real User
Top 20
Overall operational, stable but price could be better

What is our primary use case?

We use the solution for the data warehousing.

What is most valuable?

The product as a whole is good.

What needs improvement?

There are better solutions out there that have more features than this one.

For how long have I used the solution?

I have just started using the solution.

What do I think about the stability of the solution?

I do not know of any issues with the stability of the solution.

What about the implementation team?

I have an internal team that does maintenance for the solution.

What's my experience with pricing, setup cost, and licensing?

The price could be better for the product.

Which deployment model are you using for this solution?

On-premises

What is our primary use case?

We use the solution for the data warehousing.

What is most valuable?

The product as a whole is good.

What needs improvement?

There are better solutions out there that have more features than this one.

For how long have I used the solution?

I have just started using the solution.

What do I think about the stability of the solution?

I do not know of any issues with the stability of the solution.

What about the implementation team?

I have an internal team that does maintenance for the solution.

What's my experience with pricing, setup cost, and licensing?

The price could be better for the product.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.