Cloudera Distribution for Hadoop Room for Improvement

LS
Head of Big Data and Analytics Competency center at OTP Bank Hungary

The Cloudera training is terrible. Five years ago, they had up-to-date training material and instructor-led courses that were pretty good. These days, the material is outdated and the training is very expensive and irrelevant to the new platform. It's hard to gather the necessary information for administrators or developers. We now apply for training hosted by other companies such as the UDME course which is better than Cloudera. Their professional service is also something that has a lower quality nowadays. What is really missing is a well-designed UI where people can get insight into data. We don't feel that Cloudera has a good SQL UI and there is a lot of room for improvement. 

View full review »
Shahan Rehman - PeerSpot reviewer
Senior Business Development Manager at BBI Consultancy

The tool's ability to be deployed on a cloud model is an area of concern where improvements are required. The tool works very well when deployed on an on-premises model. The deployment on a cloud platform is where Cloudera needs to work more. There are competitors who are way ahead of Cloudera.

View full review »
Miodrag Milojevic - PeerSpot reviewer
Senior Data Archirect at Yettel

Cloudera Distribution for Hadoop is not always completely stable in some cases, which can be a concern for big data solutions. Sometimes, there are problems with the network, and, of course, there can be communication issues with Active Directory or similar systems due to authorization scheduling, resulting in occasional problems. The implementation process is quite complex because of the schedules.

View full review »
Buyer's Guide
Cloudera Distribution for Hadoop
April 2024
Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,415 professionals have used our research since 2012.
Hamid M. Hamid - PeerSpot reviewer
Data architect at Banking Sector

Pricing could be improved.

View full review »
Miodrag-Stanic - PeerSpot reviewer
Senior Architect at Yettel

We switched to Airflow because Cloudera is outdated. It's not widely used. It would be good if we had the Spark 3.5. Spark is quite old. Cloudera is now offering an alternate solution as a replacement for AWS. AWS works badly with small files.

The solution is not fit for on-premise distributions. It should be containerized so we can deploy it as containers within Kubernetes. We had one upgrade from CDH to CDP, which lasted for a long time. And I would expect with containerized deployment, it would be upgraded much more quickly than we had the experience.

View full review »
Thishen Govender - PeerSpot reviewer
BI Manager at Discovery Health

The governance aspect of the solution should be improved. The pricing renewal notices can also be a bit challenging for us. It requires providing a substantial amount of notice for renewal, which has been a notable difficulty in our experience.

View full review »
KY
Senior IT Application Architect at a insurance company with 5,001-10,000 employees

The competitors provide better functionalities.

View full review »
Thishen Govender - PeerSpot reviewer
BI Manager at Discovery Health

Integration is one of the main things we struggle with because we're working with several other environments. For example, we've got an MPP environment outside the Hadoop environment. Many cloud-based platforms like Azure are fully integrated with technology that gives you MPP machine learning and data lakes all in one environment. We've got on-premises IBM solutions and Cloudera, so it isn't easy to integrate. It would be useful if Cloudera had more tools like SQL Engines that offer the traditional relational database. We have to do a lot of work preparing the data outside Cloudera before getting it into the platform. And ideally, we should get as much raw data as possible into the platform before we can do the engineering, so we have machine learning and model training.

View full review »
Atif Tariq - PeerSpot reviewer
Cloud and Big Data Engineer | Developer at Huawei Cloud Middle East

The company is struggling to keep up with the upgrades of various components, and they are not willing to invest more in Cloudera.

The company is still switching from traditional methods to cutting-edge technology. While the deployed product is generally functional, there are instances where it presents difficulties. For example, the high SPs do not allow for metadata patching once it is created in the panel. This restriction limits our ability to make changes to the metadata.

I am aware that some companies are using open-source alternatives, which offer more flexibility. So, product maturity with cutting-edge technology will take more time.  

The primary concern is the cost. If you have the budget and are willing to pay for it, then it's fine. However, if we don't want to spend more money, it's not the best option.

View full review »
RS
AD - Associate Director at a financial services firm with 10,001+ employees

The performance can be improved. We have experienced some performance issues. It is not as sophisticated as Oracle Sybase.

Currently, we are using many other tools such as Spark and Blade Job to improve the performance.

The setup could be simplified, it's complex.

The security needs to be improved.

View full review »
Thishen Govender - PeerSpot reviewer
BI Manager at Discovery Health

The Data Science Workbench doesn't support multiple languages. It needs to support multiple programming languages. We were trying to use Scalar and Python for some solutions we wanted to deploy, but they didn't work properly. As a result, we had to come up with other workaround solutions. If the Data Science Workbench supported multiple programming languages our workflow would be easier and the solutions could be better.

Another aspect we would like to see improved is better opportunities for integration. For example, we would like to use H2O machine learning, which is an open-source product, and Cloudera doesn't support H2O.

If they could support H2O and also deploy multi-language support on the Cloudera Data Science that would be great. But the biggest thing that would help right now is H2O support.

Finally, one other improvement I would suggest is integrating data privacy software into  Cloudera. It is not quite complete in this aspect.

View full review »
KG
Vice President at a financial services firm with 10,001+ employees

The setup and administration were not easy with Cloudera Distribution for Hadoop. They could be improved.

The solution has a limited feature list, so having more features is something I'd like to see in the next release of Cloudera Distribution for Hadoop.

View full review »
YM
CEO at AM-BITS LLC

The areas of improvement depend on the scale of the project. For banking customers, security features and an essential budget for commercial licenses would be the top priority. Data regulation could be the most crucial for a project with extensive data or an extra use case.

View full review »
AK
Senior Data Architect Manager at Unifonic

The only thing that needs improvement is the cost, it's a very expensive solution and one of the main reasons companies are not attracted to the product. 

View full review »
Hamid M. Hamid - PeerSpot reviewer
Data architect at Banking Sector

The pricing needs to improve. If the price was affordable, then we might have continued using Cloudera. We switched to HPE because of the cost.

View full review »
Sayyed Aadil - PeerSpot reviewer
Hadoop Admin at Tata Consultancy

There are multiple bugs when we update.

View full review »
Suresh_Srinivasan - PeerSpot reviewer
Co-Founder at FORMCEPT Technologies

The security of this solution could be improved. There should also be a way to basically have a blockchain enabled storage with the HDFS. 

View full review »
EricLin - PeerSpot reviewer
Chairman at Athemaster co.,ltd.

The dashboard could be improved.

View full review »
Mohammed Hamad - PeerSpot reviewer
AI & Data Engineering Lead at a tech services company with 10,001+ employees

Cloudera's prices are too high and are not competitive with other solutions. They could also improve the Data Science Workbench and add some more features, like wizard activities.

View full review »
DS
DBA team manager at a financial services firm with 1,001-5,000 employees

I would like to see an improvement in how the solution helps me to handle the whole cluster. For example, when I'm going down to a specific tool, like Kafka, for example, the Cloudera manager doesn't really help me. Then I have to use Google with other Kafka knowledge and tools. 

View full review »
KG
Associate Manager at a consultancy with 501-1,000 employees

It could be faster and more user-friendly.

View full review »
it_user900987 - PeerSpot reviewer
Data Management at BCX

The one thing that we struggled with predominately was support. Because it was relatively new, support was always a big issue and I think it's still a bit of an ongoing concern with the team currently managing it.

In the next release, I think it would be helpful if there was easier integration into all the other existing data back corners. It will be a big plus as it's a favorite capability. We had to go with a third-party application in order to achieve that.

View full review »
SC
Lead Consultant - Product Development at FIS (http://www.fisglobal.com/)

As such in the product side, I don't have much to comment. But like other upcoming technologies like RPA, AI, GO etc they have ample training materials with variety of USE Cases, which users can understand and aligned with their current requirements. On same ground I didn't see much training materials from Cloudera.

View full review »
it_user357645 - PeerSpot reviewer
Data/Big Data Architect at a healthcare company with 1,001-5,000 employees

Sometimes the heavy queries do not finish at all. It would be good to see the progress of heavy script in the impala shell or get some way to access it.

View full review »
NK
Senior Software Engineer at a tech services company with 10,001+ employees

There is a maximum of a one-gigabyte block size, which is an area of storage that can be improved upon.

When we are upgrading CDH, there are many things that need to be upgraded and it would be helpful if it were bundled. As it is now, we have to upgrade many different things separately.

View full review »
it_user363186 - PeerSpot reviewer
Team Lead / Data Architect at a tech services company with 51-200 employees

We found some difficulties when importing Hive tables from another Cluster.

I want to point the fact that we encounter many problems related to the cloud storage and how resources are managed. Our learning has been that, although it is quite simple to deploy single machines on the cloud, deploying clusters of machines is much more complex as many factors need to be considered: individual machines, connectivity across machines, storage.

View full review »
it_user347172 - PeerSpot reviewer
System Engineer at a tech company with 10,001+ employees

HBase 1.0 stability issues and processing speed is a major area for improvement. Right now, our Cloudera 5 clusters run four to seven times slower than our Cloudera 4 clusters using our storm and kafka topologies, which causes real-time processing to be a major challenge.

CM’s API is very limited and difficult when used on multiple clusters in the same CM instance

View full review »
it_user370224 - PeerSpot reviewer
Director of Data Management at a media company with 51-200 employees

Full Support for all Spark SQL features, support for SparkR, compatibility with Hive for DataFrame saved tables.

Cloudera CDH5.5.x does not support SparkR. SparkR, the integration of R models in API would be a great addition since this will enable fast near real-time analytical integration of R models with data feed.

The functionality in SparkSQL to save a DataFrame as a table in HIVE produces a table not compatible with HIVE. There is a workaround for this in creating the HIVE table first and then doing inserts.

Cloudera CDH5.5.x is a great product, but the adoption of additional features not currently supported will make the product even better but by no means subtract from its desirability.


View full review »
it_user374058 - PeerSpot reviewer
Vice President - Big Data and Delivery at a computer software company with 51-200 employees
  • Some of the UI features seem confusing e.g. charts on the CM Services page
  • Sometimes it gets confusing to follow a single path for installation due to multiple recommended approaches e.g. parcels vs packages
View full review »
AG
Engineering Manager/Solution architect at a computer software company with 201-500 employees

When you compare Cloudera with EMR, EMR has a lot of administrative features, so you don't need to manage the solution. Cloudera is not as easy, as it requires more DevOps resources than other solutions.

View full review »
it_user364473 - PeerSpot reviewer
R&D Solutions Architect at a tech vendor with 10,001+ employees

Mainly they have to continuously evolve following the technology trends and replace or adapt part of their solutions accordingly.

View full review »
it_user364431 - PeerSpot reviewer
Consultant at a tech consulting company with 51-200 employees

More customization, better documentation for the API (basically it's the same for all Cloudera Hadoop components).

View full review »
it_user347565 - PeerSpot reviewer
Lead Bigdata Developer at a tech services company with 10,001+ employees

Apache Kudu needs improvement. It's a real-time updatable database.

View full review »
MA
Technical Presales Engineer at a tech services company with 51-200 employees

They should work on the solution's pricing. Also, finding resources with good experience in the solution is difficult. Thus, they should upgrade their technical capabilities in the market. 

They should add features like AutoML and AutoDev for enhanced machine-learning experiences. In addition, they should consider developing an integration capability similar to Informatica for an end-to-end enterprise solution.

View full review »
AD
Senior Consultant & Training at a tech services company with 51-200 employees

We experienced many issues when we started working with Hadoop 3.0 in the Cloudera 6.0 version, so there are a lot of things that need to improve. I believe they are working on that. 

View full review »
it_user374703 - PeerSpot reviewer
Data Consultant with 10,001+ employees

I'd like to see improvements to Impala. Also, it needs a more integrated environment with Spark, data warehouse, storage systems, cloud. Additionally, I'd want more UIs for components of ecosystem, preferably those UIs are centralized in a gateway.

View full review »
it_user347592 - PeerSpot reviewer
Senior Analyst - Strategy Analytics at a consultancy with 10,001+ employees

It needs more standardized documentation on Hive.

View full review »
ND
IT expert at a comms service provider with 201-500 employees

The procedure for operations could be simplified.

View full review »
it_user356769 - PeerSpot reviewer
Director of Data Architecture at a financial services firm with 501-1,000 employees

Some areas are under rapid development, like Spark.

View full review »
GW
Chief Executive Officer at a financial services firm with 51-200 employees

There are better solutions out there that have more features than this one.

View full review »
EricLin - PeerSpot reviewer
Chairman at Athemaster co.,ltd.

The price of this solution could be lowered.

View full review »
MG
Data engineer at a tech services company with 11-50 employees

We're processing a huge amount of data on our system. Without the big data environment, we cannot store all of this data live. We have billions of records and terabytes of storage to be used. It's not an option actually for us to have a big data environment. Cloudera is trying to adopt new technologies.

I think the idea of open source tools now is dominating. So Cloudera has to decide how to deal with open-source tools. I subscribe to Cloudera to get an enterprise version but I have found that I can get some of its features from other vendors that would be at a lower cost than Cloudera. They should lower the price. 

View full review »
it_user347787 - PeerSpot reviewer
Lead Instructor at a tech company with 501-1,000 employees

Spark with R integration is missing. Also, it is lacking Spark SQL support.

View full review »
MI
Project Coordinator at a manufacturing company with 1,001-5,000 employees

The user infrastructure and user interface needs to be improved, as well as the performance. The GUI needs to be better.

View full review »
it_user347535 - PeerSpot reviewer
Software Engineer at a tech services company with 501-1,000 employees

The licensing was by node. I think licensing by size of data managed would be a useful improvement.

View full review »
it_user345477 - PeerSpot reviewer
Software Design Engineer at a marketing services firm with 501-1,000 employees

We're currently trying to perform a failed installation and it's little bit difficult. It should restart the installation where it left off.

View full review »
Buyer's Guide
Cloudera Distribution for Hadoop
April 2024
Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,415 professionals have used our research since 2012.