We performed a comparison between Cloudera Distribution for Hadoop and Spark SQL based on real PeerSpot user reviews.
Find out in this report how the two Hadoop solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The solution's most valuable feature is the enterprise data platform."
"We also really like the Cloudera community. You can have any question and will have your answer within a few hours."
"The product provides better data processing features than other tools."
"In terms of scalability, if you have enough hardware you can scale out. Scalability doesn't have any issues."
"Cloudera is a very manageable solution with good support."
"The scalability of Cloudera Distribution for Hadoop is excellent."
"With a cluster available, you can manage the security layer using the shared SDX - it provides flexibility."
"The main advantage is the storage is less expensive."
"The speed of getting data."
"Overall the solution is excellent."
"Offers a variety of methods to design queries and incorporates the regular SQL syntax within tasks."
"I find the Thrift connection valuable."
"Spark SQL's efficiency in managing distributed data and its simplicity in expressing complex operations make it an essential part of our data pipeline."
"Data validation and ease of use are the most valuable features."
"It is a stable solution."
"The stability was fine. It behaved as expected."
"The initial setup of Cloudera is difficult."
"There is a maximum of a one-gigabyte block size, which is an area of storage that can be improved upon."
"Currently, we are using many other tools such as Spark and Blade Job to improve the performance."
"The procedure for operations could be simplified."
"The price of this solution could be lowered."
"The competitors provide better functionalities."
"The dashboard could be improved."
"The one thing that we struggled with predominately was support. Because it was relatively new, support was always a big issue and I think it's still a bit of an ongoing concern with the team currently managing it."
"It takes a bit of time to get used to using this solution versus Pandas as it has a steep learning curve."
"It would be beneficial for aggregate functions to include a code block or toolbox that explains its calculations or supported conditional statements."
"There should be better integration with other solutions."
"Anything to improve the GUI would be helpful."
"The solution needs to include graphing capabilities. Including financial charts would help improve everything overall."
"There are many inconsistencies in syntax for the different querying tasks."
"In the next update, we'd like to see better performance for small points of data. It is possible but there are better tools that are faster and cheaper."
"It would be useful if Spark SQL integrated with some data visualization tools."
More Cloudera Distribution for Hadoop Pricing and Cost Advice →
Cloudera Distribution for Hadoop is ranked 2nd in Hadoop with 47 reviews while Spark SQL is ranked 4th in Hadoop with 14 reviews. Cloudera Distribution for Hadoop is rated 8.0, while Spark SQL is rated 7.8. The top reviewer of Cloudera Distribution for Hadoop writes "Good end-to-end security features and we like that it's cloud independent". On the other hand, the top reviewer of Spark SQL writes "Offers the flexibility to handle large-scale data processing". Cloudera Distribution for Hadoop is most compared with Amazon EMR, HPE Ezmeral Data Fabric, Apache Spark, MongoDB and Cassandra, whereas Spark SQL is most compared with Apache Spark, IBM Db2 Big SQL, SAP HANA, HPE Ezmeral Data Fabric and Netezza Analytics. See our Cloudera Distribution for Hadoop vs. Spark SQL report.
See our list of best Hadoop vendors.
We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.