Apache Spark vs. Cloudera Distribution for Hadoop

As of April 2019, Apache Spark is ranked 1st in Hadoop with 8 reviews vs Cloudera Distribution for Hadoop which is ranked 2nd in Hadoop with 1 review. The top reviewer of Apache Spark writes "We are able to solve problems, e.g., reporting on big data, that we were not able to tackle in the past". The top reviewer of Cloudera Distribution for Hadoop writes "We use this solution to use big data for our analyses". Apache Spark is most compared with Spring Boot, Azure Stream Analytics and AWS Lambda. Cloudera Distribution for Hadoop is most compared with Hortonworks Data Platform, Cassandra and Amazon EMR.
Cancel
You must select at least 2 products to compare!
Most Helpful Review
Find out what your peers are saying about Apache, Cloudera, Hortonworks and others in Hadoop. Updated: March 2019.
332,881 professionals have used our research since 2012.
Quotes From Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:

report
Use our free recommendation engine to learn which Hadoop solutions are best for your needs.
332,881 professionals have used our research since 2012.
Ranking
1st
out of 24 in Hadoop
Views
15,422
Comparisons
8,386
Reviews
7
Average Words per Review
175
Avg. Rating
7.7
2nd
out of 24 in Hadoop
Views
18,064
Comparisons
7,470
Reviews
1
Average Words per Review
361
Avg. Rating
9.0
Top Comparisons
Compared 16% of the time.
Compared 12% of the time.
Learn
Apache
Cloudera
Overview

Spark provides programmers with an application programming interface centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. It was developed in response to limitations in the MapReduce cluster computing paradigm, which forces a particular linear dataflowstructure on distributed programs: MapReduce programs read input data from disk, map a function across the data, reduce the results of the map, and store reduction results on disk. Spark's RDDs function as a working set for distributed programs that offers a (deliberately) restricted form of distributed shared memory

Cloudera Distribution for Hadoop is the world's most complete, tested, and popular distribution of Apache Hadoop and related projects. CDH is 100% Apache-licensed open source and is the only Hadoop solution to offer unified batch processing, interactive SQL, and interactive search, and role-based access controls. More enterprises have downloaded CDH than all other such distributions combined.
Offer
Learn more about Apache Spark
Learn more about Cloudera Distribution for Hadoop
Sample Customers
NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions37signals, Adconion,adgooroo, Aggregate Knowledge, AMD, Apollo Group, Blackberry, Box, BT, CSC
Find out what your peers are saying about Apache, Cloudera, Hortonworks and others in Hadoop. Updated: March 2019.
332,881 professionals have used our research since 2012.
We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.

Sign Up with Email