Apache Spark vs. Cloudera Distribution for Hadoop

Apache Spark is ranked 1st in Hadoop with 6 reviews vs Cloudera Distribution for Hadoop which is ranked 3rd in Hadoop with 1 review. The top reviewer of Apache Spark writes "We are able to solve problems, e.g., reporting on big data, that we were not able to tackle in the past". The top reviewer of Cloudera Distribution for Hadoop writes "Cloudera Manager is a good tool to administer. Sometimes it gets confusing to follow a single path for installation". Apache Spark is most compared with Apache NiFi, AWS Lambda and Azure Stream Analytics. Cloudera Distribution for Hadoop is most compared with IBM InfoSphere BigInsights, Hortonworks Data Platform and Cassandra.
Cancel
You must select at least 2 products to compare!
Most Helpful Review
Use Null Product? Share your opinion.
Find out what your peers are saying about Apache, Hortonworks, Cloudera and others in Hadoop.
300,846 professionals have used our research since 2012.

Quotes From Members Comparing Apache Spark vs. Cloudera Distribution for Hadoop

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
report
Use our free recommendation engine to learn which Hadoop solutions are best for your needs.
300,846 professionals have used our research since 2012.
Ranking
Views
13,237
Comparisons
7,841
Reviews
6
Followers
359
Avg. Rating
8.3
Views
16,839
Comparisons
8,763
Reviews
1
Followers
360
Avg. Rating
6.0
Views
0
Comparisons
0
Reviews
1
Followers
1
Avg. Rating
5.0
Top Comparisons
Compared 17% of the time.
Compared 15% of the time.
See more Apache Spark competitors »
See more Cloudera Distribution for Hadoop competitors »
Website/Video
Apache
Cloudera
Null Vendor
Overview

Spark provides programmers with an application programming interface centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. It was developed in response to limitations in the MapReduce cluster computing paradigm, which forces a particular linear dataflowstructure on distributed programs: MapReduce programs read input data from disk, map a function across the data, reduce the results of the map, and store reduction results on disk. Spark's RDDs function as a working set for distributed programs that offers a (deliberately) restricted form of distributed shared memory

Cloudera Distribution for Hadoop is the world's most complete, tested, and popular distribution of Apache Hadoop and related projects. CDH is 100% Apache-licensed open source and is the only Hadoop solution to offer unified batch processing, interactive SQL, and interactive search, and role-based access controls. More enterprises have downloaded CDH than all other such distributions combined.
Information Not Available
OFFER
Learn more about Apache Spark
Learn more about Cloudera Distribution for Hadoop
Learn more about Null Product
Sample Customers
NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions37signals, Adconion,adgooroo, Aggregate Knowledge, AMD, Apollo Group, Blackberry, Box, BT, CSC
Information Not Available
Top Industries
VISITORS READING REVIEWS
Comms Service Provider
17%
Manufacturing Company
14%
Financial Services Firm
13%
Marketing Services Firm
11%
VISITORS READING REVIEWS
Financial Services Firm
28%
Marketing Services Firm
11%
Media Company
7%
Insurance Company
7%
No Data Available
Company Size
REVIEWERS
Small Business
33%
Midsize Enterprise
20%
Large Enterprise
47%
VISITORS READING REVIEWS
Small Business
19%
Midsize Enterprise
18%
Large Enterprise
64%
REVIEWERS
Small Business
27%
Midsize Enterprise
33%
Large Enterprise
40%
VISITORS READING REVIEWS
Small Business
18%
Midsize Enterprise
23%
Large Enterprise
59%
No Data Available
Find out what your peers are saying about Apache, Hortonworks, Cloudera and others in Hadoop.
Download now
300,846 professionals have used our research since 2012.
We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.

Sign Up with Email