2015-10-15 17:24:00 UTC

AWS EMR vs Hadoop


I do not see a big advantage of using Cloudera or Hortonworks Hadoop over AWS EMR.

I would like to know what are the key pain points that these vendors address which AWS EMR will not be able to support.

Thanks.

Guest
33 Answers

author avatar
Top 5Consultant

Here are the key points that differentiate EMR vs. packaged HADOOP software on a private cluster:

Amazon Web Services Elastic Map Reduce (EMR) is clearly a simple and fast
way to get started with Hadoop. As with any cloud offering the trade off is
control and security. With your corporate data in the cloud you are trusting
someone else and you are somewhat limited in terms of the types of things
you can do. AWS EMR is going to leverage open source Apache Hadoop
components almost exclusively.

Cheap but not as easy to use as some of the value add components in Hortonworks, Cloudera or IBM products.
If I leverage IBM InfoSphere BigInsights on my own cluster I gain
ease of use thru robust tools, security which I can control and standard SQL queries
thru BigSQL instead of HiveQL. Additionally the support would be superior.
Cost is of course more with a private cluster and purchasing SW and/or Support
So for these reasons, many people do get started with AWS EMR.

To summarize, the advantages of EMR are cost and open source components vs. flexibility, control, security, and convenience for a private HADOOP cluster.

Full disclosure: I work for an IBM Business Partner.

2015-10-20 18:02:08 UTC
author avatar
Vendor

In my opinion it is more about support and certification across Apache projects and vendor products. In open source, if you run into an issue,you fix the problem. You can build your own distribution and deploy it with EMR or you can take a certified distribution like CDH and HDP and have assistance.

2015-10-20 14:03:24 UTC
author avatar
Vendor

If you Are interested to know why Hadoop is so important. Suggestion is visit this link once :

http://collaberatact.com/apache-way-of-building-software-2/

2015-10-20 11:58:30 UTC
Find out what your peers are saying about Apache, Cloudera, IBM and others in Hadoop. Updated: May 2020.
418,350 professionals have used our research since 2012.