We performed a comparison between Amazon EMR, Hortonworks Data Platform, and IBM InfoSphere BigInsights [EOL] based on real PeerSpot user reviews.
Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop."In Amazon EMR it is easy to rebuild anything, easy to upgrade and has good fault tolerance."
"It allows users to access the data through a web interface."
"The solution is scalable."
"The ability to resize the cluster is what really makes it stand out over other Hadoop and big data solutions."
"When we grade big jobs from on-prem to the cloud, we do it in EMR with Spark."
"Amazon EMR is a good solution that can be used to manage big data."
"This is the best tool for hosts and it's really flexible and scalable."
"We are using applications, such as Splunk, Livy, Hadoop, and Spark. We are using all of these applications in Amazon EMR and they're helping us a lot."
"Now, using this solution, it is much cheaper to have all of the data available for searching, not in real-time, but whenever there is a pending request."
"The Hortonworks solution is so stable. It is working as a production system, without any error, without any downtime. If I have downtime, it is mostly caused by the hardware of the computers."
"Ambari Web UI: user-friendly."
"The data platform is pretty neat. The workflow is also really good."
"It is a scalable platform."
"Ranger for security; with Ranger we can manager user’s permissions/access controls very easily."
"The product offers a fairly easy setup process."
"The upgrades and patches must come from Hortonworks."
"InfoSphere Streams was the one core product from the platform in which we were using. We were building a real-time response system and we built it on InfoSphere Streams."
"Amazon EMR is continuously improving, but maybe something like CI/CD out-of-the-box or integration with Prometheus Grafana."
"The product must add some of the latest technologies to provide more flexibility to the users."
"The problem for us is it starts very slow."
"We don't have much control. If we have multiple users, if they want to scale up, the cost will go and increase and we don't know how we can restrict that price part."
"There is no need to pay extra for third-party software."
"As people are shifting from legacy solutions to other technologies, Amazon EMR needs to add more features that give more flexibility in managing user data."
"The legacy versions of the solution are not supported in the new versions."
"The initial setup was time-consuming."
"Security and workload management need improvement."
"Hive performance. If Hive performance increased, Hadoop would replace (not everywhere) traditional databases."
"More information could be there to simplify the process of running the product."
"Since Cloudera acquired HDP, it's been bundled with CBH and HDP. However, the biggest challenge is cloud storage integration with Azure, GCP, and AWS."
"It would also be nice if there were less coding involved."
"The cost of the solution is high and there is room for improvement."
"I would like to see more support for containers such as Docker and OpenShift."
"It's at end of life and no longer will there be improvements."
"The UI was not interactive: Responses used to be very slow and hang up at times."
Earn 20 points