Apache Hadoop Reviews
- Highest Rating
- Lowest Rating
- Review Length
Jul 29 2019
What is most valuable?The most valuable thing about this program for us is that it is very powerful and very cheap. We're using a lot of the program's modules and features because we're using software and hardware that can be difficult to integrate. For example… more»
How has it helped my organization?It helps us work with older products and more easily create solutions.
What needs improvement?We are using HDTM circuit boards, and I worry about the future of this product and compatibility with future releases. It's a concern because, for now, we do not have a clear path to upgrade. The Hadoop product is in version three and we'd… more»
Which solution did I use previously and why did I switch?We had a very old version of Hadoop which was already installed by another company and we upgraded it. We didn't really switch we just upgraded what was here.
What other advice do I have?I would give this product a rating of eight out of ten. It would not be a ten out of ten because of some problems we are having with the upgrade to the newer version. It would have been better for us if these problems were not holding us… more»
Aug 14 2018
Parallel processing allows us to get jobs done, but the platform needs more direct integration of visualization applications
What is most valuable?* Scalability * Parallel processing There are jobs that cannot be done unless you have massively parallel processing; for instance, processing call-detail records for telecom.
How has it helped my organization?There is a lot of difference. I think the best case is that we are able to drill down to transactional records and really build a root-cause analysis for various issues that might arise, on demand. Because we're able to process in parallel… more»
What needs improvement?In general, Hadoop has as lot of different component parts to the platform - things like Hive and HBase - and they're all moving somewhat independently and somewhat in parallel. I think as you look to platforms in the cloud or into… more»
Which solution did I use previously and why did I switch?There are the older relational database technologies: Netezza, SQL Server, MySQL, Oracle, Teradata. All have some advantages and some disadvantages. Most notably, they are all significantly more expensive in terms of the capital expense… more»
What other advice do I have?Implement for defined use cases. Don't expect it to all just work very easily. I would rate this platform a seven out of 10. On the one hand, it's the only place you can use certain functions, and on the other hand, it's not going to put… more»
Find out what your peers are saying about Apache, VMware, Snowflake Computing and others in Data Warehouse. Updated: March 2020.
408,459 professionals have used our research since 2012.
Sep 30 2019
What is most valuable?We don't use many of the Hadoop features, like Pig, or Sqoop, but what I like most is using the Ambari feature. You have to use Ambari otherwise it is very difficult to configure. What comes with the standard setup is what we mostly use… more»
What needs improvement?Hadoop itself is quite complex, especially if you want it running on a single machine, so to get it set up is a big mission. It seems that Hadoop is on it's way out and Spark is the way to go. You can run Spark on a single machine and it's… more»
Which solution did I use previously and why did I switch?We used the more traditional database solutions such as SAP IQ and Data Marks, but now it's changing more towards Data Science and Big Data. We are a smaller infrastructure, so that's how we are set up.
What other advice do I have?It's good for what is meant to do, a lot of big data, but it's not as good for low latency applications. If you have to perform quick queries on naive or analytics it can be frustrating. It can be useful for what it was intended to be used… more»
Which other solutions did I evaluate?There was an evaluation, but it was a decision to implement with Data Lake and Hortonworks data platform.
Nov 27 2019
What is most valuable?The ability to add multiple nodes without any restriction is the solution's most valuable aspect.
What needs improvement?What needs improvement depends on the customer and the use case. The classical Hadoop, for example, we consider an old variant. Most now work with flash data. There is a very wide application for this solution, but in enterprise companies, if you work with classical BI systems, it would be good to… more»
What's my experience with pricing, setup cost, and licensing?We originally built on Hortonworks tech which didn't require any licensing, but that is getting discontinued in 2022, so it's been proposed we move to Cloudera which will have licensing costs associated with it.
What other advice do I have?We use the on-premises deployment model. It's a requirement for the company we work with, which is a bank. Often customers demand we work with on-premises deployment models. I'd rate the solution seven out of ten. In terms of the ability to build middleware and offer scalability, it would be 10 out… more»
Dec 17 2019
What is most valuable?The most valuable feature is the database.
What needs improvement?We're finding vulnerabilities in running it 24/7. We're experiencing some downtime that affects the data. It would be good to have more advanced analytics tools.
Which solution did I use previously and why did I switch?We didn't previously use a different solution.
What other advice do I have?We use the on-premises deployment model. We're more inclined towards an operational data source to fill our customer's needs. Hadoop is good for analytics and some reporting requirements. It's a good solution for those needing something for the purposes of management reporting. I'd rate the solution… more»
Dec 16 2019
What is most valuable?The solution is perfect for when you have big data. It's good for managing and replication. It's good for storing historical data and handling analytics on a huge amount of data.
What needs improvement?It could be because the solution is open source, and therefore not funded like bigger companies, but we find the solution runs slow. The solution isn't as mature as SQL or Oracle and therefore lacks many features. The solution could use a better user interface. It needs a more effective GUI in order to create a better user environment.
What other advice do I have?I've used the solution under cloud, hybrid and on-premises deployment models. I'd recommend the solution, but it depends on the company's requirements. If you don't have huge amounts of data, you probably don't need Hadoop. If you need a completely private environment, and you have lots of big data, consider Hadoop. You don't even need to invest in the infrastructure as you can just use a cloud… more»
Jul 17 2019
What do you think of Apache Hadoop?
What is our primary use case?We use this solution for our Enterprise Data Lake.
How has it helped my organization?Using this solution has reduced the overall TCO. It has also improved data processing time for the machine and provides greater insight into our unstructured data.
What is most valuable?The most valuable features are the ability to process the machine data at a high speed, and to add structure to our data so that we can generate relevant analytics.
What needs improvement?We would like to have more dynamics in merging this machine data with other internal data to make more meaning out of it.
For how long have I used the solution?More than four years.
Feb 11 2020
What do you think of Apache Hadoop?
What is our primary use case?The primary use is as a data lake.
How has it helped my organization?Using this solution has allowed us to consolidate the data. It has made it such that data science-based algorithms can be written for predictive analytics.
What is most valuable?The most valuable features are powerful tools for ingestion, as data is in multiple systems.
What needs improvement?It would be helpful to have more information on how to best apply this solution to smaller organizations, with less data, and grow the data lake.
For how long have I used the solution?I have been using Apache Hadoop for two years.
User Assessments By Topic About Apache Hadoop
Apache Hadoop Questions
Read Archived Reviews
What is Apache Hadoop?The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
Apache Hadoop customersAmazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web Lab