Apache Hadoop Reviews

Filter by:Reset all filters
industry
Filter Unavailable
Company Size
Filter Unavailable
Job Level
Filter Unavailable
rating
Filter Unavailable
Arul Mani
Real User
CEO
Mar 18 2018

What is most valuable?

HDFS and Kafka: Ingestion of huge volumes and variety of unstructured/semi-structured data is feasible, and it helps us to quickly onboard a new... more»

How has it helped my organization?

Initially, with RDBMS alone, we had a lot of work and few servers running on-premise and on cloud for the PoC and incubation. With the use of... more»

What needs improvement?

Based on our needs, we would like to see a tool for data visualization and enhanced Ambari for management, plus a pre-built IoT hub/model. These... more»

What's my experience with pricing, setup cost, and licensing?

We normally do not suggest any specific distributions. When it comes to cloud, our suggestion would be to choose different types of instances... more»

What other advice do I have?

Our general suggestion to any customer is not to blindly look and compare different options. Rather, list the exact business needs - current and... more»
Samuel Feinberg
Real User
Analytics Platform Manager at a consultancy with 10,001+ employees
Aug 14 2018

What is most valuable?

* Scalability * Parallel processing There are jobs that cannot be done unless you have massively parallel processing; for instance, processing... more»

How has it helped my organization?

There is a lot of difference. I think the best case is that we are able to drill down to transactional records and really build a root-cause... more»

What needs improvement?

In general, Hadoop has as lot of different component parts to the platform - things like Hive and HBase - and they're all moving somewhat... more»

Which solutions did we use previously?

There are the older relational database technologies: Netezza, SQL Server, MySQL, Oracle, Teradata. All have some advantages and some... more»

What other advice do I have?

Implement for defined use cases. Don't expect it to all just work very easily. I would rate this platform a seven out of 10. On the one hand,... more»
Find out what your peers are saying about Apache, Pivotal, Oracle and others in Data Warehouse.
302,095 professionals have used our research since 2012.
Colt Rodgers
Real User
Infrastructure Engineer at Zirous, Inc.
Mar 22 2017

What is most valuable?

The Distributed File System, which is the base of Hadoop, has been the most valuable feature with its ability to store... more»

How has it helped my organization?

We do use the Hadoop platform internally, but mostly it is for R&D purposes. However, many of the recent projects... more»

What needs improvement?

Hadoop in and of itself stores data with 3x redundancy and our organization has come to the conclusion that the default... more»

What's my experience with pricing, setup cost, and licensing?

It's open source.

Which solutions did we use previously?

We started off using Apache Hadoop for our initial Big Data initiative and have stuck with it since.

What other advice do I have?

Try, try, and try again. Experiment with MapReduce and YARN. Fine tune your processes and you will see some insane... more»
Randy Chng
Real User
Senior Associate at a financial services firm with 10,001+ employees
Sep 17 2017

What is most valuable?

Impala. As compared to Hive on MapReduce, Impala on MPP returns results of SQL queries in a fairly short amount of... more»

How has it helped my organization?

The quick access to data enabled more frequent data backed decisions.

What needs improvement?

The key shortcoming is its inability to handle queries when there is insufficient memory. This limitation can be... more»

What's my experience with pricing, setup cost, and licensing?

Not much advice as pricing and licensing is handled at an enterprise level. However do take into consider that data... more»

Which solutions did we use previously?

No. Two years ago this was a new team and hence there were no legacy systems to speak of.

What other advice do I have?

Try open-source Hadoop first but be aware of greater implementation complexity. If open-source Hadoop is "too" complex,... more»
Chetna
Real User
Big Data Engineer at a tech vendor with 5,001-10,000 employees
Jun 29 2017

What do you think of Apache Hadoop?

What is most valuable?: HDFS allows you to store large data sets optimally. • How has it helped my organization?: After switching to big data pipelines, our query performance improved a hundred times. • What needs improvement?: Rolling restarts of data nodes need to be done in a way that can be further optimized. Also, I/O operations can be optimized for more performance. • For how long have I used the solution?: I have used Hadoop for over three years. • What do I think about the stability of the solution?: Once we had an issue with stability, due to a complete shutdown of a cluster. Bringing up a cluster took a lot of time because of some order that needed to be followed. • What do I think about the scalability of the solution?: We have not had scalability...
Naveen Karnam
Consultant
Software Architect at a tech services company with 10,001+ employees
Mar 18 2018

What is most valuable?

High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization.

How has it helped my organization?

We start with data mashing on Hive and finally use this for KPI visualization. This intermediate step not only mashes data in the form that we want through data Cube slicing, but... more»

What needs improvement?

At the beginning, MRs on Hive made me think we should get down to Hadoop MRs to have better control of the data. But later, Hive as a platform upgraded very well. I still think a... more»

What other advice do I have?

I rate it an eight out of 10. It's huge, complex, slow. But does what it is meant for.
Chitharanjan Billa
Consultant
Database/Middleware Consultant (Currently at U.S. Department of Labor) at a tech services company with 51-200 employees
Mar 15 2018

What do you think of Apache Hadoop?

What is our primary use case?: Content management solution Unified Data solution Apache Hadoop running on Linux • What is most valuable?: Data ingestion: It has rapid speed, if Apache Accumulo is used. Data security Inexpensive • What needs improvement?: It needs better user interface (UI) functionalities. • For how long have I used the solution?: Three to five years. • What's my experience with pricing, setup cost, and licensing?: There are no licensing costs involved, hence money is saved on the software infrastructure.

Articles

User Assessments By Topic About Apache Hadoop

Find out what your peers are saying about Apache, Pivotal, Oracle and others in Data Warehouse.
302,095 professionals have used our research since 2012.

Apache Hadoop Questions

Apache Hadoop Projects By Members

Apache Hadoop Consultants

What is Apache Hadoop?

The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
Apache Hadoop customers
Amazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web Lab
BUYER'S GUIDE
Not sure which Data Warehouse solution is right for you?

Download our free Data Warehouse Report and find out what your peers are saying about Apache, Pivotal, Oracle, and more!

Sign Up with Email