Apache Hadoop Reviews

Filter by:Reset all filters
industry
Loading...
Filter Unavailable
Company Size
Loading...
Filter Unavailable
Job Level
Loading...
Filter Unavailable
rating
Loading...
Filter Unavailable
Arul Mani
Real User
CEO
Mar 18 2018

What is most valuable?

HDFS and Kafka: Ingestion of huge volumes and variety of unstructured/semi-structured data is feasible, and it helps us to quickly onboard a new… more»

How has it helped my organization?

Initially, with RDBMS alone, we had a lot of work and few servers running on-premise and on cloud for the PoC and incubation. With the use of… more»

What needs improvement?

Based on our needs, we would like to see a tool for data visualization and enhanced Ambari for management, plus a pre-built IoT hub/model. These… more»

What's my experience with pricing, setup cost, and licensing?

We normally do not suggest any specific distributions. When it comes to cloud, our suggestion would be to choose different types of instances… more»

What other advice do I have?

Our general suggestion to any customer is not to blindly look and compare different options. Rather, list the exact business needs - current and… more»
Samuel Feinberg
Real User
Analytics Platform Manager at a consultancy with 10,001+ employees
Aug 14 2018

What is most valuable?

* Scalability * Parallel processing There are jobs that cannot be done unless you have massively parallel processing; for instance, processing… more»

How has it helped my organization?

There is a lot of difference. I think the best case is that we are able to drill down to transactional records and really build a root-cause… more»

What needs improvement?

In general, Hadoop has as lot of different component parts to the platform - things like Hive and HBase - and they're all moving somewhat… more»

If you previously used a different solution, which one did you use and why did you switch?

There are the older relational database technologies: Netezza, SQL Server, MySQL, Oracle, Teradata. All have some advantages and some… more»

What other advice do I have?

Implement for defined use cases. Don't expect it to all just work very easily. I would rate this platform a seven out of 10. On the one hand, it's… more»
Find out what your peers are saying about Apache, Pivotal, Oracle and others in Data Warehouse. Updated: March 2019.
326,282 professionals have used our research since 2012.
Randy Chng
Real User
Senior Associate at a financial services firm with 10,001+ employees
Sep 17 2017

What is most valuable?

Impala. As compared to Hive on MapReduce, Impala on MPP returns results of SQL queries in a fairly short amount of time… more»

How has it helped my organization?

The quick access to data enabled more frequent data backed decisions.

What needs improvement?

The key shortcoming is its inability to handle queries when there is insufficient memory. This limitation can be bypassed… more»

What's my experience with pricing, setup cost, and licensing?

Not much advice as pricing and licensing is handled at an enterprise level. However do take into consider that data… more»

If you previously used a different solution, which one did you use and why did you switch?

No. Two years ago this was a new team and hence there were no legacy systems to speak of.

What other advice do I have?

Try open-source Hadoop first but be aware of greater implementation complexity. If open-source Hadoop is "too" complex… more»
Chetna
Real User
Big Data Engineer at a tech vendor with 5,001-10,000 employees
Jun 29 2017

What do you think of Apache Hadoop?

What is most valuable?

HDFS allows you to store large data sets optimally.

How has it helped my organization?

After switching to big data pipelines, our query performance improved a hundred times.

What needs improvement?

Rolling restarts of data nodes need to be done in a way that can be further optimized. Also, I/O operations can be optimized for more performance.

For how long have I used the solution?

I have used Hadoop for over three years.

What do I think about the stability of the solution?

Once we had an issue with stability, due to a complete shutdown of a cluster. Bringing up a cluster took a lot of time because of some order that needed to be followed.

What do I think about the scalability of the solution?

We have not had scalability issues.

How is customer

Naveen Karnam
Consultant
Software Architect at a tech services company with 10,001+ employees
Mar 18 2018

What is most valuable?

High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization.

How has it helped my organization?

We start with data mashing on Hive and finally use this for KPI visualization. This intermediate step not only mashes data in the form that we want through data Cube slicing, but also… more»

What needs improvement?

At the beginning, MRs on Hive made me think we should get down to Hadoop MRs to have better control of the data. But later, Hive as a platform upgraded very well. I still think a… more»

What other advice do I have?

I rate it an eight out of 10. It's huge, complex, slow. But does what it is meant for.
Chitharanjan Billa
Consultant
Database/Middleware Consultant (Currently at U.S. Department of Labor) at a tech services company with 51-200 employees
Mar 15 2018

What do you think of Apache Hadoop?

What is our primary use case?

Content management solution Unified Data solution Apache Hadoop running on Linux

What is most valuable?

Data ingestion: It has rapid speed, if Apache Accumulo is used. Data security Inexpensive

What needs improvement?

It needs better user interface (UI) functionalities.

For how long have I used the solution?

Three to five years.

What's my experience with pricing, setup cost, and licensing?

There are no licensing costs involved, hence money is saved on the software infrastructure.

Articles

User Assessments By Topic About Apache Hadoop

Find out what your peers are saying about Apache, Pivotal, Oracle and others in Data Warehouse. Updated: March 2019.
326,282 professionals have used our research since 2012.

Apache Hadoop Questions

Apache Hadoop Projects By Members

What is Apache Hadoop?

The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
Apache Hadoop customers
Amazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web Lab
BUYER'S GUIDE
Download our free Data Warehouse Report and find out what your peers are saying about Apache, Pivotal, Oracle, and more!

Sign Up with Email