Apache Hadoop Reviews

Filter by:Reset all filters
industry
Loading...
Filter Unavailable
Company Size
Loading...
Filter Unavailable
Job Level
Loading...
Filter Unavailable
rating
Loading...
Filter Unavailable
Arul Mani
Real User
CEO
Mar 18 2018

What is most valuable?

HDFS and Kafka: Ingestion of huge volumes and variety of unstructured/semi-structured data is feasible, and it helps us to quickly onboard a new Big Data analytics prospect.

How has it helped my organization?

Initially, with RDBMS alone, we had a lot of work and few servers running on-premise and on cloud for the PoC and incubation. With the use of Hadoop and ecosystem components and tools, and managing it… more»

What needs improvement?

Based on our needs, we would like to see a tool for data visualization and enhanced Ambari for management, plus a pre-built IoT hub/model. These would reduce our efforts and the time needed to prove… more»

What's my experience with pricing, setup cost, and licensing?

We normally do not suggest any specific distributions. When it comes to cloud, our suggestion would be to choose different types of instances offered by Amazon cloud, as we are technology partners of… more»

What other advice do I have?

Our general suggestion to any customer is not to blindly look and compare different options. Rather, list the exact business needs - current and future - and then prepare a matrix to see product… more»

Which other solutions did I evaluate?

None, as this stack is familiar to us and we were sure it could be used for such engagements without much hassle. Our primary criteria were the ability to migrate our existing RDBMS-based PoC and… more»
Samuel Feinberg
Real User
Analytics Platform Manager at a consultancy with 10,001+ employees
Aug 14 2018

What is most valuable?

* Scalability * Parallel processing There are jobs that cannot be done unless you have massively parallel processing; for instance, processing call-detail records for telecom.

How has it helped my organization?

There is a lot of difference. I think the best case is that we are able to drill down to transactional records and really build a root-cause analysis for various issues that might arise, on demand. Because we're able to process in parallel… more»

What needs improvement?

In general, Hadoop has as lot of different component parts to the platform - things like Hive and HBase - and they're all moving somewhat independently and somewhat in parallel. I think as you look to platforms in the cloud or into… more»

If you previously used a different solution, which one did you use and why did you switch?

There are the older relational database technologies: Netezza, SQL Server, MySQL, Oracle, Teradata. All have some advantages and some disadvantages. Most notably, they are all significantly more expensive in terms of the capital expense… more»

What other advice do I have?

Implement for defined use cases. Don't expect it to all just work very easily. I would rate this platform a seven out of 10. On the one hand, it's the only place you can use certain functions, and on the other hand, it's not going to put… more»
Find out what your peers are saying about Apache, Pivotal, Snowflake Computing and others in Data Warehouse. Updated: July 2019.
353,599 professionals have used our research since 2012.
Randy Chng
Real User
Senior Associate at a financial services firm with 10,001+ employees
Sep 17 2017

What is most valuable?

Impala. As compared to Hive on MapReduce, Impala on MPP returns results of SQL queries in a fairly short amount of time, and is relatively fast when reading data into… more»

How has it helped my organization?

The quick access to data enabled more frequent data backed decisions.

What needs improvement?

The key shortcoming is its inability to handle queries when there is insufficient memory. This limitation can be bypassed by processing the data in chunks.

What's my experience with pricing, setup cost, and licensing?

Not much advice as pricing and licensing is handled at an enterprise level. However do take into consider that data storage and compute capacity scale differently and… more»

If you previously used a different solution, which one did you use and why did you switch?

No. Two years ago this was a new team and hence there were no legacy systems to speak of.

What other advice do I have?

Try open-source Hadoop first but be aware of greater implementation complexity. If open-source Hadoop is "too" complex, then consider a vendor packaged Hadoop solution… more»

Which other solutions did I evaluate?

Yes. Oracle Exadata and Teradata.
Naveen Karnam
Consultant
Software Architect at a tech services company with 10,001+ employees
Mar 18 2018

What is most valuable?

High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization.

How has it helped my organization?

We start with data mashing on Hive and finally use this for KPI visualization. This intermediate step not only mashes data in the form that we want through data Cube slicing, but also helps us save states as snapshots for multiple time frames. Without this, we would have had to plan another data… more»

What needs improvement?

At the beginning, MRs on Hive made me think we should get down to Hadoop MRs to have better control of the data. But later, Hive as a platform upgraded very well. I still think a Spark-type layer on top gives you an edge over having only Hive.

What other advice do I have?

I rate it an eight out of 10. It's huge, complex, slow. But does what it is meant for.
MahalingamShanmugam
Real User
User
Jul 17 2019

What do you think of Apache Hadoop?

What is our primary use case?

We use this solution for our Enterprise Data Lake.

How has it helped my organization?

Using this solution has reduced the overall TCO. It has also improved data processing time for the machine and provides greater insight into our unstructured data.

What is most valuable?

The most valuable features are the ability to process the machine data at a high speed, and to add structure to our data so that we can generate relevant analytics.

What needs improvement?

We would like to have more dynamics in merging this machine data with other internal data to make more meaning out of it.

For how long have I used the solution?

More than four years.
Chitharanjan Billa
Consultant
Database/Middleware Consultant (Currently at U.S. Department of Labor) at a tech services company with 51-200 employees
Mar 15 2018

What do you think of Apache Hadoop?

What is our primary use case?

Content management solution Unified Data solution Apache Hadoop running on Linux

What is most valuable?

Data ingestion: It has rapid speed, if Apache Accumulo is used. Data security Inexpensive

What needs improvement?

It needs better user interface (UI) functionalities.

For how long have I used the solution?

Three to five years.

What's my experience with pricing, setup cost, and licensing?

There are no licensing costs involved, hence money is saved on the software infrastructure.

Articles

User Assessments By Topic About Apache Hadoop

Find out what your peers are saying about Apache, Pivotal, Snowflake Computing and others in Data Warehouse. Updated: July 2019.
353,599 professionals have used our research since 2012.

Apache Hadoop Questions

Apache Hadoop Projects By Members

What is Apache Hadoop?

The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
Apache Hadoop customers
Amazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web Lab
BUYER'S GUIDE
Download our free Data Warehouse Report and find out what your peers are saying about Apache, Pivotal, Snowflake Computing, and more!
Sign Up with Email