Apache Hadoop Pros and Cons

Apache Hadoop Pros

Arul Mani
CEO
Initially, with RDBMS alone, we had a lot of work and few servers running on-premise and on cloud for the PoC and incubation. With the use of Hadoop and ecosystem components and tools, and managing it in Amazon EC2, we have created a Big Data "lab" which helps us to centralize all our work and solutions into a single repository. This has cut down the time in terms of maintenance, development and, especially, data processing challenges.
Since both Apache Hadoop and Amazon EC2 are elastic in nature, we can scale and expand on demand for a specific PoC, and scale down when it's done.
Most valuable features are HDFS and Kafka: Ingestion of huge volumes and variety of unstructured/semi-structured data is feasible, and it helps us to quickly onboard a new Big Data analytics prospect.
View full review »
Samuel Feinberg
Analytics Platform Manager at a consultancy with 10,001+ employees
Two valuable features are its scalability and parallel processing. There are jobs that cannot be done unless you have massively parallel processing.
View full review »
Randy Chng
Senior Associate at a financial services firm with 10,001+ employees
As compared to Hive on MapReduce, Impala on MPP returns results of SQL queries in a fairly short amount of time, and is relatively fast when reading data into other platforms like R.
View full review »
Find out what your peers are saying about Apache, Pivotal, Snowflake Computing and others in Data Warehouse. Updated: July 2019.
352,246 professionals have used our research since 2012.
Naveen Karnam
Software Architect at a tech services company with 10,001+ employees
High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization.
View full review »
Chitharanjan Billa
Database/Middleware Consultant (Currently at U.S. Department of Labor) at a tech services company with 51-200 employees
​​Data ingestion: It has rapid speed, if Apache Accumulo is used.
View full review »

Apache Hadoop Cons

Arul Mani
CEO
Based on our needs, we would like to see a tool for data visualization and enhanced Ambari for management, plus a pre-built IoT hub/model. These would reduce our efforts and the time needed to prove to a customer that this will help them.
General installation/dependency issues were there, but were not a major, complex issue. While migrating data from MySQL to Hive, things are a little challenging, but we were able to get through that with support from forums and a little trial and error.
View full review »
Samuel Feinberg
Analytics Platform Manager at a consultancy with 10,001+ employees
I would like to see more direct integration of visualization applications.
View full review »
Randy Chng
Senior Associate at a financial services firm with 10,001+ employees
The key shortcoming is its inability to handle queries when there is insufficient memory. This limitation can be bypassed by processing the data in chunks.
View full review »
Find out what your peers are saying about Apache, Pivotal, Snowflake Computing and others in Data Warehouse. Updated: July 2019.
352,246 professionals have used our research since 2012.
Chitharanjan Billa
Database/Middleware Consultant (Currently at U.S. Department of Labor) at a tech services company with 51-200 employees
It needs better user interface (UI) functionalities.
View full review »
Find out what your peers are saying about Apache, Pivotal, Snowflake Computing and others in Data Warehouse. Updated: July 2019.
352,246 professionals have used our research since 2012.
Sign Up with Email