Apache Hadoop Room for Improvement

Arul Mani
Based on our needs, we would like to see a tool for data visualization and enhanced Ambari for management, plus a pre-built IoT hub/model. These would reduce our efforts and the time needed to prove to a customer that this will help them. View full review »
User at a comms service provider with 1,001-5,000 employees
We are using HDTM circuit boards, and I worry about the future of this product and compatibility with future releases. It's a concern because, for now, we do not have a clear path to upgrade. The Hadoop product is in version three and we'd like to upgrade to the third version. But as far as I know, it's not a simple thing. There are a lot of features in this product that are open-source. If something isn't included with the distribution we are not limited. We can take things from the internet and integrate them. As far as I know, we are using Presto which isn't included in HDP (Hortonworks Data Platform) and it works fine. Not everything has to be included in the release. If something is outside of HDP and it works, that is good enough for me. We have the flexibility to incorporate it ourselves. View full review »
Samuel Feinberg
User at a consultancy with 10,001+ employees
In general, Hadoop has as lot of different component parts to the platform - things like Hive and HBase - and they're all moving somewhat independently and somewhat in parallel. I think as you look to platforms in the cloud or into walled-garden concepts, like Cloudera or Azure, you see that the third-party can make sure all the components work together before they are used for business purposes. That reduces a layer of administration configuration and technical support. I would like to see more direct integration of visualization applications. View full review »
Find out what your peers are saying about Apache, Pivotal, Snowflake Computing and others in Data Warehouse. Updated: February 2020.
398,890 professionals have used our research since 2012.
User at a tech vendor with 501-1,000 employees
Hadoop itself is quite complex, especially if you want it running on a single machine, so to get it set up is a big mission. It seems that Hadoop is on it's way out and Spark is the way to go. You can run Spark on a single machine and it's easier to setup. In the next release, I would like to see Hive more responsive for smaller queries and to reduce the latency. I don't think that this is viable, but if it is possible, then latency on smaller guide queries for analysis and analytics. I would like a smaller version that can be run on a local machine. There are installations that do that but are quite difficult, so I would say a smaller version that is easy to install and explore would be an improvement. View full review »
Yevgen Manzhulyanov
What needs improvement depends on the customer and the use case. The classical Hadoop, for example, we consider an old variant. Most now work with flash data. There is a very wide application for this solution, but in enterprise companies, if you work with classical BI systems, it would be good to include an additional presentation layer for BI solutions. There is a lack of virtualization and presentation layers, so you can't take it and implement it like a radio solution. View full review »
User at RBSG Internet Operations
We're finding vulnerabilities in running it 24/7. We're experiencing some downtime that affects the data. It would be good to have more advanced analytics tools. View full review »
Naveen Karnam
Software Architect at Self-employed
At the beginning, MRs on Hive made me think we should get down to Hadoop MRs to have better control of the data. But later, Hive as a platform upgraded very well. I still think a Spark-type layer on top gives you an edge over having only Hive. View full review »
User at a tech services company with 11-50 employees
It could be because the solution is open source, and therefore not funded like bigger companies, but we find the solution runs slow. The solution isn't as mature as SQL or Oracle and therefore lacks many features. The solution could use a better user interface. It needs a more effective GUI in order to create a better user environment. View full review »
We would like to have more dynamics in merging this machine data with other internal data to make more meaning out of it. View full review »
Abhik Ray
User at a tech services company with 201-500 employees
It would be helpful to have more information on how to best apply this solution to smaller organizations, with less data, and grow the data lake. View full review »
Chitharanjan Billa
Database/Middleware Consultant (Currently at U.S. Department of Labor) at a tech services company with 51-200 employees
It needs better user interface (UI) functionalities. View full review »
Find out what your peers are saying about Apache, Pivotal, Snowflake Computing and others in Data Warehouse. Updated: February 2020.
398,890 professionals have used our research since 2012.