Apache Hadoop Room for Improvement

Arul Mani
CEO
Based on our needs, we would like to see a tool for data visualization and enhanced Ambari for management, plus a pre-built IoT hub/model. These would reduce our efforts and the time needed to prove to a customer that this will help them. View full review »
ITexp677
IT Expert at a comms service provider with 1,001-5,000 employees
We are using HDTM circuit boards, and I worry about the future of this product and compatibility with future releases. It's a concern because, for now, we do not have a clear path to upgrade. The Hadoop product is in version three and we'd like to upgrade to the third version. But as far as I know, it's not a simple thing. There are a lot of features in this product that are open-source. If something isn't included with the distribution we are not limited. We can take things from the internet and integrate them. As far as I know, we are using Presto which isn't included in HDP (Hortonworks Data Platform) and it works fine. Not everything has to be included in the release. If something is outside of HDP and it works, that is good enough for me. We have the flexibility to incorporate it ourselves. View full review »
Samuel Feinberg
Analytics Platform Manager at a consultancy with 10,001+ employees
In general, Hadoop has as lot of different component parts to the platform - things like Hive and HBase - and they're all moving somewhat independently and somewhat in parallel. I think as you look to platforms in the cloud or into walled-garden concepts, like Cloudera or Azure, you see that the third-party can make sure all the components work together before they are used for business purposes. That reduces a layer of administration configuration and technical support. I would like to see more direct integration of visualization applications. View full review »
Find out what your peers are saying about Apache, Pivotal, Snowflake Computing and others in Data Warehouse. Updated: September 2019.
371,355 professionals have used our research since 2012.
reviewer860583
Data Scientist at a tech vendor with 501-1,000 employees
Hadoop itself is quite complex, especially if you want it running on a single machine, so to get it set up is a big mission. It seems that Hadoop is on it's way out and Spark is the way to go. You can run Spark on a single machine and it's easier to setup. In the next release, I would like to see Hive more responsive for smaller queries and to reduce the latency. I don't think that this is viable, but if it is possible, then latency on smaller guide queries for analysis and analytics. I would like a smaller version that can be run on a local machine. There are installations that do that but are quite difficult, so I would say a smaller version that is easy to install and explore would be an improvement. View full review »
Naveen Karnam
Software Architect at a tech services company with 10,001+ employees
At the beginning, MRs on Hive made me think we should get down to Hadoop MRs to have better control of the data. But later, Hive as a platform upgraded very well. I still think a Spark-type layer on top gives you an edge over having only Hive. View full review »
MahalingamShanmugam
User
We would like to have more dynamics in merging this machine data with other internal data to make more meaning out of it. View full review »
Chitharanjan Billa
Database/Middleware Consultant (Currently at U.S. Department of Labor) at a tech services company with 51-200 employees
It needs better user interface (UI) functionalities. View full review »
Find out what your peers are saying about Apache, Pivotal, Snowflake Computing and others in Data Warehouse. Updated: September 2019.
371,355 professionals have used our research since 2012.
Sign Up with Email