We performed a comparison between Apache Hadoop and VMware Tanzu Greenplum based on real PeerSpot user reviews.
Find out in this report how the two Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."Hadoop File System is compatible with almost all the query engines."
"Apache Hadoop can manage large amounts and volumes of data with relative ease, which is a feature that is beneficial."
"High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization."
"The performance is pretty good."
"What I like about Apache Hadoop is that it's for big data, in particular big data analysis, and it's the easier solution. I like the data processing feature for AI/ML use cases the most because some solutions allow me to collect data from relational databases, while Hadoop provides me with more options for newer technologies."
"Most valuable features are HDFS and Kafka: Ingestion of huge volumes and variety of unstructured/semi-structured data is feasible, and it helps us to quickly onboard a new Big Data analytics prospect."
"As compared to Hive on MapReduce, Impala on MPP returns results of SQL queries in a fairly short amount of time, and is relatively fast when reading data into other platforms like R."
"Hadoop is designed to be scalable, so I don't think that it has limitations in regards to scalability."
"With VMware Tanzu Greenplum, one can make a huge database table and analyze the queries by adding in the SQL command. Some hint or command for the query goes over the multi-parallel execution."
"A very good, open-source platform."
"Tanzu Greenplum's most valuable features include the integration of modern data science approaches across an MPP platform."
"Scalable (Massive) Parallel Processing (MPP) – The ability to bring to bear large amounts of compute against large data sets with Greenplum and the EMC DCA has proven itself to be very effective."
"Pivotal Greenplum's shared-nothing architecture."
"It's super easy to deploy and it also supports different languages and analytics."
"The parallel load features mean that Greenplum is capable of high-volume data loading in parallel to all of the cluster segments, which is really valuable."
"Helps us to achieve large-scale analytics."
"General installation/dependency issues were there, but were not a major, complex issue. While migrating data from MySQL to Hive, things are a little challenging, but we were able to get through that with support from forums and a little trial and error."
"It could be more user-friendly."
"The main thing is the lack of community support. If you want to implement a new API or create a new file system, you won't find easy support."
"We would like to have more dynamics in merging this machine data with other internal data to make more meaning out of it."
"The upgrade path should be improved because it is not as easy as it should be."
"The solution is very expensive."
"It requires a great deal of learning curve to understand. The overall Hadoop ecosystem has a large number of sub-products. There is ZooKeeper, and there are a whole lot of other things that are connected. In many cases, their functionalities are overlapping, and for a newcomer or our clients, it is very difficult to decide which of them to buy and which of them they don't really need. They require a consulting organization for it, which is good for organizations such as ours because that's what we do, but it is not easy for the end customers to gain so much knowledge and optimally use it."
"The solution is not easy to use. The solution should be easy to use and suitable for almost any case connected with the use of big data or working with large amounts of data."
"If you have a user consuming a huge load of resources, it takes down the entire system."
"VMware Tanzu Greenplum needs improvement in the memory area and improved methods for quick access to the disc. So, one of the quick goals of Greenplum must work on enhancing access to the disc by adding hints in the database."
"Maintenance is time-consuming."
"The installation is difficult and should be made easier."
"they need to interact more with customers. They need to explain the features, especially when there are new releases of Greenplum. I know just from information I've found that it has other features, it can be used to for analytics, for integration with Big Data, Hadoop. They need to focus on this part with the customer."
"They should add more analytics. Their documentation could also be improved so that I don't have to bother my co-workers and tech support so often."
"Lacks sufficient inbuilt machine-learning functions for complex use cases."
"Extra filters would be helpful."
Apache Hadoop is ranked 5th in Data Warehouse with 32 reviews while VMware Tanzu Greenplum is ranked 9th in Data Warehouse with 36 reviews. Apache Hadoop is rated 7.8, while VMware Tanzu Greenplum is rated 7.8. The top reviewer of Apache Hadoop writes "A file system for data collection that contains needed information and files". On the other hand, the top reviewer of VMware Tanzu Greenplum writes "Very efficient at large scale analytics; lacks inbuilt machine-learning functions for complex use cases". Apache Hadoop is most compared with Azure Data Factory, Microsoft Azure Synapse Analytics, Oracle Exadata, Snowflake and Amazon Redshift, whereas VMware Tanzu Greenplum is most compared with Oracle Exadata, Vertica, Oracle Database Appliance, Snowflake and Teradata. See our Apache Hadoop vs. VMware Tanzu Greenplum report.
See our list of best Data Warehouse vendors.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.