We performed a comparison between Apache Hadoop and VMware Tanzu Greenplum based on real PeerSpot user reviews.
Find out in this report how the two Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The performance is pretty good."
"The most valuable feature is the database."
"As compared to Hive on MapReduce, Impala on MPP returns results of SQL queries in a fairly short amount of time, and is relatively fast when reading data into other platforms like R."
"Its integration is Hadoop's best feature because that allows us to support different tools in a big data platform."
"The solution is easy to expand. We haven't seen any issues with it in that sense. We've added 10 servers, and we've added two nodes. We've been expanding since we started using it since we started out so small. Companies that need to scale shouldn't have a problem doing so."
"Data ingestion: It has rapid speed, if Apache Accumulo is used."
"One valuable feature is that we can download data."
"Since both Apache Hadoop and Amazon EC2 are elastic in nature, we can scale and expand on demand for a specific PoC, and scale down when it's done."
"With VMware Tanzu Greenplum, one can make a huge database table and analyze the queries by adding in the SQL command. Some hint or command for the query goes over the multi-parallel execution."
"Helps us to achieve large-scale analytics."
"We chose Greenplum because of the architecture in terms of clustering databases and being able to have, or at least utilize the resources that are sitting on a database."
"It works very well with large database queries."
"Tanzu Greenplum's most valuable features include the integration of modern data science approaches across an MPP platform."
"The parallel load features mean that Greenplum is capable of high-volume data loading in parallel to all of the cluster segments, which is really valuable."
"Scalable (Massive) Parallel Processing (MPP) – The ability to bring to bear large amounts of compute against large data sets with Greenplum and the EMC DCA has proven itself to be very effective."
"Very fast for query processing."
"The key shortcoming is its inability to handle queries when there is insufficient memory. This limitation can be bypassed by processing the data in chunks."
"The solution is not easy to use. The solution should be easy to use and suitable for almost any case connected with the use of big data or working with large amounts of data."
"There is a lack of virtualization and presentation layers, so you can't take it and implement it like a radio solution."
"In the next release, I would like to see Hive more responsive for smaller queries and to reduce the latency."
"It could be more user-friendly."
"In certain cases, the configurations for dealing with data skewness do not make any sense."
"The solution is very expensive."
"General installation/dependency issues were there, but were not a major, complex issue. While migrating data from MySQL to Hive, things are a little challenging, but we were able to get through that with support from forums and a little trial and error."
"Some integration with other platforms like design tools, and ETL development tools, that will enable some advanced functionality, like fully down processing, etc."
"VMware Tanzu Greenplum needs improvement in the memory area and improved methods for quick access to the disc. So, one of the quick goals of Greenplum must work on enhancing access to the disc by adding hints in the database."
"We would like to see Greenplum maintain a closer relationship with and parity to features implemented in PostgreSQL."
"Tanzu Greenplum's compression for GPText could be made more efficient."
"they need to interact more with customers. They need to explain the features, especially when there are new releases of Greenplum. I know just from information I've found that it has other features, it can be used to for analytics, for integration with Big Data, Hadoop. They need to focus on this part with the customer."
"I saw some limitation with respect to the column store, and removing this would be an improvement."
"Lacks sufficient inbuilt machine-learning functions for complex use cases."
"Maintenance is time-consuming."
Apache Hadoop is ranked 5th in Data Warehouse with 11 reviews while VMware Tanzu Greenplum is ranked 9th in Data Warehouse with 6 reviews. Apache Hadoop is rated 7.8, while VMware Tanzu Greenplum is rated 7.8. The top reviewer of Apache Hadoop writes "Has good processing power and speed and is capable of handling large volumes of data and doing online analysis". On the other hand, the top reviewer of VMware Tanzu Greenplum writes "Very efficient at large scale analytics; lacks inbuilt machine-learning functions for complex use cases". Apache Hadoop is most compared with Microsoft Azure Synapse Analytics, Azure Data Factory, Oracle Exadata, Snowflake and Amazon Redshift, whereas VMware Tanzu Greenplum is most compared with Oracle Exadata, Oracle Database Appliance, Vertica, Snowflake and Teradata. See our Apache Hadoop vs. VMware Tanzu Greenplum report.
See our list of best Data Warehouse vendors.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.