Compare Apache Hadoop vs. Vertica

Apache Hadoop is ranked 5th in Data Warehouse with 8 reviews while Vertica is ranked 4th in Data Warehouse with 7 reviews. Apache Hadoop is rated 7.6, while Vertica is rated 9.2. The top reviewer of Apache Hadoop writes "An inexpensive and flexible suite that helps users integrate varied legacy systems". On the other hand, the top reviewer of Vertica writes "Allows us to take volumes and process them at a very high speed". Apache Hadoop is most compared with Snowflake, VMware Tanzu Greenplum and Oracle Exadata, whereas Vertica is most compared with Snowflake, Apache Hadoop and Amazon Redshift. See our Apache Hadoop vs. Vertica report.
Cancel
You must select at least 2 products to compare!
Apache Hadoop Logo
13,336 views|11,258 comparisons
Vertica Logo
16,008 views|8,979 comparisons
Most Helpful Review
Find out what your peers are saying about Apache Hadoop vs. Vertica and other solutions. Updated: March 2020.
408,459 professionals have used our research since 2012.
Quotes From Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:

Pros
The most valuable features are powerful tools for ingestion, as data is in multiple systems.The most valuable feature is the database.It's good for storing historical data and handling analytics on a huge amount of data.The ability to add multiple nodes without any restriction is the solution's most valuable aspect.What comes with the standard setup is what we mostly use, but Ambari is the most important.The best thing about this solution is that it is very powerful and very cheap.The most valuable features are the ability to process the machine data at a high speed, and to add structure to our data so that we can generate relevant analytics.Two valuable features are its scalability and parallel processing. There are jobs that cannot be done unless you have massively parallel processing.

Read more »

Vertica's most outstanding features are the compression rates achieved and the speed of access of high volume data.Allows us to take volumes and process them at a very high speed.The performance is very good and the aggregate records are fast.Eighty percent of the ETL operations have improved since implementing this solution.It maximize cloud economics for mission-critical big data analytical initiatives.It maximizes cloud economics with Eon Mode by scaling cluster size to meet variable workload demands.Bulk loads, batch loads, and micro-batch loads have made it possible for our organization to process near real-time ingestions and faster analytics.Any novice user can tune vertical queries with minimal training (or no training at all).

Read more »

Cons
It would be helpful to have more information on how to best apply this solution to smaller organizations, with less data, and grow the data lake.It would be good to have more advanced analytics tools.The solution could use a better user interface. It needs a more effective GUI in order to create a better user environment.There is a lack of virtualization and presentation layers, so you can't take it and implement it like a radio solution.In the next release, I would like to see Hive more responsive for smaller queries and to reduce the latency.The upgrade path should be improved because it is not as easy as it should be.We would like to have more dynamics in merging this machine data with other internal data to make more meaning out of it.I would like to see more direct integration of visualization applications.

Read more »

Support is an area where it could get better.Promotion/marketing must be improved, even though it is a very useful product at very good price, it is not as "popular" as it should be.When it is about to reach the maximum storage capacity, it becomes slow.Fact-to-fact joins on multi-billion record tables perform poorly.It needs integration with multiple clouds.It should provide a GUI interface for data management and tuning.Monitoring tools need to be lightweight. They should not take up heavy resources of the main server.If you do not utilize the tuning tools like projections, encoding, partitions, and statistics, then performance and scalability will suffer.

Read more »

Pricing and Cost Advice
This is a low cost and powerful solution.

Read more »

It is fast to purchase through the AWS Marketplace.The pricing and licensing depend on the size of your environment and the zone where you want to implement.Read the fine print carefully.

Read more »

report
Use our free recommendation engine to learn which Data Warehouse solutions are best for your needs.
408,459 professionals have used our research since 2012.
Answers from the Community
Morten Calisch
author avatarC Dove
Real User

I haven't used SQream personally. However, if you are only considering GPU based rdbms's please check the following
https://hackernoon.com/which-gpu-database-is-right-for-me-6ceef6a17505

author avatarTristan Bergh
Real User

Your best DB for 40+ TB is Apache Spark, Drill and the Hadoop stack, in the cloud.

Use the public cloud provider's elastic store (S3, Azure BLOB, google drive) and then stand up Apache Spark on a cluster sized to run your queries within 20 minutes. Based on my experience (Azure BLOB store, Databricks, PySpark) you may need around 500 32GB nodes for reading 40 TB of data.

Costs can be contained by running your own clusters but Databricks manage clusters for you.

I would recommend optimizing your 40TB data store into the Databricks delta format after an initial parse.

Ranking
5th
out of 31 in Data Warehouse
Views
13,336
Comparisons
11,258
Reviews
8
Average Words per Review
399
Avg. Rating
7.6
4th
out of 31 in Data Warehouse
Views
16,008
Comparisons
8,979
Reviews
7
Average Words per Review
199
Avg. Rating
9.1
Top Comparisons
Compared 32% of the time.
Compared 14% of the time.
Compared 17% of the time.
Compared 13% of the time.
Compared 11% of the time.
Also Known As
Micro Focus Vertica, HPE Vertica, HPE Vertica on Demand
Learn
Apache
Micro Focus
Overview
The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

Micro Focus Vertica is the most advanced SQL database analytics portfolio built from the very first line of code to address the most demanding Big Data analytics initiatives. Micro Focus Vertica delivers speed without compromise, scale without limits, and the broadest range of consumption models. Choose Vertica on premise, on demand, in the cloud, or on Hadoop. With support for all leading BI and visualization tools, open source technologies like Hadoop and R, and built-in analytical functions, Vertica helps you derive more value from your Enterprise Data Warehouse and data lakes and get to market faster with your analytics initiatives.

To learn more about Micro Focus Vertica Advanced Analytics, visit our website.

Offer
Learn more about Apache Hadoop
Learn more about Vertica
Sample Customers
Amazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web LabCerner, Game Show Network Game, Guess by Marciano, Supercell, Etsy, Nascar, Empirix, adMarketplace, and Cardlytics.
Top Industries
VISITORS READING REVIEWS
Software R&D Company36%
Comms Service Provider15%
Financial Services Firm13%
Government7%
REVIEWERS
Media Company21%
Software R&D Company21%
Marketing Services Firm17%
Comms Service Provider14%
VISITORS READING REVIEWS
Software R&D Company37%
Comms Service Provider20%
Media Company8%
Financial Services Firm5%
Find out what your peers are saying about Apache Hadoop vs. Vertica and other solutions. Updated: March 2020.
408,459 professionals have used our research since 2012.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.