Apache Hadoop vs Vertica Comparison 2024

Apache Hadoop

Vertica

Apache Hadoop

Read 34 Apache Hadoop reviews

2,387 views|2,021 comparisons

Vertica

Read 83 Vertica reviews

3,967 views|3,087 comparisons

Comparison Buyer's Guide

Download the complete report

Buyer's Guide

Apache Hadoop vs. Vertica

May 2024

Executive Summary

We performed a comparison between Apache Hadoop and Vertica based on real PeerSpot user reviews.

Find out in this report how the two Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.

To learn more, read our detailed Apache Hadoop vs. Vertica Report (Updated: May 2024).

Download the complete report

772,679 professionals have used our research since 2012.

Q&A Highlights

Question: Which is the best RDMBS solution for big data?

Answer: I haven't used SQream personally. However, if you are only considering GPU based rdbms's please check the following https://hackernoon.com/which-gpu-database-is-right-for-me-6ceef6a17505

Featured Review

Akhilesh Chipre

Senior Assosiate Consultant at Applied Materials

Handles huge data volumes and create your own workflows and tables but you need to have deeper knowledge

We primarily use Kafka for intensive data streaming. For batch-based processing, we use Hadoop. Additionally, we have our own custom batch catalog... Read more →

Anonymous User

Senior Technology Architect at a tech vendor

Great performance, stable and easy to use database

Vertica provides benefits to our customers and helps with their performance.

Quotes From Members

We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:

Pros

"The most valuable features are powerful tools for ingestion, as data is in multiple systems.""The most important feature is its ability to handle large volumes. Some of our customers have really large volumes, and it is capable of handling their data in terms of the core volume and daily incremental volume. So, its processing power and speed are most valuable.""Initially, with RDBMS alone, we had a lot of work and few servers running on-premise and on cloud for the PoC and incubation. With the use of Hadoop and ecosystem components and tools, and managing it in Amazon EC2, we have created a Big Data "lab" which helps us to centralize all our work and solutions into a single repository. This has cut down the time in terms of maintenance, development and, especially, data processing challenges.""The performance is pretty good.""Apache Hadoop is crucial in projects that save and retrieve data daily. Its valuable features are scalability and stability. It is easy to integrate with the existing infrastructure.""The tool's stability is good.""We selected Apache Hadoop because it is not dependent on third-party vendors.""High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization."

More Apache Hadoop Pros →

"Vertica is a great product because customers can compress and code data. The infrastructure that data warehouse solutions need is a commodity server so that customers don't have to invest in infrastructure.""It maximizes cloud economics with Eon Mode by scaling cluster size to meet variable workload demands.""The most valuable feature of Vertica is the unmatchable database performance.""The feature of the product that is most important is the speed. I needed a columnar database, and its speed is what it's built to do, and so that's what really does differentiate Vertica from its competitors.""The most valuable feature of Vertica is the ability to receive large aggregations at a very quick pace. The use case of subclusters is very good.""Integrated R and geospatial functions are helping us improve efficiency and explore new revenue streams. ""The product's initial setup phase is extremely simple.""The fast columnar store database structure allows our query times to be at least 10x faster than on any other database."

More Vertica Pros →

Cons

"The solution could use a better user interface. It needs a more effective GUI in order to create a better user environment.""In the next release, I would like to see Hive more responsive for smaller queries and to reduce the latency.""The stability of the solution needs improvement.""Real-time data processing is weak. This solution is very difficult to run and implement.""The key shortcoming is its inability to handle queries when there is insufficient memory. This limitation can be bypassed by processing the data in chunks.""What could be improved in Apache Hadoop is its user-friendliness. It's not that user-friendly, but maybe it's because I'm new to it. Sometimes it feels so tough to use, but it could be because of two aspects: one is my incompetency, for example, I don't know about all the features of Apache Hadoop, or maybe it's because of the limitations of the platform. For example, my team is maintaining the business glossary in Apache Atlas, but if you want to change any settings at the GUI level, an advanced level of coding or programming needs to be done in the back end, so it's not user-friendly.""I would like to see more direct integration of visualization applications.""The solution is very expensive."

More Apache Hadoop Cons →

"The documentation of Vertica is an area with shortcomings where improvements are required.""Performance of management of metadata layer (database catalog) needs improvement. We still have to have smaller customers on PostgreSQL; Vertica cannot manage thousands of schemata.""Vertica's native cloud support could be improved, and its installation could be made easier.""In a future release, we would like to have artificial intelligence capabilities like neural networks. Customers are demanding this type of analytics.""It needs integration with multiple clouds.""We faced some challenges when trying to use the temporary tables feature.""We are looking for a cheaper deployment for the solution. Although we did a lot of benchmarks, like Redshift. We tried Redshift, it didn't work. It didn't work out for us as well.""The integration with AI has room for improvement."

More Vertica Cons →

Pricing and Cost Advice

"Do take into consider that data storage and compute capacity scale differently and hence purchasing a "boxed" / 'all-in-one" solution (software and hardware) might not be the best idea."

"There are no licensing costs involved, hence money is saved on the software infrastructure."

"This is a low cost and powerful solution."

"The price of Apache Hadoop could be less expensive."

"If my company can use the cloud version of Apache Hadoop, particularly the cloud storage feature, it would be easier and would cost less because an on-premises deployment has a higher cost during storage, for example, though I don't know exactly how much Apache Hadoop costs."

"We don't directly pay for it. Our clients pay for it, and they usually don't complain about the price. So, it is probably acceptable."

"The price could be better. Hortonworks no longer exists, and Cloudera killed the free version of Hadoop."

"We just use the free version."

More Apache Hadoop Pricing and Cost Advice →

"Work with a vendor, if possible, and take advantage of more aggressive discounts at mid-fiscal year (April) and fiscal year-end (October)."

"It's free up to three nodes and 1TB, and then get in contact with their sales guys."

"Start with license per 1TB. Starting from hundreds of TB there is unlimited licensing to be considered. Move historical data to HDFS/S3 which are significantly cheaper or even free."

"The first TB is free and you can use all the Vertica features. After 1TB you have to pay for licensing. The product is worth it, but be aware of this condition, and plan. The compression ratio is explained in the documentation."

"I think it's starting to get a little expensive. Open source products are starting to get more robust, so I think that's something that they need to start looking at in terms of licensing."

"Read the fine print carefully."

"It is fast to purchase through the AWS Marketplace."

"The pricing and licensing depend on the size of your environment and the zone where you want to implement."

More Vertica Pricing and Cost Advice →

See Which Vendors Are Best For You

Use our free recommendation engine to learn which Data Warehouse solutions are best for your needs.

See Recommendations

772,679 professionals have used our research since 2012.

Answers from the Community

Anonymous User

Special Adviser Strategy at a university

Question: Which is the best RDMBS solution for big data?

Yuval Klein

SQreamDB is a GPU DB. It is not suitable for real-time oltp of course.

Cassandra is best suited for OLTP database use cases, when you need a scalable database (instead of SQL server, Postgres)
SQream is a GPU database suited for OLAP purposes. It's the best suite for a very large data warehouse, very large queries needed mass parallel activity since GPU is great in massive parallel workload.

Also, SQream is quite cheap since we need only one server with a GPU card, the best GPU card the better since we will have more CPU activity. It's only for a very big data warehouse, not for small ones.

Apr 19, 2020

Tristan Bergh

Your best DB for 40+ TB is Apache Spark, Drill and the Hadoop stack, in the cloud.

Use the public cloud provider's elastic store (S3, Azure BLOB, google drive) and then stand up Apache Spark on a cluster sized to run your queries within 20 minutes. Based on my experience (Azure BLOB store, Databricks, PySpark) you may need around 500 32GB nodes for reading 40 TB of data.

Costs can be contained by running your own clusters but Databricks manage clusters for you.

I would recommend optimizing your 40TB data store into the Databricks delta format after an initial parse.

Jan 28, 2020

Russell Rothstein (PeerSpot)

Morten, the most popular comparisons of SQream can be found here: www.itcentralstation.com
The top ones include Cassandra, MemSQL, MongoDB, and Vertica.

Jan 27, 2020

See all 4 answers »

Questions from the Community

What do you like most about Apache Hadoop?

Top Answer:It's primarily open source. You can handle huge data volumes and create your own views, workflows, and tables. I can also use it for real-time data streaming.

Read all 25 answers →

What is your experience regarding pricing and costs for Apache Hadoop?

Top Answer:We just use the free version.

Read all 8 answers →

What needs improvement with Apache Hadoop?

Top Answer:Since it is an open-source product, there won't be much support. So, you have to have deeper knowledge. You need to improvise based on that.

Read all 25 answers →

What do you like most about Vertica?

Top Answer:Vertica is easy to use and provides really high performance, stability, and scalability.

Read all 21 answers →

What is your experience regarding pricing and costs for Vertica?

Top Answer:Vertica has a perpetual license, but they are currently trying to convert all those licenses to subscription-based licenses on a yearly basis.

Read all 12 answers →

What needs improvement with Vertica?

Top Answer:Vertica's native cloud support could be improved, and its installation could be made easier. It's possible to deploy the solution on different hyperscalers, but it's not an easy process. Vertica is an… more »

Read all 21 answers →

Ranking

6th

out of 35 in Data Warehouse

Views

2,387

Comparisons

2,021

Reviews

Average Words per Review

530

Rating

7.8

4th

out of 35 in Data Warehouse

Views

3,967

Comparisons

3,087

Reviews

Average Words per Review

377

Rating

8.0

Comparisons

Azure Data Factory vs. Apache Hadoop

Compared 19% of the time.

Microsoft Azure Synapse Analytics vs. Apache Hadoop

Compared 16% of the time.

Oracle Exadata vs. Apache Hadoop

Compared 12% of the time.

Snowflake vs. Apache Hadoop

Compared 9% of the time.

SAP IQ vs. Apache Hadoop

Compared 2% of the time.

More Apache Hadoop Competitors →

Snowflake vs. Vertica

Compared 18% of the time.

SQL Server vs. Vertica

Compared 13% of the time.

Amazon Redshift vs. Vertica

Compared 10% of the time.

Teradata vs. Vertica

Compared 10% of the time.

BigQuery vs. Vertica

Compared 7% of the time.

More Vertica Competitors →

Also Known As

Micro Focus Vertica, HPE Vertica, HPE Vertica on Demand

Learn More

Apache

OpenText

Overview

The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

Vertica is a deploy-anywhere SQL database created for elasticity, speed, and advanced analytics. Vertica enables today’s busy teams to modernize their data warehouses, democratize data and analytics to enable increased access, and deploy analytics in a hybrid cloud environment. Additionally, Vertica merges how companies power their analytics by providing a scalable, open, and elastic database with numerous intuitive features.

In today’s marketplace, organizations are experiencing continued robust growth of data volumes, and citizen data scientists’ broader use of analytics is causing many companies to re-visit and re-examine their systems in order to match the demands of an aggressive marketplace. Analytics are continually swiftly evolving. New data from social media, blogs, IoT sources, data streams, gas and electrical grids, and mobile networks is being constantly gathered in extensive data sets. This presents organizations with a new opportunity to become more data driven, and they must be able to manage the new data growth and identify the trends and sequences that can lead to both improved business opportunities and continued repeat business from their clients.

Vertica Benefits:

Vertica has many valuable key benefits. Some of its most useful benefits include:

Efficiency: Vertica provides robust compression and intuitive impressions. This results in users requiring significantly less storage and hardware than other comparable data analytics solutions. The progressive Vertica architecture results in queries that are 10-50 times faster than other platforms while providing more storage data per server.
Integration: Each new iteration of Vertica is tested and certified with the latest ETL and visualization tools. It actively supports Java Database Connectivity (JDBC), Open Database Connectivity (ODBC), and popular SQL providers. All these solutions and most leading BI and visualization tools interact seamlessly, making Vertica overall a very cost-effective solution and solid business investment.
Cloud flexibility: With Vertica, users do not have to get locked into a single cloud vendor. Users are able to take complete advantage of the current infrastructure that is already in place. Vertica seamlessly integrates with popular public clouds, including Google Cloud Platform (GCP), Azure, AWS, Alibaba, VMware clouds, and more. It also provides for easy portability across on-premise and multi-cloud environments and data lakes. Vertica designs a robust flexible platform for running a company’s analytical and computing workloads, which allows applications to run simultaneously on numerous environments in a hybrid cloud infrastructure. Vertica is able to seamlessly use public clouds and private data centers, and it grants the flexibility to switch in an instant.
Security: Vertica offers dynamic end-to-end security with support for partner solutions and industry-standard protocols such as Apache Sentry, AWS IAM, Kerberos, LDAP, and more. Vertica utilizes an intuitive layered security model that provides multiple security authentication authorization mechanisms. Vertica will also maintain an audit trail, natively exported to other security domains for analysis and persistence.

Reviews from Real Users

“I am using Vertica for aggregations and dashboards. The most valuable feature of Vertica is the ability to receive large aggregations at a very quick pace. The use case of subclusters is very good.” - Bijal S., Group Chief Technology Officer at Netcore Solutions

“The hardware usage and speed has been the most valuable feature of this solution. It is very fast and has saved us a lot of money.” - Munkhsaikhan B., Project Lead - Digital Transformation Unit at Bodi Electronics LLC

Sample Customers

Amazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web Lab

Cerner, Game Show Network Game, Guess by Marciano, Supercell, Etsy, Nascar, Empirix, adMarketplace, and Cardlytics.

Top Industries

REVIEWERS

Financial Services Firm35%

Comms Service Provider24%

Hospitality Company6%

Consumer Goods Company6%

VISITORS READING REVIEWS

Financial Services Firm29%

Computer Software Company11%

University6%

Manufacturing Company5%

REVIEWERS

Computer Software Company19%

Media Company17%

Marketing Services Firm14%

Comms Service Provider11%

VISITORS READING REVIEWS

Financial Services Firm18%

Computer Software Company16%

Manufacturing Company8%

Comms Service Provider5%

Company Size

REVIEWERS

Small Business33%

Midsize Enterprise19%

Large Enterprise47%

VISITORS READING REVIEWS

Small Business15%

Midsize Enterprise11%

Large Enterprise74%

REVIEWERS

Small Business32%

Midsize Enterprise26%

Large Enterprise42%

VISITORS READING REVIEWS

Small Business21%

Midsize Enterprise14%

Large Enterprise66%

Buyer's Guide

Apache Hadoop vs. Vertica

May 2024

Free Report: Apache Hadoop vs. Vertica

Find out what your peers are saying about Apache Hadoop vs. Vertica and other solutions. Updated: May 2024.

DOWNLOAD NOW

772,679 professionals have used our research since 2012.

Apache Hadoop is ranked 6th in Data Warehouse with 34 reviews while Vertica is ranked 4th in Data Warehouse with 83 reviews. Apache Hadoop is rated 7.8, while Vertica is rated 8.2. The top reviewer of Apache Hadoop writes "Handles huge data volumes and create your own workflows and tables but you need to have deeper knowledge". On the other hand, the top reviewer of Vertica writes " A user-friendly tool that needs to improve its documentation part". Apache Hadoop is most compared with Azure Data Factory, Microsoft Azure Synapse Analytics, Oracle Exadata, Snowflake and SAP IQ, whereas Vertica is most compared with Snowflake, SQL Server, Amazon Redshift, Teradata and BigQuery. See our Apache Hadoop vs. Vertica report.

See our list of best Data Warehouse vendors and best Cloud Data Warehouse vendors.

We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.

Apache Hadoop vs Vertica comparison