Apache Hadoop vs Vertica comparison

Cancel
You must select at least 2 products to compare!
Apache Logo
2,630 views|2,223 comparisons
89% willing to recommend
OpenText Logo
4,276 views|3,331 comparisons
90% willing to recommend
Comparison Buyer's Guide
Executive Summary

We performed a comparison between Apache Hadoop and Vertica based on real PeerSpot user reviews.

Find out in this report how the two Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.
To learn more, read our detailed Apache Hadoop vs. Vertica Report (Updated: March 2024).
768,857 professionals have used our research since 2012.
Q&A Highlights
Question: Which is the best RDMBS solution for big data?
Answer: I haven't used SQream personally. However, if you are only considering GPU based rdbms's please check the following https://hackernoon.com/which-gpu-database-is-right-for-me-6ceef6a17505
Featured Review
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
Pros
"Its integration is Hadoop's best feature because that allows us to support different tools in a big data platform.""The scalability of Apache Hadoop is very good.""It is a file system for data collection. There are nodes in this cluster that contain all the information, directories, and other files. The nodes are based on the MySQL database.""The most valuable feature is the database.""The ability to add multiple nodes without any restriction is the solution's most valuable aspect.""It's open-source, so it's very cost-effective.""​​Data ingestion: It has rapid speed, if Apache Accumulo is used.""Hadoop is designed to be scalable, so I don't think that it has limitations in regards to scalability."

More Apache Hadoop Pros →

"I like the projection feature, which increases query performance.""It's the fastest database I have ever tested. That's the most important feature of Vertica.""Its projections and encoding are excellent tools for tuning large volumes.""DBAs don’t need to add a partition every month/quarter like with other DBs.""Allows us to take volumes and process them at a very high speed.""I appreciate the flexibility offered by Vertica's projections. It allows for modifying the primary projection without altering the tables, which helps to optimize queries without the need to modify the underlying data.""It maximize cloud economics for mission-critical big data analytical initiatives.""Vertica has a few features that I like. From an architecture standpoint, they have separated compute and storage. So you have low-cost object storage for primary storage and the ability to have several sub-clusters working off the same ObjectStore. So it provides workload isolation."

More Vertica Pros →

Cons
"Based on our needs, we would like to see a tool for data visualization and enhanced Ambari for management, plus a pre-built IoT hub/model. These would reduce our efforts and the time needed to prove to a customer that this will help them.""The price could be better. I think we would use it more, but the company didn't want to pay for it. Hortonworks doesn't exist anymore, and Cloudera killed the free version of Hadoop.""It requires a great deal of learning curve to understand. The overall Hadoop ecosystem has a large number of sub-products. There is ZooKeeper, and there are a whole lot of other things that are connected. In many cases, their functionalities are overlapping, and for a newcomer or our clients, it is very difficult to decide which of them to buy and which of them they don't really need. They require a consulting organization for it, which is good for organizations such as ours because that's what we do, but it is not easy for the end customers to gain so much knowledge and optimally use it.""I think more of the solution needs to be focused around the panel processing and retrieval of data.""The solution is not easy to use. The solution should be easy to use and suitable for almost any case connected with the use of big data or working with large amounts of data.""The solution is very expensive.""The integration with Apache Hadoop with lots of different techniques within your business can be a challenge.""In certain cases, the configurations for dealing with data skewness do not make any sense."

More Apache Hadoop Cons →

"It should provide a GUI interface for data management and tuning.""Vertica can improve automation and documentation. Additionally, the solution can be simplified.""The geospatial functionality could be designed better.""The biggest problem is the cost of cloud deployment.""When it is about to reach the maximum storage capacity, it becomes slow.""If you do not utilize the tuning tools like projections, encoding, partitions, and statistics, then performance and scalability will suffer.""Documentation has become much better, but can always use some improvement.""The documentation of Vertica is an area with shortcomings where improvements are required."

More Vertica Cons →

Pricing and Cost Advice
  • "Do take into consider that data storage and compute capacity scale differently and hence purchasing a "boxed" / 'all-in-one" solution (software and hardware) might not be the best idea."
  • "​There are no licensing costs involved, hence money is saved on the software infrastructure​."
  • "This is a low cost and powerful solution."
  • "The price of Apache Hadoop could be less expensive."
  • "If my company can use the cloud version of Apache Hadoop, particularly the cloud storage feature, it would be easier and would cost less because an on-premises deployment has a higher cost during storage, for example, though I don't know exactly how much Apache Hadoop costs."
  • "We don't directly pay for it. Our clients pay for it, and they usually don't complain about the price. So, it is probably acceptable."
  • "The price could be better. Hortonworks no longer exists, and Cloudera killed the free version of Hadoop."
  • "We just use the free version."
  • More Apache Hadoop Pricing and Cost Advice →

  • "Work with a vendor, if possible, and take advantage of more aggressive discounts at mid-fiscal year (April) and fiscal year-end (October).​"
  • "It's free up to three nodes and 1TB, and then get in contact with their sales guys."
  • "Start with license per 1TB. Starting from hundreds of TB there is unlimited licensing to be considered. Move historical data to HDFS/S3 which are significantly cheaper or even free."
  • "The first TB is free and you can use all the Vertica features. After 1TB you have to pay for licensing. The product is worth it, but be aware of this condition, and plan. The compression ratio is explained in the documentation."
  • "I think it's starting to get a little expensive. Open source products are starting to get more robust, so I think that's something that they need to start looking at in terms of licensing."
  • "Read the fine print carefully."
  • "It is fast to purchase through the AWS Marketplace."
  • "The pricing and licensing depend on the size of your environment and the zone where you want to implement."
  • More Vertica Pricing and Cost Advice →

    report
    Use our free recommendation engine to learn which Data Warehouse solutions are best for your needs.
    768,857 professionals have used our research since 2012.
    Answers from the Community
    Anonymous User
    Yuval Klein - PeerSpot reviewerYuval Klein
    Real User

    SQreamDB is a GPU DB. It is not suitable for real-time oltp of course.

    Cassandra is best suited for OLTP database use cases, when you need a scalable database (instead of SQL server, Postgres)
    SQream is a GPU database suited for OLAP purposes. It's the best suite for a very large data warehouse, very large queries needed mass parallel activity since GPU is great in massive parallel workload.

    Also, SQream is quite cheap since we need only one server with a GPU card, the best GPU card the better since we will have more CPU activity. It's only for a very big data warehouse, not for small ones.

    Tristan Bergh - PeerSpot reviewerTristan Bergh
    Real User

    Your best DB for 40+ TB is Apache Spark, Drill and the Hadoop stack, in the cloud.

    Use the public cloud provider's elastic store (S3, Azure BLOB, google drive) and then stand up Apache Spark on a cluster sized to run your queries within 20 minutes. Based on my experience (Azure BLOB store, Databricks, PySpark) you may need around 500 32GB nodes for reading 40 TB of data.

    Costs can be contained by running your own clusters but Databricks manage clusters for you.

    I would recommend optimizing your 40TB data store into the Databricks delta format after an initial parse.

    Russell Rothstein - PeerSpot reviewerRussell Rothstein (PeerSpot)
    Vendor

    Morten, the most popular comparisons of SQream can be found here: www.itcentralstation.com
    The top ones include Cassandra, MemSQL, MongoDB, and Vertica.

    Questions from the Community
    Top Answer:Tools like Apache Hadoop are knowledge-intensive in nature. Unlike other tools in the market currently, we cannot understand knowledge-intensive products straight away. To use Apache Hadoop, a person… more »
    Top Answer:The product's initial setup phase is extremely simple.
    Top Answer:In my opinion, nothing needs improvement in the solution as it is a great product. The documentation of Vertica is an area with shortcomings where improvements are required. Vertica needs to increase… more »
    Ranking
    5th
    out of 34 in Data Warehouse
    Views
    2,630
    Comparisons
    2,223
    Reviews
    11
    Average Words per Review
    532
    Rating
    8.0
    4th
    out of 34 in Data Warehouse
    Views
    4,276
    Comparisons
    3,331
    Reviews
    10
    Average Words per Review
    353
    Rating
    8.3
    Comparisons
    Snowflake logo
    Compared 18% of the time.
    SQL Server logo
    Compared 15% of the time.
    Amazon Redshift logo
    Compared 11% of the time.
    Teradata logo
    Compared 10% of the time.
    SingleStore logo
    Compared 1% of the time.
    Also Known As
    Micro Focus Vertica, HPE Vertica, HPE Vertica on Demand
    Learn More
    Overview
    The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

    Vertica is a deploy-anywhere SQL database created for elasticity, speed, and advanced analytics. Vertica enables today’s busy teams to modernize their data warehouses, democratize data and analytics to enable increased access, and deploy analytics in a hybrid cloud environment. Additionally, Vertica merges how companies power their analytics by providing a scalable, open, and elastic database with numerous intuitive features.

    In today’s marketplace, organizations are experiencing continued robust growth of data volumes, and citizen data scientists’ broader use of analytics is causing many companies to re-visit and re-examine their systems in order to match the demands of an aggressive marketplace. Analytics are continually swiftly evolving. New data from social media, blogs, IoT sources, data streams, gas and electrical grids, and mobile networks is being constantly gathered in extensive data sets. This presents organizations with a new opportunity to become more data driven, and they must be able to manage the new data growth and identify the trends and sequences that can lead to both improved business opportunities and continued repeat business from their clients.

    Vertica Benefits:

    Vertica has many valuable key benefits. Some of its most useful benefits include:

    • Efficiency:  Vertica provides robust compression and intuitive impressions. This results in users requiring significantly less storage and hardware than other comparable data analytics solutions. The progressive Vertica architecture results in queries that are 10-50 times faster than other platforms while providing more storage data per server.
    • Integration: Each new iteration of Vertica is tested and certified with the latest ETL and visualization tools. It actively supports Java Database Connectivity (JDBC), Open Database Connectivity (ODBC), and popular SQL providers. All these solutions and most leading BI and visualization tools interact seamlessly, making Vertica overall a very cost-effective solution and solid business investment.
    • Cloud flexibility: With Vertica, users do not have to get locked into a single cloud vendor. Users are able to take complete advantage of the current infrastructure that is already in place. Vertica seamlessly integrates with popular public clouds, including Google Cloud Platform (GCP), Azure, AWS, Alibaba, VMware clouds, and more. It also provides for easy portability across on-premise and multi-cloud environments and data lakes. Vertica designs a robust flexible platform for running a company’s analytical and computing workloads, which allows applications to run simultaneously on numerous environments in a hybrid cloud infrastructure. Vertica is able to seamlessly use public clouds and private data centers, and it grants the flexibility to switch in an instant.
    • Security: Vertica offers dynamic end-to-end security with support for partner solutions and industry-standard protocols such as Apache Sentry, AWS IAM, Kerberos, LDAP, and more. Vertica utilizes an intuitive layered security model that provides multiple security authentication authorization mechanisms. Vertica will also maintain an audit trail, natively exported to other security domains for analysis and persistence. 

    Reviews from Real Users

    “I am using Vertica for aggregations and dashboards. The most valuable feature of Vertica is the ability to receive large aggregations at a very quick pace. The use case of subclusters is very good.” - Bijal S., Group Chief Technology Officer at Netcore Solutions

    “The hardware usage and speed has been the most valuable feature of this solution. It is very fast and has saved us a lot of money.” - Munkhsaikhan B.,  Project Lead - Digital Transformation Unit at Bodi Electronics LLC

    Sample Customers
    Amazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web Lab
    Cerner, Game Show Network Game, Guess by Marciano, Supercell, Etsy, Nascar, Empirix, adMarketplace, and Cardlytics.
    Top Industries
    REVIEWERS
    Financial Services Firm38%
    Comms Service Provider25%
    Hospitality Company6%
    Consumer Goods Company6%
    VISITORS READING REVIEWS
    Financial Services Firm27%
    Computer Software Company10%
    Comms Service Provider6%
    University6%
    REVIEWERS
    Computer Software Company19%
    Media Company17%
    Marketing Services Firm14%
    Comms Service Provider11%
    VISITORS READING REVIEWS
    Financial Services Firm18%
    Computer Software Company15%
    Manufacturing Company8%
    Comms Service Provider6%
    Company Size
    REVIEWERS
    Small Business34%
    Midsize Enterprise23%
    Large Enterprise43%
    VISITORS READING REVIEWS
    Small Business15%
    Midsize Enterprise11%
    Large Enterprise75%
    REVIEWERS
    Small Business32%
    Midsize Enterprise26%
    Large Enterprise42%
    VISITORS READING REVIEWS
    Small Business20%
    Midsize Enterprise14%
    Large Enterprise66%
    Buyer's Guide
    Apache Hadoop vs. Vertica
    March 2024
    Find out what your peers are saying about Apache Hadoop vs. Vertica and other solutions. Updated: March 2024.
    768,857 professionals have used our research since 2012.

    Apache Hadoop is ranked 5th in Data Warehouse with 32 reviews while Vertica is ranked 4th in Data Warehouse with 83 reviews. Apache Hadoop is rated 7.8, while Vertica is rated 8.2. The top reviewer of Apache Hadoop writes "A file system for data collection that contains needed information and files". On the other hand, the top reviewer of Vertica writes " A user-friendly tool that needs to improve its documentation part". Apache Hadoop is most compared with Azure Data Factory, Microsoft Azure Synapse Analytics, Oracle Exadata, Snowflake and Oracle Big Data Appliance, whereas Vertica is most compared with Snowflake, SQL Server, Amazon Redshift, Teradata and SingleStore. See our Apache Hadoop vs. Vertica report.

    See our list of best Data Warehouse vendors and best Cloud Data Warehouse vendors.

    We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.