Compare Apache Hadoop vs. Snowflake

Apache Hadoop is ranked 4th in Data Warehouse with 11 reviews while Snowflake is ranked 1st in Data Warehouse with 10 reviews. Apache Hadoop is rated 7.6, while Snowflake is rated 8.2. The top reviewer of Apache Hadoop writes "We are able to ingest huge volumes/varieties of data, but it needs a data visualization tool and enhanced Ambari for management". On the other hand, the top reviewer of Snowflake writes "Fast, convenient and requires almost no administration". Apache Hadoop is most compared with Snowflake, Pivotal Greenplum and Oracle Exadata, whereas Snowflake is most compared with Apache Hadoop, Microsoft Azure SQL Data Warehouse and Amazon Redshift. See our Apache Hadoop vs. Snowflake report.
Cancel
You must select at least 2 products to compare!
Apache Hadoop Logo
12,931 views|11,053 comparisons
Snowflake Logo
18,513 views|13,119 comparisons
Most Helpful Review
Find out what your peers are saying about Apache Hadoop vs. Snowflake and other solutions. Updated: January 2020.
399,230 professionals have used our research since 2012.
Quotes From Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:

Pros
The most valuable features are powerful tools for ingestion, as data is in multiple systems.The most valuable feature is the database.It's good for storing historical data and handling analytics on a huge amount of data.The ability to add multiple nodes without any restriction is the solution's most valuable aspect.What comes with the standard setup is what we mostly use, but Ambari is the most important.The best thing about this solution is that it is very powerful and very cheap.The most valuable features are the ability to process the machine data at a high speed, and to add structure to our data so that we can generate relevant analytics.Two valuable features are its scalability and parallel processing. There are jobs that cannot be done unless you have massively parallel processing.

Read more »

I like the idea that you can assign roles and responsibilities, limiting access to data.It has great flexibility whenever we are loading data and performs ELT (extract, load, transform) techniques instead of ETL.The snapshot feature is good, the rollback feature is good and the interface is user-friendly.The thing I find most valuable is that scalability, space storage, and computing power is separate. When you scale up, it is live from one second to the next — constantly available as you scale — so there is no downtime or interruption of services.The initial setup is straightforward. You just need to follow the documentation.They separate compute and storage. You can scale storage independently of the computer, or you can scale computing independently of storage. If you need to buy more computer parts you can add new virtual warehouses in Snowflake. Similarly, if you need more storage, you take more storage. It's most scalable in the database essentially; typically you don't have this scalability independence on-premises.The most valuable features are the clustering, LS50, being able to change the size, the pay per use feature, the flexibility with many different sources and analytic applications.As long as you don't need to worry about the storage or cost, this solution would be one of the best ones on the market for scalability purposes.

Read more »

Cons
It would be helpful to have more information on how to best apply this solution to smaller organizations, with less data, and grow the data lake.It would be good to have more advanced analytics tools.The solution could use a better user interface. It needs a more effective GUI in order to create a better user environment.There is a lack of virtualization and presentation layers, so you can't take it and implement it like a radio solution.In the next release, I would like to see Hive more responsive for smaller queries and to reduce the latency.The upgrade path should be improved because it is not as easy as it should be.We would like to have more dynamics in merging this machine data with other internal data to make more meaning out of it.I would like to see more direct integration of visualization applications.

Read more »

If you go with one cloud provider, you can't switch.They do have a native connector to connect with integration tools for loading data, but it would be much better to have the functionality built-in.Availability is a problem.Maybe there could be some more connectors to other systems, but this is what they are constantly developing anyway.The solution could improve the user interface and add functionality to the system.The solution should offer an on-premises version also. We have some requirements where we would prefer to use it as a template.Support needs improvement, as it can take several days before you get some initial support.There are some stored procedures that we've had trouble with. The solution also needs to fine-tune the connectors to be able to connect into the system source.

Read more »

Pricing and Cost Advice
This is a low cost and powerful solution.​There are no licensing costs involved, hence money is saved on the software infrastructure​.

Read more »

You pay based on the data that you are storing in the data warehouse and there are no maintenance costs.The whole licensing system is based on credit points. You can also make a license agreement with the company so that you buy credit points and then you use them. What you do not use in one year can be carried over to the next year.Pricing can be confusing for customers.

Read more »

report
Use our free recommendation engine to learn which Data Warehouse solutions are best for your needs.
399,230 professionals have used our research since 2012.
Answers from the Community
Miriam Tover
author avatarManish-Kapoor (Tata Consultancy Services)
Consultant

Interactive querying as a consumption pattern is something Snowflake handles much better than Hadoop and related query engine options - Impala, Presto, Drill etc. Heavy data scientists query workload can be an expensive query pattern on Snowflake and Hadoop can provide a more cost-efficient solution. Hadoop is also still relevant as a back-end data processing engine, instead of leveraging Snowflake for data transformation due to higher cost as well as limited procedural language capabilities (javascript based stored procedures). Snowflake fares much better than Hadoop in terms of administrative complexity.

author avatarSreenivasan Ramanujam
User

Apache Hadoop is for data lake use cases. But getting data out of Hadoop for meaningful analytics is indeed need quite an amount of work. by either using spark/Hive/presto and so on. The way i look at Snowflake and Hadoop is they complement each other. For data lake you can use hadoop and then for datawarehouse companies can use snowflake. Depending on the size of the company you can turn snowflake into a data lake use case too. Snowflake is SQL friendly and you don't need to carry out any circus to get the data in and out of snowflake.

Ranking
4th
out of 30 in Data Warehouse
Views
12,931
Comparisons
11,053
Reviews
10
Average Words per Review
427
Avg. Rating
7.5
1st
out of 30 in Data Warehouse
Views
18,513
Comparisons
13,119
Reviews
10
Average Words per Review
642
Avg. Rating
8.2
Top Comparisons
Compared 33% of the time.
Compared 26% of the time.
Compared 13% of the time.
Compared 23% of the time.
Compared 12% of the time.
Also Known As
Snowflake Computing
Learn
Apache
Snowflake Computing
Overview
The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

Snowflake provides a data warehouse built for the cloud, delivering a solution capable of solving problems for which legacy, on-premises and cloud data platforms were not designed.

Offer
Learn more about Apache Hadoop
Learn more about Snowflake
Sample Customers
Amazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web LabAccordant Media, Adobe, Kixeye Inc., Revana, SOASTA, White Ops
Top Industries
VISITORS READING REVIEWS
Software R&D Company34%
Comms Service Provider15%
Financial Services Firm15%
Government7%
VISITORS READING REVIEWS
Software R&D Company39%
Financial Services Firm8%
Comms Service Provider7%
Retailer7%
Company Size
REVIEWERS
Small Business31%
Midsize Enterprise23%
Large Enterprise46%
REVIEWERS
Small Business20%
Midsize Enterprise20%
Large Enterprise60%
VISITORS READING REVIEWS
Small Business4%
Midsize Enterprise10%
Large Enterprise86%
Find out what your peers are saying about Apache Hadoop vs. Snowflake and other solutions. Updated: January 2020.
399,230 professionals have used our research since 2012.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.