Compare Apache Hadoop vs. Snowflake

Apache Hadoop is ranked 4th in Data Warehouse with 7 reviews while Snowflake is ranked 7th in Data Warehouse with 5 reviews. Apache Hadoop is rated 7.6, while Snowflake is rated 8.2. The top reviewer of Apache Hadoop writes "We are able to ingest huge volumes/varieties of data, but it needs a data visualization tool and enhanced Ambari for management". On the other hand, the top reviewer of Snowflake writes "Stable with good technical support, but the solution is expensive on longrun". Apache Hadoop is most compared with Snowflake, Pivotal Greenplum and Oracle Exadata, whereas Snowflake is most compared with Apache Hadoop, Microsoft Azure SQL Data Warehouse and Amazon Redshift. See our Apache Hadoop vs. Snowflake report.
You must select at least 2 products to compare!
Apache Hadoop Logo
11,979 views|10,438 comparisons
Snowflake Logo
15,169 views|11,128 comparisons
Most Helpful Review
Find out what your peers are saying about Apache Hadoop vs. Snowflake and other solutions. Updated: September 2019.
372,622 professionals have used our research since 2012.
Quotes From Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:

What comes with the standard setup is what we mostly use, but Ambari is the most important.The best thing about this solution is that it is very powerful and very cheap.The most valuable features are the ability to process the machine data at a high speed, and to add structure to our data so that we can generate relevant analytics.Two valuable features are its scalability and parallel processing. There are jobs that cannot be done unless you have massively parallel processing.Initially, with RDBMS alone, we had a lot of work and few servers running on-premise and on cloud for the PoC and incubation. With the use of Hadoop and ecosystem components and tools, and managing it in Amazon EC2, we have created a Big Data "lab" which helps us to centralize all our work and solutions into a single repository. This has cut down the time in terms of maintenance, development and, especially, data processing challenges.Since both Apache Hadoop and Amazon EC2 are elastic in nature, we can scale and expand on demand for a specific PoC, and scale down when it's done.Most valuable features are HDFS and Kafka: Ingestion of huge volumes and variety of unstructured/semi-structured data is feasible, and it helps us to quickly onboard a new Big Data analytics prospect.High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization.

Read more »

They separate compute and storage. You can scale storage independently of the computer, or you can scale computing independently of storage. If you need to buy more computer parts you can add new virtual warehouses in Snowflake. Similarly, if you need more storage, you take more storage. It's most scalable in the database essentially; typically you don't have this scalability independence on-premises.The most valuable features are the clustering, LS50, being able to change the size, the pay per use feature, the flexibility with many different sources and analytic applications.As long as you don't need to worry about the storage or cost, this solution would be one of the best ones on the market for scalability purposes.The distributed architecture of Snowflake has the capacity to process huge datasets faster and allows us to scale up and down according to our needs.

Read more »

In the next release, I would like to see Hive more responsive for smaller queries and to reduce the latency.The upgrade path should be improved because it is not as easy as it should be.We would like to have more dynamics in merging this machine data with other internal data to make more meaning out of it.I would like to see more direct integration of visualization applications.Based on our needs, we would like to see a tool for data visualization and enhanced Ambari for management, plus a pre-built IoT hub/model. These would reduce our efforts and the time needed to prove to a customer that this will help them.General installation/dependency issues were there, but were not a major, complex issue. While migrating data from MySQL to Hive, things are a little challenging, but we were able to get through that with support from forums and a little trial and error.It needs better user interface (UI) functionalities.

Read more »

The solution should offer an on-premises version also. We have some requirements where we would prefer to use it as a template.Support needs improvement, as it can take several days before you get some initial support.There are some stored procedures that we've had trouble with. The solution also needs to fine-tune the connectors to be able to connect into the system source.Snowflake has to improve their spatial parts since it doesn't have much in terms of geo-spatial queries.

Read more »

Pricing and Cost Advice
This is a low cost and powerful solution.​There are no licensing costs involved, hence money is saved on the software infrastructure​.

Read more »

Pricing can be confusing for customers.

Read more »

Use our free recommendation engine to learn which Data Warehouse solutions are best for your needs.
372,622 professionals have used our research since 2012.
Answers from the Community
Miriam Tover

Interactive querying as a consumption pattern is something Snowflake handles much better than Hadoop and related query engine options - Impala, Presto, Drill etc. Heavy data scientists query workload can be an expensive query pattern on Snowflake and Hadoop can provide a more cost-efficient solution. Hadoop is also still relevant as a back-end data processing engine, instead of leveraging Snowflake for data transformation due to higher cost as well as limited procedural language capabilities (javascript based stored procedures). Snowflake fares much better than Hadoop in terms of administrative complexity.

26 June 19
out of 30 in Data Warehouse
Average Words per Review
Avg. Rating
out of 30 in Data Warehouse
Average Words per Review
Avg. Rating
Top Comparisons
Compared 31% of the time.
Compared 30% of the time.
Compared 13% of the time.
Compared 26% of the time.
Compared 15% of the time.
Also Known As
Snowflake Computing
Snowflake Computing
The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

Snowflake provides a data warehouse built for the cloud, delivering a solution capable of solving problems for which legacy, on-premises and cloud data platforms were not designed.

Learn more about Apache Hadoop
Learn more about Snowflake
Sample Customers
Amazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web LabAccordant Media, Adobe, Kixeye Inc., Revana, SOASTA, White Ops
Top Industries
Software R&D Company30%
Financial Services Firm20%
Comms Service Provider11%
Software R&D Company32%
Financial Services Firm10%
Insurance Company8%
Comms Service Provider7%
Find out what your peers are saying about Apache Hadoop vs. Snowflake and other solutions. Updated: September 2019.
372,622 professionals have used our research since 2012.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.
Sign Up with Email