We performed a comparison between Apache Hadoop and Snowflake based on real PeerSpot user reviews.
Find out in this report how the two Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The most important feature is its ability to handle large volumes. Some of our customers have really large volumes, and it is capable of handling their data in terms of the core volume and daily incremental volume. So, its processing power and speed are most valuable."
"Hadoop is extensible — it's elastic."
"Most valuable features are HDFS and Kafka: Ingestion of huge volumes and variety of unstructured/semi-structured data is feasible, and it helps us to quickly onboard a new Big Data analytics prospect."
"It's good for storing historical data and handling analytics on a huge amount of data."
"The ability to add multiple nodes without any restriction is the solution's most valuable aspect."
"High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization."
"Two valuable features are its scalability and parallel processing. There are jobs that cannot be done unless you have massively parallel processing."
"Apache Hadoop is crucial in projects that save and retrieve data daily. Its valuable features are scalability and stability. It is easy to integrate with the existing infrastructure."
"All the people who are working with Snowflake are extremely happy with it because it is designed from a data-warehousing point of view, not the other way around. You have a database and then you tweak it and then it becomes a data warehouse."
"The Mbps they have established is quite a bit faster than any other data warehouse."
"The best thing about Snowflake is its flexibility in changing warehouse sizes or computational power."
"Time travel is one feature that really helps us out."
"The overall ecosystem was easy to manage. Given that we weren't a very highly technical group, it was preferable to other things we looked at because it could do all of the cloud tunings. It can tune your data warehouse to an appropriate size for controlled billing, resume and sleep functions, and all such things. It was much more simple than doing native Azure or AWS development. It was stable, and their support was also perfect. It was also very easy to deploy. It was one of those rare times where they did exactly what they said they could do."
"The distributed architecture of Snowflake has the capacity to process huge datasets faster and allows us to scale up and down according to our needs."
"The product is quite fast."
"The product's most important feature is unloading data to S3."
"There is a lack of virtualization and presentation layers, so you can't take it and implement it like a radio solution."
"The upgrade path should be improved because it is not as easy as it should be."
"The key shortcoming is its inability to handle queries when there is insufficient memory. This limitation can be bypassed by processing the data in chunks."
"The integration with Apache Hadoop with lots of different techniques within your business can be a challenge."
"It could be more user-friendly."
"Real-time data processing is weak. This solution is very difficult to run and implement."
"We would like to have more dynamics in merging this machine data with other internal data to make more meaning out of it."
"The solution needs a better tutorial. There are only documents available currently. There's a lot of YouTube videos available. However, in terms of learning, we didn't have great success trying to learn that way. There needs to be better self-paced learning."
"Snowflake needs to improve its programming part. Though the tool has Snowpath, it doesn’t support all features like its competitor, Databricks. Snowflake doesn’t support external data ingestion capabilities. You need to have third-party tools for that. Also, the tool needs to incorporate data integration features in its future releases."
"Snowflake could improve migration. It should be made easier. It would be beneficial if it could offer some OLTP features. One of our customers was using Oracle for both data warehousing and OLTP workloads, and they were able to migrate their data warehousing workloads to Snowflake without major issues. However, for some of their OLTP requirements, such as needing a response time of fewer than 10 milliseconds for certain queries, Snowflake is currently unable to provide that."
"If you go with one cloud provider, you can't switch."
"I see room for improvement when it comes to credit performance. The other thing I'd like to be improved is the warehouse facility."
"Getting data out of the tool to third-party applications is difficult."
"Snowflake has to improve their spatial parts since it doesn't have much in terms of geo-spatial queries."
"I would like to see more transparency in data processing, ATLs, and compute areas - which should give more comfort to the end users."
"This solution could be improved by offering machine learning apps."
Apache Hadoop is ranked 5th in Data Warehouse with 34 reviews while Snowflake is ranked 1st in Data Warehouse with 94 reviews. Apache Hadoop is rated 7.8, while Snowflake is rated 8.4. The top reviewer of Apache Hadoop writes "Handles huge data volumes and create your own workflows and tables but you need to have deeper knowledge". On the other hand, the top reviewer of Snowflake writes "Good usability, good data sharing and elastic compute features, and requires less DBA involvement". Apache Hadoop is most compared with Azure Data Factory, Microsoft Azure Synapse Analytics, Oracle Exadata, Teradata and BigQuery, whereas Snowflake is most compared with BigQuery, Azure Data Factory, Teradata, Vertica and Teradata Cloud Data Warehouse. See our Apache Hadoop vs. Snowflake report.
See our list of best Data Warehouse vendors and best Cloud Data Warehouse vendors.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.
Apache Hadoop is for data lake use cases. But getting data out of Hadoop for meaningful analytics is indeed need quite an amount of work. by either using spark/Hive/presto and so on. The way i look at Snowflake and Hadoop is they complement each other. For data lake you can use hadoop and then for datawarehouse companies can use snowflake. Depending on the size of the company you can turn snowflake into a data lake use case too. Snowflake is SQL friendly and you don't need to carry out any circus to get the data in and out of snowflake.