We performed a comparison between Apache Hadoop and Snowflake based on real PeerSpot user reviews.
Find out in this report how the two Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."Since both Apache Hadoop and Amazon EC2 are elastic in nature, we can scale and expand on demand for a specific PoC, and scale down when it's done."
"High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization."
"Initially, with RDBMS alone, we had a lot of work and few servers running on-premise and on cloud for the PoC and incubation. With the use of Hadoop and ecosystem components and tools, and managing it in Amazon EC2, we have created a Big Data "lab" which helps us to centralize all our work and solutions into a single repository. This has cut down the time in terms of maintenance, development and, especially, data processing challenges."
"The most important feature is its ability to handle large volumes. Some of our customers have really large volumes, and it is capable of handling their data in terms of the core volume and daily incremental volume. So, its processing power and speed are most valuable."
"The most valuable feature is the database."
"It's good for storing historical data and handling analytics on a huge amount of data."
"Apache Hadoop can manage large amounts and volumes of data with relative ease, which is a feature that is beneficial."
"Hadoop is designed to be scalable, so I don't think that it has limitations in regards to scalability."
"The cloning functionality has been the most valuable. I have been able to completely copy databases. The data sharing concept is also useful. As compared to, for example, SAP, Snowflake is a lot more open, and it allows a lot more connectivity for other providers than an SAP ecosystem."
"The solution's computing time is less."
"The speed of data loading and being able to quickly create the environment are most valuable."
"It requires no maintenance on our part. They handle all that. The speed is phenomenal. The pricing isn't really anything more than what you would be paying for a SQL server license or another tool to execute the same thing. We have zero maintenance on our side to do anything and the speed at which it performs queries and loads the data is amazing. It handles unstructured data extremely well, too. So, if the data is in a JSON array or an XML, it handles that super well."
"I like the ability to work with a managed service on the cloud and that is easy to start with."
"The most valuable features are the clustering, LS50, being able to change the size, the pay per use feature, the flexibility with many different sources and analytic applications."
"As long as you don't need to worry about the storage or cost, this solution would be one of the best ones on the market for scalability purposes."
"Its performance is a big advantage. When you run a query, its performance is very good. The inbound and outbound share features are also very useful for sharing a particular database. By using these features, you can allow others to access the Snowflake database and query it, which is another advantage of this solution. It has good security, and we can easily integrate it. We can connect it with multiple source systems."
"The main thing is the lack of community support. If you want to implement a new API or create a new file system, you won't find easy support."
"The solution is very expensive."
"I think more of the solution needs to be focused around the panel processing and retrieval of data."
"In certain cases, the configurations for dealing with data skewness do not make any sense."
"From the Apache perspective or the open-source community, they need to add more capabilities to make life easier from a configuration and deployment perspective."
"The solution could use a better user interface. It needs a more effective GUI in order to create a better user environment."
"Hadoop's security could be better."
"The price could be better. I think we would use it more, but the company didn't want to pay for it. Hortonworks doesn't exist anymore, and Cloudera killed the free version of Hadoop."
"Their strategy is just to leverage what you've got and put Snowflake in the middle. It does work well with other tools. You have to buy a separate reporting tool and a separate data loading tool, whereas, in some platforms, these tools are baked in. In the long-term, they'll need to add more direct partnerships to the ecosystem so that it's not like adding on tools around Snowflake to make it work. They can also consider including Snowflake native reporting tools versus partnering with other reporting tools. It would kind of change where they sit in the market."
"For the Snowflake database, there should be some third-party features for the ETL. It would also be good to be able to use some kind of controls to get the data either from another database or a flat file. Its price should be improved. It should be cheaper than Microsoft."
"Sometimes it can be tricky to manage multiple environments if you're purely using Snowflake as your scripting and pipeline environment."
"Snowflake needs to improve its programming part. Though the tool has Snowpath, it doesn’t support all features like its competitor, Databricks. Snowflake doesn’t support external data ingestion capabilities. You need to have third-party tools for that. Also, the tool needs to incorporate data integration features in its future releases."
"Snowflake has support for stored procedures, but it is not that powerful."
"I would like to see a client version of the GUI."
"Snowflake could improve migration. It should be made easier. It would be beneficial if it could offer some OLTP features. One of our customers was using Oracle for both data warehousing and OLTP workloads, and they were able to migrate their data warehousing workloads to Snowflake without major issues. However, for some of their OLTP requirements, such as needing a response time of fewer than 10 milliseconds for certain queries, Snowflake is currently unable to provide that."
"Snowflake could improve if they had an Operational Data Store(ODS) space."
Apache Hadoop is ranked 5th in Data Warehouse with 31 reviews while Snowflake is ranked 1st in Data Warehouse with 92 reviews. Apache Hadoop is rated 7.8, while Snowflake is rated 8.4. The top reviewer of Apache Hadoop writes "A file system for data collection that contains needed information and files". On the other hand, the top reviewer of Snowflake writes "Good usability, good data sharing and elastic compute features, and requires less DBA involvement". Apache Hadoop is most compared with Microsoft Azure Synapse Analytics, Azure Data Factory, Oracle Exadata, Teradata and BigQuery, whereas Snowflake is most compared with BigQuery, Azure Data Factory, Teradata, Vertica and Teradata Cloud Data Warehouse. See our Apache Hadoop vs. Snowflake report.
See our list of best Data Warehouse vendors and best Cloud Data Warehouse vendors.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.
Apache Hadoop is for data lake use cases. But getting data out of Hadoop for meaningful analytics is indeed need quite an amount of work. by either using spark/Hive/presto and so on. The way i look at Snowflake and Hadoop is they complement each other. For data lake you can use hadoop and then for datawarehouse companies can use snowflake. Depending on the size of the company you can turn snowflake into a data lake use case too. Snowflake is SQL friendly and you don't need to carry out any circus to get the data in and out of snowflake.