We performed a comparison between Apache Hadoop and Snowflake based on real PeerSpot user reviews.
Find out in this report how the two Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The performance is pretty good."
"The scalability of Apache Hadoop is very good."
"Most valuable features are HDFS and Kafka: Ingestion of huge volumes and variety of unstructured/semi-structured data is feasible, and it helps us to quickly onboard a new Big Data analytics prospect."
"Hadoop is extensible — it's elastic."
"Apache Hadoop is crucial in projects that save and retrieve data daily. Its valuable features are scalability and stability. It is easy to integrate with the existing infrastructure."
"It's good for storing historical data and handling analytics on a huge amount of data."
"High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization."
"The tool's stability is good."
"The cloning functionality has been the most valuable. I have been able to completely copy databases. The data sharing concept is also useful. As compared to, for example, SAP, Snowflake is a lot more open, and it allows a lot more connectivity for other providers than an SAP ecosystem."
"Time travel is one feature that really helps us out."
"The most efficient way for real-time dashboards or analytical business intelligence reports to be sent to the customer."
"The solution is stable."
"Snowflake's most valuable features are data enrichment and flattening."
"The ability to share the data and the ability to scale up and down easily are the most valuable features. The concept of data sharing and data plumbing made it very easy to provide and share data. The ability to refresh your Dev or QA just by doing a clone is also valuable. It has the dynamic scale up and scale down feature. Development and deployment are much easier as compared to other platforms where you have to go through a lot of stuff. With a tool like DBT, you can do modeling and transformation within a single tool and deploy to Snowflake. It provides continuous deployment and continuous integration abilities. There is a separation of storage and compute, so you only get charged for your usage. You only pay for what you use. When we share the data downstream with business partners, we can specifically create compute for them, and we can charge back the business."
"It helped us to build MVP (minimum viable product) for our idea of building a data warehouse model for small businesses."
"Working with Parquet files is support out of the box and it makes large dataset processing much easier."
"There is a lack of virtualization and presentation layers, so you can't take it and implement it like a radio solution."
"The load optimization capabilities of the product are an area of concern where improvements are required."
"The solution could use a better user interface. It needs a more effective GUI in order to create a better user environment."
"What could be improved in Apache Hadoop is its user-friendliness. It's not that user-friendly, but maybe it's because I'm new to it. Sometimes it feels so tough to use, but it could be because of two aspects: one is my incompetency, for example, I don't know about all the features of Apache Hadoop, or maybe it's because of the limitations of the platform. For example, my team is maintaining the business glossary in Apache Atlas, but if you want to change any settings at the GUI level, an advanced level of coding or programming needs to be done in the back end, so it's not user-friendly."
"In the next release, I would like to see Hive more responsive for smaller queries and to reduce the latency."
"I think more of the solution needs to be focused around the panel processing and retrieval of data."
"In certain cases, the configurations for dealing with data skewness do not make any sense."
"The price could be better. I think we would use it more, but the company didn't want to pay for it. Hortonworks doesn't exist anymore, and Cloudera killed the free version of Hadoop."
"There is a need for improvements in the documentation, this would allow more people to switch over to this solution."
"I have heard people having difficulty with the machine learning model, so there may be room for improvement."
"The cost efficiency and monitoring of this solution could be improved. It's easy to spend a lot on Snowflake and it does offer monitoring tools but they're pretty basic."
"We are yet to figure out how to integrate tools, such as Liquibase, to release changes to our data warehouse model."
"Every product has room for improvement, although in this case, it needs some broadening of the functionality."
"The documentation could improve. They should provide architecture information."
"Pricing is an issue for many customers."
"To ensure the proper functioning of Snowflake as an MDS, it relies heavily on other partner tools."
Apache Hadoop is ranked 5th in Data Warehouse with 34 reviews while Snowflake is ranked 1st in Data Warehouse with 94 reviews. Apache Hadoop is rated 7.8, while Snowflake is rated 8.4. The top reviewer of Apache Hadoop writes "Handles huge data volumes and create your own workflows and tables but you need to have deeper knowledge". On the other hand, the top reviewer of Snowflake writes "Good usability, good data sharing and elastic compute features, and requires less DBA involvement". Apache Hadoop is most compared with Azure Data Factory, Microsoft Azure Synapse Analytics, Oracle Exadata, Teradata and BigQuery, whereas Snowflake is most compared with BigQuery, Azure Data Factory, Teradata, Vertica and Teradata Cloud Data Warehouse. See our Apache Hadoop vs. Snowflake report.
See our list of best Data Warehouse vendors and best Cloud Data Warehouse vendors.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.
Apache Hadoop is for data lake use cases. But getting data out of Hadoop for meaningful analytics is indeed need quite an amount of work. by either using spark/Hive/presto and so on. The way i look at Snowflake and Hadoop is they complement each other. For data lake you can use hadoop and then for datawarehouse companies can use snowflake. Depending on the size of the company you can turn snowflake into a data lake use case too. Snowflake is SQL friendly and you don't need to carry out any circus to get the data in and out of snowflake.