We performed a comparison between Apache Hadoop and Snowflake based on real PeerSpot user reviews.
Find out in this report how the two Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."I liked that Apache Hadoop was powerful, had a lot of tools, and the fact that it was free and community-developed."
"Hadoop is designed to be scalable, so I don't think that it has limitations in regards to scalability."
"It's good for storing historical data and handling analytics on a huge amount of data."
"One valuable feature is that we can download data."
"The most valuable feature is scalability and the possibility to work with major information and open source capability."
"The ability to add multiple nodes without any restriction is the solution's most valuable aspect."
"As compared to Hive on MapReduce, Impala on MPP returns results of SQL queries in a fairly short amount of time, and is relatively fast when reading data into other platforms like R."
"It is a file system for data collection. There are nodes in this cluster that contain all the information, directories, and other files. The nodes are based on the MySQL database."
"The solution is stable."
"The most valuable feature is the snapshot database. In one second, you can just take a snapshot of the database for test purposes."
"It requires no maintenance on our part. They handle all that. The speed is phenomenal. The pricing isn't really anything more than what you would be paying for a SQL server license or another tool to execute the same thing. We have zero maintenance on our side to do anything and the speed at which it performs queries and loads the data is amazing. It handles unstructured data extremely well, too. So, if the data is in a JSON array or an XML, it handles that super well."
"Data sharing is a good feature. It is a majorly used feature. The elastic compute is another big feature. Separating compute and storage gives you flexibility. It doesn't require much DBA involvement because it doesn't need any performance tuning. We are not really doing any performance tuning, and the entire burden of performance tuning and SQL tuning is on Snowflake. Its usability is very good. I don't need to ramp up any user, and its onboarding is easier. You just onboard the user, and you are done with it. There are simple SQL and UI, and people are able to use this solution easily. Ease of use is a big thing in Snowflake."
"The way it is built and designed is valuable. The way the shared model is built and the way it exploits the power of the cloud is very good. Certain features related to administration and management, akin to Oracle Flashback and all that, are very important for modern-day administration and management. It is also good in terms of managing and improving performance, indexing, and partitioning. It is sort of completely automated. Everything is essentially under the hood, and the engine takes care of it all. As a data warehouse on the cloud, Snowflake stands strong on its ground even though each of the cloud providers has its own data warehouse, such as Redshift for AWS or Synapse for Azure."
"It has great flexibility whenever we are loading data and performs ELT (extract, load, transform) techniques instead of ETL."
"A user-friendly and reliable solution."
"The pricing is reasonable and matches the rest of the market."
"The stability of the solution needs improvement."
"The solution needs a better tutorial. There are only documents available currently. There's a lot of YouTube videos available. However, in terms of learning, we didn't have great success trying to learn that way. There needs to be better self-paced learning."
"From the Apache perspective or the open-source community, they need to add more capabilities to make life easier from a configuration and deployment perspective."
"The integration with Apache Hadoop with lots of different techniques within your business can be a challenge."
"The upgrade path should be improved because it is not as easy as it should be."
"General installation/dependency issues were there, but were not a major, complex issue. While migrating data from MySQL to Hive, things are a little challenging, but we were able to get through that with support from forums and a little trial and error."
"I mentioned it definitely, and this is probably the only feature we can improve a little bit because the terminal and coding screen on Hadoop is a little outdated, and it looks like the old C++ bio screen. If the UI and UX can be improved slightly, I believe it will go a long way toward increasing adoption and effectiveness."
"The solution is very expensive."
"Maybe there could be some more connectors to other systems, but this is what they are constantly developing anyway."
"The product's performance could be improved."
"From the documentation, the black box is not very descriptive. Snowflake does not reveal how exactly the data is processed or sourced."
"These aren't as crucial, but there are common errors sometimes where the database is down, or a table is nullified and a new table is added and you are not given access to that. With those errors, you don't have permissions."
"Portability is a big hurdle right now for our clients. Porting all of your existing SQL ecosystem, such as stored procedures, to Snowflake is a major pain point. Currently, Snowflake stored procedures use JavaScript, but they should support SQL-based stored procedures. It would be a huge advantage if you can write your stored procedures using SQL. It seems that they are working on this feature, and they are yet to release it. I remember seeing some notes saying that they were going to do that in the future, but the sooner this feature comes out, it would be better for Snowflake because there are a lot of clients with whom I'm interacting, and their main hurdle is to take their existing Oracle or SQL Server stored procedures and move them into Snowflake. For this, you need to learn JavaScript and how it works, which is not easy and becomes a little tricky. If it supports SQL-based procedures, then you can just cut-paste the SQL code, run it, and easily fix small issues."
"The UI could improve because sometimes in the security query the UI freezes. We then have to close the window and restart."
"If they could bring in some tools for data integration, it would be really great."
"The design of the product is easily misunderstood."
Apache Hadoop is ranked 5th in Data Warehouse with 11 reviews while Snowflake is ranked 1st in Data Warehouse with 40 reviews. Apache Hadoop is rated 7.8, while Snowflake is rated 8.4. The top reviewer of Apache Hadoop writes "Has good processing power and speed and is capable of handling large volumes of data and doing online analysis". On the other hand, the top reviewer of Snowflake writes "Easy to set up with great cloning and time travel". Apache Hadoop is most compared with Microsoft Azure Synapse Analytics, Azure Data Factory, Oracle Exadata, Teradata and BigQuery, whereas Snowflake is most compared with BigQuery, Azure Data Factory, Teradata, Vertica and Teradata Cloud Data Warehouse. See our Apache Hadoop vs. Snowflake report.
See our list of best Data Warehouse vendors and best Cloud Data Warehouse vendors.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.
Apache Hadoop is for data lake use cases. But getting data out of Hadoop for meaningful analytics is indeed need quite an amount of work. by either using spark/Hive/presto and so on. The way i look at Snowflake and Hadoop is they complement each other. For data lake you can use hadoop and then for datawarehouse companies can use snowflake. Depending on the size of the company you can turn snowflake into a data lake use case too. Snowflake is SQL friendly and you don't need to carry out any circus to get the data in and out of snowflake.