Most Helpful Review
The distributed architecture of Snowflake has the capacity to process huge datasets faster and allows us to scale up...
We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
What comes with the standard setup is what we mostly use, but Ambari is the most important.
The best thing about this solution is that it is very powerful and very cheap.
The most valuable features are the ability to process the machine data at a high speed, and to add structure to our data so that we can generate relevant analytics.
Two valuable features are its scalability and parallel processing. There are jobs that cannot be done unless you have massively parallel processing.
Initially, with RDBMS alone, we had a lot of work and few servers running on-premise and on cloud for the PoC and incubation. With the use of Hadoop and ecosystem components and tools, and managing it in Amazon EC2, we have created a Big Data "lab" which helps us to centralize all our work and solutions into a single repository. This has cut down the time in terms of maintenance, development and, especially, data processing challenges.
Since both Apache Hadoop and Amazon EC2 are elastic in nature, we can scale and expand on demand for a specific PoC, and scale down when it's done.
Most valuable features are HDFS and Kafka: Ingestion of huge volumes and variety of unstructured/semi-structured data is feasible, and it helps us to quickly onboard a new Big Data analytics prospect.
High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization.
They separate compute and storage. You can scale storage independently of the computer, or you can scale computing independently of storage. If you need to buy more computer parts you can add new virtual warehouses in Snowflake. Similarly, if you need more storage, you take more storage. It's most scalable in the database essentially; typically you don't have this scalability independence on-premises.
The most valuable features are the clustering, LS50, being able to change the size, the pay per use feature, the flexibility with many different sources and analytic applications.
As long as you don't need to worry about the storage or cost, this solution would be one of the best ones on the market for scalability purposes.
The distributed architecture of Snowflake has the capacity to process huge datasets faster and allows us to scale up and down according to our needs.
In the next release, I would like to see Hive more responsive for smaller queries and to reduce the latency.
The upgrade path should be improved because it is not as easy as it should be.
We would like to have more dynamics in merging this machine data with other internal data to make more meaning out of it.
I would like to see more direct integration of visualization applications.
Based on our needs, we would like to see a tool for data visualization and enhanced Ambari for management, plus a pre-built IoT hub/model. These would reduce our efforts and the time needed to prove to a customer that this will help them.
General installation/dependency issues were there, but were not a major, complex issue. While migrating data from MySQL to Hive, things are a little challenging, but we were able to get through that with support from forums and a little trial and error.
It needs better user interface (UI) functionalities.
The solution should offer an on-premises version also. We have some requirements where we would prefer to use it as a template.
Support needs improvement, as it can take several days before you get some initial support.
There are some stored procedures that we've had trouble with. The solution also needs to fine-tune the connectors to be able to connect into the system source.
Snowflake has to improve their spatial parts since it doesn't have much in terms of geo-spatial queries.
Pricing and Cost Advice
This is a low cost and powerful solution.
There are no licensing costs involved, hence money is saved on the software infrastructure.
Pricing can be confusing for customers.
Answers from the Community
out of 30 in Data Warehouse
Average Words per Review
out of 30 in Data Warehouse
Average Words per Review
Compared 31% of the time.
Compared 30% of the time.
Compared 13% of the time.
Compared 26% of the time.
Compared 20% of the time.
Compared 15% of the time.
Also Known As
|The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.|
Snowflake provides a data warehouse built for the cloud, delivering a solution capable of solving problems for which legacy, on-premises and cloud data platforms were not designed.
Learn more about Apache Hadoop
Learn more about Snowflake
|Amazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web Lab||Accordant Media, Adobe, Kixeye Inc., Revana, SOASTA, White Ops|
Software R&D Company30%
Financial Services Firm20%
Comms Service Provider11%
Software R&D Company32%
Financial Services Firm10%
Comms Service Provider7%