We performed a comparison between Apache Hadoop and Microsoft Parallel Data Warehouse based on real PeerSpot user reviews.
Find out in this report how the two Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The most valuable feature is the database."
"Hadoop is designed to be scalable, so I don't think that it has limitations in regards to scalability."
"It's good for storing historical data and handling analytics on a huge amount of data."
"Since both Apache Hadoop and Amazon EC2 are elastic in nature, we can scale and expand on demand for a specific PoC, and scale down when it's done."
"We selected Apache Hadoop because it is not dependent on third-party vendors."
"The most valuable feature is scalability and the possibility to work with major information and open source capability."
"Hadoop is extensible — it's elastic."
"Initially, with RDBMS alone, we had a lot of work and few servers running on-premise and on cloud for the PoC and incubation. With the use of Hadoop and ecosystem components and tools, and managing it in Amazon EC2, we have created a Big Data "lab" which helps us to centralize all our work and solutions into a single repository. This has cut down the time in terms of maintenance, development and, especially, data processing challenges."
"The UI is very simple and functional for my clients, most of the clients that use the solution are not experts."
"It is a very stable database."
"The data transmissions between the data models is the most valuable feature."
"Microsoft Parallel Data Warehouse integrates beautifully with other Microsoft ecosystem products."
"One of the most important features is the ease of using MS SQL."
"The most valuable features are the performance and usability."
"The most valuable feature is the business intelligence (BI) part of it."
"Tools like the BI and SAS are excellent."
"It needs better user interface (UI) functionalities."
"The solution is very expensive."
"From the Apache perspective or the open-source community, they need to add more capabilities to make life easier from a configuration and deployment perspective."
"The solution could use a better user interface. It needs a more effective GUI in order to create a better user environment."
"Real-time data processing is weak. This solution is very difficult to run and implement."
"It requires a great deal of learning curve to understand. The overall Hadoop ecosystem has a large number of sub-products. There is ZooKeeper, and there are a whole lot of other things that are connected. In many cases, their functionalities are overlapping, and for a newcomer or our clients, it is very difficult to decide which of them to buy and which of them they don't really need. They require a consulting organization for it, which is good for organizations such as ours because that's what we do, but it is not easy for the end customers to gain so much knowledge and optimally use it."
"The solution needs a better tutorial. There are only documents available currently. There's a lot of YouTube videos available. However, in terms of learning, we didn't have great success trying to learn that way. There needs to be better self-paced learning."
"The key shortcoming is its inability to handle queries when there is insufficient memory. This limitation can be bypassed by processing the data in chunks."
"This solution would be improved with an option for in-memory data analysis."
"If the database is large with a lot of columns then it is difficult to clean the data."
"Some compatibility issues occur during deployment, so we need to build the product from scratch for some features."
"The query is slow if we don't optimize it."
"We find the cost of the solution to be a little high."
"It could be made more user-friendly for business users which would increase the user base."
"Sometimes, the product requires rolling back to its previous version during a software update. This particular area could be enhanced."
"Concurrent queries are limited to 32, making it more of a data storage mechanism instead of an active DWH solution."
More Microsoft Parallel Data Warehouse Pricing and Cost Advice →
Apache Hadoop is ranked 5th in Data Warehouse with 33 reviews while Microsoft Parallel Data Warehouse is ranked 8th in Data Warehouse with 32 reviews. Apache Hadoop is rated 7.8, while Microsoft Parallel Data Warehouse is rated 7.6. The top reviewer of Apache Hadoop writes "Handles huge data volumes and create your own workflows and tables but you need to have deeper knowledge". On the other hand, the top reviewer of Microsoft Parallel Data Warehouse writes "An easy to setup tool that allows its users to write stored procedure, making it a scalable product". Apache Hadoop is most compared with Azure Data Factory, Microsoft Azure Synapse Analytics, Oracle Exadata, Snowflake and Teradata, whereas Microsoft Parallel Data Warehouse is most compared with Microsoft Azure Synapse Analytics, Oracle Exadata, SAP BW4HANA, Snowflake and VMware Tanzu Greenplum. See our Apache Hadoop vs. Microsoft Parallel Data Warehouse report.
See our list of best Data Warehouse vendors.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.