Compare Apache Hadoop vs. Microsoft Parallel Data Warehouse

Apache Hadoop is ranked 4th in Data Warehouse with 6 reviews while Microsoft Parallel Data Warehouse is ranked 14th in Data Warehouse with 2 reviews. Apache Hadoop is rated 7.6, while Microsoft Parallel Data Warehouse is rated 4.6. The top reviewer of Apache Hadoop writes "We are able to ingest huge volumes/varieties of data, but it needs a data visualization tool and enhanced Ambari for management". On the other hand, the top reviewer of Microsoft Parallel Data Warehouse writes "Concurrency issues forced the customer to use the raw DB as a secondary solution". Apache Hadoop is most compared with Snowflake, Pivotal Greenplum and Oracle Exadata, whereas Microsoft Parallel Data Warehouse is most compared with Oracle Exadata, Snowflake and Teradata. See our Apache Hadoop vs. Microsoft Parallel Data Warehouse report.
Cancel
You must select at least 2 products to compare!
Most Helpful Review
Find out what your peers are saying about Apache Hadoop vs. Microsoft Parallel Data Warehouse and other solutions. Updated: September 2019.
365,820 professionals have used our research since 2012.
Quotes From Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:

Pros
The best thing about this solution is that it is very powerful and very cheap.The most valuable features are the ability to process the machine data at a high speed, and to add structure to our data so that we can generate relevant analytics.Two valuable features are its scalability and parallel processing. There are jobs that cannot be done unless you have massively parallel processing.Initially, with RDBMS alone, we had a lot of work and few servers running on-premise and on cloud for the PoC and incubation. With the use of Hadoop and ecosystem components and tools, and managing it in Amazon EC2, we have created a Big Data "lab" which helps us to centralize all our work and solutions into a single repository. This has cut down the time in terms of maintenance, development and, especially, data processing challenges.Since both Apache Hadoop and Amazon EC2 are elastic in nature, we can scale and expand on demand for a specific PoC, and scale down when it's done.Most valuable features are HDFS and Kafka: Ingestion of huge volumes and variety of unstructured/semi-structured data is feasible, and it helps us to quickly onboard a new Big Data analytics prospect.High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization.​​Data ingestion: It has rapid speed, if Apache Accumulo is used.

Read more »

It handles high volumes of data very well.​It has allowed fast daily loads and analysis of millions of rows of data, which eventually moved to near real-time.​

Read more »

Cons
The upgrade path should be improved because it is not as easy as it should be.We would like to have more dynamics in merging this machine data with other internal data to make more meaning out of it.I would like to see more direct integration of visualization applications.Based on our needs, we would like to see a tool for data visualization and enhanced Ambari for management, plus a pre-built IoT hub/model. These would reduce our efforts and the time needed to prove to a customer that this will help them.General installation/dependency issues were there, but were not a major, complex issue. While migrating data from MySQL to Hive, things are a little challenging, but we were able to get through that with support from forums and a little trial and error.It needs better user interface (UI) functionalities.

Read more »

It needs more compatibility with common BI tools.​Concurrent queries are limited to 32, making it more of a data storage mechanism instead of an active DWH solution.

Read more »

Pricing and Cost Advice
This is a low cost and powerful solution.​There are no licensing costs involved, hence money is saved on the software infrastructure​.

Read more »

Information Not Available
report
Use our free recommendation engine to learn which Data Warehouse solutions are best for your needs.
365,820 professionals have used our research since 2012.
Ranking
4th
out of 30 in Data Warehouse
Views
11,781
Comparisons
10,295
Reviews
7
Average Words per Review
440
Avg. Rating
7.6
14th
out of 30 in Data Warehouse
Views
2,995
Comparisons
2,240
Reviews
2
Average Words per Review
116
Avg. Rating
4.5
Top Comparisons
Compared 32% of the time.
Compared 31% of the time.
Compared 13% of the time.
Also Known As
Microsoft PDW, SQL Server Data Warehouse, Microsoft SQL Server Parallel Data Warehouse
Learn
Apache
Microsoft
Overview
The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

The traditional structured relational data warehouse was never designed to handle the volume of exponential data growth, the variety of semi-structured and unstructured data types, or the velocity of real time data processing. Microsoft's SQL Server data warehouse solution integrates your traditional data warehouse with non-relational data and it can handle data of all sizes and types, with real-time performance.

Offer
Learn more about Apache Hadoop
Learn more about Microsoft Parallel Data Warehouse
Sample Customers
Amazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web LabAuckland Transport, Erste Bank Group, Urban Software Institute, NJVC, Sheraton Hotels and Resorts, Tata Steel Europe
Top Industries
VISITORS READING REVIEWS
Financial Services Firm26%
Software R&D Company21%
Comms Service Provider10%
Government10%
No Data Available
Find out what your peers are saying about Apache Hadoop vs. Microsoft Parallel Data Warehouse and other solutions. Updated: September 2019.
365,820 professionals have used our research since 2012.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.
Sign Up with Email