Microsoft Parallel Data Warehouse Review

Microsoft PDW History


Originally published at https://www.linkedin.com/pulse/microsoft-pdw-history-datallegro-stephen-c-folkerts

Microsoft SQL Server Parallel Data Warehouse (PDW) is the result of the DATAllegro acquisition in 2008 for roughly $238M. Datallegro was the invention of Stuart Frost to compete with Netezza which is now IBM PureData System for Analytics. Stuart Frost founded DATAllegro in 2003, was CEO of the company from the beginning, and specified the architecture of the product.Netezza came to market with a compelling value proposition. It leveraged an open source Postgres DBMS. It used an appliance business model to create a tightly integrated software and hardware stack, removing a significant area of complexity for DBAs and other system staff. It shifted to sequential I/O from the more typical random I/O in SMP architectures. This allowed the use of much larger and cheaper SATA disk drives and led to a highly competitive price/performance ratio. However, there was a significant flaw in Netezza's strategy. They created a highly proprietary hardware platform and, effectively, a proprietary software platform, with little of Postgres remaining.

Netezza secured its first few customers around the time DATAllegro was being founded. Looking at the Netezza architecture, Stuart Frost realized that there was an opportunity to create a similar value proposition while using a completely non-proprietary platform. Frost’s vision was to create a massively parallel DW appliance with an embedded, off-the-shelf open source Ingres DBMS running on Linux and using completely standard servers, networking and storage from major vendors.

Each server in DATAllegro ran a highly tuned copy of the Ingres DBMS and custom Java on SuSe Linux. These separate database servers were turned into a massively parallel, shared nothing database system that offered incredibly good performance, especially under complex mixed workloads.

Once Microsoft acquired DATAllegro in 2008, the first obvious task was to port the appliance over to the Microsoft SQL Server Windows stack. Microsoft internally went to work on this migration between the 2008 and 2010 period of time. It was known then as project ‘Madison’. In 2010, IBM ponied up $1.8 billion for DATAllegro's biggest competitor, Netezza.

Microsoft Parallel Data Warehouse (PDW)

See my article Microsoft Parallel Data Warehouse (PDW) for a more in-depth look at Microsoft SQL Server PDW.

Microsoft Analytics Platform System (APS)

See my article Microsoft Analytics Platform System (APS) for a more in-depth look at Microsoft APS.

These views are my own and may not necessarily reflect those of my current or previous employers.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Add a Comment
Guest
Sign Up with Email