Processing speed – especially loading and transformation of large data sets.
Processing speed – especially loading and transformation of large data sets.
Before we implemented Greenplum, our weekly data loads (for third party marketing data sets) were taking over three days. (We also had some monthly data that could take up to 3 days to load and transform via Informatica.) After we implemented Greenplum, the loads were reduced to less than nine hours. Previously, we were receiving data early Wed a.m. and not getting out to the salesforce (if we were lucky) until noon on the following Monday. Now we get the data to the field early Friday mornings before they wake up.
The Greenplum appliance itself has had some reliability issues, so it would be great if that could be improved in the next version. More critical, though, is that the latest devices are not backward compatible. i.e., We have to replace our entire environment to upgrade. That’s quite an expense. I would hope they could improve the upgrade roadmap in the future.
We have used EMC Consulting for some projects, and we have lots of EMC storage.
If you can, do a benchmark with other MPP options including cloud alternatives. Although our Greenplum implementation was very successful (going on 4 years ago), I wish we had benchmarked against Teradata and Netezza (now IBM) at least. Today, I would consider not even buying hardware… just doing it all in the cloud.