Matillion ETL Review

With built-in verification and sampling, anywhere along the transformation-pipeline, ETL engineers can check, see and sample the data.

How has it helped my organization?

It is helping to make Makerbot a data-driven company.

What is most valuable?

The most valuable features are the components for SFDC, RDS, Marketo, Facebook, and Google AdWords; built-in verification; and scheduling, restarting & logging.

On a Redshift project, before Matillion was released, two people literally spent over one month using Sqoop to pull very wide data tables from to Redshift. On a new project using Matillion, it took me 10 minutes to set up and begin importing data from

Built-in verification and sampling is a fabulous feature for ETL engineers. Anywhere along the transformation-pipeline, one can check, see and sample the data. This saves days & weeks of effort and leads to a far more agile project.

What needs improvement?

More frequent releases are needed, due to API changes from Google, Marketo, and Facebook. They frequently release upgrades to their API and consequently frequently deprecate the older version when only a few months old. The only way to use the Matillion components for these APIs successfully is for the Matillion release process to step up to the plate and have far more frequent "minor" API releases (as opposed to "major" product releases).

Even having these automated might not be a bad idea. Some customers willing to pay might open up a new revenue stream for "platinum" service, to take the headache out of this very valuable set of marketing components in Matillion.

What do I think about the stability of the solution?

I only encountered stability issues when accidentally performing EC2-intensive Python jobs (i.e., not Redshift-intensive SQL). These can kill the Matillion EC2 instance.

What do I think about the scalability of the solution?

I have not encountered any scalability issues.

How is customer service and technical support?

Customer Service:

Customer service is excellent.

Technical Support:

Technical support is above-and-beyond.

Which solutions did we use previously?

SnapLogic and Informatica: too slow; for MPP, they are just glorified and expensive Python schedulers.

Python scripts: high maintenance.

How was the initial setup?

Initial setup was straightforward.

What about the implementation team?

I implemented it myself.

What was our ROI?

We achieved ROI in <1 year.

What's my experience with pricing, setup cost, and licensing?

The first two weeks are free; pay by the hour for smallest instance for next 2-3 months; after that, take out yearly discounted rate from AWS Marketplace for instance/devs in team.

Which other solutions did I evaluate?

We also evaluated SnapLogic, Informatica, Talend, and Hadoop.

What other advice do I have?

The mindset of the traditional ETL tools is to off-load transformation to another server/DB. This totally misses the point of MPP and especially of Redshift. Load the data into Redshift early and then transform it inside Redshift ("ELT" not "ETL"). Matillion orchestrates the loading and transformation "pipelines", then gets out of the way whilst Redshift does what it is good at (i.e. the "grunt work").

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Add a Comment
Sign Up with Email