Pentaho Data Integration Review

It doesn't have the capability to produce crosstab reports with formatting capabilities. It connects seamlessly to most commonly used data sources.​​


Valuable Features

It is a lightweight ETL tool that's easy to get started on. It connects seamlessly to most commonly used data sources.

Improvements to My Organization

The organization went with Pentaho ETL and Reporting solutions as cost effective products, as compared to competitors. The ETL part certainly met those objectives, along with serving the purpose.

Room for Improvement

Since there have already been newer versions, maybe some of these features are already fixed now. The most troublesome missing feature was the capability to produce crosstab reports with formatting capabilities in the BI Reporting product. The one annoyance that troubled us a lot was the fact that every step in a transformation that needed data, created its own data connection. With some data sources like Greenplum, this was a problem, because they have a limit on available number of connections.

Use of Solution

I used it for three years, from 2012 to 2015, and only stopped as I left the organization.

Deployment Issues

One issue with encountered constantly with PDI deployments was that the environment parameters for jobs had to be updated manually through the designer module 'Spoon'. Although the product has a feature of keeping Environment Variables outside Spoon, that didn't work for us, as we had one Development server used for Dev, QA and UAT.

Stability Issues

There were no issues with the stability.

Scalability Issues

We had no issues scaling it across the company as needed.

Customer Service and Technical Support

It's about average. Most of the help we got was through Google searches and Wiki pages. One time we had an issue with a feature - our version of PDI could not handle microseconds. The product owner came up with a solution, but instead of applying the patch, wanted to sell it to us for a fee.

Initial Setup

I am only aware of the client side setup which was simple enough. It was pretty much a one step installation process.

Implementation Team

It was done by an in-house team. A couple of issues we realized later were regarding memory configuration for the environment. This needs to be evaluated and fine tuned otherwise you can run into job failures with large amount of data. We ran into this issue with 'Commit' points and 'Sort' steps.

Other Solutions Considered

There was an evaluation performed, however I was not involved in it.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Add a Comment
Guest

Sign Up with Email