This is the third of a six-part review of the Pentaho BI suite. In each part of the review, we will take a look at the components that make up the BI suite, according to how they would be used in the real world.
In this third part, we'll be discussing the tools and facilities, with which all of the reports are designed, generated, and served. A full BI suite should have a few reporting facilities that are usable by users with different level of technical/database knowledge.
Why is this important? Because in the real world, owners of data (people who consume the reports to make various business decisions) ranges from accountants, customer account managers, supply-chain managers, C-level executives, manufacturing managers, etc. Notice that proficiency in writing SQL queries a prerequisite to any of those positions?
In the Pentaho BI Suite, we have these reporting components:
Let's discuss each of these components individually. The screenshots below are sanitized to remove references to our actual clients. A fictitious company called “DonutWorld” is used to illustrate and relate the concepts.
This Java standalone program feels like the Eclipse Java development IDE because they share the UI library. If you are already familiar with Jasper Reports, iReports, or Crystal Report, the concepts are similar (bands, groups, details, sub-reports). You start with a master report in which you can combine different data sources (SQL and MDX queries in this case) into a layout that is managed via a set of properties.
Learning experience: As with any report designers, which are complex software because of the sheer number of tweak-able properties governing each element of the reports, one has to be prepared to learn the PRD. While the tools are laid out logically, it will take some time for a new personnel to absorb the main concepts. The sub-report facility is one of the most powerful feature of this program and it is the key to create reports that drills into more than one axis (or dimension) of data.
Usage experience: Things like the placement accuracy of elements within the page is not 100% precise and there are times when I had to work around the quirks and inconsistencies revolving around setting default values for properties, especially the ones containing formulas. Be prepared to have a dedicated personnel (either a permanent employee or a consultant) that can be reached for report designs *and* subsequent modifications. In addition, aesthetic considerations are also important in order to create a visually engaging reports (who wants to read a boring and bland report?).
Figure 1. The typical look of PRD when designing a report.
The Data Source facility is accessible from within the Pentaho BI Server UI (the PUC, see Part 2 of this review series for more information). Once you have logged in, look for a section on the screen that allows you to create or manage existing data sources.
This feature allows data personnel to setup “models” that can be constructed from various data sources, that represents a flat-view of data, of which a non-technical data owners can create ad-hoc reports or dashboards. Obviously this feature does not alleviate the need for knowing how to use the available tools for creating those reports and dashboards. It simply detach the dependency on crafting SQL/MDX queries and the intricacies of OLAP data structures from creating an ad-hoc report.
Learning experience: A data personnel who are familiar with the Data Warehouse (DW) can easily create models out of SQL queries against existing tables within the DW, or by using MDX queries against existing OLAP cubes. Data owners who are familiar with the data itself, can then start to use the Saiku Ad hoc Reporting tool or the CDE (Community-tools Dashboard Editor) to create dashboards. In reality, expect a couple of weeks for the personnels to get accustomed to this feature. Assumption: A knowledgeable BI teacher or consultant is available during this time. Usage experience: By separating the technical-database skill from the ability to generate ad-hoc reports, Pentaho has provided a way for organizations to streamline their business decision-making process further away from the technical minutiae that tends to bog down the process with details that are not relevant to the business goals. I highly rate this feature in the Pentaho BI Suite as one of the more innovative contribution to the area of Business Process Management.
Figure 2. Creating a model out of a SQL query
NOTE: The most important part of using this facility has to do more with business process than the familiarity of the data itself. Without a good process in place, it is quite obvious that the reports can get out of sync with the underlying data model. This is where the construction and maturity of the Data Warehouse is tested. For example, a DW with sufficient maturity will notify the data personnel of any data model changes which will trigger the updating of the Model Data Structure, which may or may not have an effect on the ad-hoc reports.
If the DW is designed correctly, there should be quite a few fact tables that can readily be translated into a Model Data Source. This is the first step. Now let's look at how to use this model.
Saiku is the name of two tools available from the PUC. The first one is the Saiku Analytics tool which allows us drill into an OLAP cube and perform analysis using aggregated measures (we'll review this in Part 4). The second one is the Saiku Ad-hoc Reporting tool. This is the one we are going to look into at this time. Using the modern UI library such as jQuery, the developers of Saiku give us a convenient drag-and-drop UI that is easy to learn and use.
Once a model is published, it will be available to choose from the drop-down list on the top left of the Saiku Ad-hoc Reporting tool. See the screenshot below:
Figure 3. A Saiku report in progress
Next, you can start to choose from the list of available fields in the model to specify as part of either the Columns list, or Groups list. Next, from the same list of available fields, you can specify some values as filters. The most obvious example would be the transaction date and time range which determines what period is the report for.
As you select the fields into the proper report elements, the tool started to populate the preview area with what the report would look like. You can also specify aggregation for each of the groupings, which is very handy.
There is a limited control on templates which governs the appearance of the report, but obviously won't be enough for serious usages. The best remedy however, is available, via the exporting to .prpt file, which you can open in the PRD and tweak to your heart's content.
After you are happy with the report, you can save it for later editing. Another thoughtful design decision by the Pentaho team.
In overall, the Saiku Ad-hoc Reporting tool is a handy facility to craft quick reports that answer specific questions based on the available model data sources. If your data personnel diligently updates and maintains the models, this tool can be invaluable to support your business decisions.
None of the above discussions would mean a whole lot without a practical and useful way for the reports to be delivered to its requesters. Here, the comprehensive nature of the Pentaho BI Suite helps by providing the facilities like xaction and input UI controls for report parameters.
For example a report designed in PRD can be published on the PUC. At some point it is opened by the user on the PUC who supplies the necessary parameters, then the xaction script fire an ETL which renders a .prpt file into a .pdf and either email it to the requester or drop it in a shared folder.
Reports can also be “burst” via an ETL script that utilizes the Pentaho Reporting Output step available from within Spoon (the ETL editor). I have used this method to distribute periodically-generated reports to different recipients containing data that is specific to the said recipient's access permission level. This saves a lot of time and increased the efficiency of up-to-date information distribution inside a company.
The reporting tools in the Pentaho BI Suite is designed to allow different users within the company to generate reports that are either pre-designed or ad-hoc. The reports are made available on the Pentaho User Console (PUC) where users login and initiate the report generation. Reports can also be scheduled to be generated via ETL scripts.
The PRD will be instantly recognizable by anyone who has experience using tools like Crystal Reports and its derivatives. You can also specify MDX queries against any OLAP cube schema published in the Pentaho BI Server as a data source.
The Model Data Source facility allows data owners who are not data personnels to create ad-hoc reports quickly and save it for future use and modifications.
The Saiku Ad-Hoc report is the UI with which available models can be used to generate reports on-the-fly. These reports can also be saved for later use.
Next in part-four, we will discuss the Pentaho Mondrian (MDX query engine) and the OLAP Cube Schema tools.