Data profiling, data quality reporting
Data profiling, data quality reporting
Sometimes a project knows little about a particular set of data. IA is good at data profiling / data discovery. It can give insight into data about data type, format, uniqueness, completeness, frequency distribution, etc. The other powerful feature of IA is its ability to check data against business rules. It can give statistics on how many records violate the rule.
Data rules, column analysis, virtual tables
The interface is not the most friendly. Performance.
There are also these following features - documented in the user guide - but do not work:
1. Global Logical Variables (GLVs)
2. Migrating projects. Neither the internal method (Export/Import) nor the command line interface (CLI) method work 100%. They always error out.
3. When you open a data rule and do no modifications, when you close it, IA asks if you want to save the changes, even if you did not make any. A bit disturbing when you know you did not change anything yet you start to doubt what you think you know.
My wish list for new features:
1. Ability to use functions on data sources. I do not understand how IBM could miss this. Data sources are not visible when coding custom expressions. For example if you have a field called CUSTOMER.ACCOUNT_NUM, you cannot code TRIM(ACCOUNT_NUM). My workaround is to create a variable in the rule definition then bind it in the data rule. Functions can only be applied to variables, not directly to fields. I have a rule where I do things to about 12 fields - concatenate, substring, length, coalesce, etc - and I had to make up 12 lines in the definition that do nothing but refer to these variables. I had to invent a rule so I coded seemingly useless rule conditions like address1 = address1 just so I have a variable for the field I want to code functions for. Huge oversight on the part of IBM.
2. Copy a data rule and modify the copy. Right now only rule definitions can be copied, not data rules. Sometimes I need to create two or more versions of the same rule. IA forces me to generate each of them from scratch. This is annoying when version 2 is only slightly different from version 1. If it took me an hour to code the original, it would take me close to that amount of time to code the new version. If I could copy and modify, the effort would only take maybe 5 minutes.
3. The date of last modification. IA only shows the date of creation which is generally useless. The last modification date is far more important and needs to be available and visible.
4. A file manager, a la Windows Explorer. I may want to see the list of rules and sort them by date of modification.
5. Enhanced dedup on output. Currently, IA can only exclude duplicates based on the entire record. It should allow deduping on a select set of columns.
6. Feature to select one record from multiple matches in a join. For instance, in Oracle SQL, one can FETCH FIRST ROW ONLY or use ROWNUM or TOP 1.
7. Ability to sort the output.
8. New virtual tables take a while to appear. You create one and the list doesn't list the new table. Wait 15 minutes or so and maybe it will be listed. Or log out and log back in.
The tool sometimes crashes or freezes. But the latest version, 11.7, is more than stable than previous ones.
Scale of 1 to 10: 8. While IBM is excellent at responding to inquiries, it is slow to implement much-needed software fixes. While that is common in the industry, I would still like to see IBM fix software bugs sooner.
Same as customer service.
No never had the chance.
I have not been involved in setup but I understand it is very complex, not for the faint of heart.
I was not involved in the selection.
Get the latest version. Compare with competing products. Know that there are not many experts in the product and that they may pay a premium to hire them.