In a recent question posted on IT Central Station, a community member asked for advice about designing reports using Jaspersoft. He's comparing Pentaho vs. Jaspersoft which he would like to use to clean the data. Here's a quick roundup of what our community members had to say:
reviewer111504 wrote that it "Makes no difference really from a design and report point of view. They both are created from the Kettle codebase however Pentaho with its latest release is revamping its data ingestion solutions so its ahead of the curve against Jasper ETL. If you already paid for the full Jasper licences then its a no brainer to use Jasper toolkit."
Several users were quick to explain that "Jaspersoft is strictly a reporting tool and has no ETL capability. Pentaho has some ETL capabilities, but out of my experience I Will always use Talend DI." (Igor Korelič). reviewer72435 also wrote "Jaspersoft is for embedding analytics/reports within applications predominantly in a Java application framework. I don't believe it has any "clean the data first" capabilities, which is more of a data quality/ETL requirement."
Another user, Ivan de Vargas Lopes Jr says "Ideally, you use an ETL tool like Talend or SSIS (SQL Server Integration Services), because they are specialized tools for data processing. But between the two tools mentioned, the Pentaho is simpler than the Jaspersoft."
BIExpert221 is clearly a Pentaho fan. He wrote "Only have background in Pentaho usage so can’t speak to how easy Jaspersoft is to use, however, I have to say Pentaho DI is excellent and is one of the most straightforward ETL tools I’ve seen in 15 years of working in BI. It is an extremely user friendly interface and has an impressive and passionate user community who can help you if you get stuck, as well as excellent documentation on how to do things. I’d thoroughly recommend using Pentaho DI"
CIO Marty Smith notes that neither product is good for large data volumes. Mainly, he says "The data cleansing capability of the two products is very comparable. But it really depends on the volume of data and the time allowed to clean and load the data."
Do you agree with these recommendations? Please let the community know by commenting below.