We performed a comparison between IBM InfoSphere DataStage and Pentaho Data Integration and Analytics based on real PeerSpot user reviews.
Find out in this report how the two Data Integration solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."We like the flexibility of modeling."
"We are mostly using transmission rules. It has a lot of functions and logic related to transmission. It is a user-friendly tool with in-built functions."
"Once you have Infosphere up and running properly, it is stable."
"The ETL tools are probably the most valuable feature. It has an IBM tool, a friendly UI and it makes things more comfortable."
"IBM is stable and accurate to monitor. It's easy to understand to monitor the data lineage from source to target."
"The most valuable feature is the data integration for data warehousing."
"The solution is stable."
"ETL is the most valuable feature."
"The area where Lumada has helped us is in the commercial area. There are many extractions to compose reports about our sales team performance and production steps. Since we are using Lumada to gather data from each industry in each country. We can get data from Argentina, Chile, Brazil, and Colombia at the same time. We can then concentrate and consolidate it in only one place, like our data warehouse. This improves our production performance and need for information about the industry, production data, and commercial data."
"I can use Python, which is open-source, and I can run other scripts, including Linux scripts. It's user-friendly for running any object-based language. That's a very important feature because we live in a world of open-source."
"The fact that it enables us to leverage metadata to automate data pipeline templates and reuse them is definitely one of the features that we like the best. The metadata injection is helpful because it reduces the need to create and maintain additional ETLs. If we didn't have that feature, we would have lots of duplicated ETLs that we would have to create and maintain. The data pipeline templates have definitely been helpful when looking at productivity and costs."
"Its drag-and-drop interface lets me and my team implement all the solutions that we need in our company very quickly. It's a very good tool for that."
"It makes it pretty simple to do some fairly complicated things. Both I and some of our other BI developers have made stabs at using, for example, SQL Server Integration Services, and we found them a little bit frustrating compared to Data Integration. So, its ease of use is right up there."
"This solution allows us to create pipelines using a minimal amount of custom coding."
"The fact that it's a low-code solution is valuable. It's good for more junior people who may not be as experienced with programming."
"It's my understanding that the product can scale."
"Improvements for DataStage could include better integration with modern data sources like cloud solutions and documents, along with enhancing its capability to handle non-structured data."
"It would be great if they can include some basic version of data quality checking features."
"The pricing should be lower."
"It doesn't have any big data connections. It would be good to have them because most of the systems are moving towards big data. There should also be a user-friendly way to interact with the cloud. Its loading process is very slow. It takes a lot of time for around 5 or 6 million records, and we are not able to provide real-time data to the vendors due to this delay. Its performance needs to be improved. It is also like a legacy system. It is not updated much. In higher versions, they only do small changes. We would like to have new features and new technologies."
"Reduced cost would allow more customers to choose the product. It's quite expensive in relation to the cost of other similar solutions."
"The error messaging needs to be improved."
"There could be more customization options for the product."
"The documentation and in-application help for this solution need to be improved, especially for new features."
"The performance could be improved. If they could have analytics perform well on large volumes, that would be a big deal for our products."
"The reporting definitely needs improvement. There are a lot of general, basic features that it doesn't have. A simple feature you would expect a reporting tool to have is the ability to search the repository for a report. It doesn't even have that capability. That's been a feature that we've been asking for since the beginning and it hasn't been implemented yet."
"In the Community edition, it would be nice to have more modules that allow you to code directly within the application. It could have R or Python completely integrated into it, but this could also be because I'm using an older version."
"If you're working with a larger data set, I'm not so sure it would be the best solution. The larger things got the slower it was."
"Although it is a low-code solution with a graphical interface, often the error messages that you get are of the type that a developer would be happy with. You get a big stack of red text and Java errors displayed on the screen, and less technical people can get intimidated by that. It can be a bit intimidating to get a wall of red error messages displayed. Other graphical tools that are focused at the power user level provide a much more user-friendly experience in dealing with your exceptions and guiding the user into where they've made the mistake."
"I was not happy with the Pentaho Report Designer because of the way it was set up. There was a zone and, under it, another zone, and under that another one, and under that another one. There were a lot of levels and places inside the report, and it was a little bit complicated. You have to search all these different places using a mouse, clicking everywhere... each report is coded in a binary file... You cannot search with a text search tool..."
"It could be better integrated with programming languages, like Python and R. Right now, if I want to run a Python code on one of my ETLs, it is a bit difficult to do. It would be great if we have some modules where we could code directly in a Python language. We don't really have a way to run Python code natively."
"The product needs more plugins."
More Pentaho Data Integration and Analytics Pricing and Cost Advice →
IBM InfoSphere DataStage is ranked 7th in Data Integration with 37 reviews while Pentaho Data Integration and Analytics is ranked 16th in Data Integration with 48 reviews. IBM InfoSphere DataStage is rated 7.8, while Pentaho Data Integration and Analytics is rated 8.0. The top reviewer of IBM InfoSphere DataStage writes "User-friendly with a lot of functions for transmission rules, but has slow performance and not suitable for a huge volume of data". On the other hand, the top reviewer of Pentaho Data Integration and Analytics writes "It's flexible and can do almost anything I want it to do". IBM InfoSphere DataStage is most compared with SSIS, IBM Cloud Pak for Data, Azure Data Factory, Talend Open Studio and Informatica PowerCenter, whereas Pentaho Data Integration and Analytics is most compared with Azure Data Factory, SSIS, Talend Open Studio, Oracle Data Integrator (ODI) and AWS Database Migration Service. See our IBM InfoSphere DataStage vs. Pentaho Data Integration and Analytics report.
See our list of best Data Integration vendors.
We monitor all Data Integration reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.