We performed a comparison between Pentaho Data Integration and Analytics and SSIS based on real PeerSpot user reviews.
Find out in this report how the two Data Integration solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The solution has a free to use community version."
"Data transformation within Pentaho is a nice feature that they have and that I value."
"One of the valuable features is the ability to use PL/SQL statements inside the data transformations and jobs."
"We can schedule job execution in the BA Server, which is the front-end product we're using right now. That scheduling interface is nice."
"The amount of data that it loads and processes is good."
"Provides a good open source option."
"It's very simple compared to other products out there."
"Its drag-and-drop interface lets me and my team implement all the solutions that we need in our company very quickly. It's a very good tool for that."
"It is easy to set up the product."
"The most valuable aspect of this solution is that it is simple to use and it offers a flexible custom script task."
"The most valuable features for our company are the flexibility and the quick turn around time in producing simple ETL solutions."
"I have used most of the standard SQL features, but the ones that stand out are the Data Flows and Bulk Import."
"The script component is very powerful, things that you cannot normally do, is feasible through C#."
"There are many good features in this solution including the data fields, database integration, support for SQL views, and the lookups for matching information."
"The interface is very user-friendly."
"Built in reports show package execution and messages. Logging can also be customized so only what is needed is logged. There is also an excellent logging replacement called BiXpress that provides both historical and real-time monitoring which is more efficient and much more robust than the built-in logging capabilities. And none of this requires custom coding to make it useful unlike many other ETL tools."
"It could be better integrated with programming languages, like Python and R. Right now, if I want to run a Python code on one of my ETLs, it is a bit difficult to do. It would be great if we have some modules where we could code directly in a Python language. We don't really have a way to run Python code natively."
"I would like to see support for some additional cloud sources. It doesn't support Azure, for example. I was trying to do a PoC with Azure the other day but it seems they don't support it."
"If you develop it on MacBook, it'll be quite a hassle."
"The reporting definitely needs improvement. There are a lot of general, basic features that it doesn't have. A simple feature you would expect a reporting tool to have is the ability to search the repository for a report. It doesn't even have that capability. That's been a feature that we've been asking for since the beginning and it hasn't been implemented yet."
"If you're working with a larger data set, I'm not so sure it would be the best solution. The larger things got the slower it was."
"I would like to see more improvements with AS400 DB2."
"Its basic functionality doesn't need a whole lot of change. There could be some improvement in the consistency of the behavior of different transformation steps. The software did start as open-source and a lot of the fundamental, everyday transformation steps that you use when building ETL jobs were developed by different people. It is not a seamless paradigm. A table input step has a different way of thinking than a data merge step."
"Although it is a low-code solution with a graphical interface, often the error messages that you get are of the type that a developer would be happy with. You get a big stack of red text and Java errors displayed on the screen, and less technical people can get intimidated by that. It can be a bit intimidating to get a wall of red error messages displayed. Other graphical tools that are focused at the power user level provide a much more user-friendly experience in dealing with your exceptions and guiding the user into where they've made the mistake."
"SSIS sometimes hangs, and there are some problems with servers going down after they've been patched."
"The interface could use improvement, as well as the administrative tools. Jobs fail from time to time for different reasons. It's not a problem with Microsoft, or SSIS itself. The problems are external, but to find the problems and analyze them it takes too much time."
"I would like to see better integration with Power BI."
"We have issues with SSIS connectors while extracting data from Excel sources."
"It would be nice if you could run SSIS on other environments besides Windows."
"SSIS is cumbersome despite its drag-and-drop functionality. For example, let's say I have 50 tables with 30 columns. You need to set a data type for each column and table. That's around 1,500 objects. It gets unwieldy adding validation for every column. Previously, SSIS automatically detected the data type, but I think they removed this feature. It would automatically detect if it's an integer, primary key, or foreign key column. You had fewer problems building the model."
"It hangs a lot of the time."
"The solution should work on the GPU, graphical processing unit. There should also be piping integration available."
More Pentaho Data Integration and Analytics Pricing and Cost Advice →
Pentaho Data Integration and Analytics is ranked 15th in Data Integration with 48 reviews while SSIS is ranked 2nd in Data Integration with 69 reviews. Pentaho Data Integration and Analytics is rated 8.0, while SSIS is rated 7.6. The top reviewer of Pentaho Data Integration and Analytics writes "It's flexible and can do almost anything I want it to do". On the other hand, the top reviewer of SSIS writes "Maintaining the solution and contacting its support team is easy". Pentaho Data Integration and Analytics is most compared with Azure Data Factory, Talend Open Studio, Oracle Data Integrator (ODI), AWS Glue and SAP Data Services, whereas SSIS is most compared with Informatica PowerCenter, Talend Open Studio, IBM InfoSphere DataStage, Oracle Data Integrator (ODI) and AWS Glue. See our Pentaho Data Integration and Analytics vs. SSIS report.
See our list of best Data Integration vendors.
We monitor all Data Integration reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.
There are two products I know about
* TimeXtender : Microsoft based, Transformation logic is quiet good and can easily be extended with T-SQL , Has a semantic layer that generates metat data for cubes . price approx 40K$, works with tables
. Attunity (Bought by Qlik) : technology agnostic , nice web interface , expensive > 100K€. Works with transaction logs
There are many other pure ETL tools
* ERWIN has a nice one ,
Depends upon the technologies being used. If you're using Oracle for both OLTP and OLAP then you'll get a lot of value from an Oracle solution.
The other question is how up to date do you want your OLAP DB to be? Goldengate is a good answer if you're looking to minimize latency, but it can be expensive. ODI is less expensive but better suited to bulkier data sets. If an Oracle product wasn't the option I'd probably consider something like Informatica.
Hi Rajneesh,
yes here is the feature comparison between the community and enterprise edition : www.hitachivantara.com
And a short description of the community edition: www.predictiveanalyticstoday.com
And the download link: community.hitachivantara.com
You can ask more from the great community: forums.pentaho.com
Regards
Károly
We usually use Talend.
Look here: community.talend.com
As someone mentioned, if you're purely Oracle shop and staying that way then there's value with prioritizing Oracle tools. However, let me contrast that with this caveat...
Consider expectations for tool and vendor longevity. Oracle has a long history of retiring and/or replacing tools leaving customers in the cold with prior versions/tools (I've been burned multiple times by Oracle product retirements or replacements including OWB, Oracle Designer2k, Oracle Express, Oracle OEDW, their purchase of Sagent ETL which as later abandoned).
But I would also consider these questions and relative prioritization:
What is your organization's plans for moving to other database technologies?
Where is your org going with on-prem versus cloud solutions? How important are PaaS versus IaaS solutions?
Where is your current staff's expertise?
Prioritize mature over immature tools.
How many sources do you have? What are their technologies and does the integration tool support them?
Is it just moving data from a single ERP such as Oracle EBS to Olap? When you say Olap what do you mean by that? Are you talking Oracle Olap product or something else? That makes a really big difference of course - if your ETL tool doesn't support your source(s) and target(s) then it shouldn't be considered.
Given the industry's trajectory, I myself would highly prioritize PaaS solutions over others.
What is the OLAP that you are using? Hosted in Cloud or on-premise?
The target DB should have its tool to extract data.
Pentaho is a really nice tool if opensource is the only option.
Please think about issues such as upgrade and disaster in the future. These operations are very easy in Pentaho.
I can only suggest one thing for replication and that is Qlik. (ex-Attunity).
Hi Karoly, Thanks for your input. community: forums.pentaho.com is not allowing new registrations for new users. I guess they accept queries from customers only and not from any one. Do you know any other forum, community, SMEs contacts who can help on queries?