IBM InfoSphere DataStage Reviews

Name: IBM InfoSphere DataStage
Brand: IBM
Rating: 4 (37 reviews)

Vendor: IBM

3.9 out of 5

37 reviews
82% willing to recommend

2,018 followers

What is IBM InfoSphere DataStage?UNIXBusinessApplication
Price:

IBM InfoSphere DataStage is a high-quality data integration tool that aims to design, develop, and run jobs that move and transform data for organizations of different sizes. The product works by integrating data across multiple systems through a high-performance parallel framework. It supports extended metadata management, enterprise connectivity, and integration of all types of data.

Get the IBM InfoSphere DataStage Buyer's Guide and find out what your peers are saying about IBM InfoSphere DataStage, Azure Data Factory, Informatica PowerCenter and more!

IBM InfoSphere DataStage is the #7 ranked solution in top Data Integration Tools. PeerSpot users give IBM InfoSphere DataStage an average rating of 7.8 out of 10. IBM InfoSphere DataStage is most commonly compared to Azure Data Factory: IBM InfoSphere DataStage vs Azure Data Factory. IBM InfoSphere DataStage is popular among the large enterprise segment, accounting for 75% of users researching this solution on PeerSpot. The top industry researching this solution are professionals from a financial services firm, accounting for 26% of all views.

Buyer's Guide

IBM InfoSphere DataStage

March 2024

Get the report

Helped 768,740 peers since 2012

Featured reviews

Sumeet Zalpuri

Data engineer at ASR Nederland N.V.

Apr 7, 2023

I'm working for an insurance provider. They have applications where they register claims, insurance, et cetera. We get a flat file from the vendor and put those flat files into our Oracle Data Warehouse and report on the data. We publish those reports to our institutional investors, partners, and…

Read full review

Rahul Saxena

Manager - Business Technology Solutions at a consultancy with 1,001-5,000 employees

Jan 23, 2024

I deal with companies from the healthcare industry. The solutions are largely cloud-based. In data-rich industries like telecom or BFSI, such tools are extensively used. Healthcare also has a lot of data. I will encourage people to use the solution. It is quite an easy tool. Every stage has a help guide. It’s an extensive documentation. We can understand the purpose of a stage, how the connection has to be set up, how to set up a username and password, and whom we should contact. New users must start using the tool and explore it. They might have to invest ten days or two weeks to understand the workflows and options. It is easy to learn. My company is a partner with IBM. Overall, I rate the product a nine out of ten.

Read full review

Murali B

Data Engineer at Ernst & Young

Mar 28, 2024

DataStage facilitated our peak data integration projects. For example, big data integrations have happened, particularly when we worked with BigQuery files... that integration server. DataStage parallel processing capabilities have improved data tasks. When I worked with DataStage, it could handle around two terabytes of data. We have other appliances as well, but we're processing data concurrently. It was good. My team supported it well, and everything worked fine. The GUI was good. Compared to Cloud Pak for Data, we have some enhanced connectors in the standard InfoSphere DataStage version. That standard version is really good; it's easy to use. When we want to find out the absolute quality of data, the governance features really helped. For example, when we tried to identify discrepancies between systems, it worked well.

Read full review

IBM InfoSphere DataStage market share

As of March 2024, the market share of IBM InfoSphere DataStage in the Data Integration category stands at 7.1%, marking an increase of 31.9% compared to the previous year, according to calculations based on PeerSpot user engagement data.

Data Integration

Key learnings from peers

Last updated Apr 9, 2024

Valuable Features

"Compared to other ETL tools, DataStage has excellent debugging and development capabilities. And the availability of connectors, even though we sometimes have to opt for specific ones. Also, the availability of patches is good."
"The most valuable feature for our data processing needs is IBM InfoSphere DataStage's capability to handle ETL tasks with large record volumes."
"IBM is stable and accurate to monitor. It's easy to understand to monitor the data lineage from source to target."

Room for Improvement

"In terms of intermediate storage, we have some challenges, especially with customers who store data in intermediate locations."
"Improvements for DataStage could include better integration with modern data sources like cloud solutions and documents, along with enhancing its capability to handle non-structured data."
"DataStage is quite expensive. It is too hard to find a consultant using DataStage in Turkey."

Pricing

"The pricing is competitive but on the higher side of the pricing scale."
"The solution is cheap."
"The product is expensive."

These insights are based on the in-depth reviews provided by peers to help you make a better buying decision.

Download our IBM InfoSphere DataStage Buyer's Guide for additional reliable information.

Review data by company size

By reviewers

By visitors reading reviews

Top industries

By visitors reading reviews

Financial Services Firm

26%

Manufacturing Company

11%

Computer Software Company

10%

Insurance Company

Government

Retailer

Healthcare Company

Educational Organization

Comms Service Provider

Energy/Utilities Company

University

Media Company

Real Estate/Law Firm

Wholesaler/Distributor

Construction Company

Logistics Company

Transportation Company

Non Profit

Legal Firm

Hospitality Company

Recreational Facilities/Services Company

Pharma/Biotech Company

Performing Arts

Outsourcing Company

Compare IBM InfoSphere DataStage with alternative products

Learn more about IBM InfoSphere DataStage

The solution is the data integration component of IBM InfoSphere Information Server, providing a graphical framework for moving data from source systems to target systems. IBM InfoSphere DataStage can deliver data to data warehouses, data marts, operational data sources, and other enterprise applications. The tool works with various types of patterns - extract, transform and load (ETL), and extract, load, and transform (ELT). The scalability of the platform is achieved by using parallel processing and enterprise connectivity.

The solution has various versions, catering to different types of companies, which include the Server Edition, the Enterprise Edition, and the MVS Edition. Depending on which version a company has bought, different goals can be achieved. They include the following:

Designing data flows to extract information from multiple sources, transform the data, and deliver it to target databases or applications.
Delivery of relevant and accurate data through direct connections to enterprise applications.
Reduction of development time and improvement of consistency through prebuilt functions.
Utilization of InfoSphere Information Server tools for accelerating the project delivery cycle.

IBM InfoSphere DataStage can be deployed in various ways, including:

As a service: The tool can be accessed from a subscription model, where its capabilities are a part of IBM DataStage on IBM Cloud Park for Data as a Service. This option offers full management on IBM Cloud.
On premises or in any cloud: The two editions - IBM DataStage Enterprise and IBM DataStage Enterprise Plus - can run workloads on premises or in any cloud when added to IBM DataStage on IBM Cloud Pak for Data as a Service.
On premises: The basic jobs of the tool can be run on premises using IBM DataStage.

IBM InfoSphere DataStage Features

The tool has various features through which users can integrate and utilize their data effectively. The components of IBM InfoSphere DataStage include:

AI services: The tool offers services such as data science, event messaging, data warehousing, and data virtualization. It accelerates processes through artificial intelligence (AI) and offers a connection with IBM Cloud Paks - the cloud-native insight platform of the solution.
Parallel engine: Through this feature, ETL performance can be optimized to process data at scale. This is achieved through parallel engine and load balancing, which maximizes throughput.
Metadata support: This feature of the product uses the IBM Watson Knowledge Catalog to protect companies' sensitive data and monitor who can access it and at what levels.
Automated delivery pipelines: IBM InfoSphere DataStage reduces costs by automating continuous integration and delivery of pipelines.
Prebuilt connectors: The feature for prebuilt connectivity and stages allows users to move data between multiple cloud sources and data warehouses, including IBM native products.
IBM DataStage Flow Designer: This feature offers assistance through machine learning design. The product offers its clients a user-friendly interface which facilitates the work process.
IBM InfoSphere QualityStage: The tool provides a feature that automatically resolves data quality issues and increases the reliability of the delivered data.
Automated failure detection: Through this feature, companies can reduce infrastructure management efforts, relying on the automated detection that the tool offers.
Distributed data processing: Cloud runtimes can be executed remotely through this feature while maintaining its sovereignty and decreasing costs.

IBM InfoSphere DataStage Benefits

This solution offers many benefits for the companies that utilize it for data integration. Some of these benefits include:

Increased speed of workload execution due to better balancing and a parallel engine.
Reduction of data movement costs through integrations and seamless design of jobs.
Modernization of data integration by extending the capabilities of companies' data.
Delivery of reliable data through IBM Cloud Pak for Data.
Utilization of a drag-and-drop interface which assists in the delivery of data without the need for code.
Effective data manipulation allows data to be merged before being mapped and transformed.
Creating easier access of users to their data by providing visual maps of the process and the delivered data.