What are the must-have features for a Data integration system?

From a user's or a developer's perspective. Solutions you have have experience with which include these features?

33 Answers

author avatar
Real User

Thanks Gary for the details I missed to include. Indeed the cost, skill and maintenance of licensed tool is the make-or-break factor while deciding which tool to opt for, often realized through experience over period of time.

author avatar
Top 5PopularReal User

The assumption that you are referring to batch/ETL data integration rather than process mentioned by Abhishek said is critical. The primary difference between ETL verses IAI being in the areas of expected latency, data format, and data verses process transformation. I say that because there is overlapping functionality between those two categories of integration tools while being fundamentally different in design and expectations.

Other tools which are potentials are Ab Initio and Microsoft SqlServer Integration Services. You can't go wrong with InfoSphere or Informatica if money is no object and willing to make the investment.

While all of the above tools will share some level of features that Abhishek mentioned (which was a great summary btw), the devil as they say is in the details. Usability of logging, scalability, richness and usability of metadata, user friendliness/level of support, scheduling - these are often very different and you won't easily tell significant differences until you actually use them for awhile. The differences will show up in how much customization developers have to do to make them perform and are supportable in the real world.

author avatar
Real User

Data Integration systems, I would assume, are not referring to messaging and EAI here. For non-messaging or non-EAI Data Integration systems, I would look for these features:
01. Range of built-in components available to transform data.
02. Degree of customization possible at transformation level, process level and group of processes (batch) level.
03. Exception Handling mechanism supported, either through: built-in, configuration or custom.
04. Scheduling and Reporting of status of process(es).
05. Notification, Alerts and Logging, as required for specific process(es).
06. Variety of Sources and Targets which can be used for design of ETL process(es).
07. Architecture scalability, when: processing large (and very large) data volumes, failure recovery, high availability.
08. Concurrency and Stability of the system during BCP and/ or Failover recovery.
09. Available and accessible Metadata Repository.
10. Client software being available in various compatible OS (optional).

Few of the market leading solutions which meet most of the requirements are:
1. Informatica PowerCenter (licensed).
2. IBM InfoSphere DataStage (licensed)..
3. SAP BODI (licensed).
4. Talend DI (OpenSource).
5. Pentaho Kettle (OpenSource).
6. Oracle DI/ Oracle WB (licensed).

Find out what your peers are saying about Informatica, Microsoft, Talend and others in Data Integration Tools. Updated: June 2021.
509,570 professionals have used our research since 2012.