What is our primary use case?
We are supporting a healthcare domain vendor located in the US. We get data from various domains, such as health insurance. We have member data, provider data, and consumer data. We also have client-related stuff and broker-related commission data.
We get the data from these domains, and after receiving it, we apply the transformation rules, such as joints. We also do the standardization of data by formatting and doing field validations, such as formatting the date field and doing data and time validations. We also do other normal transformations with some business logic. After applying all this, we send the data to the business.
What is most valuable?
We are mostly using transmission rules. It has a lot of functions and logic related to transmission. It is a user-friendly tool with in-built functions.
What needs improvement?
It doesn't have any big data connections. It would be good to have them because most of the systems are moving towards big data. There should also be a user-friendly way to interact with the cloud.
Its loading process is very slow. It takes a lot of time for around 5 or 6 million records, and we are not able to provide real-time data to the vendors due to this delay. Its performance needs to be improved.
It is also like a legacy system. It is not updated much. In higher versions, they only do small changes. We would like to have new features and new technologies.
For how long have I used the solution?
I have been using this solution for around 15 years.
What do I think about the scalability of the solution?
It is easy to scale. In my project, six or seven people are using this solution, but in my company, we have around 15 to 16 projects.
How are customer service and technical support?
We have an internal admin team for support. If they are not able to solve an issue, they raise a ticket with the IBM team. In the last ten years, we had to contact IBM only two to three times. Our internal team is able to handle most of the issues.
How was the initial setup?
Its initial setup has moderate complexity. It required some coordination with the vendor because their system also needs to be ready. We also get maintenance support from them.
What's my experience with pricing, setup cost, and licensing?
Our internal team takes care of group licensing and cost. We don't have individual licenses. We have group licensing at the company level. Usually, IBM doesn't charge anything separately on the licensing side.
For storage and everything else, we are paying around $6,000 per month, which is not very high. It includes Linux data storage, execution, and licensing. They're charging $40 for one-hour execution. Based on that, we are spending around $2,000 on the production environment and $1,000 on the lower environment for testing and development-side executions. For the mainframe, we are using the Db2 mainframe database, and we are spending around $1,000 on the Db2 mainframe database as well. All this comes out to be around $6,000. We, however, would like to have some cost reduction.
What other advice do I have?
DataStage is a good tool for the ETL platform, but it is not suitable for a huge volume of data. It works well for low to medium volume of data. I would advise others to do a feasibility study and evaluate available options in the market in terms of features and cost.
I would rate IBM InfoSphere DataStage a seven out of ten.
Which deployment model are you using for this solution?
Which version of this solution are you currently using?