We just raised a $30M Series A: Read our story

SSIS OverviewUNIXBusinessApplication

SSIS is #4 ranked solution in top Data Integration Tools. IT Central Station users give SSIS an average rating of 8 out of 10. SSIS is most commonly compared to Informatica PowerCenter: SSIS vs Informatica PowerCenter. The top industry researching this solution is Computer Software Company, accounting for 25% of all views.
What is SSIS?
SSIS is a package is an organized collection of connections, control flow elements, data flow elements, event handlers, variables, parameters, and configurations, that you assemble using either the graphical design tools that SQL Server Integration Services provides, or build programmatically. You then save the completed package to SQL Server, the SSIS Package Store, or the file system, or you can deploy the ssISnoversion project to the SSIS server. The package is the unit of work that is retrieved, executed, and saved.

SSIS is also known as SQL Server Integration Services.

Buyer's Guide

Download the Data Integration Tools Buyer's Guide including reviews and more. Updated: October 2021

SSIS Customers
PKP Energetyka, UniCredit Bank, Mostar, waldwasser, Ashok leyland, Florida Atlantic University, Stadt Frankfurt am Main
SSIS Video

Archived SSIS Reviews (more than two years old)

Filter by:
Filter Reviews
Industry
Loading...
Filter Unavailable
Company Size
Loading...
Filter Unavailable
Job Level
Loading...
Filter Unavailable
Rating
Loading...
Filter Unavailable
Considered
Loading...
Filter Unavailable
Order by:
Loading...
  • Date
  • Highest Rating
  • Lowest Rating
  • Review Length
Search:
Showingreviews based on the current filters. Reset all filters
RS
Big Data Architect, Integration Specialist at Groundswell Group Inc
Real User
Easy to implement and supports many connectors, but performance-tuning requires extensive expertise

What is our primary use case?

Our primary use for this solution is to move data between points and applications.

How has it helped my organization?

This is a flexible tool to use and comes with the MSSQL server package.

What is most valuable?

This solution is easy to implement, has a wide variety of connectors, has support for Visual Basic, and supports the C language.

What needs improvement?

Tuning using this solution requires extensive expertise to improve performance.

For how long have I used the solution?

We have been using this solution for fifteen years.

What's my experience with pricing, setup cost, and licensing?

This solution is included with the MSSQL server package.

What is our primary use case?

Our primary use for this solution is to move data between points and applications.

How has it helped my organization?

This is a flexible tool to use and comes with the MSSQL server package.

What is most valuable?

This solution is easy to implement, has a wide variety of connectors, has support for Visual Basic, and supports the C language.

What needs improvement?

Tuning using this solution requires extensive expertise to improve performance.

For how long have I used the solution?

We have been using this solution for fifteen years.

What's my experience with pricing, setup cost, and licensing?

This solution is included with the MSSQL server package.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
SA
Data Architect and Modeler at a tech services company with 201-500 employees
Real User
Works well for small and medium-sized projects, but additional SQL connectors need to be supplied

Pros and Cons

  • "The initial setup of this solution is very straightforward."
  • "The performance of this solution is not as good as other tools in the market."

What is our primary use case?

We use this solution for extracting data from various databases and saving it in our data warehouse.

We use the on-premise deployment model.

What needs improvement?

My team is facing problems regarding the database connectors, which are not available. The MySQL connectors need to be purchased from outside vendors. They should provide connections for more SQL databases, free of charge.

The performance of this solution is not as good as other tools in the market. Compared to the same job is running in a different tool, it will take longer using SSIS.

For how long have I used the solution?

I have been working with this solution for one year.

What do I think about the stability of the solution?

This is a stable solution.

What do I think about the scalability of the solution?

This solution is scalable.

The number of people I have using this solution depends on the size of the project. Normally, I need three to five ETL developers. Sometimes, if the project is big enough, then I will need more.

How are customer service and technical support?

I have not contacted Microsoft Technical Support for this solution, although we have sometimes accessed the internet to research problems that we face.

Which solution did I use previously and why did I switch?

In my previous company, I was leading a team who were working with Informatica. Here, they stick to Microsoft technologies and are unwilling to change.

How was the initial setup?

The initial setup of this solution is very straightforward. In a few hours, everything was up and running.

What other advice do I have?

The decision to use this particular solution includes many factors. Some companies do not want to purchase a license for another product because this one comes included with the database.

SSIS worked well for small or medium-sized Projects. For larger projects with huge data, I believe that you should search for another solution as you will need to do manual fine tuning. Additionally, some components such as SCD will show unexpected errors with huge data.

 As Microsoft is very slow in providing updates and enhancements to SSIS, I see that the future for Integration projects in Saudi Arabia goes towards other vendors products such as  Informatica powercenter, IBM DataStage, and Oracle ODI

Compared to other Integration tools, I would rate it a six out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Find out what your peers are saying about Microsoft, Informatica, Oracle and others in Data Integration Tools. Updated: October 2021.
543,424 professionals have used our research since 2012.
MG
Technical Product Owner, Team Lead at ARCA
Real User
A well designed, straightforward, and easy-to-use solution

Pros and Cons

  • "The setup is straightforward. It's very easy to install."
  • "Video training would be a helpful addition."

How has it helped my organization?

It's too early to say how it has improved my organization, as we've just begun using the solution. First, we need to create the appropriate role before we see organizational improvements. We're just in the discovery phase of the tool.

What is most valuable?

I couldn't point out a specific feature above all others, but I can say the solution is very useful to us. It's well designed and quite straightforward. There isn't much of a learning curve involved. It's a well-documented solution. 

What needs improvement?

It's at a very early stage in terms of adopting the tool, so I can't speak to if there are areas for improvement just yet.

Video training would be a helpful addition, however.

For how long have I used the solution?

I've been using the solution for one year.

What do I think about the stability of the solution?

The stability of the solution is impressive.

What do I think about the scalability of the solution?

The scalability of SSIS is very good. Right now, only I am using the solution, but we plan to increase usage in the future.

How are customer service and technical support?

I haven't required any technical support, so I'm not in a position to judge it.

How was the initial setup?

The setup is straightforward. It's very easy to install.

What about the implementation team?

We did the implementation ourselves.

Which other solutions did I evaluate?

We did not evaluate other options before choosing this solution.

What other advice do I have?

We are just beginning development with this solution, so we're currently starting with the on-premises version.

So far, I haven't been disappointed by the tool. I have not explored the dashboard tuning feature yet, so I don't know if there are things we can do, like building our own matrixes or choosing deployment preferences, but I hope that will be the case.

I'd rate the solution ten out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
AltanAtabarut
Founding Partner at Altdata Analytics
Real User
Top 5Leaderboard
Has the potential to expand so that you can immediately write code onto the SQL server

Pros and Cons

  • "With this solution, there is the potential to expand, so that you can immediately write code onto the SQL server."
  • "The solution should work on the GPU, graphical processing unit. There should also be piping integration available."

What is our primary use case?

I'm using the solution primarily for e-memory and e-database calculation, mostly data blending (ETL) and massive table joining, etc. I also use it for data preparation and modeling.

What is most valuable?

With this solution, there is the potential to expand so that you can immediately write code onto the SQL server. Auto-parallelization is also critical for us. It decides automatically how many parallel sessions you have to run to get your query handled. You don't need to take care of any code, which is ideal. It's also highly scalable.  

What needs improvement?

The solution should work on the GPU, graphical processing unit. There should also be a piping integration available.

The design of the dashboard isn't great, visually, but the solution works, so this isn't so important to me.

SSIS's competitor, Autonomous Database by Oracle, offers you, depending on the performance of the server, new hardware, should you update your software. It advises you on how to optimize your system. It's AI and works as an artificial database manager. Microsoft doesn't have anything like this yet, but it would be nice if they did.

For how long have I used the solution?

I've been using the solution since 2008 or 2009.

What do I think about the stability of the solution?

The stability of the solution is perfect.

What do I think about the scalability of the solution?

The solution is very scalable. We have approximately 25-30 users for the on-premises version, and our cloud version has around 5,000 users.

How are customer service and technical support?

We don't ask for much from technical support. We utilize community information, and source answers via the internet or online user information. I do find that communicating with Oracle is much better.

Which solution did I use previously and why did I switch?

We previously used Oracle as an integration tool, but it's much more expensive than Microsoft.

How was the initial setup?

The initial setup was easy.

What about the implementation team?

We had a consultant that assisted us with the implementation.

What other advice do I have?

I use the private cloud as well as the on-premises version of the solution.

I would rate this solution eight out of ten. I would rate it higher if the solution offered full automation AI for database managing, setup and management, and scaling of the solution.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Igor Smirnov
Director at Netica d.o.o.
Real User
Excellent automated workflow and functionality

Pros and Cons

  • "The workflow features have been very valuable. You can have automated workflows and all the steps are controlled. The workflow functionality of integration services is excellent."
  • "It needs more integration tools, so you can connect to different sources."

What is our primary use case?

We primarily use the solution for integration packages for ETL in order to build data warehouses.

What is most valuable?

The workflow features have been very valuable. You can have automated workflows and all the steps are controlled. The workflow functionality of integration services is excellent.

What needs improvement?

The solution should offer more information online for users of SSIS. There are only a few general data sources available, but it would be helpful if there was more extensive information.

The solution could also use some more intelligent discs for data manipulation, or maybe some artificial intelligence as well. It needs more integration tools, so you can connect to different sources.

For how long have I used the solution?

I've been using the solution since 2008.

What do I think about the stability of the solution?

The solution is quite stable.

What do I think about the scalability of the solution?

The solution is very scalable. The amount of people that use SSIS from our end depends on the project, so it varies. We increase the usage of this product based on the projects that we get. Basically, each project that we do, we use this solution for integration purposes.

How are customer service and technical support?

Technical support is quite good.

Which solution did I use previously and why did I switch?

I've always used SSIS. We never considered any other solution. We might have used some tools that were developed in-house occasionally, but by and large, we always use SSIS.

How was the initial setup?

The initial setup was straightforward.

What other advice do I have?

We use the on-premises version of the solution.

I would recommend the solution to others. I'd give it a rating of eight out of ten.

Disclosure: My company has a business relationship with this vendor other than being a customer: Partner.
BIConsule1f7
BI Consultant at a tech services company with 201-500 employees
Consultant
The drag and drop feature makes it an easy to use solution and it has a good user experience

Pros and Cons

  • "It has a drag and drop feature that makes it easy to use. It has a good user experience because it takes into account your most-used tools and they're lined up nicely so you can just drag and drop without looking too far. It also integrates nicely with Microsoft."
  • "There were some issues when we tried to connect it to data storage. It was a connection issue."

What is our primary use case?

We use the on-prem version of this solution. Our primary use case of this solution is for integrated data on the SQL server. 

What is most valuable?

It has a drag and drop feature that makes it easy to use. It has a good user experience because it takes into account your most-used tools and they're lined up nicely so you can just drag and drop without looking too far. It also integrates nicely with Microsoft.

What needs improvement?

There are a lot of features that could make this solution better. 

There were some issues when we tried to connect it to data storage. It was a connection issue. There is also room for improvement in the underlying language. 

For how long have I used the solution?

I have been using this solution for three to four months.

What do I think about the stability of the solution?

It is stable. 

What do I think about the scalability of the solution?

From personal experience, I really can't comment on the scalability, but from what I've seen online, and what I heard from other people is that it scales quite nicely.

How are customer service and technical support?

I haven't had to contact technical support. 

Which solution did I use previously and why did I switch?

We previously used an in-house solution based on Oracle APEX.

How was the initial setup?

The initial setup was straightforward.

What about the implementation team?

We used a consultant for the integration.

Which other solutions did I evaluate?

We chose SSIS because it was a good financial decision. It was the most cost-effective option at the time. 

There are many projects for the company that I work for that requires implementation for a client. I started with it because I just wanted to learn about it, and I had the chance to use it because there was an opportunity that opened up to implement it.

What other advice do I have?

If you have data needs and you are a small or medium enterprise this is a good solution. If you are an organization and you're not dealing with terabytes of data, but you still need to analyze data to make that decision, you should go with this solution.

I would rate it a 7.5 out of ten. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ST
Software Developer at a transportation company with 1,001-5,000 employees
Real User
Offers good data integration but there was an incident where data extraction crashed

Pros and Cons

  • "It has good data integration and good processes."
  • "I would also like to see full integration with our BI because then our full load of data will be available in our organization. They should incorporate an ATL process."

What is most valuable?

It has good data integration and good processes. 

What needs improvement?

I would like to see an auto-scheduler feature in the Developer version of this solution.

I would also like to see full integration with our BI because then our full load of data will be available in our organization. They should incorporate an ATL process. 

For how long have I used the solution?

I have been using this solution for the last three years.

What do I think about the stability of the solution?

It is stable.

What do I think about the scalability of the solution?

Compared to other solutions, this solution is scalable. 

We only have three users using this solution. 

How are customer service and technical support?

We haven't had to use their technical support. We don't really have any issues. We only had a problem with the data extraction. 

Which solution did I use previously and why did I switch?

This is the first tool of this sort that we've used in my company. 

How was the initial setup?

The initial setup was complex. 

What about the implementation team?

We implemented it ourselves. We did the ATL. It was nice in the previous version. There was an issue where we were extracting a big load of data and the system crashed. 

What other advice do I have?

I would rate it a seven out of ten. Not a ten because of the data extraction that crashed. Tableau is better in this aspect. The transformation of data is very good in SSIS. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
AhmedShaban
Database Consultant
Reseller
Easy data retrieval and data flow with excellent reporting features

Pros and Cons

  • "The reporting on the solution is perfect. I didn't expect to see reporting features, but they are great."
  • "There was also not enough instructions from Microsoft in regards to this application or this technology, which can easily be improved upon."

What is our primary use case?

We've used the solution to create some data flows for one of the governmental sectors here in Saudi Arabia. I have created some applications for exporting data from Oracle databases to SQL databases.

What is most valuable?

I like that the data connections and all ODB connections are able to retrieve data from so many different databases. That's not only from SQL but also from other data sources as well. The data flow, which controls the data, including where to store it, is an excellent feature and is one of the most interesting points about SQL Integration Services.

The reporting on the solution is perfect. I didn't expect to see reporting features, but they are great.

What needs improvement?

More coding is needed to make it easier. It needs more development and enhancement. It also needs to offer different integration services for SQL. We had to do a lot fo the implementation ourselves, but more automating of the processes would make things easier.

Data migration needs to be simplified. I've found that it is really difficult to match, especially regarding data types. The old records often have different data types than we have now have in newer databases. Converting or transferring this kind of data is very important, and right now it's very difficult.

There were also not enough instructions from Microsoft in regard to this application or this technology, which can easily be improved upon.

For how long have I used the solution?

I've been using the solution for one year.

What do I think about the stability of the solution?

The solution is stable, as long as you provide good infrastructure for the servers you're dealing with. That means making sure the SQL will be very stable as well as the SAN storage, the RAM, the CPU, and other components of the server. It is quite important.

What do I think about the scalability of the solution?

The solution is very scalable. A client contacted me two months after deployment with a request to enlarge the services and add some more customization, which I managed to do easily. It's an enterprise-level solution, with the whole of Saudi Arabia using it for something called an E-Visa. People are applying for a visa to get into Saudi Arabia through this application. The number of users is quite high and worldwide.

How are customer service and technical support?

We haven't needed the assistance of technical support so far.

Which solution did I use previously and why did I switch?

Previously, we used the .NET framework, which we are still using as well.

How was the initial setup?

The initial setup of the solution was easy. Deployment took less than two hours. You only need one person to implement and maintain it. At the same time, we do have another person who monitors the server-side.

What about the implementation team?

I deployed the solution myself.

Which other solutions did I evaluate?

As a teacher, I worked with SSIS, so I previously used this solution. I didn't look at other options.

What other advice do I have?

In terms of advice I would give in regard to implementing this product, I'd say to examine your business needs in relation to the solution. You need to make sure those are met before implementing it. A lot of companies use data filers and they already have existing databases. You need to make sure the solution you choose is able to deal with old databases of client records.

I would rate this solution ten out of ten. 

Disclosure: My company has a business relationship with this vendor other than being a customer: Reseller.
KirillSlivchikov
Owner at 7Spring Consult
Real User
Top 20
Offers excellent integration and has great simplicity

Pros and Cons

  • "The simplicity of the solution is great. The solution also offers excellent integration."
  • "The interface could use improvement, as well as the administrative tools. Jobs fail from time to time for different reasons. It's not a problem with Microsoft, or SSIS itself. The problems are external, but to find the problems and analyze them it takes too much time."

What is most valuable?

The simplicity of the solution is great. The solution also offers excellent integration.

What needs improvement?

The interface could use improvement, as well as the administrative tools. Jobs fail from time to time for different reasons. It's not a problem with Microsoft, or SSIS itself. The problems are external, but to find the problems and analyze them it takes too much time. If SSIS could make some new monitors or new features for finding the reasons for the problems in the processes and analyzing it for how to correct the issues, that would be great.

For how long have I used the solution?

I've been using the solution for over six months.

What do I think about the stability of the solution?

The solution is stable.

What do I think about the scalability of the solution?

Right now we're working with Azure, and it's scalable, but it's expensive to do so. If we had features to scale the software part of the solution, it would be great.

How are customer service and technical support?

We have support not only for SSIS but Microsoft as a whole because we're an official partner of Microsoft in data warehousing and have a partnership agreement with the company. However, support seems to be connected with sales, and if I call with technical problems and ask if they have some feature to solve the problem, months can go by with no results.

They do, however, have a professional community online and in 95% of our incidents, we can find the solution there instead of calling.

How was the initial setup?

The initial setup was simple and straightforward.

What other advice do I have?

I would recommend the solution. It's always worked for me and it's usually a top choice for my clients.

I would rate this solution at seven or eight out of ten. It's not a perfect solution, but it works well.

Disclosure: My company has a business relationship with this vendor other than being a customer: Partner.
ITCS user
Platform Lead: Global Markets Data Analytics at Rand Merchant Bank
Real User
A solution that offers excellent compatibility with Microsoft products and an easy interface

Pros and Cons

  • "Its compatibility with Microsoft products has been very valuable to our company. It fits well within the architecture."
  • "We're in the process of switching to Informatica, and we need to work out data lineage and data profiling and to improve the quality of our data. SSIS, however, is not that compatible with Informatica. We managed to connect it to Informatica Metadata Manager, but we don't get good lineage, so we have to redo all our ETLs using the Informatica process in order to accept the proper data lineage."

What is our primary use case?

Our primary use is as an ETL tool to move data across our various environments.

What is most valuable?

Its compatibility with Microsoft products has been very valuable to our company. It fits well within the architecture.

The interface is also very good. If users are familiar with Microsoft, then they'll be quite familiar with the interface.

What needs improvement?

We're in the process of switching to Informatica, and we need to work out data lineage and data profiling and to improve the quality of our data. SSIS, however, is not that compatible with Informatica. We managed to connect it to Informatica Metadata Manager, but we don't get good lineage, so we have to redo all our ETLs using the Informatica process in order to accept the proper data lineage.

As an example of what I'd like to see in SSIS is something that I think is very effective in Informatica. In Informatica PowerCenter, you define your sources and destinations once and after that, you can simply drag and drop into the designer. It's quite a nice feature. I don't know if SSIS has that feature, but if they don't, if they could allow the setup system configurations in one step and then offer drag and drop functionality, that would be extremely useful.

For how long have I used the solution?

I've been using the solution for six years.

What do I think about the stability of the solution?

The solution is quite stable. I don't think we've had any issues.

What do I think about the scalability of the solution?

In terms of scalability, we haven't tried big data because in our enterprise we have to use Informatica to connect to the dedupe. I can't say how well it reads with the file, but we've never had an issue. However, I can't speak to scalability because we're never really, really big data.

How are customer service and technical support?

We've never had to contact technical support. Our developers would just use Google, and usually, they'll be able to find answers on the Internet.

How was the initial setup?

The initial setup was straightforward. We've got quite a strong SQL experience in my team. We have about five people working on the solution, including Data engineers and BI developers. They're all admins.

What about the implementation team?

We did the implementation ourselves. We have in-house database administrators. They usually set up all the services.

What other advice do I have?

We're using the on-premises version of the solution.

My only recommendation to those considering implementation of the solution would be, if they want to check data lineage, they should set up the packages with a particular configuration that they would have to follow in order to get data lineage. If they were to start it from scratch, my recommendation would be just to follow the pattern to ensure data lineage.

I would rate this solution eight out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
it_user1048011
Senior BI Developer at a music company with 501-1,000 employees
Real User
Integrates well with other solutions in our on-premise data warehouse

Pros and Cons

  • "It is easily scheduled and integrates well with SQL Server and SQL Server Agent jobs."
  • "Improving the login procedure would make our reporting easier on monitoring our ETL processes."

What is our primary use case?

We use this solution for data warehousing.

We are using the on-premise deployment model.

How has it helped my organization?

This solution allows us to be repeatable in the sense of how we move data from A to B. It is easily scheduled and integrates well with SQL Server and SQL Server Agent jobs. We also use it with Microsoft BizTalk, and it's quite handy.

What is most valuable?

We use everything in this solution, including a third party component called COZYROC. We try to explore and use this product to its fullest.

What needs improvement?

The login process needs improvement. At the moment, when you run SSIS and it's logging on SysTD within the server itself, I think that you have the option to do verbose and a couple of others. Sometimes, it is difficult to follow how executions are actually working. They are just one after another with an execution set. If you're running multiple packages at the same time, it's difficult to track which packages are running and to locate those specifically.

Improving the login procedure would make our reporting easier on monitoring our ETL processes.

For how long have I used the solution?

I have been using this solution for approximately eleven years.

What do I think about the stability of the solution?

The stability is very good. I've been using it for a long time. If you know how to use it then it works fairly well.

What do I think about the scalability of the solution?

The scalability is fairly good. I've used it from very, very small instances to two very large ones, where we're moving large quantities of data on a daily basis.

The actual tool itself, in a sense of transformations, your ability to pick and look up and do certain things is fairly good. Scalability, you can either have it on the same server as the database or you can move it onto its own server. It just depends on the situation.

If you've got a heavy load on the database during the day and you're doing SSIS then you have the option to separate them. But as in our case, which is a data warehouse, you can keep them on the same server because you're getting a lot of your batch processes overnight. So, most of the resources are being used by SSIS. Then during the day, you have the users using the database server for reporting or data capture or data inserts and data updates.

Our users for this solution are primarily BI developers, and we have twenty to twenty-five of them, mostly offshore.

If we grow here then we'll need more SSIS servers, or we'll get bigger boxes. At the moment, I think that we're where we need to be, and I don't have any plans moving forward at this stage.

How are customer service and technical support?

We have not had to contact Microsoft technical support. We tend to be able to solve most of the issues that crop up with SSIS, which I suppose is another reason why we have it. If there are issues then we can fix them pretty quickly.

Which solution did I use previously and why did I switch?

Prior to this solution, before 2005, we used DTS. 

How was the initial setup?

The initial setup of this solution is part of the DBA's role, and I did not involve myself in this installation. However, I have done it before. To set up a SysTD is pretty straightforward, and it can be configured to your own company ETL framework.

The length of time required for deployment depends on how good your DBAs are. Some of them take quite a while because they have to install, configure, and then do some test runs. Then all of the permissions, etc, have to be taken care of. Some of these things are easy and some of them are difficult. I would say that it will take a least a week. 

We have five people maintain this solution at the moment. We have the DBAs on standby before we do any sort of deployment to UAT or production.

What about the implementation team?

We handled the implementation with our in-house DBAs.

What's my experience with pricing, setup cost, and licensing?

When you purchase SQL Server, SSIS tends to come with it. Whether you purchase the standard edition or an enterprise edition, SSIS comes with it. Whether you choose to install it on a separate server, or the same server as the database, that really comes down to Microsoft's charging.

My advice is to look at what your configuration will be because most companies have their own deals with Microsoft.

Which other solutions did I evaluate?

We did not evaluate other products before choosing this solution. However, in the last couple of years, I have researched a few others. One of these solutions looks good, but I'm not sure how well it would be in a data warehouse situation.

What other advice do I have?

My advice for anybody who is considering this solution is to install the trial version first. You can get the SQL development edition, which pretty much has everything the enterprise edition has. You can download it, put it somewhere and use it as a development or testing area. Then, if you like it, look to purchase it.

We're looking to move more to the cloud at some point. I don't know when, but we'll be doing more research before we do.

Overall, I think for what this solution does, it's pretty much all there. I don't see any way or any changes that can be made to make it work faster, or easier. I know the tool inside out, so we know what to do with it.

The other solutions that I have looked at appear to be very good in certain situations. These are good for specific information, for situations that cover everything. With SSIS, it is specific to SQL and what we do, but it's lacking in some things like logging. Monitoring itself, for example, is what is missing.

I would rate this solution an eight out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Sasidhar Parupudi
Principle Consultant at Keyrus South Africa
Consultant
Excellent for producing simple ETL solutions but the mapping process is cumbersome

Pros and Cons

  • "The most valuable features for our company are the flexibility and the quick turn around time in producing simple ETL solutions."
  • "I come from a coding background and this tool is graphically based. Sometimes I think it's cumbersome to do mapping graphically. If there was a way to provide a simple script, it would be helpful and make it easier to use."

What is our primary use case?

We primarily use the solution for data transformation.

What is most valuable?

The most valuable features for our company are the flexibility and the quick turn around time in producing simple ETL solutions. Also, the use of the configuration on Azure. Our clients are basically secret sever based, so it works well there.

What needs improvement?

In terms of improvements to the solution, I don't think I have that expertise on it to recommend product enhancements right now. I've only used a few parts of it. Based on what I've used so far, I am happy with it.

I come from a coding background and this tool is graphically based. Sometimes I think it's cumbersome to do mapping graphically. If there was a way to provide a simple script, it would be helpful and make it easier to use.

For how long have I used the solution?

I've been using the solution for the past six months.

What do I think about the stability of the solution?

The solution is currently running in production. It's stable.

How are customer service and technical support?

We've never had to contact technical support. We generally handle any issues ourselves.

How was the initial setup?

The initial setup is straightforward. Deployment is done by another team. I download the package and deliver it to them and they deploy it onto the cloud. In terms of maintenance, we only need one person.

What other advice do I have?

We are using the 2014 Public Cloud version of the solution, under Microsoft Azure.

In terms of recommending the solution, it obviously depends on the level users you have. If you mainly have a simple graphical based user then maybe you could go for SSIS. If you want to take control of the whole ETL process, you should look for another tool. I use Informatica PowerCenter as well. That's a bit more of a robust ETL solution.

I would rate the solution at six out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
LD
Data Scientist at a tech vendor with 501-1,000 employees
Real User
A strong solution for batch processing with good community support

Pros and Cons

  • "I have used most of the standard SQL features, but the ones that stand out are the Data Flows and Bulk Import."
  • "This solution needs full support for real-time processing."

What is our primary use case?

We use this solution for ETL, which includes data summation and cleaning. 

This solution used in an on-premise deployment, for now.

What is most valuable?

I have used most of the standard SQL features, but the ones that stand out are the Data Flows and Bulk Import.

What needs improvement?

The synchronous processing needs to be improved. For batch processing, it works fine, but when you start to do real-time processing, I find that this solution is not strong, depending on how you use it. If you use it for short batches, micro batching, that might work, but it is not as good for queuing real-time processing. This solution needs full support for real-time processing.

The solution needs better support for XML and JSON.

For how long have I used the solution?

I've been using this solution for the past ten years.

What do I think about the stability of the solution?

I use this product extensively, on a daily basis. It is stable.

Over time, I think we'll most likely start to decrease usage. This will happen as we move to more real-time processing, and we will most likely start to do more queue processing. In terms of batch processing, it will scale down considerably.

What do I think about the scalability of the solution?

In the catalog, it's supposed to be scalable. I think that it has support for an SQL cluster.

In my opinion, I think it's a bit more limited in terms of scalability, although it scales with the database. I would say that the scalability is intermediate in terms of being able to launch multiple instances, or it could do load balancing as well. I think that would be a bit more challenging.

How are customer service and technical support?

We haven't needed to contact Microsoft technical support. When we have trouble we usually use Google to search for what we need to find out. Also, in terms of issues, there's a lot of information on SQLServerCentral and Stack Overflow.

Which solution did I use previously and why did I switch?

For the most part, we started with this solution. 

How was the initial setup?

The initial setup of this solution is pretty much straightforward.

If you want to develop with Visual Studio then you have to install the data service add-ons afterward, so it is a bit cumbersome. Then, if you want to use the catalog on the database, with deployment there are often security issues and you have to get an SQL catalog up and running. This can also be a bit cumbersome.

I would say that it takes a day or two to deploy this solution in a new environment, and it can be completed by one or two people. A single developer may be sufficient to deploy and maintain the system.

What about the implementation team?

I implemented some of this myself and had help in terms of setting up the security. There are often security settings that require the assistance of a DBA.

Which other solutions did I evaluate?

This solution was already in place. That's what is available and it's what people know. Going forward, this will most likely change.

What other advice do I have?

My advice to anybody implementing this solution is to look into whether to use it on a catalog in a database, versus using package deployment. There are pros and cons to both approaches in terms of deployment and security. I would say that's something that needs to be evaluated quite early. There are lots of benefits to the catalog, but also a bit more admin attached to it.

Another consideration is real-time processing needs. If this is a requirement then I would recommend against using this solution, unless the next version has a new set of features specifically for real-time processing.

I would rate this solution an eight out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ET
BI and Report Development Technician at a tech vendor with 201-500 employees
Real User
Simple to install and has support for SQL views

Pros and Cons

  • "There are many good features in this solution including the data fields, database integration, support for SQL views, and the lookups for matching information."
  • "The debugging could be improved because when it came to solving the errors that I've experienced in the past, I've had to look at the documentation for more information."

What is our primary use case?

I use this solution to create BI reports. I have used SSIS in more than twenty projects over the past six years.

What is most valuable?

There are many good features in this solution including the data fields, database integration, support for SQL views, and the lookups for matching information.

What needs improvement?

I have had problems in the past with this tool. The debugging could be improved because when it came to solving the errors that I've experienced in the past, I've had to look at the documentation for more information.

The main difficulty that I have had is to match different data types, where there were errors that I could not fix.

I would like to see better integration with the control sources. 

For how long have I used the solution?

We have been using SSIS for ix years.

How are customer service and technical support?

I am satisfied with the technical support for this solution.

How was the initial setup?

The initial setup of this solution is quite simple.

What about the implementation team?

We handled the installation of this solution in-house.

What other advice do I have?

Overall, I am quite satisfied with this product.

I would rate this solution an eight out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Alireza Nikseresht
CEO at Shiraz University
Real User
User-friendly solution with a good dashboard

Pros and Cons

  • "It's already very user-friendly and has a good dashboard."
  • "Sometimes when we want to publish to other types of databases it's not easy to publish to those databases. For example, the Jet Database Engine. Before the SSIS supported Jet Database Engine but nowadays it doesn't support the Jet Database Engine. We connect to many databases such as Access database, SparkPros databases and the other types of databases using Jet Database Engines now and SSIS now doesn't seem to support it in our databases."

How has it helped my organization?

This product is always used in projects because I'm in a company that develops software for other companies and we accept this type of work like programming and developing data models and writing the script for organizations to transfer their data between databases. We usually use these tools, and it's a cheap tool for us.

What is most valuable?

Usually, when I want to import data from other databases I use this solution. It's been about fifteen years that I've worked with an SQL server, and I'm a software developer. I've had many companies with different databases, and it's ranged from FoxPro, EDX and Oracle, and some other databases. When I want to import data I usually use this tool for importing data from the other databases to the server. It is really good, and it's solved many problems for me because I can choose the source table, destination table, source field, and destination field. I can match these things, I can write some queries to do this and because of that, I use this tool.

What needs improvement?

Sometimes when we want to publish to other types of databases it's not easy to publish to those databases. For example, the Jet Database Engine. Before the SSIS supported Jet Database Engine but nowadays it doesn't support the Jet Database Engine. We connect to many databases such as Access database, SparkPros databases and the other types of databases using Jet Database Engines and SSIS now doesn't seem to support it in our databases.

In future releases, it would be helpful to support new SQL databases. It would also be helpful if there was a way to use SQL with other languages like Python or ASP.NET.

For how long have I used the solution?

I've been using the solution for 15 years.

What do I think about the stability of the solution?

The stability is very good and I recommend it to my friends, my coworkers, and the other companies.

What do I think about the scalability of the solution?

The solution is scalable.

How was the initial setup?

The initial setup was very easy for us and very good.

What about the implementation team?

I installed the solution by myself.

What's my experience with pricing, setup cost, and licensing?

The price is important to us. We are based in Iran, and we look for products with a good price because of the sanctions. Some other products such as Oracle are expensive here and we do not recommend these products to our customers. They are expensive and they are very difficult to work with them. They are very difficult and very complicated but Microsoft products are not like that.

Which other solutions did I evaluate?

I used this product when I was a student so I chose this technology because I understood it best. 

What other advice do I have?

We chose the solution because we have some data in Oracle and we wanted to extract it and load the data into SQL Server

The type of port, like SSIS, I always recommend other companies to use it and other Microsoft products because they are very easy to use. There are very handy tools. It's very user-friendly and has a good dashboard.

I would rate the solution nine out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Steve Seeber
Senior Director Business Intelligence at Xtivia
Real User
An inexpensive tool with a short learning curve

What is our primary use case?

The primary use case for this solution varies according to the customer, but it typically involves moving data from OLTP systems into a data warehouse and/or data marts.

How has it helped my organization?

This solution has provided an inexpensive tool and it is easy to find experienced developers.

What is most valuable?

The most valuable features are the relatively short learning curve, and the automation capabilities provided through the BIML add-in for SSDT.

What needs improvement?

The UI could use refinement. I would like to see more standard components out of the box, such as SFTP, and data compression components.

For how long have I used the solution?

More than five years.

What is our primary use case?

The primary use case for this solution varies according to the customer, but it typically involves moving data from OLTP systems into a data warehouse and/or data marts.

How has it helped my organization?

This solution has provided an inexpensive tool and it is easy to find experienced developers.

What is most valuable?

The most valuable features are the relatively short learning curve, and the automation capabilities provided through the BIML add-in for SSDT.

What needs improvement?

The UI could use refinement. I would like to see more standard components out of the box, such as SFTP, and data compression components.

For how long have I used the solution?

More than five years.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Brian Dandeneau
Business Process and Strategy Specialist Advisor at NTTData
Consultant
Top 5Leaderboard
It was a easily adaptable to our group. The scalability needs some improvement.

What is our primary use case?

Movement of data and creation of files. ALl the typical things that you would have a ETL solution do. Data movements were in the millions and no calculations were completed. This means it was always a select * from where ever it was coming from and going to. Light translations like concatenation was being used. 

How has it helped my organization?

SSIS was easily adaptable to our group. It was cheaper than the other tools that we compared it to, however I feel that we got what we paid for.

What is most valuable?

The packaging and how it is organized is good for someone that really has never seen ETL before. 

What needs improvement?

Scalability of SSIS needs some improvement. Seems to get sluggish as soon as we hit a high volume of data.

For how long have I used the solution?

Still implementing.

What do I think about the stability of the solution?

Again it failed a lot and by a lot I mean every day. The failures were false alarms and caused many sleepless night for our company that I used to work for.

What do I think about the scalability of the solution?

SSIS is good for smaller shops that don't really have a high volume of data.

How are customer service and technical support?

I would rate the customer service as poor. 

Which solution did I use previously and why did I switch?

We were using Hyperion Application Link. We switched because HAL was being sunset.

How was the initial setup?

Never participated in the initial setup.

What about the implementation team?

In-house.

What was our ROI?

For the money, it's a decent tool. However, if the budget was larger I would have gone with a different tool

What's my experience with pricing, setup cost, and licensing?

Look at how this product is sold to you. Ask yourself, am I getting everything that I need. Its more expensive to get the additional adapters after the fact.

Which other solutions did I evaluate?

We looked at ODI, Informatica, and DataStage. All three we had in-house. ODI was the better option and after dealing with SSIS for only a few months, we ended up using ODI.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
VD
student at University of Newcastle
User
A change in the metadata source cripples the whole ETL process, requiring each module to be manually reopened. The solution can be deployed into the cloud through Data Factory

What is our primary use case?

SSIS is a very flexible solution that allows data to be generated through code or external software. As a result, we can create reproducible patterns and improve code quality.

How has it helped my organization?

Because of the code generation, code quality improved and the time to develop decreased. This led to a quicker and more agile development phase.

What is most valuable?

It has the ability to be deployed into the cloud through Data Factory, and run completely as a software as a service in the cloud.

What needs improvement?

Metadata management. A change in the metadata source cripples the whole ETL process, requiring each module to be manually reopened.

For how long have I used the solution?

More than five years.

What is our primary use case?

SSIS is a very flexible solution that allows data to be generated through code or external software. As a result, we can create reproducible patterns and improve code quality.

How has it helped my organization?

Because of the code generation, code quality improved and the time to develop decreased. This led to a quicker and more agile development phase.

What is most valuable?

It has the ability to be deployed into the cloud through Data Factory, and run completely as a software as a service in the cloud.

What needs improvement?

Metadata management. A change in the metadata source cripples the whole ETL process, requiring each module to be manually reopened.

For how long have I used the solution?

More than five years.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
RajeshHegde
Assistant Manager at a tech services company with 10,001+ employees
MSP
In addition to the ETL challenges, it allows us to do DBA activitities, scripting and many other tasks.

Pros and Cons

  • "In SSIS, the scope is not only to handle ETL challenges, but it will allow us to do so many other tasks, such as DBA activities, scripting, calling any .exe or scripts, etc."
  • "SSIS can improve in handling different data sources like Salesforce connectivity, Oracle Cloud's connectivity, etc."

How has it helped my organization?

For the full version of the SQL Server, SSIS, SSAS and SSRS will come as additional features for free. Hence, my organization need not spend extra money for other ETL, reporting and analysis tools. This can give very good flexibility.

What is most valuable?

In SSIS, the scope is not only to handle ETL challenges, but it will allow us to do so many other tasks, such as DBA activities, scripting, calling any .exe or scripts, etc.

What needs improvement?

SSIS can improve in handling different data sources like Salesforce connectivity, Oracle Cloud's connectivity, etc. Also, handling of the different data types will be a big challenge here; so expecting improvement in these areas.

What do I think about the stability of the solution?

There were no stability issues.

What do I think about the scalability of the solution?

There were no scalability issues.

How are customer service and technical support?

Technical support is very good.

Which solution did I use previously and why did I switch?

Based on the client's requirements, we switched over to this solution.

How was the initial setup?

The setup is straightforward.

Which other solutions did I evaluate?

We looked at Informatica and DataStage.

What other advice do I have?

Don't worry, go ahead.

Disclosure: My company has a business relationship with this vendor other than being a customer:
ITCS user
Business Intelligence Consultant at a tech services company with 51-200 employees
Consultant
All the necessary types of enumerators are available, but writing custom components and sharing them across multiple ETL streams is tricky.

Valuable Features:

  • Compatibility with the rest of Microsoft BI stack
  • Scripting component which opens unlimited possibilities: by having the full .Net stack/libraries available, if you need a yellow rubber duck to quack every million rows, it will quack
  • Inter-stream dependency mechanism built directly into the tool; the whole ETL solution can be driven off it without having to employ third party software
  • Parallel processing of a data flow: let's say there's five steps in the flow and ech step consumes output from the preceding step; once step #1 is completed and its data passed to step #2, the engine will feed the next batch of rows to step #1. Once the first batch of rows reaches the last step, there are five concurrent steps working on five different data sets, maximizing utilization of server resources
  • All the necessary types of enumerators (files, xml nodes, rows and many more)

Improvements to My Organization:

  • SSIS packages are stored in XML-like format; we keep them in SVN which makes it very easy to track changes
  • Built-in dependency tool is really great; you can not only react to success/failure results, but you can evaluate any expression in the flow and execute (or bypass) any components based on the result of that expression. This gives you full control to build decision trees of any complexity

Room for Improvement:

  • Writing custom components and sharing them across multiple ETL streams is tricky and requires specialistic voodoo knowledge
  • Make the GUI less eye-candy and more responsive; especially the scripting component needs some TLC regarding UI responsiveness

Initial Setup:

It's easy to deploy, you just need a simple file-copy mechanism that is sufficient for most deployment scenarios.

Other Advice:

The good:

  • Straightforward, intuitive, quick to learn
  • Built-in debugger: variables, data viewers, breakpoints - everything you would expect from a modern software development tool
  • Sleek, snappy UI
  • Great flexibility with storing and deploying ETL components

The bad:

  • Third party plug-ins may become incompatible with future versions of SSIS
  • Upgrading is usually painful and time consuming despite what MS says
  • Compatibility issues may arise when used with non-Microsoft technologies
Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
Senior Consultant with 51-200 employees
Vendor
It allows us to add C# or other code into ETL flow, but it doesn't have a built-in version control, so you have to get that from other vendors too (so it lacks support for multiple developers).

What is most valuable?

Its easy to use, fast to develop. Quite easy to also add C# or other code into ETL flow. Transfer to production is much faster than for example with IBM Infosphere Datastage, you can just copy the packages. Haven't found bugs much at all. Its license cost is also quite cheap compared to IBM and Informatica offerings. Overall when choosing a new ETL software I would look into SSIS or one of the open source tools.

What needs improvement?

-If you want to connect to SAP for example you have to buy add-in from other company and the same applies to many other sources.

-It doesn't have a built-in version control, so you have to get that from other vendors too (so it lacks support for multiple developers).

-Visual studio crashes sometimes.

-Doesn't have good ELT functionality, though ofcourse one can just do SQL.

-Overall there's many small things that could be done to make development faster and the product is not definitely perfect, but one has to compare to offerings of other vendors, which are not 'perfect' from usability and performance standpoint eather.

For how long have I used the solution?

Been using it for 5 years now.

What do I think about the scalability of the solution?

Scalability not on the same level as with IBM Datastage.

Which solution did I use previously and why did I switch?

Many customers are switching from IBM Datastage to SSIS. Think its the ease of use and license costs.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITCS user
Developer at a tech consulting company with 51-200 employees
Consultant
Microsoft SQL Server Integration Services is one of the most effective ETL tools available in the market

Valuable Features:

Microsoft SQL Server Integration Services provides you the ability to build ETL solutions with very minimum background knowledge. If you are already familiar with DTS packages or fairly new to ETL, it is very easy for you to start with your first assignment using SSIS.Microsoft SSIS not only provides you the traditional ETL tasks for loading data from source to destination, it also provides you the ability to send emails using it and write your custom code using Script task and offers to process SSAS task, which makes it stand out from its competitors.It is specifically designed to provide high data transfer using parallel computational logic.Very easy to configure and deploy and maintenance is very low.

Room for Improvement:

It is unable to provide separate user account for each user like Data Hub. Actually Microsoft SSIS is more designed to build ETL solutions for enterprise level unlike Data Hub where each user can access the database and have their own sets of databases.Need to improve error logging although it does provide error logs but there is still more room for improvement in this area. Microsoft SQL Server Integration Services is one of the most effective ETL tools available in the market.Microsoft SQL Server Integration Services is very low cost compared to the market’s leader Informatica (Power Center) and almost offers everything you need to build your ETL solution.

Other Advice:

You need to have Microsoft SQL Server (Enterprise Edition) licensed to run Microsoft SSIS on your production environment. Like every other Microsoft product, It offers a very vibrant MSDN community along with Microsoft support staff for assistance.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
it_user91539
Engineer at a tech services company with 501-1,000 employees
Consultant
SSIS vs. BizTalk - which is best for integration with Dynamics Ax
During the last couple of years, I have integrated data with Dynamics Ax both with SSIS and BizTalk. A common question I'm asked is what is the difference when every thing is possible in SSIS why do we need BizTalk or what does BizTalk provide different from SSIS. So My answer to this question is something like: Everything that BizTalk provides can be implemented in SSIS. But the major difference is batch processing. Usually SSIS package are used to migrate large set of data or dataset. BizTalk provide the operations to be perform on one message at time or real time processing. Because everything in BizTalk is XML so BizTalk is very slow on large set of data. BizTalk provides large number of adapters, while In SSIS you have to use direct connection by Oldb, or Sql db to…

During the last couple of years, I have integrated data with Dynamics Ax both with SSIS and BizTalk. A common question I'm asked is what is the difference when every thing is possible in SSIS why do we need BizTalk or what does BizTalk provide different from SSIS. So My answer to this question is something like:

Everything that BizTalk provides can be implemented in SSIS. But the major difference is batch processing. Usually SSIS package are used to migrate large set of data or dataset. BizTalk provide the operations to be perform on one message at time or real time processing. Because everything in BizTalk is XML so BizTalk is very slow on large set of data. BizTalk provides large number of adapters, while In SSIS you have to use direct connection by Oldb, or Sql db to communicate with different database and depend on OlDb connections. In BizTalk large number of Adapter provided to communicate which may or may not be depend on OlDB connection. Build in Tracking system (BAM) and its display on BAM portal is also big advantage on SSIS. For this purpose you have to make a custom tracking system in SSIS which require a lot of coding. Third advantage of BizTalk over SSIS is BRE. Business rule engine. BRE provide the condition whose value can be changed and complete follow of BizTalk application. These BRE roles can be used in multiple biztalk application while these functionality can be achieved on config files in SSIS.

In conclusion, when we required less data integration/migration and require complex decision making we used BizTalk. For example we have to implement complex work flow on single record. BizTalk application also used route data, read from one location, transform it and drop on other location. A simple example of this transactional data, when one transaction is occur in one system and its impact or integration will required on other system we will use BizTalk. BizTalk is a rapid development tool as compare to SSIS.

When we have a large sum of data, we require less complexity and requirement of integrated systems are based on Same technology then we have to use SSIS. Usually SSIS is used to migrate or integrate the non-transaction data or step up data. The delay of migration and integration possible or example Batch processing. SSIS is built for ETL process, it is not rapid integration tool.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
it_user91536
Architect at a marketing services firm with 51-200 employees
Vendor
SSIS vs Hadoop
On one corner we have Hadoop, a massively distributed JVM-based data processing engine with a Map & Reduce API and a proven track record in handling huge data-sets. On the other corner we have SSIS, a natively non-distributed ETL engine part of the SQL Server family tool-set with .NET code extensibility features and a drag and drop UI (for the most part anyway). Two sweet technologies, probably shouldn’t be compared to each other but we’re doing it anyway, pitted head to head against a data mapping task to the death (or at least to the recycling of my test VMs)… Now FIGHT! Recently I have been tasked with building a data processing layer tracking social signals with the following characteristics: Input data is flat files. Although initially the amount of data might not…

On one corner we have Hadoop, a massively distributed JVM-based data processing engine with a Map & Reduce API and a proven track record in handling huge data-sets. On the other corner we have SSIS, a natively non-distributed ETL engine part of the SQL Server family tool-set with .NET code extensibility features and a drag and drop UI (for the most part anyway). Two sweet technologies, probably shouldn’t be compared to each other but we’re doing it anyway, pitted head to head against a data mapping task to the death (or at least to the recycling of my test VMs)… Now FIGHT!

Recently I have been tasked with building a data processing layer tracking social signals with the following characteristics:

  • Input data is flat files. Although initially the amount of data might not be classified under “Big Data” per-say, but certainly had the potential to grow very quickly. Files were very small JSON format (1 KB average).
  • Output data is flat files. Delimited file which will be queried through a Hive Warehouse layer.
  • Data is only Mapped and not Reduced.Which means data is only extracted from the flat files and processed but never aggregated, and in any case SSIS is not capable of reducing (or aggregating) data in a scale-out architecture without building a custom intermediary layer (such as temporarily placing data in a database).
  • Data Latency into Hive is of Paramount Importance.

Both technologies are capable of iterating through a large number of flat files, extracting information and building an output, and when we take the Reduce operation out of the equation, we level the playing field and now both technologies can be scaled out, albeit Hadoop in a perhaps more friendly manner.

Although these technologies have a wider application and usage that they might be better suited to, in this experiment I was only interested in performance figures on this basic task.

In order to test these technologies against the mapping task, I have built two test machines, one for SSIS with SQL Server to support the SSIS Catalogue database, and another for a simple 3 node Hadoop cluster, the technical specification for each scenario is as follows:

Integration Service (SSIS) Hadoop
CPU 4 Cores / Node 2 Cores / Node
RAM 8 GB / Node 3 GB / Node
Nodes 1 VM 3 VMs
OS Windows Server 2012 CentOS
Edition SQL Server 2012 Cloudera CDH 4

Although the specifications for each test setup is slightly different, which makes the comparison fairly “unscientific”, the over-all processing resources available for each test scenario should be fairly comparable, with the Hadoop cluster gaining a slight edge in terms of over-all CPU cores and RAM. Besides, we are only looking for a really considerable difference in the result to warrant a favouritism of one technology over the other in this business requirement.

I ran two test scenarios:

  • Scenario 1: 33,000 small (1KB) JSON input files, each file will have about 5 – 10 values to extract against a key (mapping).
  • Scenario 2: 33 input files (every 1,000 files in scenario 1 is concatenated)

The results of the test were as follows:

Scenario 1 (33,000) Scenario 2 (33)
SSIS 14.5 (Min) 3.94 (Sec)
Hadoop Cluster 957 (Min) 134 (Sec)

As can be deduced from the results above, 1 SSIS instance showed up to 66X better performance in handling and processing flat files than the same job running in a Hadoop cluster.

Learnings from SSIS vs Hadoop Test

There are a few key learnings that has been gained by doing this experiment:

  • Hadoop has a terrible start time when operating on a file, the processing engine could take up-to 5 seconds before it could actually start processing the file, were SSIS takes less than 0.2 of a second. Java has never been a very agile language in my opinion.
  • Hadoop is not intended to handle a large number of small files, instead try combining smaller files into bigger concatenations. Sometimes it is considerably faster to have a pre-processing step that concatenates files into smaller batches.
  • Although the number of “Reducers” for a Hadoop job could be easily controlled, it is more difficult to control how many “Mappers” available for a job across the cluster, and Hadoop does not always adhere to the user-set number of Mappers.
  • Although SSIS outperforms Hadoop by an average of 50X on this simple task, Hadoop scales in a much more user-friendly manner, and allows users to “Reduce” or aggregate the data across all nodes for a particular job, a feature that is not supported by the out-of-the-box Integration Service.
  • Don’t just jump on new technologies, you need to test it and ensure that it is suitable for your particular business requirement, Hadoop is a great distributed processing engine when used in the correct context. It is too easy these days for managers and BI people to band around the term “Hadoop” for everything “Big Data”, from data processing to warehousing, but you need to take the time to separate the wheat from the chaff.
  • HDInsight (Microsoft’s Hadoop distribution which runs on Windows and Azure) was another technology that we were investigating at the time, although performance was extremely terrible that it was eliminated from the race fairly quickly.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Product Categories
Data Integration Tools
Buyer's Guide
Download our free Data Integration Tools Report and find out what your peers are saying about Microsoft, Informatica, Oracle, and more!