Microsoft 400x400

SSIS Review
SSIS 2012 - The good, the bad, and the ugly

3,976
4

Sql Server Integration Services (SSIS) 2012 Review

Summary

SSIS 2012 - the Good the Bad, and the Ugly

SSIS is a unique ETL tool for a number of reasons. Most shops already own it but may treat it as a one-off free utility rather than as an enterprise ETL tool even though it can compete with the best of ETL tools. Some of the lack of respect is due to immature beginnings in its former "DTS" days. And some of it is because it comes bundled with Sql Server and does not require a separate ETL server. It can't be good if its free right?  Especially when competitors charge 6 figures for the same functionality. Don't we get what we pay for?  

My answer is, generally yes but this may be an exception worth noting.

SSIS has become extremely flexible, extensible and integrated with many other Microsoft tools and add-ons. It's incredibly cost effective, easy to learn the basics quickly (although like all ETL tools requires the traditional learning curve to get good at) and has an immense user base. There are also endless bits of quality shareware available that seamlessly plug-in and a wide variety of low priced vendor supported add-on products to fill in any out-of-the-box gaps (see my other review of MellissaData Data Quality MatchUp for SSIS). And if you can't find something you need and you know how to code C#.net you can extend the tool yourself.

The Good

  • How is free for a price? (for anyone running enterprise or BI editions of SqlServer)
  • Limitless Extensibility

This comes from the fact that SSIS is merely Visual Studio code and comes with templates to add your own custom components. A large variety of pre-built shareware is available at the codeplex.com website and vendors such as MellisaData and Pragmatic Works provide sophisticated add-on components from advanced realtime monitoring to state-of-the-art data quality plug-ins. BIML shareware allows for automated ETL code generation based on XML templates. Some of the shareware available on Codeplex are very high value such as the MultipleHash component providing very sophisticated hashing to support CDC and SCD operations. If you don't like out of the box functionality (such as the SCD wizard which is largely worthless) then there's likely a worthwhile supported replacement by some vendor for a reasonable price. There's built in support for CDC from many database sources such as Sql Server and Oracle for no added charge which is unheard of among their competitors.

Adding plug-ins such as BiXpress, Task Factory and DocXpress from Pragmatic Works are relatively inexpensive tools that do some really amazing things. BiXpress is a MUST for providing real-time and historical monitoring of ETL including tracking package parameter and local variable value changes both in real-time and historical. I highly recommend MelissaData Matchup for SSIS - you'll never see any other data quality tool as easy to use as that one which seamlessly integrates with SSIS.

  • New Version 2016

Many improvements are here with the latest version such as ability to deploy a single package within a project (a feature lost when going from 2008R2 to the new project deployment model in 2012).

  • Common Development Environment - Visual Studio

Unlike any other ETL tool - if you learn Visual Studio, you gain familiarity with an entire toolset. Its navigation and project organization is common to all other .net development. Along with Visual Studio you get all the  source control plug-ins inherent with the tool such as TFS (aka. TFVC in VSO which is cloud based and free!), and Git.

  • Job/package Parameters

2012 Version introduced very flexible parameter capability superior to many other ETL tools. Project parameters integrate into Sql Agent to provide job-level ability to dynamically change runtime values such as source/target connections.

  • Endless Add-ons

Either via shareware or purchased products. Codeplex.com provides huge amount of free shareware code. Vendors such as KingswaySoft and PragmaticWorks and many others provide multitude of inexpensive add-ons from adapters to enhanced components.

  • Very Large User Community

Just google if you don't know the answer - from youtube to blogs there's an incredible amount of information out there about this tool.

           List of good features...

  1. Respectable Performance
  2. Robust historical repository reporting provided in the included SSISDB repository
  3. Includes connectivity to large variety of sources/targets
  4. Built in CDC for multiple sources (formerly a purchased add-on from Attunity). This feature alone is often 6 figure add-on from other vendors.
  5. High value/low cost Data Quality component integrations from MelissaData
  6. Sophisticated breakpoint debug capability including inside VB and C# scripts
  7. Integration with Change Control Software (e.g. TFS, Subversion, Git)
  8. Fully integrated with Sql Server Agent for scheduling including dynamic job step parameters
  9. Integrates with SSAS tabular and cubes as well as Data Mining algorithms
  10. Includes data profiling task and wizard
  11. High level of sophistication with source/target drivers
  12. Free Attunity OLEDB drivers for higher performance connections to Oracle and Teradata
  13. Multiple plug-ins for interfacing with applications such as Salesforce.com and Dynamics CRM
  14. Longevity of the tool and consistent support and enhancements by Microsoft
  15. Full power of either VB or C# script tasks to accomplish pretty much anything that isn't already included

The Bad

  • No direct support for push-down of joins

You have to write push down join & lookup SQL to the database yourself via stored procedures or use of the SQL Task to get very high performance. That said, this is a common complaint for nearly all ETL tools on the market and those that offer an alternative such as Informatica offer them at a very expensive add-on price.

  • Slowly Changing Dimension (SCD) Wizard has poor performance

No surprise here as this is a common issue with ETL SCD wizards. Requires custom/shareware enhanced wizard or one from 3rd party vendor to get good performance. This begs the question why Microsoft can't come up with a better wizard since it's pretty straight forward to code a dynamic merge as a workaround. Both much improved Codeplex shareware version and supported version from PragmaticWorks are however available as add-ons.

Here's a list of constraints or potential gotchas...

  1. Difficult to monitor server memory usage/bottlenecks. Possible but non-intuitive.
  2. Expression Language primitive and inconsistent with other languages. The workaround is to use the script task that allows either VB or C#.
  3. Limited native scalability - No direct support for multiple server/clustered installation/processing.
  4. Flat file connectors are overly difficult to build and maintain - Changes in columns and file layout is cumbersome and problematic. Datatype detection is almost always wrong requiring manual settings for every column.
  5. Tight coupling with Sql Server for metadata repository.
  6. Default datatypes not always correct when reading from views. This can cause syntax errors in data flows and components such as union all. Workarounds are to explicitly convert in datatype conversion task or override the default metadata datatype.
  7. Previous metadata often does not disappear when changing targets/sources. Tasks have the tendency to hold onto previous datatypes and lengths and not automatically pick up the changes.
  8. Inconsistent data types for variables and parameters. Start with one set of datatypes coming from a database, combine with a completely different set of internal variable datatypes, another set with either Vb or C# variable datatypes when using script tasks, another set of datatypes when passing parameters to stored procedures, and yet another when applying SSIS expressions...and it's quite the mess in the end. You get it figured out eventually but it leaves you scratching your head asking why it has to be so difficult when it's all the same vendor's product. The C# and VB and database datatypes are a given but why can't the others follow one of those?
  9. Confusing context/scope for variables to watch when running multiple levels of parent/child execution. Debug mode shows all of them at the same time and the context for each set is not intuitively obvious. For example you get list of each parent and child in the hierarchy of calls and its easy to mistake which package the variables relate to when viewing in the debugger.
  10. Expression editor is clumsy and language is cryptic.  There are 3rd party shareware editors which may help develop your code but the end result is hard to read if complex. As a general rule it is best to move complex expressions into VB or C# script task to avoid hard to maintain expressions.

The Ugly

Deployment instability with very large (in size of packages) projects

It is not unusual to get "out of memory" errors IF you use the default deployment wizard which is 32 bit. However there is a 64 bit version that eliminates this issue but you have to realize that using it isn't the default.

SSDT (the development tool) gobbling memory so you have to exit at least once a day to free up memory.

Containers that help group tasks have several very annoying bugs. For example, sometimes attempts to resize will make the diagram tool move about wildly and out of control. A task within a container sometimes becomes detached and you can't get it back into the container. Some containers just decide they aren't going to accept more tasks. The workarounds include:  exit SSDT and come back in, and create a new container and move all the tasks (which may not be reasonable if you have lots of tasks with precedence sequence constraints). Another sometimes helpful workaround is using auto-layout which often resolves display problems.

Prior to 2012 not recommended

Prior versions had many issues including debugging instability with large parent-child package call volumes (@30 seemed to be the magic instability number) and com locking issues when running many parallel threads. It's largest drawback however was it was overly complicated with its prior parameter passing through XML configurations which were very difficult to debug and manage. That said, it was still superior to ETL tools that require passing parameters via just files. But these issues were resolved with 2012 when they introduced project and package parameters and changed memory management for parent/child package calls.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
7 visitors found this review helpful
Ae31ecf3 2279 4139 897d a4e625666cc4 avatar?1439988666Justin duperre avatar 1433343384?1433343382Birsen tunc yazar avatar 1433926590?1433926588D565c461 a5b8 45c9 bd61 06a7a2bcfea1 avatar4f3daf44 44b1 469d a4b5 60bdb4c61f08 avatar?1458919119Anonymous avatar x30Anonymous avatar x30

4 Comments

Anonymous avatar x30

Great summary. Thanks.

Like (0)17 July 15
Picture 1288 1365323206
GaryMReal UserTOP 5POPULAR

Update regarding deployment instability... It is caused by very large projects and using the default install shortcut 32bit deployment wizard. Running the 64 bit version resolves the issue.

Like (0)10 January 16
Piotr lipski li?1414331704
Piotr LipskiConsultantTOP 20

You are missing one important feature in "The Good" secion: the new (2012+) deployment / storage model (i.e. Integration Services Catalogs) while indeed tightly coupled with the SQL Server technology, provides a very handy access to all SSIS metadata (including execution history & results) via pure SQL. No need to click the GUI any more in order to get the status of all currently running tasks, error details of failed components and so on. You can even execute tasks by calling SQL procedures built into the solution. A big winner for me.

Like (1)12 September 16
Justin duperre avatar 1433343384?1433343382
Justin DuperreReal UserTOP 20

Great summary - knowing the built in SCD is notoriously slow, I've built a custom type 2 template within a data flow task that is fairly versatile. I am interested to see how other ETL "canned" SCD options compare.

Like (2)12 September 16
Anonymous avatar x30
Guest

Have A Question About SSIS?

Our experts can help. 218,478 professionals have used our research on 5,616 solutions.
Why do you like it?

Sign Up with Email