SQL Server Integration Services (SSIS) was introduced in SQL Server 2005 as the successor to Data Transformation Services. It was a complete revolution and SSIS was firmly positioned as Microsoft’s Enterprise ETL tool. When SQL Server 2008 was released, a lot of changes were made under the hood regarding stability and performance and a couple of new features were introduced. SQL Server 2008R2 came with very few changes, but SQL Server 2012 revolutionized the way of working with SSIS. It introduced some features but, more important, it introduced the Project Deployment Model. This model allows easier deployment to the server, but also provides a better and easier manageable administration framework. In order to not completely break with previous editions of SSIS and to maintain existing code bases, the way of working in SSIS 2005 till 2008R2 is preserved as the Package Deployment Model. When developing SSIS 2012 packages, you have a choice between the twp models. This article will introduce you to the concepts of both models, which can help you in the preparation for exam 70-463.
The Package Deployment Model
Let’s start with the oldest model. If you do not want to migrate existing packages—developed in earlier versions of SSIS—or if you like developing SSIS solutions like you did since SQL Server 2005, this model is the way to go. Designing the package itself—the control flow and the data flow—is pretty much the same in both models, so let’s focus on the important characteristics of the package deployment model which are not present in the project deployment model.
Creating a Project
When starting a new project in Visual Studio, there is no option to choose for the package deployment model.
Every new project is automatically created in the new project deployment model. If you want to work in the package deployment model, you can simply convert the project by right-clicking the solution and choosing Convert to Package Deployment Model.
You can do this at any point in time, even if you have already designed a few packages. However, the conversion is only possible if you didn’t use any of the new features of the project deployment model. These new features will be discussed when treating the project deployment model. While converting the project, SSIS will check for compatibility.
If the conversion succeeded, the solution will indicate that you’re working with the package deployment model.
When you finished designing the package, you deploy it to a server. However, most likely some objects need to change value, depending on the environment where the package is running. For example, you have a connection string pointing to a database where you extract data from. When you design the package, the connection string points to the test database, but when you run the package in production, you don’t want to extract data from the test database but from the production database. Since it is not feasible to manually edit every package after it has been deployed to a server, SSIS provides package configurations.
The concept is simple: You have a set of configurations that specify values for SSIS objects inside a package. You define different sets of values for each environment. When the package runs, it searches for the configurations and applies the values found to the objects, resulting in a dynamic execution of the package.
There are several options for configurations:
* SQL Server configuration—The configurations values are stored inside in one or more tables in SQL Server. The advantage is that sensitive data, such as passwords for connection strings, are protected by SQL Server security.
* XML file—The configurations are stored inside an XML file with the extension .dtsconfig. This type of configuration is particularly useful in an environment where there is no database engine present to store the configurations.
* Environment variable—This type of configuration is typically used to point SSIS to the location of other configurations. For example, the environment variable stores a connection string to a SQL Server database, where the configuration tables are located. When SSIS starts, it uses this connection string to find the other configurations. This is called indirect configuration. If you would configure a configuration with an explicit value, such as the file path to an XML file, it is a direct configuration. The downside of direct configurations is that this file path has to be exactly the same on all environments.
* Registry entry…The configuration is stored in the registry. This must be the least used option in practice.
* Parent package variable…This is a special type of configuration. When a child package is started by a parent package through “Execute Package Task,” the child package can retrieve the value of a variable of the parent package and apply it to one of its own objects.
If a package cannot find its configurations when it starts to run, it will apply the values configured during design time (see the figure above, where variable2 keeps its original value B). This can give unwanted effects. Suppose you run a package in production but the XML config file is missing. The package will return to its default values and start to update the test server.
A full discussion of package configurations is beyond the scope of this article. For a more detailed treatment of configurations and how to configure them, please refer to the following sources:
When a package is running on the server, you want to know what is going on. SSIS provides native logging capabilities that allow you to track the progress of a package and to troubleshoot if necessary. To enable logging, right-click on the control flow canvas and choose Logging…
In the logging dialog, you need to choose the type of logging. The following options are available:
* Text file
* SQL Server—This option has the advantage that the logging results are easily to query. SSIS logs to a table called sysssislog.
* SQL Server profiler—This logs to a trace file. This option is almost never used in reality.
* Windows event log
* XML files—Be aware that the structure of XML files comes with an overhead.
Text file, XML file and SQL Server profiler logging need a file connection. SQL Server needs an OLE DB connection, but the Windows event log doesn’t need any connection.
You can specify multiple log types in the same package. Once you add a log provider, you need to assign events that will be logged. You can specify events for any level of the control flow, indicated by the tree in the left pane. This allows you to do fine-grained logging with specific events for specific tasks.
OnError and OnWarning, because they tell you when something is wrong. OnInformation events can contain useful information, such as the number of rows written by a destination component in the data flow, but they are logged so often that there is some overhead in including this event. The TechNet article Integration Services (SSIS) Logging has more information about the integrated logging capabilities of SSIS.
If this native logging doesn’t suit your needs, you can construct your own custom logging framework. Using “Execute SQL Tasks” and system variables such as StartTime or PackageName, you can log the start and the end of the package. With the use of event handlers, you can capture errors and log them to your own auditing tables. In the data flow, you have the Rowcount transformation that enables you to capture the numbers of rows that pass through the pipeline. The possibilities are endless with a custom framework, but it costs significantly more development effort to incorporate them in the packages and to keep it consistent between different packages.
For more information about custom logging, please refer to the following resources:
Deployments are the reason why the models got their name. In the Package Deployment model, you can deploy single packages, while in the Project Deployment model, you can only deploy the entire project. There are a few different options when you deploy a package to a server:
* You can deploy the package to SQL Server. The packages are stored inside the MSDB database. The advantage of this method is that possible sensitive information is protected by SQL Server and that all of your packages are included in the backup of the MSDB database. (You do back up your databases, don’t you?)
* Another option is the file system. There are no constraints: You just put them somewhere in a folder, preferably in a place to which the account executing the packages has access.
* The last option is the SSIS package store. This is a special folder, by default C:\Program Files\Microsoft SQL Server\100\DTS\Packages, where you can store your packages. The only advantage over the regular file system is that packages stored in this location are monitored by the SSIS service and show up when you browse this service; more on that later.
The question is, how do you get your package to either of those locations? Again, there are a few options on how to deploy your packages.
* You can deploy packages manually. For file system deployments, this is as easy as copy pasting the packages to the new location. In SQL Server, you need to log into the service and import the package.
* Through Visual Studio, you can create a deployment manifest when you build the project. This manifest launches a deployment wizard.
* There is a command line utility, DTUTIL, which you can use to deploy a package. You can also use it to manage the encryption of a package, to sign a package, and to manage folders in the SSIS service. The biggest advantage of DTUTIL is that you can easily incorporate it into a batch script to automate deployments.
* BIDSHelper, a free add-on for Visual Studio on Codeplex, has the option to deploy packages with a simple mouse click.
The article SSIS Deployments gives an overview of the different deployment methods for the package deployment model.
Administration is fairly limited. When packages are stored in the file system, there is no administration at all: You are responsible for what happens to the packages and how they are stored.
The only administration possible is within the SSIS service. You can log into this service with SQL Server Management Studio.
Inside the service, you can either monitor running packages or you can go through packages stored in the SSIS package store or in the MSDB database. Packages can be organized into a folder structure.
Packages can be imported, as noted in the deployment section earlier, and they can also be exported, upgraded, or executed. There is a basic security mechanism through package roles.
Packages are executed through the DTEXEC executable. There are different methods for executing a package, either dealing with DTEXEC directly or indirectly:
* Manually executing a package: by double-clicking a package stored in the file system or by right-clicking a package in the SSIS service and choosing “Run Package”. This launches DTEXECUI—also called the execute package utility—which is a user interface on top of DTEXEC that helps you with the execution of the package. DTEXECUI also shows the log messages when the package is run. You can specify additional configuration files, alter connection managers, specify execution options, add extra logging providers, and so on.
* Using an SQL server agent job: There is a special job step type reserved for executing SSIS packages. The same options to control the execution of the SSIS package as with the execute package utility are present. The advantage of using an agent job is that you can schedule the package easily and that you can specify an alternative account—an SQL server agent proxy—to run the package.
* By directly calling DTEXEC: As with DTUTIL, DTEXEC can be called through the command line or a batch script. Since it is an ordinary executable, you can schedule DTEXEC with any scheduler possible, such as the Windows Scheduler.
Although the SSIS service monitors running packages, it is not necessary to have the service enabled in order to execute packages.
The first part of this article gave an overview of the package deployment model, which is how you designed, scheduled, and executed SSIS packages in the earlier versions of SSIS. These methods are still available, but as an alternative to the project deployment model, which will be discussed in Part 2 of this article.