What are SSIS Packages?

SSIS packages are collections of work developed using SSIS. They are used to supervise and handle data workflows, which include tasks such as extracting, transforming, and loading data. Each SSIS package is a bundle of different components that enable users to design and execute data integration processes. These components include data flow tasks, control flow tasks, and connections. SSIS packages are a high-performance tool for automating data movement, transforming data according to business requirements, and managing complex data processing tasks.

Want to know more about SSIS? Check out the complete overview in the What is SSIS? article.

Importance and relevance of SSIS in data integration

SSIS enables us to build one or multiple solutions to our requirements that are highly stable, easily maintainable over time, and highly flexible to adapt to unexpected changes without a high redesign cost.

Additionally, the ability to perform all of this through a graphical interface, without writing any code, expands the opportunities for professionals who can be involved in developing solutions using this tool.

Key purposes and applications of SSIS packages

SSIS is primarily used to automate moving, modifying, and loading data from diverse sources to various destinations.

All this can be done through a single unified tool, giving the business high efficiency in the development and maintenance process of the solutions.

Examples include moving data from one repository to another, correcting, modifying, or adapting information to specific requirements, filtering data according to needs, validating incoming and outgoing information to other systems, and fully automating moving and transforming data in a reliable manner.

It can easily modify its components and make it cost-effective to maintain an application if data sources or data models change in the future.

Why use SSIS packages?

An ETL process, by definition, involves extracting data from heterogeneous sources, transforming it to adapt to our storage requirements, and finally loading it into our data repositories, where it has been validated and processed.

We will now proceed to detail the standard phases of an ETL process:

  • EXTRACT: In this stage, we aim to read data from various sources, regardless of their format, according to a defined data model. These sources include plain text files, Excel files, tables in a specific database, and countless other possibilities.
  • TRANSFORM: After extracting the information, it is often necessary to adapt it to the target object. This step is where we match the source object with the target data model. All necessary data conversions occur to ensure the process runs smoothly during the insertion step. For example, if a text is found in a field where a numeric value is expected, a conversion must be made to prevent errors. It is also possible to adapt data models. For example, source data might combine information about a purchased ticket and the customer. In this case, we can split them into two objects - one for the ticket and one for the customer - each with its destination.

    Moreover, we can calculate and add data not present in the source before sending it to the destination. We could calculate the total amount of a ticket, a tax rate, or a discount and include this information in the final output.
  • LOAD: After completing the previous operations, the data can be loaded into database tables, multiple files, or any required destination. SSIS is designed to handle various destinations, processing large volumes of data with high speed, reliability, and flexibility.

Benefits of using SSIS packages

You experience the following benefits when you use SSIS packages:

  • Microsoft's suite allows for the entire process to be integrated into a single, unified environment.
  • High capacity to test and implement changes with great speed and low probability of errors.
  • User-friendly environment that allows professionals without a high degree of specificity to work on solutions.
  • Packages can be generated perfectly and easily tracked through a standard version control system.
  • It has a wide variety of preconfigured internal tools to perform most of the typical activities of an ETL.
  • It has connectors and ways to connect to a wide variety of data sources and destinations, increasing interoperability with a large number of systems. It also has the possibility of not having access to a particular data source, developing it, and adding it to the packages without major inconvenience in a language that can be C#.
  • It has a solid execution and error control system that allows for easy and reliable operation.
  • It has high performance for loading large volumes of data.

Components of a SSIS package

A SSIS package is a logically arranged set of connections, variables, parameters, event handlers, control and data flow elements, and configurations that you can develop programmatically or with the help of the graphical design tools provided by SSIS.

The diagram illustrates a simple package containing a Control Flow with a Data Flow task, which, in turn, includes a data flow:

Components of a SSIS package

Control Flow

This section allows us to manage the overall interaction between the different components of the solution. It provides the most general view of the projects.

Control Flow

One of the main components of a Control Flow is the Container. Containers are grouping elements that help organize and structure the solution. They provide a way to manage and control the tasks within the package. There are two types of containers:

  • Sequential containers that allow you to program a series of tasks or steps to be performed one after the other in a linear manner, with a single start and end.
  • Repeating containers that allow you to execute a series of programmed steps but repeat them as many times as necessary.

The other basic building block is called Tasks, which represent the individual actions that make up our processes. Tasks define the specific operation each step in the workflow will perform. There are many types of tasks that offer various options for ETL processes to manipulate data. For example, tasks can handle FTP transfers, read from a flat file, or write data to database tables.

Data Flow

SSIS provides various data flow components, including:

  • Sources: These define where the data will be read from.
  • Transformations: These components allow us to manipulate the data to meet the specific requirements of the processing or target data model.
  • Destinations: These objects receive the data after it has been extracted and processed.

A noteworthy case of third-party tools is the Devart SSIS Data Flow Components that enhance SSIS and provide interoperability with cloud connectors in an easy and reliable way.

Data Flow

Event handlers: managing and handling events

These provide a powerful and flexible way for users to execute custom logic in response to specific events based on their needs. Tasks, containers, or the package itself can trigger events. They are often used for error handling, event logging, and performing additional tasks before or after data processing. They can also be a viable option to direct the execution of a package through events.

Parameters and variables: dynamic data handling

Using parameters and variables together allows for flexible and dynamic data handling in SSIS. Parameters enable dynamic package configuration, while variables provide a mechanism to manage and manipulate data throughout the package's execution. This combination ensures that SSIS packages can adapt to changing data requirements and execution conditions and facilitate robust and scalable data integration processes.

Connection managers: configuring data connections

In SSIS, Connection Managers help configure and manage connections to various data sources and destinations. They enable SSIS packages to interact with databases, files, and other external systems by defining the details required to establish a connection.

How to create a SSIS package

To build packages, you need to have Visual Studio Installed. With the prerequisite mentioned in the previous step, a set of tools known as SSDT (SQL Server Data Tools) will be required. To install it, you must enter Visual Studio installer, select the Modify option, and then select SQL Server Data Tools under Data Storage and Processing from the available options. For more information, refer to our instructions on how to install SSIS.

Step-by-step process of creating a new SSIS package

To begin, launch Visual Studio on your machine. On the Configure your new project page that opens, select the Integration Services Project as a solution name, specify the project name and its location, and click Create to create the SSIS project.

Configure your new project

The home screen will appear, providing you with the primary work environment.

Work environment

Design the package

Let's create a simple project to get a general overview of the process.

Control Flow

Navigate to the Control Flow tab of the new package. Drag tasks, such as Data Flow Task or Execute SQL Task, from the SSIS Toolbox onto the design surface. Configure each task by double-clicking it and adjusting the properties as required.

Control Flow

Data Flow

Add a Data Flow Task to the Control Flow tab and double-click Data Flow Task to switch to the Data Flow tab. From the SSIS Toolbox, drag data flow components and configure each component by setting its properties and defining the necessary connections.

Control Flow

Watch our step-by-step video tutorial to master SSIS packages in no time!

Executing a SSIS package

Let us explore the ways to execute an SSIS package:

Executing from Visual Studio: Open the required package and press F5 or click Start. This method can be used to debug or troubleshoot package execution.

Control Flow

Executing from SQL Server Management Studio: After installing the package, in Object Explorer, right-click the package and select Execute. This method is commonly used to automate package execution using SQL Server Agent.

Best practices for execution and monitoring

  • In a production environment, it is recommended to use a schedule to execute packages in a programmed and automated manner. This allows for optimal resource orchestration, such as defining specific times for process execution, preventing simultaneous execution of multiple processes, etc. Additionally, an execution log for each package should be maintained to efficiently track errors or warnings that occur during execution. Notifications, such as email alerts, can also be sent for situations requiring operator attention.
  • During development, Visual Studio can be used for execution, debugging, and error detection. Logging and alerts are less critical in this environment as the user executes processes on-demand and in real-time.

Handling errors and troubleshooting execution issues

It is crucial that the information provided by the processes is neither excessive nor insufficient to determine production errors. To avoid time-consuming debugging during critical moments, it's important to carefully design how packages communicate with the user.

Logging and auditing SSIS package execution

Logs can be written to flat files, databases, or the Windows Event Viewer. The choice of destination(s) will largely depend on the application type. For instance, if an FTP transfer fails, an error might be logged to a folder related to the transferred files. If an error occurs within a process interacting with a database, it would be more appropriate to log it within the database itself. There are no strict rules governing this decision; it should be tailored to the specific needs of the solution.

Try to create a SSIS package with the Devart Data Flow Components

Devart SSIS Data Flow Components offer enhanced connectivity solutions that integrate with different data sources that Microsoft SSIS does not natively support. These components are designed to extend the functionality of SSIS, allowing for effective data manipulation and integration with systems, such as Salesforce, Oracle, MySQL, PostgreSQL, and many others.


Conclusion

While this text has provided a general overview and simple examples, SSIS is a powerful tool with immense capabilities. It's a comprehensive suite that can manage virtually any company's entire data flow. Its ability to read and write data from diverse sources gives it exceptional flexibility.

SSIS can be extended using Visual Studio to make it highly customizable. This means that you can tailor it to fit unique business requirements, even those that have yet to be conceived.

Despite the rise of cloud-based alternatives, SSIS remains a robust, reliable, and scalable choice for on-premise data integration and transformation. It's particularly well-suited for organizations that rely heavily on Microsoft technologies.

SSIS boasts a large and active community, along with comprehensive online documentation that provide ample opportunities for learning, troubleshooting, and collaboration.

Having weighed its pros and cons, it's clear that SSIS offers a compelling solution for businesses seeking to streamline their data processes. Its scalability, flexibility, and extensive capabilities make it a strong choice for organizations of all sizes.

SSIS Data Flow Components

High performance SSIS components for popular databases and cloud applications