This is the first blog in a two-part series focusing on performance tuning of SQL Server Integration Services (SSIS).
Developers are usually unaware of internal components of the SSIS run-time engine, thus do not include them while creating the packages and start to face performance impact the moment data volume starts increasing.
It is, therefore, imperative for developers to not only be aware of these internal components but also know how to modify them, and how the changes can impact the performance of SSIS packages during runtime.
The purpose of this blog is to make you all familiar with “Integration Services: Performance Tuning”, which we usually overlook while creating the SSIS packages. We will also discuss SSIS buffers and how to deal with it when it comes to enhancing a package’s performance. In addition, understanding Execution Trees will help us understand how SQL Server Integration Services manage data flows and how data is distributed among data pipelines.
SQL Server Integration Services Buffer
SSIS buffer is an internal memory used by SSIS to hold data during the execution of the package. The metadata is invariable and cannot be changed once the design is complete. This means we cannot change the fields, data types, etc. at SSIS package execution time. SSIS transformations manipulate and transform the data inside the buffers. All the operations occurring in control flow and data flow level use SSIS buffer, making it essential for developers to optimize the use of SSIS buffer.
Here are some data flow level settings that help in enhancing the performance of SSIS Packages.
Data Type Conversion in Query Level
Developers should avoid using data type conversion, instead try to cast or convert the data fields at the query level for maximum result.
Data Access Mode
Developers should also avoid using the Table or View option for data access when reading a few columns from the data source. Using SQL command and selecting only required columns makes much more sense.
Data Access Mode Destination Properties
Another necessary step is to use the fast load option when inserting data into destinations as SSIS internally uses a Bulk Insert statement instead of a simple INSERT statement.
Upon checking the table lock option, SQL acquires a lock for the destination table, which ensures faster data load in the table. All other operations, requested by any user during the locking period will be queued and would operate only after all the data is loaded.
Checking this option specifies that the data flow pipeline engine will validate the source data against the constraints of the target table. The option is unchecked, by default, which improves the package performance at data load time.
Rows per Batch
The default value of this property is -1, it means all the rows coming from the source are treated as a single batch.
Maximum insert commit size
The default value for this setting is ‘2147483647’ (largest value for a 4-byte integer type). Once the package is complete, all the rows are successfully committed.
Specifying the “Maximum insert commit size” means that the SSIS engine will ignore the “Rows per Batch” value.
Not specifying the “Maximum insert commit size” means that the SSIS engine will use the value of the “Rows per Batch” field and commit the number of rows specified for this property into the destination.
The value of MAXCONCURRENTEXECUTABLES and engine threads should be what it is by default.
This is a package level property and its default value is -1, which indicates the maximum number of parallel tasks that can be executed is equal to the total number of processors + 2.
Parallel Tasks = Total Number of Processors + 2
Engine Threads is a property of the data flow task with a default value of 10. This value indicates the number of threads that will be created to process the data flow tasks.
Avoid Blocking Transformations
Several block transformations exist in SSIS, such as Sort, Aggregate, Fuzzy Group, Merge, etc. Developers should try to avoid these transformations in the SSIS packages, and rather perform sorting and aggregations at the query level.
SQL Server Integration Services is highly popular among developers but there are some elements and properties that can help improve the overall performance of the tool but are often overlooked. The purpose of this two-part series is to highlight those elements and properties and help you tune the performance of SSIS.
This is it from our end for the 1st part. Sit back for the next part for more information on improving the performance of SSIS.
Until next time!
- SQL Server 2017 on Ubuntu Platform
Unlike previous releases of SQL Server, this time Microsoft has also given us the choice of picking the operating system on which we wish to install the application. This means…
- Different Application of Joins in SQL
Structurally organized with large quantities of information, databases are the backbone of every organization. That is why companies invest in database applications such as MySQL, PostgreSQL, Microsoft SQL Server and…
- Microsoft SQL Server Execution Plan
In a database environment, certain questions related to performance pop up repeatedly: why is my query running slowly? And why is SQL Server not using the index? etc. As a…
- Working with SQL Server Database on Microsoft Azure (Part 2)
The huge volume of data generated around us, today, requires optimum storage and management to have any benefit for organizations. With a host of advantages that MS Azure possesses, especially…
- Performance Tuning of SAP Business Objects Reports
SAP Business Objects (BO) is a reporting and analytics tool used to visualize data in different forms like charts, graphs, tabular reports, etc. The tool makes reporting and analysis simple…
- Amazon Athena: An Overview
It is difficult to fathom the progress of the computing world. Only a couple of decades ago, 1.44MB of memory space offered by floppy disks seemed a lot of storage…