3/01/2012

Exploring SSIS Architecture and Execution History Through Scripting

When you are using SSIS, there soon comes a time when you are confronted with having to do a tricky task such as searching for particular connection strings in all your SSIS packages, or checking the execution history of scheduled SSIS jobs.  You can do this type of work effectively in TSQL as Feodor explains.
My previous article on SSIS was focused on the architecture and the functioning of the product. I had also provided a few essential T-SQL scripts which provide certain ways for documenting the SSIS environment.
In this article I will focus more on the T-SQL scripting and the ways to reveal configuration, performance and architectural information through scripting.

Exploring SSIS’s Metadata Objects

SSIS Metadata Objects in sys.objects

Let’s start simple by exploring the metadata objects that are related to the SSIS installation.
If we look at the Integration Services metadata objects in SQL 2005 we will notice that the objects contain the ‘DTS’ phrase in their names. By executing the following script in SQL 2005 we will get all objects related to the SSIS metadata (notice that the script is executed in the MSDB context):
USE msdb ;
SELECT  *FROM    sys.objectsWHERE   name LIKE '%dts%'
Later on, in the SQL 2008 and later we have objects containing the phrase ‘DTS’ as well as ‘SSIS’ in the names. Execute the following script to view the objects (again, in the context of the MSDB database):
USE msdb ;
SELECT  *FROM    sys.objectsWHERE   name LIKE '%dts%'
        
OR name LIKE '%ssis%'
Why is this? In SQL Server 2005 you will find dbo.sysdtspackages and dbo. sysdtspackages90, which help SQL Server distinguish between Integration Services packages created in BIDS and legacy packages inherited and transferred from the old SQL Server 2000 DTS (Data Transformation Services).
In SQL Server 2008 and up we find dbo. sysdtspackages and dbo. sysssispackages, where the first table contains legacies, and the second – the BIDS packages with versions from 2005 and 2008.

SSIS Metadata Objects in other system tables

In SQL 2008 and up we have:
name - 2008
sysdtscategories One row for each category description
sysdtspackagelog Legacy
sysdtspackages Legacy
sysdtssteplog Legacy
sysdtstasklog Legacy
sysssislog One row per entry generated by SSIS package at runtime (when the SQL Server log provider is used)
sysssispackagefolders One row for each folder in the SSIS structure
sysssispackages One row for each SSIS package
… and in SQL 2005 there is …
name - 2005
sysdtscategories One row for each category description
sysdtslog90 One row per entry generated by SSIS package at runtime
sysdtspackagefolders90 One row for each folder in the SSIS structure
sysdtspackagelog Legacy
sysdtspackages Legacy
sysdtspackages90 One row for each SSIS package
sysdtssteplog Legacy
sysdtstasklog Legacy

Structure and contents of the SSIS packages

As we know, the SSIS packages are just structured XML files that contain all information needed for the package to carry out its tasks. In other words, the SSIS package itself contains the objects in the flows, the precedence, the connections and their configurations.
SSIS Packages may be saved on the file system, or in MSDB repository. In the case of the package being saved in MSDB, the package definition is saved in the packagedata column of the dbo.sysssispackages table (or in dbo.sysdtspackages90 in SQL Server 2005).
The column itself is of the image datatype, hence in order for us to retrieve the contents, we need to cast it as a VARBINARY(MAX) first, and then as a XML data type.
Depending on the security level of the package, however, it might not be very easy to explore the contents of the package definitions in MSDB; in case the package is encrypted, the package definition will begin with the EncryptedData tag.

Retrieving the definitions of the SSIS Packages

So, here is how to retrieve the definitions of the SSIS packages in MSDB:

In 2005:

SELECT     p.[name] AS [PackageName]
          
,[description] AS [PackageDescription]
          
,CASE [packagetype]
              
WHEN 0 THEN 'Undefined'
              
WHEN 1 THEN 'SQL Server Import and Export Wizard'
              
WHEN 2 THEN 'DTS Designer in SQL Server 2000'
              
WHEN 3 THEN 'SQL Server Replication'
              
WHEN 5 THEN 'SSIS Designer'
              
WHEN 6 THEN 'Maintenance Plan Designer or Wizard'
          
END     AS [PackageType]
          
,CASE [packageformat]
              
WHEN 0 THEN 'SSIS 2005 version'
              
WHEN 1 THEN 'SSIS 2008 version'
          
END AS [PackageFormat]
          
,p.[createdate]
          
,CAST(CAST(packagedata AS VARBINARY(MAX)) AS XML) PackageXML
          
FROM      [msdb].[dbo].[sysdtspackages90]     p

In 2008 and up:

SELECT     p.[name] AS [PackageName]
          
,[description] AS [PackageDescription]
          
,CASE [packagetype]
              
WHEN 0 THEN 'Undefined'
              
WHEN 1 THEN 'SQL Server Import and Export Wizard'
              
WHEN 2 THEN 'DTS Designer in SQL Server 2000'
              
WHEN 3 THEN 'SQL Server Replication'
              
WHEN 5 THEN 'SSIS Designer'
              
WHEN 6 THEN 'Maintenance Plan Designer or Wizard'
          
END     AS [PackageType]
          
,CASE [packageformat]
              
WHEN 0 THEN 'SSIS 2005 version'
              
WHEN 1 THEN 'SSIS 2008 version'
          
END AS [PackageFormat]
          
,p.[createdate]
          
,CAST(CAST(packagedata AS VARBINARY(MAX)) AS XML) PackageXML
          
FROM      [msdb].[dbo].[sysssispackages]     p
Now that we have the definition, what can we do with it? We can parse it and extract some useful data.

Extracting connection strings from an SSIS definition

Here is how to retrieve the data connection strings:

In SQL 2005:

;WITH XMLNAMESPACES ('www.microsoft.com/SqlServer/Dts' AS pNS1,
  
'www.microsoft.com/SqlServer/Dts' AS DTS) -- declare XML namespacesSELECT c.name,
  
SSIS_XML.value('./pNS1:Property [@pNS1:Name="DelayValidation"][1]',
    
'varchar(100)') AS DelayValidation,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ObjectName"][1]',
    
'varchar(100)') AS ObjectName,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="Description"][1]',
    
'varchar(100)') AS Description,
  
SSIS_XML.value('pNS1:ObjectData[1]/pNS1:ConnectionManager[1]
/pNS1:Property[@pNS1:Name="Retain"][1]'
, 'varchar(MAX)') Retain
  
SSIS_XML.value('pNS1:ObjectData[1]/pNS1:ConnectionManager[1]
/pNS1:Property[@pNS1:Name="ConnectionString"][1]'
, 'varchar(MAX)') ConnectionStringFROM  --( SELECT    id ,
                    
CAST(CAST(packagedata AS VARBINARY(MAX)) AS XML) PackageXML
          
FROM      [msdb].[dbo].[sysdtspackages90]
        
) PackageXML
        
CROSS APPLY PackageXML.nodes('/DTS:Executable/DTS:ConnectionManager') SSIS_XML ( SSIS_XML )
        
INNER JOIN [msdb].[dbo].[sysdtspackages90] c ON PackageXML.id = c.id

In SQL 2008 and up:

;WITH XMLNAMESPACES ('www.microsoft.com/SqlServer/Dts' AS pNS1,
  
'www.microsoft.com/SqlServer/Dts' AS DTS) -- declare XML namespacesSELECT c.name,
  
SSIS_XML.value('./pNS1:Property [@pNS1:Name="DelayValidation"][1]',
    
'varchar(100)') AS DelayValidation,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ObjectName"][1]',
    
'varchar(100)') AS ObjectName,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="Description"][1]',
    
'varchar(100)') AS Description,
  
SSIS_XML.value('pNS1:ObjectData[1]/pNS1:ConnectionManager[1]
/pNS1:Property[@pNS1:Name="Retain"][1]'
, 'varchar(MAX)') Retain
  
SSIS_XML.value('pNS1:ObjectData[1]/pNS1:ConnectionManager[1]
/pNS1:Property[@pNS1:Name="ConnectionString"][1]'
, 'varchar(MAX)') ConnectionStringFROM  --( SELECT    id ,
                    
CAST(CAST(packagedata AS VARBINARY(MAX)) AS XML) PackageXML
          
FROM      [msdb].[dbo].[sysssispackages]
        
) PackageXML
        
CROSS APPLY PackageXML.nodes('/DTS:Executable/DTS:ConnectionManager') SSIS_XML ( SSIS_XML )
        
INNER JOIN [msdb].[dbo].[sysssispackages] c ON PackageXML.id = c.id

Extracting connection strings from an SSIS definition

Here is how to retrieve the package configurations:

In SQL 2005:

;WITH XMLNAMESPACES ('www.microsoft.com/SqlServer/Dts' AS pNS1,
  
'www.microsoft.com/SqlServer/Dts' AS DTS) -- declare XML namespacesSELECT c.name,
  
SSIS_XML.value('./pNS1:Property [@pNS1:Name="ConfigurationType"][1]',
    
'varchar(100)') AS ConfigurationType,
  
CASE CAST(SSIS_XML.value('./pNS1:Property[@pNS1:Name="ConfigurationType"][1]',
    
'varchar(100)') AS INT)
    
WHEN 0 THEN 'Parent Package'
    
WHEN 1 THEN 'XML File'
    
WHEN 2 THEN 'Environmental Variable'
    
WHEN 3 THEN 'Registry Entry'
    
WHEN 4 THEN 'Parent Package via Environmental Variable'
    
WHEN 5 THEN 'XML File via Environmental Variable'
    
WHEN 6 THEN 'Registry Entry via Environmental Variable'
    
WHEN 7 THEN 'SQL Server'
  
END AS ConfigurationTypeDesc,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ConfigurationVariable"][1]',
    
'varchar(100)') AS ConfigurationVariable,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ObjectName"][1]',
    
'varchar(100)') AS ConfigurationName,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ConfigurationString"][1]',
    
'varchar(100)') AS ConfigurationStringFROM  ( SELECT    id ,
                    
CAST(CAST(packagedata AS VARBINARY(MAX)) AS XML) PackageXML
          
FROM      [msdb].[dbo].[sysdtspackages90]
        
) PackageXML
        
CROSS APPLY PackageXML.nodes('/DTS:Executable/DTS:Configuration') SSIS_XML ( SSIS_XML )
        
INNER JOIN [msdb].[dbo].[sysdtspackages90] c ON PackageXML.id = c.id

In SQL 2008 and up:

;WITH XMLNAMESPACES ('www.microsoft.com/SqlServer/Dts' AS pNS1,
  
'www.microsoft.com/SqlServer/Dts' AS DTS) -- declare XML namespacesSELECT c.name,
  
SSIS_XML.value('./pNS1:Property [@pNS1:Name="ConfigurationType"][1]',
    
'varchar(100)') AS ConfigurationType,
  
CASE CAST(SSIS_XML.value('./pNS1:Property[@pNS1:Name="ConfigurationType"][1]',
    
'varchar(100)') AS INT)
    
WHEN 0 THEN 'Parent Package'
    
WHEN 1 THEN 'XML File'
    
WHEN 2 THEN 'Environmental Variable'
    
WHEN 3 THEN 'Registry Entry'
    
WHEN 4 THEN 'Parent Package via Environmental Variable'
    
WHEN 5 THEN 'XML File via Environmental Variable'
    
WHEN 6 THEN 'Registry Entry via Environmental Variable'
    
WHEN 7 THEN 'SQL Server'
  
END AS ConfigurationTypeDesc,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ConfigurationVariable"][1]',
    
'varchar(100)') AS ConfigurationVariable,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ObjectName"][1]',
    
'varchar(100)') AS ConfigurationName,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ConfigurationString"][1]',
    
'varchar(100)') AS ConfigurationStringFROM  ( SELECT    id ,
                    
CAST(CAST(packagedata AS VARBINARY(MAX)) AS XML) PackageXML
          
FROM      [msdb].[dbo].[sysssispackages]
        
) PackageXML
        
CROSS APPLY PackageXML.nodes('/DTS:Executable/DTS:Configuration') SSIS_XML ( SSIS_XML )
        
INNER JOIN [msdb].[dbo].[sysssispackages] c ON PackageXML.id = c.id
There are many other aspects to be explored in the definitions of the SSIS packages, and it is all matter of finding the node names and parsing them.
In the remaining part of this article, I would like to shift our attention to these areas: the interaction between SQL Agent and the SSIS packages and some scripts to gather performance statistics.

Overriding the package internal configurations

The SSIS packages can be executed in several ways: as a scheduled job from the SQL Server Agent or from the command line (or even from a batch file).
Regardless of which method is used for the execution, it is always the DTExec.exe who carries the task.
Before executing the SSIS package, the SQL Server Agent or the command line script have to form an execution string and pass parameters to the DTExec, and thus control the execution of the package.
Here is a script which shows all SQL Agent jobs steps which execute SSIS packages and the custom configurations provided through the SQL Agent job:
USE [msdb]
GO
SELECT j.job_id,
      
s.srvname,
      
j.name,
      
js.subsystem,
      
js.step_id,
      
js.command,
      
j.enabled,
      
js.output_file_name,
      
js.last_run_outcome,
      
js.last_run_duration,
      
js.last_run_retries,
      
js.last_run_date,
      
js.last_run_time,
        
js.proxy_id FROM   dbo.sysjobs jJOIN   dbo.sysjobsteps js
  
ON  js.job_id = j.job_id JOIN   MASTER.dbo.sysservers s
  
ON  s.srvid = j.originating_server_id--filter only the job steps which are executing SSIS packages WHERE  subsystem = 'SSIS'--use the line below to enter some search criteria
--AND js.command LIKE N'%ENTER_SEARCH%'
GO
As you noticed, you can even use the script above to filter and search through the configurations of the SQL Agent Jobs. For example, you can search for all jobs which are executing encrypted SSIS packages by using …
AND    js.command LIKE N'%/DECRYPT%'
…as a search criteria in the above script. You may also want to search for a server name, for example.

Exploring execution history

Finally, let’s look into some execution history of the SSIS packages which are scheduled as SQL Server Agent jobs.
The following script will return all SQL Server Agent Jobs, which are currently (as of the moment of the execution of the script) executing SSIS packages and also the last execution time and duration, as well as the execution command.
SET NOCOUNT ON-- Check if the SQL Server Agent is runningIF EXISTS ( SELECT  1
            
FROM    MASTER.dbo.sysprocesses
            
WHERE   program_name = N'SQLAgent - Generic Refresher' )
    
BEGIN
        SELECT  
@@SERVERNAME AS 'InstanceName' ,
                
1 AS 'SQLServerAgentRunning'
    
END
ELSE
    BEGIN
        SELECT  
@@SERVERNAME AS 'InstanceName' ,
                
0 AS 'SQLServerAgentRunning'              
        
RAISERROR('The SQL Server Agent is not running.', 16, 1) WITH SETERROR ;              
    
END-- Execute the scriptIF EXISTS ( SELECT  *
            
FROM    tempdb.dbo.sysobjects
            
WHERE   id = OBJECT_ID(N'[tempdb].[dbo].[Temp1]') )
    
DROP TABLE [tempdb].[dbo].[Temp1]
GO
CREATE TABLE [tempdb].[dbo].[Temp1]
    
(
      
job_id UNIQUEIDENTIFIER NOT NULL ,
      
last_run_date NVARCHAR(20) NOT NULL ,
      
last_run_time NVARCHAR(20) NOT NULL ,
      
next_run_date NVARCHAR(20) NOT NULL ,
      
next_run_time NVARCHAR(20) NOT NULL ,
      
next_run_schedule_id INT NOT NULL ,
      
requested_to_run INT NOT NULL ,
      
request_source INT NOT NULL ,
      
request_source_id SYSNAME COLLATE database_default
                                
NULL ,
      
running INT NOT NULL ,
      
current_step INT NOT NULL ,
      
current_retry_attempt INT NOT NULL ,
      
job_state INT NOT NULL
    )
DECLARE @job_owner SYSNAME
DECLARE
@is_sysadmin INT
SET
@is_sysadmin = ISNULL(IS_SRVROLEMEMBER('sysadmin'), 0)SET @job_owner = SUSER_SNAME()INSERT  INTO [tempdb].[dbo].[Temp1]
        
EXECUTE MASTER.dbo.xp_sqlagent_enum_jobs @is_sysadmin, @job_owner
        
UPDATE  [tempdb].[dbo].[Temp1]SET     last_run_time = RIGHT('000000' + last_run_time, 6) ,
        
next_run_time = RIGHT('000000' + next_run_time, 6) ;-----SELECT  j.name AS JobName ,
        
j.enabled AS Enabled ,
        
CASE x.running
          
WHEN 1 THEN 'Running'
          
ELSE CASE h.run_status
                
WHEN 2 THEN 'Inactive'
                
WHEN 4 THEN 'Inactive'
                
ELSE 'Completed'
              
END
        END AS
CurrentStatus ,
        
COALESCE(x.current_step, 0) AS CurrentStepNbr ,
        
CASE x.running
          
WHEN 1 THEN js.step_name
          
ELSE NULL
        
END AS CurrentStepName ,
        
CASE WHEN x.last_run_date > 0
            
THEN CONVERT (DATETIME, SUBSTRING(x.last_run_date, 1, 4) + '-'
                  
+ SUBSTRING(x.last_run_date, 5, 2) + '-'
                  
+ SUBSTRING(x.last_run_date, 7, 2) + ' '
                  
+ SUBSTRING(x.last_run_time, 1, 2) + ':'
                  
+ SUBSTRING(x.last_run_time, 3, 2) + ':'
                  
+ SUBSTRING(x.last_run_time, 5, 2) + '.000', 121)
            
ELSE NULL
        
END AS LastRunTime ,
        
CASE h.run_status
          
WHEN 0 THEN 'Fail'
          
WHEN 1 THEN 'Success'
          
WHEN 2 THEN 'Retry'
          
WHEN 3 THEN 'Cancel'
          
WHEN 4 THEN 'In progress'
        
END AS LastRunOutcome ,
        
CASE WHEN h.run_duration > 0
            
THEN ( h.run_duration / 1000000 ) * ( 3600 * 24 )
                  + (
h.run_duration / 10000 % 100 ) * 3600 + ( h.run_duration
                                                              
/ 100 % 100 )
                  *
60 + ( h.run_duration % 100 )
            
ELSE NULL
        
END AS LastRunDuration ,
        
js.command AS SSISPackageExecutionCommandFROM    [tempdb].[dbo].[Temp1] x
        
LEFT JOIN msdb.dbo.sysjobs j ON x.job_id = j.job_id
        
JOIN msdb.dbo.sysjobsteps js ON js.job_id = j.job_id
        
LEFT OUTER JOIN msdb.dbo.syscategories c ON j.category_id = c.category_id
        
LEFT OUTER JOIN msdb.dbo.sysjobhistory h ON x.job_id = h.job_id
                                                    
AND x.last_run_date = h.run_date
                                                    
AND x.last_run_time = h.run_time
                                                    
AND h.step_id = 0WHERE   x.running = 1
        
AND js.subsystem = 'SSIS'
DROP TABLE [tempdb].[dbo].[Temp1]
In conclusion, SSIS is a vast product which provides significant amount of metadata available to the SQL Server administrator. In this article I have shown the way to explore the SSIS metadata through some scripts and hopefully they will make the daily administration of your SSIS environments much easier.

No comments:

Post a Comment