Saturday, November 5, 2011

SSIS: "And If You've Made it Wrong..."

Old Cross Cut Saw

When you start working with SSIS, there's one thing nobody mentions: you have to learn a new syntax for expressions. It's not T-SQL, it's not VB.NET, it's not JavaScript, it's not like anything you've made before. It's "SSIS Expression Syntax", and you can learn all about it at Integration Services Expression Reference. (And yes, I'm pretty sure "it won't keep you comin' back for more"!)

I bring this up because until you understand this, you're going to think SSIS was written in FORTRAN, because anytime you have to enter an expression, nothing works. Ever. For example, let's write an expression in T-SQL to strip out all occurrences of the hex 12 (0x0C, "FF") character from a string. Simple, right?

-- T-SQL code.  Don't try this in SSIS. 
DECLARE @FF CHAR
SET @FF = 0X0C
DECLARE @Msg VARCHAR(50)

SET @Msg = 'Hello ' + @FF + 'SSIS'
PRINT @Msg

SET @Msg = REPLACE(@Msg, 0x0C, '')
PRINT @Msg

The output, depending on your font, collation, codepage, which version of Windows you're using, and hat size, will look something like this:

Hello §SSIS
Hello SSIS

So, we've proven we know how to use the REPLACE function in T-SQL. Admirable, but it doesn't help us in SSIS. That is, this doesn't work:

REPLACE(SampleBeg, 0x0C, '')

Try that expression there, and you'll get the usual non-helpful error SSIS message stack. For some reason, SSIS likes double-quotes, not single-quotes, and of course has its own unique way of expressing character constants in hex:

REPLACE(SampleBeg, "\x000C", "")

I'm sure there are excellent reasons why SSIS is such a Frankenstein's monster. When you're building something that might not be such a good idea to begin with, you're going to find some funky design choices forced on you, and my guess is that's what happened with SSIS. All I know is, the way it's been designed, it will be a long time before I can get it to do what I want.

Friday, November 4, 2011

How Much Data Is In That Column?

This is a special-purpose script, but it still may be useful to someone, someday.

Columns

Imagine you have many tables with similar filenames and the same column names. This might happen if the tables are created automatically by a Job on a daily basis, for example. For each column you specify, this script displays the declared width, the average number of characters stored, and the minimum and maximum number of characters stored, and it does this across all tables matching a LIKE pattern than you also specify.

This script might be useful if you wanted to see the average "fullness" of each column, or how big the widest value is for a given column. This might be good to know if you're planning to copy data from this column to another table, for example.


-------------------------------------------------------------------------------
/* Sample setup code:                                                          
                                                                               
 USE AdventureWorks                                                            
                                                                               
 IF OBJECT_ID('dbo.Invoice_2011_10') IS NOT NULL DROP TABLE dbo.Invoice_2011_10
 IF OBJECT_ID('dbo.Invoice_2011_11') IS NOT NULL DROP TABLE dbo.Invoice_2011_11
 IF OBJECT_ID('dbo.Invoice_2011_12') IS NOT NULL DROP TABLE dbo.Invoice_2011_12
                                                                               
 CREATE TABLE dbo.Invoice_2011_10 (AccountNumber INT, Comment NVARCHAR(50))    
 CREATE TABLE dbo.Invoice_2011_11 (AccountNumber INT, Comment NVARCHAR(50))    
 CREATE TABLE dbo.Invoice_2011_12 (AccountNumber INT, Comment NVARCHAR(50))    
                                                                               
 INSERT dbo.Invoice_2011_10 (AccountNumber, Comment) VALUES (  2, 'ABC')       
 INSERT dbo.Invoice_2011_11 (AccountNumber, Comment) VALUES (234, 'ABCDEF')    
 INSERT dbo.Invoice_2011_12 (AccountNumber, Comment) VALUES (234, 'ABCDEFGHI') 
                                                                               
 -- Now set the values as follows, and run this script.                        
 --     @SchemaName  is 'dbo'                                                  
 --     @TableFormat is 'Invoice_2011%'                                        
 --     @tColumns    should contain 'AccountNumber' and 'Comment' *only*       
                                                                               
                                                                             */
-------------------------------------------------------------------------------

SET NOCOUNT ON


-------------------------------------------------------------------------------
-- These are the variables you need to set.  There are no changes necessary    
-- after this section.                                                         
-------------------------------------------------------------------------------

-- The schema for the tables. 
DECLARE @SchemaName sysname = 'dbo'

-- The LIKE format for all the tables you want to examine. 
DECLARE @TableFormat sysname = 'Invoice_2011%'

-- The set of column names in the set of tables you want to report on. 
DECLARE @tColumns TABLE (ColumnName sysname)

INSERT @tColumns
       (ColumnName)
VALUES
       ('AccountNumber')
     , ('Comment')
        

-------------------------------------------------------------------------------
-- Variables (you don't need to change these).                                 
-------------------------------------------------------------------------------

DECLARE @TableName    sysname       = ''
DECLARE @ColumnName   sysname       = ''

DECLARE @Sql          NVARCHAR(MAX) = ' '                                 + CHAR(10)
                                    + 'SET NOCOUNT ON'                    + CHAR(10)
                                    + 'SET STATISTICS IO OFF'             + CHAR(10)
                                    + ' '                                 + CHAR(10)
                                    + 'DECLARE @tTotals TABLE '           + CHAR(10)
                                    + '    (ColumnName   sysname,'        + CHAR(10)
                                    + '     Example      NVARCHAR(MAX), ' + CHAR(10)
                                    + '     AverageLen   INT, '           + CHAR(10)
                                    + '     MaximumLen   INT, '           + CHAR(10)
                                    + '     DeclaredSize INT) '           + CHAR(10)

-- The *COLUMNAME* is just a placeholder; each column will call REPLACE with 
-- its name from the set you specified above.  Done this way to keep all the 
-- variables you need to change in one place, and at the top of the file. 
DECLARE @ColumnsSQL   NVARCHAR(MAX) = 'QUOTENAME(''*COLUMNNAME*''), '            + CHAR(10)
                                    + '       MIN([*COLUMNNAME*]), '             + CHAR(10)
                                    + '       AVG(DATALENGTH([*COLUMNNAME*])), ' + CHAR(10)
                                    + '       MAX(DATALENGTH([*COLUMNNAME*])), ' + CHAR(10)

DECLARE @ColLengthSQL NVARCHAR(MAX) = ''


-------------------------------------------------------------------------------
-- Process each table that matches the LIKE expression.                        
-------------------------------------------------------------------------------

RAISERROR('All column sizes are in bytes, not characters.', 10, 1) WITH NOWAIT

DECLARE curTables CURSOR FOR
 SELECT name
   FROM sys.tables
  WHERE name LIKE @TableFormat ESCAPE '$'
    AND type_desc = 'USER_TABLE'
       AND SCHEMA_ID(@SchemaName) = schema_id
  ORDER BY name

OPEN curTables
FETCH NEXT FROM curTables INTO @TableName

WHILE @@FETCH_STATUS = 0
BEGIN
    SET @Sql += ' '                                                                              + CHAR(10)
             +  'RAISERROR('' '', 10, 1) WITH NOWAIT'                                            + CHAR(10)
             +  'RAISERROR(''Table ' + @SchemaName + '.' + @TableName + ''', 10, 1) WITH NOWAIT' + CHAR(10)

    SET @ColLengthSQL = '       MIN(COL_LENGTH(''' 
                      + @SchemaName + '.' + @TableName 
                      + ''', ''*COLUMNNAME*'')) ' + CHAR(10)
    
    -- Process each column in the set. 
    DECLARE curColumns CURSOR FOR
        SELECT ColumnName
          FROM @tColumns
   
    OPEN curColumns
    FETCH NEXT FROM curColumns INTO @ColumnName
              
    WHILE @@FETCH_STATUS = 0
    BEGIN
    
        -- Some types don't support MIN, etc. 
        IF EXISTS (SELECT *
                     FROM sys.columns   c
                     JOIN sys.types     t
                       ON c.system_type_id = t.system_type_id
                    WHERE t.name IN ('uniqueidentifier', 'bit', 'xml')
                      AND c.object_id = OBJECT_ID(@SchemaName + '.' + @TableName)
                      AND c.name = @ColumnName)
        BEGIN
            RAISERROR('Skipping %s.%s.%s because type is unsupported', 10, 1, @SchemaName, @TableName, @ColumnName) WITH NOWAIT
            FETCH NEXT FROM curColumns INTO @ColumnName
            CONTINUE
        END

        -- Build the INSERT statement. 
        SET @Sql += ' '                               + CHAR(10)
                 +  'INSERT @tTotals (ColumnName, '   + CHAR(10)
                 +  '                 Example, '      + CHAR(10)
                 +  '                 AverageLen, '   + CHAR(10)
                 +  '                 MaximumLen, '   + CHAR(10)
                 +  '                 DeclaredSize) ' + CHAR(10)
                 +  'SELECT '
                 
        SET @Sql += REPLACE(@ColumnsSQL + @ColLengthSQL, '*COLUMNNAME*', @ColumnName)
                 +  '  FROM ' + @SchemaName + '.' + @TableName + CHAR(10)
                 +  ' WHERE ' + QUOTENAME(@ColumnName) + ' IS NOT NULL '  + CHAR(10)
                 
        SET @Sql += ' '                                                                + CHAR(10)
                 +  'RAISERROR(''    Column ' + @ColumnName + ''', 10, 1) WITH NOWAIT' + CHAR(10)
        
        FETCH NEXT FROM curColumns INTO @ColumnName
    END
    
    CLOSE curColumns
    DEALLOCATE curColumns

    FETCH NEXT FROM curTables INTO @TableName
END
    
CLOSE curTables
DEALLOCATE curTables


-------------------------------------------------------------------------------
-- Done with the tables.  Display the results.                                 
-------------------------------------------------------------------------------

SET @Sql += ' '                                                     + CHAR(10)
         +  'RAISERROR('' '', 10, 1) WITH NOWAIT'                   + CHAR(10)
         +  ' '                                                     + CHAR(10)
         +  '  SELECT ColumnName          AS [Column Name], '       + CHAR(10)
         +  '         MAX(Example)        AS [Example], '           + CHAR(10)
         +  '         AVG(AverageLen)     AS [Average Length], '    + CHAR(10)
         +  '         MAX(MaximumLen)     AS [Maximum Length], '    + CHAR(10)
         +  '         MIN(DeclaredSize)   AS [Declared Size] '      + CHAR(10)
         +  '    FROM @tTotals '                                    + CHAR(10)
         +  'GROUP BY ColumnName '                                  + CHAR(10)
         +  'ORDER BY ColumnName '                                  
     
EXEC (@Sql)

-- PRINT is limited to 8000 bytes.  There are better solutions out there. 
DECLARE @Idx INT = 1

WHILE @Idx < LEN(@Sql)
BEGIN
    PRINT SUBSTRING(@Sql, @Idx, 4000)
    SET @Idx += 4000
END

Sunday, October 16, 2011

Adding Up the Logical Reads in the Output Window

nixie

Probably the first thing I learned about query tuning is that "clock-on-the-wall" time is definitely not the thing you want to measure. The elapsed time a query takes can be influenced by countless factors (other processes running, disk speed, number of processors, amount of RAM, phase of the moon, a few hundred database settings) that it's too rough a metric for serious performance tuning.

What I've been taught to measure, instead, are logical reads, which are the number of disk I/O reads needed to execute the query. If the query doesn't change, and the underlying data doesn't change, it's been my experience (and it makes sense theoretically), that you can run a given query all day long, under any load conditions you can create, and the number of logical reads will always be the same. This immunity to outside "noise" makes it ideal for tuning: you can be pretty sure (given the same schema, of course - indexes leap to mind - and at least similar data) that the logical read counts on your machine will be the same on another machine (such as the Production server, or your boss's laptop).

Ok, so how do we capture logical reads? Easy - simply enable the STATISTICS IO option, like so:

SET STATISTICS IO ON

Once turned on for a given session, this emits the number logical reads made for every table (including temp tables) for every statement executed, until you turn it off (by saying SET STATISTICS IO OFF). Let's look at a single line that would be emitted:

Table 'Invoices'. Scan count 1, logical reads 332, physical reads 10, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

You'll notice a lot of other values displayed on this line, but for now, ignore them. One that you should especially ignore (for now) is physical reads, because its count is included in the logical reads value. That is, in our one-line example, we can see there are going to be 332 pages read from somewhere. That "somewhere" will be either in RAM (the buffer pool), or from disk (your "C:" drive, for example), but both of these "somewheres" are counted by logical reads. Physical reads only counts what was read from disk this time, which is the kind of unpredictable "clock-on-the-wall" information we know to be unreliable for query tuning (for now). Why are they unpredictable? Because who knows why the page wasn't in buffer pool in memory, and so had to read from disk, at the precise moment we needed it? It's unknowable, and even if it weren't - if you had perfect knowledge of the innards of the computer at that moment - it's still unpredictable.

Looking at our one-line example above, everything seems pretty rosy. SET STATISTICS IO ON spits out the logical reads we need, so all we have to do is total them up. But this is where it gets ugly, because the information we're interested in is interspersed with our own PRINT messages, RAISERROR output, warning and informational messages from the system, and data output from SELECT statements (if we're using "Output to Text" mode, which I often do, as it's the only way to get the dang output window to scroll). This can all be quite a mess to read:

Starting at 12:25:30 PM...
Table 'Customers'. Scan count 6, logical reads 10213968, physical reads 125391, read-ahead reads 254294, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
CustomerNum   InvoiceNum   Amt
----------- ------------ --------------------------------------
2344   1331111   333.23
4492   4837227 44222.84
3434   222444   91811.33
98474   54422  22.09
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Updating the total prices...
(1 row(s) affected)
Table 'Invoices'. Scan count 1, logical reads 332, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
12:30:38: Added row to the Invoices table for 2011/10/12
Warning: Aggregate calculation excluded NULL values.
Table 'Worktable'. Scan count 3, logical reads 120, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Now, try to total up the logical reads in that output. If there's only a few lines of output, it's manageable, but imagine trying to decipher hundreds of lines of stuff like this. Over time, I developed a rigamarole system:

  • copy the entire contents of the Output window to the clipboard
  • fire up Excel, and paste the output into it
  • from Excel's "Data" tab, click the "Sort A to Z" button
  • select only the rows that start with "Table..." and copy them to the clipboard
  • paste the rows into a new window in Management Studio
  • use a regular expression search-and-replace to strip out all the text on each line before the logical read numbers
  • use a regular expression search-and-replace to strip out all the text on each line after the logical read numbers
  • copy the column of logical read numbers back into Excel
  • from Excel's "Formulas" tab, use the AutoSum button to, finally, calculate the total number of logical reads

"Crude but effective" doesn't quite do this process justice. There must be a better way, I thought.

My first idea was an AddOn for Management Studio, but the more I looked into it, I realized I just couldn't face the tedium of application programming (yes, I'm now permanently spoiled by a platform - SQL Server - that actually works without swearing at it). And while SQL Server isn't exactly famous for its text-manipulation ability, what I was was doing didn't have to be fast or elegant. And I'd rather hack something together in T-SQL in a few hours than beat my head against the C# wall for a few days. (Don't believe me? Ask someone who's written a truly great Visual Studio add-on.) As a bonus, the people using the code would be able to understand it, and maybe even build upon and improve it.

So, after an embarrassingly long delay caused by forgetting, again, that REPLACE replaces all occurrences, not just the first one, here it is. It works correctly as far as I can tell, but don't use it as an example of how to write good T-SQL code. (Please tell me about any bugs you find. Use at your own risk. Some settling of contents may occur during shipping.)

SET NOCOUNT ON
SET QUOTED_IDENTIFIER OFF

DECLARE @Text NVARCHAR(MAX) =
"-- INSERT YOUR TEXT BETWEEN THESE TWO LINES -----------------------------------
Starting at 12:25:30 PM...
Table 'Customers'. Scan count 6, logical reads 10213968, physical reads 125391, read-ahead reads 254294, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
CustomerNum   InvoiceNum   Amt
----------- ------------ --------------------------------------
2344   1331111   333.23
4492   4837227 44222.84
3434   222444   91811.33
98474   54422  22.09
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Updating the total prices...
(1 row(s) affected)
Table 'Invoices'. Scan count 1, logical reads 332, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
12:30:38: Added row to the Invoices table for 2011/10/12
Warning: Aggregate calculation excluded NULL values.
Table 'Worktable'. Scan count 3, logical reads 120, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- INSERT YOUR TEXT BETWEEN THESE TWO LINES -----------------------------------"

DECLARE @Lines TABLE (Txt NVARCHAR(MAX), Num INT)
DECLARE @nLF   INT

WHILE 1=1
BEGIN
    SET @nLF = CHARINDEX(NCHAR(10), @Text)
    IF @nLF = 0  BREAK
    
    IF @nLF < 3
    BEGIN
        SET @Text = STUFF(@Text, 1, 1, '')  
        CONTINUE
    END
    
    INSERT @Lines (Txt)
    VALUES (LEFT(@Text, @nLF - LEN(NCHAR(13) + NCHAR(10))))

    SET @nLF = PATINDEX('%' + NCHAR(13) + NCHAR(10) + '%', @Text)
    SET @Text = STUFF(@Text, 1, @nLF, '')
END

DELETE @Lines WHERE Txt NOT LIKE '%, logical reads %'
UPDATE @Lines SET Num = PATINDEX('%, logical reads %', Txt)
UPDATE @Lines SET Txt = STUFF(Txt, 1, Num + LEN(', logical reads '), '')
UPDATE @Lines SET Num = CHARINDEX(',', Txt)
UPDATE @Lines SET Txt = LEFT(Txt, Num - 1)
UPDATE @Lines SET Num = CAST(Txt AS INT)
DELETE @Lines WHERE Num = 0

SELECT Num      AS 'Logical Reads'       FROM @Lines
SELECT SUM(Num) AS 'Total Logical Reads' FROM @Lines

SET QUOTED_IDENTIFIER ON

To use this script, just copy-and-paste the entire contents of the Output window between the two lines in red in the @Text variable's definition, and hit F5 (Ctrl-E, whatever) to execute the script. You should get output like this:

The first set of results represent the SET STATISTICS IO rows that have non-zero logical reads, useful mostly for debugging the script. You can suppress this output by commenting out the first SELECT statement in the script (third line of code from the bottom). The second set of results is what we're looking for: the total number of logical reads that appear in the text of the Output window.

Sunday, October 9, 2011

SSIS: Word-wrapping Annotations (Using Only Notepad)

Kaukauna 41 Junk Yard HDRRemember that Twilight Zone episode, where a man sells his soul to the Devil for immortality, only to be sentenced to life in prison the next day?

I'm learning SSIS (2005, for now), and have discovered, like all those before me, that Microsoft implemented Annotations, but... they don't word-wrap! Your Annotation must fit on one really long one line! Bwa-ha-ha-ha-ha-ha!

Of course, I'm no longer surprised by anything SSIS: I even have a bookmark for "SSIS" Google searches now. And that's how I found Paul Blackwell's excellent Hidden SSIS Features: Word Wrapping Your Annotations And More which covers the subject extremely well. I highly recommend the entire article, especially the section titled Transparency & Word Wrapping.

Another very interesting read is [SSIS] Modifier l'apparence des packages via l'API [FAIL] by François Jehl. (We non-Francophones can use Google's translate-to-English, or the page itself has a Microsoft translation button.) After some research, his conclusion is that Annotations are a feature from a product created before SSIS ("DaVinci Design Surface", hence "MSDDS"), which might explain why Annotations are hacked implemented in SSIS package files as XML nested inside the document's XML. (Hmmm... that would explain all the &lt;s and &gt;s floating around!) For what François was attempting, that was a fatal roadblock. Fortunately, we're not trying anything nearly so ambitious or useful as he was - certainly nothing Notepad can't handle!

Anyway, using what they've uncovered, here's how to make your Annotations word-wrap. (This is how I do it - use at your own risk, and make a backup of your package file first). Close Visual Studio, and open the package file in Notepad. Search for controlprogid="MSDDS.Text", and you'll be treated to nested XML that looks something like this:

&lt;ddscontrol controlprogid="MSDDS.Text" left="4118" top="-655" logicalid="18" controlid="9" masterid="0" hint1="0" hint2="0" width="6153" height="994" noresize="0" nomove="0" nodefaultattachpoints="1" autodrag="0" usedefaultiddshape="1" selectable="1" showselectionhandles="1" allownudging="1" isannotation="0" dontautolayout="0" groupcollapsed="0" tabstop="1" visible="1" snaptogrid="0"&gt;
    &lt;control&gt;
      &lt;ddsxmlobjectstreaminitwrapper binary="0002000009180000e20300000300640000000500008008000000000000002a000054006800650001000000900144420100065461686f0074006f002000506d615800540068006500200063006f006e006e0065006300740069006f006e0020006d0061006e00610067006500054006800650007200200066006f0074006f0020005000720020007400680069007300200d61582005400610073006b0020006600610069006c0073002000730070006f007200610064000540068006500074006f00200050006900630061006c006c007900200075006e006c00650073007300200069007400270073002000730065007400200d6158005400680065002006100730073006900760065002000540068006500004d006f00640065002e0000000000" /&gt;
    &lt;/control&gt;
    &lt;layoutobject&gt;
      &lt;ddsxmlobj /&gt;
    &lt;/layoutobject&gt;
    &lt;shape groupshapeid="0" groupnode="0" /&gt;
  &lt;/ddscontrol&gt;

See the red "2" at position 61 of the ddsxmlobjectstreaminitwrapper item? To enable word-wrap, simply OR in a value of 1: in this case, we change the "10" (decimal 2) to "11" (decimal 3). Once that's done, save the package file (you made a backup first, right?), exit Notepad, open the package with Visual Studio, and - viola! - your Annotation is word-wrapped! Now do the same thing for all your other Annotations. (Hey, it's a hack, not magic.)

So, if it's just a bit-flip, why didn't Microsoft simply put a checkbox on the Properties page for Annotations? Because... Annotations don't have Property pages! Bwa-ha-ha-ha-ha-ha!

Tuesday, October 4, 2011

SSIS FTP Task: Directory is not specified in the file connection manager

Are you getting this error message, been Googling for an hour, and nothing helps?

Error: 0xC0029186 at {Your FTP Task}, {Your FTP Task Description}:
Directory is not specified in the file connection manager {Your Variable Name}.
rusty....

If you entered the value for the source and/or target filename variable and enclosed it with double-quotes, try removing them. Worked for me.

(Before coming over to the SQL Server side, I was a full-time Windows application developer, working in C, C++, and C#, ever since Microsoft QuickC for Windows was released, way back in 1990. If you've ever wondered what it's like being a Windows application developer, imagine that everything in SQL Server was designed and implemented just like SSIS. If you've ever spent much time writing application code for Windows, you'll know exactly what I mean. If not, count your blessings!)

Thursday, September 29, 2011

Cleaning User Stuff from the master Database

Garbage

This is an upgrade to a script I wrote a long time ago. It's handy when you find a system whose master database is somehow full of user objects. (I know that none of us ever make that mistake...)

The biggest improvement is that this script now only generates DROP statement, it doesn't execute them. You have to do that yourself, and you should be sure to back up your master database (you do that on a regular basis anyway, right?), and be darn sure to read all the comments, and of course examine the generated T-SQL code, before executing it. Use at your own risk!


-------------------------------------------------------------------------------
-- Drops all user objects that have been accidentally created in 'master'.     
-- This script erases everything in the master database that is not Microsoft- 
-- shipped.  HOWEVER, be careful: some user objects BELONG in the master       
-- database (database triggers, security objects, for example). Backing up the 
-- master database first would be a prudent step.  Use at your own risk.       
-- NOTE: This script only generates the T-SQL code required.  You have to copy 
-- it and do the actual execution yourself.  You may have to run this script   
-- several times due to dependencies; you may also get error messages about    
-- objects that cannot be found, also due to dependencies.                     
-------------------------------------------------------------------------------

USE master

DECLARE @Sql            NVARCHAR(MAX)       SET @Sql            = ''
DECLARE @ObjectName     sysname             SET @ObjectName     = ''
DECLARE @ConstraintName sysname             SET @ConstraintName = ''
DECLARE @TableName      sysname             SET @TableName      = ''
DECLARE @TypeDesc       NVARCHAR(4000)      SET @TypeDesc       = ''
DECLARE @Type           NVARCHAR(1024)      SET @Type           = ''
DECLARE @Score          INT                 SET @Score          = 0
DECLARE @ObjectID       INT                 SET @ObjectID       = 0

DECLARE @ENumber        INT                 
DECLARE @ESeverity      INT                 
DECLARE @EState         INT                 
DECLARE @EProcedure     NVARCHAR(126)       
DECLARE @ELine          INT                 
DECLARE @EMessageRecv   NVARCHAR(2048)      
DECLARE @EMessageSent   NVARCHAR(440)       

RAISERROR('-- Execute this code only *AFTER* you have examined it throughly!', 10, 1) WITH NOWAIT
RAISERROR('USE master', 10, 1) WITH NOWAIT


-------------------------------------------------------------------------------
-- Most objects follow this pattern.                                           
-------------------------------------------------------------------------------

DECLARE cur CURSOR DYNAMIC FOR
    SELECT name,
           type_desc
      FROM sys.objects
     WHERE is_ms_shipped = 0
     ORDER BY type_desc, name

OPEN cur
FETCH FIRST FROM cur INTO @ObjectName, @TypeDesc

WHILE @@FETCH_STATUS = 0
BEGIN
    BEGIN TRY
        IF @TypeDesc = 'VIEW'
        BEGIN
            SET @Sql = 'DROP VIEW ' + QUOTENAME(@ObjectName)
            RAISERROR('%s', 10, 1, @Sql) WITH NOWAIT    
        END
        ELSE IF @TypeDesc NOT LIKE '%_CONSTRAINT' ESCAPE '$'
        BEGIN
            SET @Type  = REVERSE(@TypeDesc)
            SET @Score = CHARINDEX('_', @Type)
            SET @Type  = SUBSTRING(@Type, 0, @Score)
            SET @Type  = REVERSE(@Type)
            SET @Sql   = 'DROP ' + @Type + ' ' + QUOTENAME(@ObjectName)
            RAISERROR('%s', 10, 1, @Sql) WITH NOWAIT    
        END
    END TRY
    BEGIN CATCH
        SET @ENumber      = ISNULL(ERROR_NUMBER(),          -1)
        SET @ESeverity    = ISNULL(ERROR_SEVERITY(),        -1)
        SET @EState       = ISNULL(ERROR_STATE(),            0)    IF @EState = 0 SET @EState = 42
        SET @EProcedure   = ISNULL(ERROR_PROCEDURE(), N'{N/A}')
        SET @ELine        = ISNULL(ERROR_LINE(),            -1)
        SET @EMessageRecv = ISNULL(ERROR_MESSAGE(),        N'')
        SET @EMessageSent = N''

        IF ERROR_PROCEDURE() IS NOT NULL   SET @EMessageSent = N'Error %d, Level %d, State %d, Procedure %s, Line %d, Message: '
        SET @EMessageSent = @EMessageSent + ERROR_MESSAGE()
        RAISERROR(@EMessageSent, 10 /* @ESeverity */ , @EState, @ENumber, @ESeverity, @EState, @EProcedure, @ELine) WITH LOG
    END CATCH

    FETCH NEXT FROM cur INTO @ObjectName, @TypeDesc
END

CLOSE cur
DEALLOCATE cur


-------------------------------------------------------------------------------
-- Constraints are the only exception to the pattern above.                    
-------------------------------------------------------------------------------

DECLARE cur CURSOR DYNAMIC FOR
    SELECT name,
           type_desc,
           OBJECT_NAME(parent_object_id)
      FROM sys.objects
     WHERE is_ms_shipped = 0
       AND type_desc LIKE '%_CONSTRAINT' ESCAPE '$'
  ORDER BY type_desc, name

OPEN cur    
FETCH FIRST FROM cur INTO @ConstraintName, @TypeDesc, @TableName

WHILE @@FETCH_STATUS = 0
BEGIN
    BEGIN TRY
        SET @Sql = 'ALTER TABLE ' + QUOTENAME(@TableName) + ' DROP CONSTRAINT ' + QUOTENAME(@ConstraintName)
        RAISERROR('%s', 10, 1, @Sql) WITH NOWAIT    
    END TRY
    BEGIN CATCH
        SET @ENumber      = ISNULL(ERROR_NUMBER(),          -1)
        SET @ESeverity    = ISNULL(ERROR_SEVERITY(),        -1)
        SET @EState       = ISNULL(ERROR_STATE(),            0)    IF @EState = 0 SET @EState = 42
        SET @EProcedure   = ISNULL(ERROR_PROCEDURE(), N'{N/A}')
        SET @ELine        = ISNULL(ERROR_LINE(),            -1)
        SET @EMessageRecv = ISNULL(ERROR_MESSAGE(),        N'')
        SET @EMessageSent = N''

        IF ERROR_PROCEDURE() IS NOT NULL   SET @EMessageSent = N'Error %d, Level %d, State %d, Procedure %s, Line %d, Message: '
        SET @EMessageSent = @EMessageSent + ERROR_MESSAGE()
        RAISERROR(@EMessageSent, 10 /* @ESeverity */ , @EState, @ENumber, @ESeverity, @EState, @EProcedure, @ELine) WITH LOG
    END CATCH

    FETCH NEXT FROM cur INTO @ConstraintName, @TypeDesc, @TableName
END

CLOSE cur
DEALLOCATE cur


-------------------------------------------------------------------------------
-- Done.                                                                       
-------------------------------------------------------------------------------

RAISERROR('-- Execute this code only *AFTER* you have examined it throughly!', 10, 1) WITH NOWAIT

Thursday, August 4, 2011

"Select Top 1000 Rows" Doesn't Show SPARSE Columns?

A colleague pointed out something I'd never noticed about SQL Server Management Studio's (SSMS) "Select Top 1000 Rows" feature: it doesn't display SPARSE columns. This is not a bug, but rather by design: SQL Server tables can have up to 30,000 SPARSE columns: imagine the issues with viewing thousands of columns at a time!


The annoying thing, however, is that no matter how few columns a table has, not even a single SPARSE column will be displayed. To workaround this limitation, I've written a script that creates views for all tables with SPARSE columns. Since views don't suffer from this limitation in SSMS, you can use "Select Top 1000 Rows" on them to effectively see all the columns on the table.

I personally run it as a "startup script", just to semi-automate the maintenance of re-creating the views when tables change over time, but it could just a easily be set to run as a SQL Agent Job, or manually if desired.

-------------------------------------------------------------------------------
-- Declare variables.                                                          
-------------------------------------------------------------------------------

DECLARE @Prefix     NVARCHAR(50)  = '_'
DECLARE @Suffix     NVARCHAR(50)  = '_SPARSE'
DECLARE @SchemaName sysname       = ''
DECLARE @TableName  sysname       = ''
DECLARE @ViewName   sysname       = ''
DECLARE @Sql        NVARCHAR(MAX) = ''


-------------------------------------------------------------------------------
-- Since tables can be added, modified, and deleted, start with a clean slate. 
-------------------------------------------------------------------------------

DECLARE cur CURSOR FOR
    SELECT SCHEMA_NAME(schema_id)   AS 'SchemaName'
         , name                     AS 'TableName'
     FROM sys.views
    WHERE is_ms_shipped = 0 
      AND name LIKE '%' + @Suffix ESCAPE '$'

OPEN cur
FETCH NEXT FROM cur INTO @SchemaName, @TableName

WHILE @@FETCH_STATUS = 0
BEGIN
    SET @ViewName = QUOTENAME(@SchemaName) + '.' + QUOTENAME(@TableName)

    IF OBJECT_ID(@SchemaName, @ViewName) IS NULL
    BEGIN        
        SET @Sql = 'DROP VIEW ' + @ViewName
        EXEC sys.sp_executesql @Sql 
    END

    FETCH NEXT FROM cur INTO @SchemaName, @TableName
END

CLOSE cur
DEALLOCATE cur


-------------------------------------------------------------------------------
-- Iterate through all tables and create view for those with SPARSE columns.   
-------------------------------------------------------------------------------

DECLARE cur CURSOR FOR
    SELECT DISTINCT SCHEMA_NAME(t.schema_id)   AS 'SchemaName'
                  , t.name                     AS 'TableName'
     FROM sys.tables    t
     JOIN sys.columns   c
       ON t.object_id = c.
       object_id
    WHERE t.type_desc     = 'USER_TABLE'
      AND t.is_ms_shipped = 0 
      AND c.is_sparse     = 1

OPEN cur
FETCH NEXT FROM cur INTO @SchemaName, @TableName

WHILE @@FETCH_STATUS = 0
BEGIN
    SET @ViewName  = QUOTENAME(@SchemaName) + '.' + QUOTENAME(@Prefix + @TableName + @Suffix)
    SET @TableName = QUOTENAME(@SchemaName) + '.' + QUOTENAME(@TableName)
    
    IF OBJECT_ID(@SchemaName, @ViewName) IS NULL
    BEGIN        
        SET @Sql = 'CREATE VIEW ' + @ViewName + ' AS  SELECT * FROM ' + @TableName
        EXEC sys.sp_executesql @Sql 
    END

    FETCH NEXT FROM cur INTO @SchemaName, @TableName
END

CLOSE cur
DEALLOCATE cur

Once the above code has been run, any tables you have with SPARSE columns will have views created for them. (If you don't happen to have any, here's some code to create a small test table.)

SET NOCOUNT ON

-------------------------------------------------------------------------------
-- Set up test table.                                                          
-------------------------------------------------------------------------------

IF OBJECT_ID('MyTable') IS NOT NULL
    DROP TABLE MyTable

SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE MyTable
(
    TableID            INT                  NOT NULL,
    MySparseColumn     NVARCHAR (50) SPARSE     NULL,
    MyNonSparseColumn  NVARCHAR (50)        NOT NULL
)

GO

INSERT MyTable (TableID, MySparseColumn, MyNonSparseColumn)
VALUES (1, NULL, 'ABC')

GO 10

Once you've got the views created, simply use the "Select Top 1000 Rows" feature as usual, and you'll be able to see all the columns in your tables...

... even the SPARSE ones.

Thursday, June 16, 2011

Instantly Find Size of Tables on Disk

Here's a script to tell you how big your table(s) are, subtotaled by indexes, with a grand total. Includes LOBs, and excludes dropped pages.

SELECT CASE WHEN GROUPING(t.name) = 0 THEN t.name ELSE 'GRAND TOTAL' END    AS 'Table Name'
     , CASE WHEN GROUPING(i.name) = 0 THEN i.name ELSE 'TABLE TOTAL' END    AS 'Index Name'
     , SUM(au.total_pages) * 8 / 1024                                       AS 'Allocated MB'
     , SUM(au.used_pages)  * 8 / 1024                                       AS 'Used MB'

  FROM sys.allocation_units   au

  LEFT JOIN sys.partitions    pHobt
    ON au.container_id = pHobt.hobt_id
   AND (au.type_desc = 'IN_ROW_DATA'  OR  au.type_desc = 'ROW_OVERFLOW_DATA')

  LEFT JOIN sys.partitions    pLob
    ON au.container_id = pLob.partition_id
   AND au.type_desc    = 'LOB_DATA'

  JOIN sys.indexes            i  
    ON (i.object_id = pHobt.object_id  AND  i.index_id = pHobt.index_id)
         OR
       (i.object_id = pLob.object_id   AND  i.index_id = pLob.index_id)
   
  JOIN sys.tables             t
    ON i.object_id  = t.object_id
  
 WHERE au.type_desc != 'DROPPED'
   AND t.type_desc   = 'USER_TABLE'
   AND COALESCE(pHobt.rows, pLob.rows, 0) > 0
   AND t.name LIKE '%MyTable%'
  
 GROUP BY ROLLUP(t.name, i.name)
 ORDER BY t.name, i.name 

Saturday, June 11, 2011

Easy Documentation: Does Anything Even Use That Index?

I don't know if it's ever happened to you, but I always find myself wishing I had documented what the devil I was thinking when I added an index to a table. I mean, it's all well and good to create an index, but it's annoying when six months later you can't remember what it was for. Is anything actually using this index? Can I maybe drop it?

Yes, yes, I suppose I could document my indexes, and I suppose some people do that kind of thing... but it just seems like so much... work.

Wouldn't it be cool if there was a way to automatically document which stored procedures use which indexes? And how many times each procedure used each index? And when the last time that was? And was integrated into SQL Server Management Studio, and all this with no work required on your part?

Of course it would be cool; so here it is. In this post, I'll just present the code, and what it will do for you; I'll walk through it in my next post.

Briefly, this script examines the current and cached query plans, and parses the info we want out of them; then it adds extended properties to each index.

After you run this script, your indexes will have extended properties like this, one for each stored procedure that uses it. Of course, the index had to have been in the query plan cache to show up here; since the script only takes a few seconds to run, I have a job that runs it every so often. Over the long run, all the indexes eventually get found in the cache; it just depends on how often your various applications run. Note that the date in each extended property is updated every time the script is run.

The Script


-------------------------------------------------------------------------------
-- Document Indexes via Query Plans and Extended Properties.sql                
--                                                                             
-- 2011-06-11 - Larry Leonard - http://sqlsoundings.blogspot.com               
--                                                                             
-- (Note: In the comments below "index" is shorthand for "index or primary key 
-- constraint", and "stored procedure" is shorthand for "stored procedure,     
-- trigger, etc.".)                                                            
--                                                                             
-- This script first examines each currently executing query plan and extracts 
-- (a) the name of each index each query plan is using and (b) the name of the 
-- executing stored procedure for the query plan.  An Extended Property for    
-- each stored procedure is added to the corresponding index(es).              
--                                                                             
-- As this script is run, over time, every index (assuming it's ever caught in 
-- use) gathers a set of Extended Properties.  Each Extended Property displays 
-- the *name* of the stored procedure that used it, how many *times* since the 
-- last SQL Server restart it used it, and the *date* this script last updated 
-- this Extended Property.  For example:                                       
--                                                                             
--    usp_CalculateInvoice     Used this constraint  3315 times   (2011-03-22) 
--    TR_Customer_Update       Used this index         42 times   (2011-05-01) 
--                                                                             
-- This script runs against the the current database.  It should be run on an  
-- irregular basis so that every query is more likely to be caught executing   
-- at least once.  How long it takes for every query to have executed at least 
-- once depends entirely on your system.  If after a sufficient amount of time 
-- has passed, an index does not have any Extended Properties created by this  
-- script, that doesn't prove the index is unused, only that the index needs   
-- investigation. Don't drop indexes based solely on the usage of this script. 
-- You should read and test this script thoroughly before using it in a        
-- production environment.                                                     
-------------------------------------------------------------------------------


RAISERROR('"Document Indexes via Query Plans and Extended Properties.sql" starting on "%s"' , 10, 1, @@SERVERNAME) WITH NOWAIT, LOG

IF DB_ID() < 5
BEGIN
    RAISERROR('This is a system database.', 16, 1)
    RETURN
END

SET NOCOUNT ON
SET XACT_ABORT ON

-- We don't want to kill some other process doing real work.
SET DEADLOCK_PRIORITY -10


-------------------------------------------------------------------------------
-- Variables. None of these should need to be changed in normal use.           
-------------------------------------------------------------------------------

DECLARE @DatabaseName             sysname        = ''
DECLARE @ProcedureName            sysname        = ''
DECLARE @SchemaName               sysname        = ''
DECLARE @SchemaNameUnquoted       sysname        = ''
DECLARE @TableName                sysname        = ''
DECLARE @TableNameUnquoted        sysname        = ''
DECLARE @IndexName                sysname        = ''
DECLARE @PlanUseCount             INT            = 0
DECLARE @PlanUseCountString       NVARCHAR(50)   = ''
DECLARE @Len                      INT            = 0
DECLARE @IndexNamePrev            sysname        = ''
DECLARE @IndexNameUnquoted        sysname        = ''
DECLARE @ExtendedPropertyName     sysname        = ''
DECLARE @ExtendedPropertyValue    NVARCHAR(3750) = ''       -- The value of an extended property is a sql_variant value that can contain up to 7,500 bytes of data. 
DECLARE @ExtendedPropertyValueOld NVARCHAR(3750) = ''
DECLARE @Level2Type               VARCHAR(50)    = ''
DECLARE @Now                      NVARCHAR(50)   = CONVERT(NVARCHAR, CAST(GETDATE() AS DATE), 120)
DECLARE @Msg                      NVARCHAR(2048) = ''
DECLARE @Error                    INT            = 0
DECLARE @AddedCount               INT            = 0
DECLARE @UpdatedCount             INT            = 0
DECLARE @UnchangedCount           INT            = 0

RAISERROR('Looking for indexes in the plan cache...', 10, 1) WITH NOWAIT

DECLARE cur CURSOR FOR 
   WITH XMLNAMESPACES(DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan'),
    cteCachedPlansXml
    AS
    (
         SELECT DISTINCT
                cp.plan_handle                                       AS plan_handle

              , COALESCE(RelOp.op.value(N'TableScan [1]/Object[1]/@Database',          N'varchar(128)'),
                         RelOp.op.value(N'OutputList[1]/ColumnReference[1]/@Database', N'varchar(128)'),
                         RelOp.op.value(N'IndexScan [1]/Object[1]/@Database',          N'varchar(128)'))
                                                                     AS DatabaseName

              , COALESCE(RelOp.op.value(N'TableScan [1]/Object[1]/@Schema',            N'varchar(128)'),
                         RelOp.op.value(N'OutputList[1]/ColumnReference[1]/@Schema',   N'varchar(128)'),
                         RelOp.op.value(N'IndexScan [1]/Object[1]/@Schema',            N'varchar(128)')) 
                                                                     AS SchemaName

              , COALESCE(RelOp.op.value(N'TableScan [1]/Object[1]/@Table',             N'varchar(128)'),
                         RelOp.op.value(N'OutputList[1]/ColumnReference[1]/@Table',    N'varchar(128)'),
                         RelOp.op.value(N'IndexScan [1]/Object[1]/@Table',             N'varchar(128)'))
                                                                     AS TableName
             
              , COALESCE(RelOp.op.value(N'TableScan [1]/Object[1]/@Index',             N'varchar(128)'),
                         RelOp.op.value(N'OutputList[1]/ColumnReference[1]/@Index',    N'varchar(128)'),
                         RelOp.op.value(N'IndexScan [1]/Object[1]/@Index',             N'varchar(128)'))
                                                                     AS IndexName

              , QUOTENAME(OBJECT_NAME(qp.objectid))                  AS ProcedureName
              , ISNULL(cp.usecounts, 0)                              AS PlanUseCount              
           FROM sys.dm_exec_cached_plans                 cp
    CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle)   qp
    CROSS APPLY qp.query_plan.nodes(N'//RelOp')          RelOp (op)
          WHERE qp.dbid         = DB_ID()
            AND cp.cacheobjtype IN ('Compiled Plan', 'Compiled Plan Stub', 'Parse Tree', 'Extended Proc', 'CLR Compiled Func', 'CLR Compiled Proc')
            AND cp.objtype      IN ('Proc', 'Prepared', 'Adhoc', 'ReplProc', 'Trigger', 'View', 'Default', 'UsrTab', 'SysTab', 'Check', 'Rule')            
    )
    ,
    cteCachedPlans AS
    (
         SELECT DISTINCT 
                ctecpx.SchemaName                                                                     AS SchemaName
              , ctecpx.TableName                                                                      AS TableName
              , ctecpx.IndexName                                                                      AS IndexName
              , SUBSTRING(ctecpx.IndexName, 2, LEN(ctecpx.IndexName) - 2)                             AS IndexNameUnquoted
              , CASE MIN(CAST(i.is_primary_key as int)) WHEN 0 THEN 'INDEX' ELSE 'CONSTRAINT' END     AS Level2Type
              , ctecpx.ProcedureName                                                                  AS ProcedureName
              , MIN(ctecpx.PlanUseCount)                                                              AS PlanUseCountMin
              , MAX(ctecpx.PlanUseCount)                                                              AS PlanUseCountMax
              , SUM(ctecpx.PlanUseCount)                                                              AS PlanUseCountSum
           FROM cteCachedPlansXml                        ctecpx               
           JOIN sys.dm_exec_query_stats                  qs
             ON ctecpx.plan_handle = qs.plan_handle
           JOIN sys.indexes                              i
             ON OBJECT_ID(ctecpx.TableName) = i.object_id
            AND INDEXPROPERTY(OBJECT_ID(ctecpx.TableName), SUBSTRING(ctecpx.IndexName, 2, LEN(ctecpx.IndexName) - 2), 'IndexId') = i.index_id
           JOIN sys.tables                               t
             ON  i.object_id = t.object_id
    CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle)      st
          WHERE st.dbid              = DB_ID()
            AND ctecpx.DatabaseName  = QUOTENAME(DB_NAME())
            AND ctecpx.SchemaName   != QUOTENAME('sys')
            AND ctecpx.SchemaName    IS NOT NULL 
            AND ctecpx.TableName     NOT LIKE '$[$#%' ESCAPE '$'
            AND ctecpx.TableName     IS NOT NULL 
            AND ctecpx.IndexName     IS NOT NULL 
            AND ctecpx.ProcedureName IS NOT NULL
            AND t.type_desc NOT IN ('SQL_SCALAR_FUNCTION')          -- Added to as I find them. 
       GROUP BY ctecpx.SchemaName
              , ctecpx.TableName
              , ctecpx.IndexName
              , ctecpx.ProcedureName
    )
    ,
    cteIndexExtendedProperties AS
    (
        SELECT ctecp.SchemaName
             , ctecp.TableName
             , ctecp.IndexName
             , ctecp.IndexNameUnquoted
             , ctecp.ProcedureName
             , ctecp.PlanUseCountSum
             , ISNULL(CAST(ep.value AS NVARCHAR(3750)), '')   AS ExtendedPropertyValue
             , ctecp.Level2Type                               AS Level2Type
          FROM cteCachedPlans             ctecp
     LEFT JOIN sys.extended_properties    ep
            ON ep.major_id = OBJECT_ID(ctecp.SchemaName + '.' + ctecp.TableName, 'U')
           AND ep.minor_id = INDEXPROPERTY(OBJECT_ID(ctecp.SchemaName + '.' + ctecp.TableName, 'U'), ctecp.IndexNameUnquoted, 'IndexId')
         WHERE ISNULL(ep.class_desc, 'INDEX') = 'INDEX'
    )
    ,
    ctePkExtendedProperties AS
    (
        SELECT ctecp.SchemaName
             , ctecp.TableName
             , ctecp.IndexName
             , ctecp.IndexNameUnquoted
             , ctecp.ProcedureName
             , ctecp.PlanUseCountSum    
             , ISNULL(CAST(ep.value AS NVARCHAR(3750)), '')   AS ExtendedPropertyValue
             , ctecp.Level2Type                               AS Level2Type
          FROM cteCachedPlans             ctecp
     LEFT JOIN sys.extended_properties    ep
            ON ep.major_id = OBJECT_ID(ctecp.SchemaName + '.' + ctecp.TableName + '.' + ctecp.IndexName, 'PK')
           AND ep.minor_id = 0                                                              -- 0 means on the index, not on any columns. 
         WHERE ISNULL(ep.class_desc, 'OBJECT_OR_COLUMN') = 'OBJECT_OR_COLUMN'
    )
    ,
    cteExtendedProperties AS
    (
        SELECT cteiep.SchemaName                               AS SchemaName
             , cteiep.TableName                                AS TableName
             , cteiep.IndexName                                AS IndexName
             , cteiep.ProcedureName                            AS ProcedureName
             , SUM(cteiep.PlanUseCountSum)                     AS PlanUseCount 
             , MIN(cteiep.ExtendedPropertyValue)               AS ExtendedPropertyValue
             , ISNULL(cteiep.Level2Type, ctepkep.Level2Type)   AS Level2Type
          FROM cteIndexExtendedProperties   cteiep
     LEFT JOIN ctePkExtendedProperties      ctepkep
            ON cteiep.SchemaName    = ctepkep.SchemaName
           AND cteiep.TableName     = ctepkep.TableName
           AND cteiep.IndexName     = ctepkep.IndexName
           AND cteiep.ProcedureName = ctepkep.ProcedureName
      GROUP BY cteiep.SchemaName, cteiep.TableName, cteiep.IndexName, cteiep.ProcedureName, ISNULL(cteiep.Level2Type, ctepkep.Level2Type)
    )
    SELECT SchemaName
         , TableName
         , IndexName
         , ProcedureName
         , PlanUseCount 
         , ExtendedPropertyValue
         , Level2Type
      FROM cteExtendedProperties        
  ORDER BY SchemaName
         , TableName
         , IndexName
         , ProcedureName
         , PlanUseCount 
         , ExtendedPropertyValue
         , Level2Type


-------------------------------------------------------------------------------
-- Now that we've created the SELECT statement, iterate it through a cursor.   
-------------------------------------------------------------------------------

OPEN cur
RAISERROR('Found %d query plans with index usage in the cache', 10, 1, @@CURSOR_ROWS) WITH NOWAIT

FETCH NEXT FROM cur
 INTO @SchemaName, @TableName, @IndexName, @ProcedureName, @PlanUseCount, @ExtendedPropertyValue, @Level2Type
 
WHILE @@FETCH_STATUS = 0
BEGIN
    BEGIN TRY

        -- Build some values we're going to need. 
        SET @Len = LEN(@ProcedureName)  SET @ExtendedPropertyName = SUBSTRING(@ProcedureName, 2, @Len - 2)
        SET @Len = LEN(@SchemaName)     SET @SchemaNameUnquoted   = SUBSTRING(@SchemaName,    2, @Len - 2)
        SET @Len = LEN(@TableName)      SET @TableNameUnquoted    = SUBSTRING(@TableName,     2, @Len - 2)
        SET @Len = LEN(@IndexName)      SET @IndexNameUnquoted    = SUBSTRING(@IndexName,     2, @Len - 2)
        SET @Len = LEN(@Level2Type)     SET @Level2Type           = @Level2Type + SPACE(LEN('CONSTRAINT') - @Len)
        
        SET @ExtendedPropertyValueOld = @ExtendedPropertyValue
        SET @PlanUseCountString       = REPLACE(CONVERT(NVARCHAR(50), CAST(@PlanUseCount AS MONEY), 1), '.00', '')
        SET @PlanUseCountString       = SPACE(10 - LEN(@PlanUseCountString)) + @PlanUseCountString
        SET @ExtendedPropertyValue    = 'Used this ' + LOWER(@Level2Type) + ' ' + @PlanUseCountString 
        SET @ExtendedPropertyValue   += CASE @PlanUseCount WHEN 1 THEN ' time' ELSE ' times' END 
        SET @ExtendedPropertyValue   += '   (' + @Now + ')'

        -- If the index name is new, create a display heading for it; just for readability. 
        IF @IndexName != @IndexNamePrev
        BEGIN
            SET @IndexNamePrev = @IndexName
            SET @Msg = @SchemaNameUnquoted + '.' + @TableNameUnquoted + '.' + @IndexNameUnquoted
            RAISERROR(' ',  10, 1) WITH NOWAIT
            RAISERROR(@Msg, 10, 1) WITH NOWAIT
        END   
        
        -- If the value hasn't changed, nothing to do. 
        BEGIN TRAN 

        IF @ExtendedPropertyValue = @ExtendedPropertyValueOld
        BEGIN
            RAISERROR('   Unchanged: %-40s = %s', 10, 1, @ExtendedPropertyName, @ExtendedPropertyValue) WITH NOWAIT
            SET @UnchangedCount += 1
        END
        ELSE IF EXISTS (SELECT *
                         FROM sys.extended_properties    ep
                        WHERE ep.class_desc = 'INDEX'
                          AND ep.name       = @ExtendedPropertyName
                          AND ep.major_id   = OBJECT_ID(@SchemaNameUnquoted + '.' + @TableNameUnquoted, 'U')
                          AND ep.minor_id   = INDEXPROPERTY(OBJECT_ID(@SchemaNameUnquoted + '.' + @TableNameUnquoted, 'U'), @IndexNameUnquoted, 'IndexId')
                       )
                  OR

                EXISTS (SELECT *
                          FROM sys.extended_properties    ep
                     LEFT JOIN sys.key_constraints        kc
                            ON ep.major_id = kc.parent_object_id
                           AND ep.minor_id = 0   
                         WHERE ep.class_desc = 'OBJECT_OR_COLUMN'
                           AND ep.name       = @ExtendedPropertyName
                           AND ep.major_id   = OBJECT_ID(@SchemaNameUnquoted + '.' + @IndexNameUnquoted, 'PK')
                       )
        BEGIN    
            -- Update any existing one. 
            EXEC @Error = sys.sp_updateextendedproperty 
                @name       = @ExtendedPropertyName, 
                @value      = @ExtendedPropertyValue,
                @level0type = 'SCHEMA',    @level0name = @SchemaNameUnquoted, 
                @level1type = 'TABLE',     @level1name = @TableNameUnquoted,
                @level2type = @Level2Type, @level2name = @IndexNameUnquoted

            IF @Error = 0
            BEGIN
                SET @Msg = @SchemaNameUnquoted + '.' + @TableNameUnquoted + '.' + @IndexNameUnquoted
                RAISERROR('   Updated:   %-40s = %s', 10, 1, @ExtendedPropertyName, @ExtendedPropertyValue) WITH NOWAIT
                SET @UpdatedCount += 1
            END
            ELSE
            BEGIN
                SET @Msg = ISNULL(ERROR_MESSAGE(), '{ No error }')
                RAISERROR(@Msg, 16, 1) WITH NOWAIT, LOG
            END
        END
        ELSE
        BEGIN
            -- Write extended property. 
            EXEC @Error = sys.sp_addextendedproperty 
                @name       = @ExtendedPropertyName, 
                @value      = @ExtendedPropertyValue,
                @level0type = 'SCHEMA',    @level0name = @SchemaNameUnquoted, 
                @level1type = 'TABLE',     @level1name = @TableNameUnquoted,
                @level2type = @Level2Type, @level2name = @IndexNameUnquoted

            IF @Error = 0
            BEGIN
                SET @Msg = @SchemaNameUnquoted + '.' + @TableNameUnquoted + '.' + @IndexNameUnquoted
                RAISERROR('   Added:     %-40s = %s', 10, 1, @ExtendedPropertyName, @ExtendedPropertyValue) WITH NOWAIT
                SET @AddedCount += 1
            END
            ELSE
            BEGIN
                SET @Msg = ISNULL(ERROR_MESSAGE(), '{ No error }')
                RAISERROR(@Msg, 16, 42) WITH NOWAIT, LOG
            END
        END
        
        -- Done with this extended property. 
        COMMIT
    END TRY
    BEGIN CATCH
        -- MS bug: if sp_addextendedproperty or sp_dropextendedproperty throw an exception, they 
        -- leave a transaction open.  You can verify this by examining the @@TRANCOUNT value. 
        SET @Msg  = '   *** ERROR ' + CAST(@@ERROR AS NVARCHAR) + ': ' + ISNULL(ERROR_MESSAGE(), '{ No error }') + CHAR(10)
        SET @Msg += '   The @@TRANCOUNT is ' + CAST(@@TRANCOUNT AS NVARCHAR) + CHAR(10) + '   '
        
        -- If  1, the transaction is committable. 
        -- If -1, the transaction is uncommittable and should be rolled back. 
        -- If  0, there is no transaction and a commit or rollback operation would generate an error. 
        IF XACT_STATE() = -1 
        BEGIN
            SET @Msg += 'The transaction is in an uncommittable state.  Rolling back all transactions.'
            ROLLBACK
        END
        ELSE IF XACT_STATE() = 1 
        BEGIN
            SET @Msg += 'The transaction is committable.  Committing transaction.'
            COMMIT 
        END 

        -- If this is a deadlock, the rest of the indexes on this table will probably deadlock too. 
        -- Skip to the next table. 
        IF ERROR_NUMBER() = 1205
        BEGIN
            WHILE @@FETCH_STATUS = 0
            BEGIN
                FETCH NEXT FROM cur
                 INTO @SchemaName, @TableName, @IndexName, @ProcedureName, @PlanUseCount, @ExtendedPropertyValue, @Level2Type
            
                IF QUOTENAME(@TableNameUnquoted) != @TableName  BREAK
                RAISERROR('   Skipping %s.%s.%s to get to the next table', 10, 1, @SchemaName, @TableName, @IndexName)
            END
        END
        ELSE
        BEGIN
            -- If it wasn't a deadlock, something else bad must have happenned, so we don't want to 
            -- continue - this will cause an abort that displays the error message. 
            RAISERROR(@Msg, 16, 1) WITH NOWAIT, LOG
        END
    END CATCH

    -- Next row. 
    FETCH NEXT FROM cur
     INTO @SchemaName, @TableName, @IndexName, @ProcedureName, @PlanUseCount, @ExtendedPropertyValue, @Level2Type
END

-- Close and deallocate the cursor. 
CLOSE cur
DEALLOCATE cur

RAISERROR(' ',    10, 1) WITH NOWAIT
RAISERROR('Done - Extended properties: added %d new, updated %d existing, %d unchanged', 10, 1, @AddedCount, @UpdatedCount, @UnchangedCount) WITH NOWAIT, LOG

GO

Saturday, May 14, 2011

Get Read-to-Write Ratio for All Indexes

Here's a quick script to help you quantify how useful your indexes are, in the sense of how much they're used to speed up queries, compared to how much work it is for SQL Server to maintain them during inserts, updates, and deletes. This only reflects the usage since the last SQL Server boot.

-- Displays the read-to-update ratio of all indexes.  Good for finding ones that may not be needed.
DECLARE @dbid INT = DB_ID(DB_NAME())

; WITH cteUser AS
(
    SELECT object_id,
           index_id,
           user_seeks + user_scans + user_lookups   AS 'User Reads',
           user_updates                             AS 'User Updates'
      FROM sys.dm_db_index_usage_stats   
     WHERE database_id = @dbid
)
,
cteSystem AS
(
    SELECT object_id,
           index_id,
           system_seeks + system_scans + system_lookups   AS 'System Reads',
           system_updates                                 AS 'System Updates'
      FROM sys.dm_db_index_usage_stats   
     WHERE database_id = @dbid
)
,
cteTotal AS
(
    SELECT u.object_id,
           u.index_id,
           [User Reads]   + [System Reads]     AS 'Total Reads',
           [User Updates] + [System Updates]   AS 'Total Updates'
      FROM cteUser     u
      JOIN cteSystem   s
        ON u.object_id = s.object_id
       AND u.index_id  = s.index_id
)
,
cteReadToUpdateRatio AS
(
    SELECT object_id,
           index_id,
           CONVERT(NVARCHAR,
                  CONVERT(MONEY, ISNULL(
                                        CAST([Total Reads] AS REAL)
                                        /
                                        NULLIF([Total Updates], 0.0)
                                        , [Total Reads]
                                        )
                               , 1
                         )
                  )   AS 'Read-to-Update Ratio'
                  
      FROM cteTotal
)
SELECT OBJECT_SCHEMA_NAME(i.object_id)                                                                  AS 'Schema Name',
       OBJECT_NAME(i.object_id)                                                                         AS 'Table Name',
       i.name                                                                                           AS 'Index Name',
       REPLACE(CONVERT(NVARCHAR, CONVERT(MONEY, u.[User Reads]),                      1), '.00', '')    AS 'User Reads',                   
       REPLACE(CONVERT(NVARCHAR, CONVERT(MONEY, u.[User Updates]),                    1), '.00', '')    AS 'User Updates',                 
       REPLACE(CONVERT(NVARCHAR, CONVERT(MONEY, s.[System Reads]),                    1), '.00', '')    AS 'System Reads',                 
       REPLACE(CONVERT(NVARCHAR, CONVERT(MONEY, s.[System Updates]),                  1), '.00', '')    AS 'System Updates',               
       REPLACE(CONVERT(NVARCHAR, CONVERT(MONEY, t.[Total Reads]),                     1), '.00', '')    AS 'Total Reads',                
       REPLACE(CONVERT(NVARCHAR, CONVERT(MONEY, t.[Total Updates]),                   1), '.00', '')    AS 'Total Updates',                
                                                r.[Read-to-Update Ratio]                                AS 'Read-to-Update Ratio',         
       REPLACE(CONVERT(NVARCHAR, CONVERT(MONEY, t.[Total Reads] + t.[Total Updates]), 1), '.00', '')    AS 'Total Reads + Total Updates'
  FROM cteUser                        u
  JOIN cteSystem                      s
    ON u.object_id = s.object_id
   AND u.index_id  = s.index_id
  JOIN cteTotal                       t
    ON u.object_id = t.object_id
   AND u.index_id  = t.index_id
  JOIN cteReadToUpdateRatio           r
    ON u.object_id = r.object_id
   AND u.index_id  = r.index_id
  JOIN sys.indexes                    i
    ON u.object_id = i.object_id
   AND u.index_id  = i.index_id
  JOIN sys.objects                    o
    ON u.object_id = o.object_id
 WHERE OBJECTPROPERTY(o.object_id, 'IsUserTable') = 1
   AND i.is_primary_key = 0
 ORDER BY CAST(r.[Read-to-Update Ratio] AS REAL)

The results appear in "least useful" index first. Be sure to read and test this script before using in a Production environment.

Sunday, May 8, 2011

Reclaim Space using DBCC CLEANTABLE

Sweep

Image by The Real Estreya via Flickr

I just read Pradeep Adiga's excellent post on using DBCC CLEANTABLE to reclaim space left over after a column with a variable length type has been dropped from a table.

According to Books OnLine, the variable length data types are VARCHAR, NVARCHAR, VARCHAR(MAX), NVARCHAR(MAX), VARBINARY, VARBINARY(MAX), TEXT, NTEXT, IMAGE, SQL_VARIANT, and XML. If you drop one of these type columns from a table, SQL Server doesn't release the space it occupied; I imagine this is a speed optimization.

Thus inspired, here's a little script to run DBCC CLEANTABLE on all the user tables in the current database. You might be thinking, "How often does a column get dropped from a table, really?" The answer of course is, "It depends." On one huge (and abused) database I ran this script on, I gained 8 GB of space!

-----------------------------------------------------------------------------
-- Reclaim space from dropped variable length columns and text columns.      
-----------------------------------------------------------------------------

RAISERROR('Reclaiming space from dropped variable length columns and text columns...', 10, 1) WITH NOWAIT

DECLARE @TableName  sysname       = ''
DECLARE @SchemaName sysname       = ''

DECLARE cur CURSOR FOR 
    SELECT SCHEMA_NAME(schema_id)   AS 'SchemaName' 
         , name                     AS 'TableName'
      FROM sys.tables
     WHERE is_ms_shipped = 0
       AND type_desc     = 'USER_TABLE'
  ORDER BY 'SchemaName', 'TableName'

OPEN cur
FETCH NEXT FROM cur INTO @SchemaName, @TableName        

WHILE @@FETCH_STATUS = 0
BEGIN
    SET @TableName = @SchemaName + '.' + @TableName
    RAISERROR('   Examining table %s', 10, 1, @TableName) WITH NOWAIT
    DBCC CLEANTABLE(0, @TableName, 100000) WITH NO_INFOMSGS
    FETCH NEXT FROM cur INTO @SchemaName, @TableName        
END

CLOSE cur
DEALLOCATE cur

RAISERROR('Done.', 10, 1) WITH NOWAIT

Note that we specify a "batch_size" value of 100,000 rows per transaction to keep the logfile small while processing, and to keep the table locks (!) short. The appropriate value for a given system will require some experimentation: for mine, it stopped getting faster at 100,000 rows. Also, BOL points out that this script does not need to be run on a regular basis, just when a variable length or text column is dropped: but how often that actually is depends on your environment.

Monday, April 25, 2011

How to Hide Metadata Information Using Extended Properties

Ok, a more honest title for this post might be, 'How to Clean Up the Mess After You've Hosed the Extended Properties on Primary Key Constraints', but that wouldn't fit.

What I was trying to accomplish is a topic for another post. But while grumbling loudly while writing the code to fix what I had inadvertently done, I realized it could be used to "hide" information about tables. I say "hide", in quotes, because anyone with access to Books Online could find it easily. This is really just a way of storing information about tables, and not having it appear in SQL Server Management Studio (SSMS).

What I ran into involves the difference between primary key constraints and primary key indexes. The most important distinction between them is "existence"; there's no such thing as a primary key index. If you're confused, keep reading. If you've heard this story already, skip ahead a bit. Simply put, when someone is talking about "primary key indexes", they're using verbal shorthand for, "the index that the primary key constraint uses to enforce uniqueness." (As far as I know, this kind of index has no specific name, which is good, because if if did, we'd all confuse it with "primary key constraint", so we'd be in the same boat we are now, but without the ability to abruptly clarify things by saying, "but there's no such thing as a whatever-it-is." We should count our blessings.) To sum up, when it comes to primary keys, we have a "primary key constraint", and a unique index of some sort to help it. (We won't be getting into the whole clustered vs. non-clustered thing here.)

Now we're ready to talk about setting extended properties on primary key constraints. First we create a little test table:

CREATE TABLE MyTable
(
    MyPkColumn INT NOT NULL,
    CONSTRAINT PK_MyTable PRIMARY KEY (MyPkColumn) 
) 

And now we add an extended property to the PK_MyTable constraint, as so:

EXEC sys.sp_addextendedproperty 
    @name       = N'Extended property for PK_MyTable constraint'
  , @value      = N'This is the PK for MyTable'
  , @level0type = N'SCHEMA'
  , @level0name = N'dbo'
  , @level1type = N'TABLE'
  , @level1name = N'MyTable'
  , @level2type = N'CONSTRAINT'
  , @level2name = N'PK_MyTable'

Once added, this extended property will appear in SSMS, as expected. Oddly, it appears on the index-the-primary-key-constraint-uses, not on the constraint itself. (And no, I'm not suggesting you create tables in the master database.) So far, so good. But given that the "slot" for the index is already taken, and knowing what we do about about the non-existence of "primary key indexes", what happens if we do this?

EXEC sys.sp_addextendedproperty 
    @name       = N'Secret extended property for PK_MyTable index'
  , @value      = N'This is the PK index for MyTable'
  , @level0type = N'SCHEMA'
  , @level0name = N'dbo'
  , @level1type = N'TABLE'
  , @level1name = N'MyTable'
  , @level2type = N'INDEX'
  , @level2name = N'PK_MyTable'

Ideally (I think anyway) this should generate an error: we're adding an extended property to a CONSTRAINT, but setting @level2name to INDEX. What it actually does is create the extended property as if there really were such a thing as a "primary key index". Interestingly, but perhaps not surprisingly, this new extended property doesn't show up (at least not anywhere I can find) in SSMS. So how do we know it's there? We use the system views, of course!

-- Emit code to drop all extended properties on "primary key indexes". 

SELECT ep.major_id
     , ep.minor_id
     , s.name           AS 'Schema Name'
     , t.name           AS 'Table Name'
     , i.name           AS 'Index Name'
     , ep.name          AS 'EP Name'
     , ep.value         AS 'EP Value'
       
     , 'USE ' + DB_NAME() + '   ' +
       'EXEC sys.sp_dropextendedproperty '
            + '@name = '                                 +  '''' +  ep.name  + ''', '
            + '@level0type = ''SCHEMA'', @level0name = ' + quotename(s.name) + ', ' 
            + '@level1type = ''TABLE'',  @level1name = ' + quotename(t.name) + ', '
            + '@level2type = ''INDEX'',  @level2name = ' + quotename(i.name)   AS 'T-SQL'

  FROM sys.extended_properties    ep
  JOIN sys.indexes                i
    ON ep.major_id = i.object_id
   AND ep.minor_id = i.index_id
  JOIN sys.tables                 t
    ON i.object_id = t.object_id
  JOIN sys.schemas                s
    ON t.schema_id = s.schema_id
 WHERE ep.class_desc    = 'INDEX'
   AND i.is_primary_key = 1
 ORDER BY s.name, t.name, i.name, ep.name

The above displays the offending extended properties on the current database; the last column contains the code necessary to drop each one; for example:

EXEC sys.sp_dropextendedproperty 
    @name       = 'Secret extended property for PK_MyTable index', 
    @level0type = 'SCHEMA', 
    @level0name = [dbo], 
    @level1type = 'TABLE',  
    @level1name = [MyTable], 
    @level2type = 'INDEX',  
    @level2name = [PK_MyTable]
 

Additional Reading

Thursday, April 7, 2011

Find and Kill All System-Named Statistics

I hate system-named objects.

Some of this is to be expected from someone like me, who, depending on the decade, could be called "persnickety", "perfectionist", "OCD", or "The Anal Retentive Chef." But there's also a very practical reason to name all your database objects.

If you're using a third-party tool to synch up database schemas, you may find that every time you copy a system-named object from one side of the synch to the other, an entirely new name is generated. For example, you might have a statistic object named dbo.Customers._WA_Sys_00000001_45544755 on the left-hand side of the synch, but when you tell the tool to copy it to the right-hand side, you get a statistic named dbo.Customers._WA_Sys_00000001_436BFEE3.

Well, that won't do.

Below is a stored procedure that locates and drops all system named statistics. After you run this, just create a new statistic object with a name that follows your SQL nomenclature guidelines. (Whaddaya mean you don't have any? Get some!)

-- Comment. 
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO


-------------------------------------------------------------------------------
-- Drops all system-named statistics.                                          
-------------------------------------------------------------------------------

CREATE PROCEDURE dbo.usp_CleanupStatisticsWithSysNames
AS
BEGIN
    SET NOCOUNT ON

    DECLARE @Cmd        NVARCHAR(MAX) = ''
    DECLARE @SchemaName sysname       = ''
    DECLARE @TableName  sysname       = ''
    DECLARE @StatName   sysname       = ''

    DECLARE cur CURSOR FOR
        SELECT m.name   AS SchemaName
             , t.name   AS TableName
             , s.name   AS StatName
          FROM sys.stats           s
          JOIN sys.stats_columns   sc
            ON s.object_id = sc.object_id
           AND s.stats_id  = sc.stats_id
          JOIN sys.tables          t
            ON s.object_id = t.object_id
          JOIN sys.schemas         m
            ON t.schema_id = m.schema_id
         WHERE (s.name LIKE '\_WA\_Sys\_%' ESCAPE '\'  OR  s.auto_created = 1)
           AND s.object_id > 100
      ORDER BY SchemaName
             , TableName
             , StatName

    OPEN cur
    FETCH NEXT FROM cur INTO @SchemaName, @TableName, @StatName

    WHILE @@FETCH_STATUS = 0
    BEGIN
        SET @Cmd = 'DROP STATISTICS ' + QUOTENAME(@SchemaName) + '.' + QUOTENAME(@TableName) + '.' + QUOTENAME(@StatName)
        RAISERROR(@Cmd, 10, 1) WITH NOWAIT
        EXEC sp_executesql @Cmd
        FETCH NEXT FROM cur INTO @SchemaName, @TableName, @StatName
    END

    CLOSE cur
    DEALLOCATE cur
END

Rumor has it that the "WA" in the system-generated names stands for "Washington", as in "Redmond".

Tuesday, April 5, 2011

What About "Under-Used" Indexes?

We've all spent time looking for unused indexes (right?), but today my boss asked about under-used indexes. Now, I wrote a script a while ago to find completely unused indexes, but finding indexes that were only being used a little was just different enough to require a new script.

The script below displays all indexes, with the least-used (including unused) ones first, and totals for user and system reads and updates. I figured having the ratio between reads and updates might be useful, since if it takes more updates to maintain the index than it gets used, you might want to investigate its cost-benefit.

Some caveats:

  • If your server just rebooted, these numbers probably won't mean a lot; wait for at least a full day to allow time for the indexes to be read or written to.
  • Also, don't forget about periodic events like end-of-month processing. You don't want a phone call at 12:15 am on the first of the month when it turns out that index was needed after all.
  • Just because it costs more to maintain an index doesn't mean it's not still worth having. The cost of maintaining the index is probably spread out over a long period of time; the benefit to a waiting user of a query that takes two seconds instead of five minutes to run might outweigh that cost. It depends on your particular environment.
  • Note that indexes that support primary key constraints are excluded, as I'm pretty sure I'm not going to ever want to drop them!

Finally, consider disabling any under-used indexes you find, rather than just dropping them. If it turns out the index was important, you'll still have to rebuild the index, but at least the DDL code will be there already.

-- Display unused and under-used indexes. 

DECLARE @dbid INT = DB_ID(DB_NAME())

; WITH cteUser AS
(
    SELECT object_id,
           index_id,
           user_seeks + user_scans + user_lookups   AS 'User Reads',
           user_updates                             AS 'User Updates'
      FROM sys.dm_db_index_usage_stats   
     WHERE database_id = @dbid
)
,
cteSystem AS
(
    SELECT object_id,
           index_id,
           system_seeks + system_scans + system_lookups   AS 'System Reads',
           system_updates                                 AS 'System Updates'
      FROM sys.dm_db_index_usage_stats   
     WHERE database_id = @dbid
)
,
cteTotal AS
(
    SELECT u.object_id,
           u.index_id,
           [User Reads]   + [System Reads]     AS 'Total Reads',
           [User Updates] + [System Updates]   AS 'Total Updates'
      FROM cteUser     u
      JOIN cteSystem   s
        ON u.object_id = s.object_id
       AND u.index_id  = s.index_id
)
,
cteReadToUpdateRatio AS
(
    SELECT object_id,
           index_id,
           CONVERT(NVARCHAR,
                  CONVERT(MONEY, ISNULL(
                                        CAST([Total Reads] AS REAL)
                                        /
                                        NULLIF([Total Updates], 0.0)
                                        , [Total Reads]
                                        )
                               , 1
                         )
                  )   AS 'Read-to-Update Ratio'
                  
      FROM cteTotal
)
SELECT OBJECT_NAME(i.object_id)               AS 'Table Name',
       i.name                                 AS 'Index Name',
       u.[User Reads],
       u.[User Updates],
       s.[System Reads],
       s.[System Updates],
       t.[Total Reads],
       t.[Total Updates],
       r.[Read-to-Update Ratio],
       t.[Total Reads] + t.[Total Updates]    AS 'TOTAL READS + UPDATES'
  FROM cteUser                        u
  JOIN cteSystem                      s
    ON u.object_id = s.object_id
   AND u.index_id  = s.index_id
  JOIN cteTotal                       t
    ON u.object_id = t.object_id
   AND u.index_id  = t.index_id
  JOIN cteReadToUpdateRatio           r
    ON u.object_id = r.object_id
   AND u.index_id  = r.index_id
  JOIN sys.indexes                    i
    ON u.object_id = i.object_id
   AND u.index_id  = i.index_id
  JOIN sys.objects                    o
    ON u.object_id = o.object_id
 WHERE OBJECTPROPERTY(o.object_id, 'IsUserTable') = 1
   AND i.is_primary_key = 0
   --AND (t.[Total Reads] = 0                                  -- These definitely need to be dropped. 
   -- OR CAST(r.[Read-to-Update Ratio] AS REAL) < 1.0)         -- These may need to be dropped. 
 ORDER BY CAST(r.[Read-to-Update Ratio] AS REAL)

Friday, April 1, 2011

ISNULL or COALESCE? Sometimes it Really Does Matter

After seeing some poor performance in a query that I couldn't figure out, I ended up reading a lot of discussions about the "ISNULL / COALESCE" debate. Some people feel one is faster than the other, and some people like COALESCE because it's ANSI standard. (And they hold these opinions very strongly.)

I always thought the two were identical (when just two values are involved, of course), but it turns out that they have a difference that can affect performance. The value that ISNULL returns is typed to be same as the type of the first argument. The value that COALESCE returns is typed to be the same as the argument with the highest data type precedence.

Adam Machanic says: "What does this have to do with query performance? Sometimes, when using ISNULL or COALESCE as part of a predicate, a user may end up with a data type mismatch that is not implicitly convertable and which therefore causes a table scan or other less-than-ideal access method to be used."

Data type mismatches can cause table scans? I had totally forgotten about that. Makes sense though: if there's no implicit conversion, you'll have to use a CAST or CONVERT, which means each row will have to be evaluated.

That's what was causing the poor performance I was seeing. I wouldn't advocate a global search-and-replace, but I will be looking more closely at the COALESCE statements I come across.

Monday, February 7, 2011

Monitor the Progress of Long-Running Events

I never knew you could monitor the progress of long-running processes like DBCC CHECKDB, backups, shrinking files, rebuilding indexes, etc. until I read an offhand comment by Kalen Delaney (which I can't find now).   Now I keep this script open in a SSMS window about 90% of my day!

-- Displays the progress of several kinds of commands.  See BOL.
; with cte1 as
(
    select command                              as command,
           percent_complete   /  100.0          as percent_complete, 
           total_elapsed_time / 1000.0 / 60.0   as elapsed_minutes
      from sys.dm_exec_requests
     where percent_complete > 0.0
)
, cte2 as
(
    select command                                                               as command,
           cast(percent_complete * 100.0 as float(6))                            as percent_complete,
           cast(elapsed_minutes as int)                                          as elapsed_minutes,
           cast((elapsed_minutes / percent_complete) - elapsed_minutes as int)   as remaining_minutes
      from cte1
)
select command                                                           as 'Command',
       percent_complete                                                  as '% Complete', 
       elapsed_minutes                                                   as 'Elap Min',
       remaining_minutes                                                 as 'Left Mins',
       cast(dateadd(minute, remaining_minutes, getdate()) as nvarchar)   as 'ETA'
  from cte2
 order by percent_complete desc

The output looks like this:

Command             % Complete    Elap Min    Left Mins    ETA
DbccFilesCompact      31.62873          15           33    Feb  6 2011 11:50PM

Monday, January 31, 2011

Removing All Duplicate Tabs, Linefeeds, Returns, and Spaces

The trace tables created by SQL Profiler are nice and all, but it's a pain to search through the TextData column.  The code below uses a brute force method to replace all contiguous whitespace characters with a single space.  This makes it possible to search the TextData column without differences in whitespace getting in the way.

    -- Removing All Duplicate Tabs, Linefeeds, Returns, and Spaces. 

    DECLARE @RowsAffected INT = 0
    SET NOCOUNT ON
    
    -- Create all 16 (4^2) Tab, Linefeed, Return, and Space combinations.
    DECLARE @PairTT VARCHAR(2) = CHAR( 9) + CHAR( 9)    
    DECLARE @PairTL VARCHAR(2) = CHAR( 9) + CHAR(10)    
    DECLARE @PairTR VARCHAR(2) = CHAR( 9) + CHAR(13)    
    DECLARE @PairTS VARCHAR(2) = CHAR( 9) + CHAR(32)    
    DECLARE @PairLT VARCHAR(2) = CHAR(10) + CHAR( 9)    
    DECLARE @PairLL VARCHAR(2) = CHAR(10) + CHAR(10)    
    DECLARE @PairLR VARCHAR(2) = CHAR(10) + CHAR(13)    
    DECLARE @PairLS VARCHAR(2) = CHAR(10) + CHAR(32)    
    DECLARE @PairRT VARCHAR(2) = CHAR(13) + CHAR( 9)
    DECLARE @PairRL VARCHAR(2) = CHAR(13) + CHAR(10)
    DECLARE @PairRR VARCHAR(2) = CHAR(13) + CHAR(13)
    DECLARE @PairRS VARCHAR(2) = CHAR(13) + CHAR(32)
    DECLARE @PairST VARCHAR(2) = CHAR(32) + CHAR( 9)
    DECLARE @PairSL VARCHAR(2) = CHAR(32) + CHAR(10)
    DECLARE @PairSR VARCHAR(2) = CHAR(32) + CHAR(13)
    DECLARE @PairSS VARCHAR(2) = CHAR(32) + CHAR(32)
                                                    
    WHILE 1=1
    BEGIN
        SELECT @RowsAffected = COUNT(*) 
          FROM dbo.TraceTableReportPermanent
         WHERE TextData LIKE '%' + @PairTT + '%'
            OR TextData LIKE '%' + @PairTL + '%'
            OR TextData LIKE '%' + @PairTR + '%'
            OR TextData LIKE '%' + @PairTS + '%'
            OR TextData LIKE '%' + @PairLT + '%'
            OR TextData LIKE '%' + @PairLL + '%'
            OR TextData LIKE '%' + @PairLR + '%'
            OR TextData LIKE '%' + @PairLS + '%'
            OR TextData LIKE '%' + @PairRT + '%'
            OR TextData LIKE '%' + @PairRL + '%'
            OR TextData LIKE '%' + @PairRR + '%'
            OR TextData LIKE '%' + @PairRS + '%'
            OR TextData LIKE '%' + @PairST + '%'
            OR TextData LIKE '%' + @PairSL + '%'
            OR TextData LIKE '%' + @PairSR + '%'
            OR TextData LIKE '%' + @PairSS + '%'

        IF @RowsAffected > 0
            UPDATE dbo.TraceTableReportPermanent
               SET TextData = REPLACE(
                              REPLACE(
                              REPLACE(
                              REPLACE(
                              REPLACE(
                              REPLACE(
                              REPLACE(
                              REPLACE(
                              REPLACE(
                              REPLACE(
                              REPLACE(
                              REPLACE(
                              REPLACE(
                              REPLACE(
                              REPLACE(
                              REPLACE(TextData, @PairTT, CHAR(32)),
                                                @PairTL, CHAR(32)), 
                                                @PairTR, CHAR(32)), 
                                                @PairTS, CHAR(32)),
                                                @PairLT, CHAR(32)),
                                                @PairLL, CHAR(32)),
                                                @PairLR, CHAR(32)),
                                                @PairLS, CHAR(32)),
                                                @PairRT, CHAR(32)),
                                                @PairRL, CHAR(32)),
                                                @PairRR, CHAR(32)),
                                                @PairRS, CHAR(32)),
                                                @PairST, CHAR(32)),
                                                @PairSL, CHAR(32)),
                                                @PairSR, CHAR(32)),
                                                @PairSS, CHAR(32))
             WHERE TextData LIKE '%' + @PairTT + '%'
                OR TextData LIKE '%' + @PairTL + '%'
                OR TextData LIKE '%' + @PairTR + '%'
                OR TextData LIKE '%' + @PairTS + '%'
                OR TextData LIKE '%' + @PairLT + '%'
                OR TextData LIKE '%' + @PairLL + '%'
                OR TextData LIKE '%' + @PairLR + '%'
                OR TextData LIKE '%' + @PairLS + '%'
                OR TextData LIKE '%' + @PairRT + '%'
                OR TextData LIKE '%' + @PairRL + '%'
                OR TextData LIKE '%' + @PairRR + '%'
                OR TextData LIKE '%' + @PairRS + '%'
                OR TextData LIKE '%' + @PairST + '%'
                OR TextData LIKE '%' + @PairSL + '%'
                OR TextData LIKE '%' + @PairSR + '%'
                OR TextData LIKE '%' + @PairSS + '%'
        ELSE
            BREAK
             
        RAISERROR('Deleted %d whitespace pairs from dbo.TraceTable',
                   10, 1, @RowsAffected) WITH NOWAIT
    END