SQL Banana: rowcount

Sunday, 14 June 2015

SQL Server Triggers – Which fields have been updated? – Part 2

In the last post (SQL Server Triggers – Which fields have been updated? – Part 1) we explored various methods to determine which columns in a table were being updated. All those methods relied on a SQL trigger being fired. Once fired, if a specific column in the table was updated we could carry out an appropriate action, such as inserting an audit record into another table. However, what if the UPDATE statement was executed against the table, but the actual values of the fields were not changed? Consider the following example:

CREATE TABLE MyTable (Col1 INT IDENTITY(1,1), Col2 varchar(10), Col3 INT)

CREATE TRIGGER dbo.tr_MyTable ON dbo.MyTable AFTER UPDATE

BEGIN

PRINT 'An Update has occurred'

END

INSERT INTO MyTable

SELECT 'a',7 UNION ALL

SELECT 'b',8 UNION ALL

SELECT 'c',9 UNION ALL

SELECT 'd',10

UPDATE MyTable

SET Col3 = 10 WHERE Col2 = 'd'

The UPDATE statement above has overwritten Col3 with the same value as it had before. The table data has not actually been changed. Nonetheless, because an UPDATE statement has been executed, the AFTER UPDATE Trigger is fired and will output the print statement. In fact even if the UPDATE statement had not affected a single row, the trigger would still have been fired, outputting the print statement.

This is often not the desired behaviour – I don’t want to be inserting audit records if nothing has actually changed. Depending on what logic has been implemented in the trigger and how often it is fired, this could have an adverse impact on performance. Therefore the first improvement we can make is to ensure nothing happens if no records have been updated. To do this we use the @@ROWCOUNT global variable (for more on this variable see my post on SQL Server Global Variables):

ALTER TRIGGER dbo.tr_MyTable ON dbo.MyTable AFTER UPDATE

BEGIN

IF @@ROWCOUNT = 0 RETURN;

PRINT 'An Update has occurred'

--Perform UPDATE logic here

END

This trigger will now immediately return, doing nothing, if no records have been updated.

Next we need ensure we only take action if field values have actually changed. To do this we can make use of the special inserted and deleted tables. These are temporary tables created and managed automatically by SQL Server. While we cannot perform any DDL operations on these tables, they contain identical columns to the original table the trigger is created on, and we can refer directly to these columns in the code. When a DML trigger fires, these two tables are populated with the deleted and inserted records from the underlying table respectively. If records are updated then the original pre-updated record arrives in the deleted table and the post-updated record in the inserted table. Therefore, if we want to know if previous values have actually been updated with new, different values we can compare the contents of these two tables:

ALTER TRIGGER dbo.tr_MyTable ON dbo.MyTable AFTER UPDATE

BEGIN

IF @@ROWCOUNT = 0 RETURN;

IF EXISTS( SELECT *

FROM inserted i

INNER JOIN deleted d

ON i.Col1 = d.Col1

WHERE d.col2 <> i.col2

OR d.col3 <> i.col3

)

BEGIN

PRINT 'An Update has occurred'

END

In the above trigger, inside the IF EXISTS function, we are joining the inserted and deleted tables together on Col1, the identity column. Because these tables are effectively the “before” and “after” state of the data, we are only interested where the values in the remaining fields do not match, indicating that data has changed in those fields. This is logic is carried out by the WHERE clause. Only if such rows exist do we then perform the desired UPDATE logic, in this case the PRINT statement.

The above trigger will now only do anything if data has actually changed in the underlying table. This has the potential of saving a lot of time and unnecessary processing of data on those occasions when an UPDATE has occurred but no overall change has actually taken place.

Tuesday, 27 May 2014

SQL Server Parameter Sniffing - Slow Running Queries

Parameter sniffing is a problem that can occasionally creep into code, just when everything seems fine. A stored procedure that yesterday ran nice and fast, today is taking much longer. The next day it is fast again. Nothing has changed on the server / db. It can be difficult to track down why.

The answer could be Parameter Sniffing.

When a stored procedure is compiled, it compiles using ( or "sniffing") the parameter values set at the time of the invocation. It uses those parameter values to determine the optimal execution plan for the proc. Simply put, if you later call the proc with a different parameter value, it will still use the execution plan determined by the first set of values. It will do this regardless of whether a better execution plan would have been more appropriate for the new value.

For example, if I have a table containing all the values from 1 to 100, and then 100 rows with a value 999:

/ * Create table * /

CREATE TABLE DummyData ( col1 INT )

DECLARE @Counter INT

/ * Insert data 1-100 * /

SET @Counter = 0

WHILE @Counter < 100

BEGIN

INSERT INTO DummyData ( col1 ) VALUES ( @Counter )

SET @Counter = @Counter + 1

END

/ * Insert 100 data rows with value 999 * /

SET @Counter = 0

WHILE @Counter < 100

BEGIN

INSERT INTO DummyData ( col1 ) VALUES ( 999 )

SET @Counter = @Counter + 1

END

Next create a procedure to select values from this table:

CREATE PROC SelectDummyData ( @DummyValue INT )

BEGIN

SELECT * FROM DummyData WHERE col1 = @DummyValue

END

Now generate the Execution Plan for this procedure, passing a parameter of 999, and look at the "estimated number of rows":

EXEC SelectDummyData @DummyValue = 999

The row count is 100, which is what we would expect. But re-run the execute statement, this time with @DummyValue = 5, and the Estimated Number of Rows is still 100, when we would expect it to be 1.

The reason it is still expecting 100 rows is because the execution plan was determined when the proc was first compiled, when we used a value of 999. It is still using the same execution plan, and therefore the same number of rows is expected, regardless of the input value supplied.

The impact of this oddity in this example is negligible, but for a complex query it can result in significantly slower performance.

If your query is complex it might be easier to generate the execution plan in XML ( SET SHOWPLAN_XML ON ) and then search the XML for something like "ParameterCompiledValue =".

To avoid parameter sniffing issues from occurring there are a number of options that can be considered:

1. Recompile the stored proc every time it is executed:

EXEC SelectDummyData @DummyValue = 1
WITH RECOMPILE

This will ensure that the proc uses the supplied parameters to build the best execution plan every time, but the overhead is the constant recompilation. This may be acceptable if there is sufficient gain in query performance.

Generating the execution plan on the above script, using various parameters, shows the correct number of rows being estimated each time.

If the parameter sniffing relates to a set of queries in particular, then recompilation can be specified individually for queries:

SELECT * FROM DummyData WHERE col1 = @DummyValue

OPTION ( RECOMPILE )

2. Optimize the query/ies for a specific parameter value:

ALTER PROC SelectDummyData ( @DummyValue INT )

BEGIN

SELECT * FROM DummyData WHERE col1 = @DummyValue

OPTION ( optimize FOR ( @DummyValue = 1 ))

END

With this option we can force the query plan to be based on the same known value every time it is compiled (eg During maintenance tasks). This option may work if we know the optimal value for our system. The downside, however, is that it may still perform poorly for other values, but that may well be tolerable.

3. Avoid sniffing altogether, by using a local variable inside the proc:

ALTER PROC SelectDummyData ( @DummyValue INT )
AS BEGIN DECLARE @LocalDummyValue INT SET @LocalDummyValue = @DummyValue
SELECT * FROM DummyData WHERE col1 = @LocalDummyValue
END

Here there is no possibility of parameter sniffing, and the execution plan will be based on the statistics. Note, however, that the plan will be built using statistic densities instead of statistic histograms, which are less accurate. Therefore it is not guaranteed to be the best plan for all possible values.

Conclusion
Parameter sniffing has the potential to cause queries to run with widely varying performance. In order to mitigate this problem consider modifying the code based on the options presented above. Hopefully this will help keep the queries and stored procedures performing well wherever they are used.

More useful info on this topic is available here:

Degremont Michel

http://blogs.technet.com/b/mdegre/archive/2012/03/19/what-is-parameter-sniffing.aspx

Turgay Sahtiyan
http://blogs.msdn.com/b/turgays/archive/2013/09/10/parameter-sniffing-problem-and-workarounds.aspx