SQL Banana: June 2015

In the last post (SQL Server Triggers – Which fields have been updated? – Part 1) we explored various methods to determine which columns in a table were being updated. All those methods relied on a SQL trigger being fired. Once fired, if a specific column in the table was updated we could carry out an appropriate action, such as inserting an audit record into another table. However, what if the UPDATE statement was executed against the table, but the actual values of the fields were not changed? Consider the following example:

CREATE TABLE MyTable (Col1 INT IDENTITY(1,1), Col2 varchar(10), Col3 INT)

CREATE TRIGGER dbo.tr_MyTable ON dbo.MyTable AFTER UPDATE

BEGIN

PRINT 'An Update has occurred'

END

INSERT INTO MyTable

SELECT 'a',7 UNION ALL

SELECT 'b',8 UNION ALL

SELECT 'c',9 UNION ALL

SELECT 'd',10

UPDATE MyTable

SET Col3 = 10 WHERE Col2 = 'd'

The UPDATE statement above has overwritten Col3 with the same value as it had before. The table data has not actually been changed. Nonetheless, because an UPDATE statement has been executed, the AFTER UPDATE Trigger is fired and will output the print statement. In fact even if the UPDATE statement had not affected a single row, the trigger would still have been fired, outputting the print statement.

This is often not the desired behaviour – I don’t want to be inserting audit records if nothing has actually changed. Depending on what logic has been implemented in the trigger and how often it is fired, this could have an adverse impact on performance. Therefore the first improvement we can make is to ensure nothing happens if no records have been updated. To do this we use the @@ROWCOUNT global variable (for more on this variable see my post on SQL Server Global Variables):

ALTER TRIGGER dbo.tr_MyTable ON dbo.MyTable AFTER UPDATE

BEGIN

IF @@ROWCOUNT = 0 RETURN;

PRINT 'An Update has occurred'

--Perform UPDATE logic here

END

This trigger will now immediately return, doing nothing, if no records have been updated.

Next we need ensure we only take action if field values have actually changed. To do this we can make use of the special inserted and deleted tables. These are temporary tables created and managed automatically by SQL Server. While we cannot perform any DDL operations on these tables, they contain identical columns to the original table the trigger is created on, and we can refer directly to these columns in the code. When a DML trigger fires, these two tables are populated with the deleted and inserted records from the underlying table respectively. If records are updated then the original pre-updated record arrives in the deleted table and the post-updated record in the inserted table. Therefore, if we want to know if previous values have actually been updated with new, different values we can compare the contents of these two tables:

ALTER TRIGGER dbo.tr_MyTable ON dbo.MyTable AFTER UPDATE

BEGIN

IF @@ROWCOUNT = 0 RETURN;

IF EXISTS( SELECT *

FROM inserted i

INNER JOIN deleted d

ON i.Col1 = d.Col1

WHERE d.col2 <> i.col2

OR d.col3 <> i.col3

)

BEGIN

PRINT 'An Update has occurred'

END

In the above trigger, inside the IF EXISTS function, we are joining the inserted and deleted tables together on Col1, the identity column. Because these tables are effectively the “before” and “after” state of the data, we are only interested where the values in the remaining fields do not match, indicating that data has changed in those fields. This is logic is carried out by the WHERE clause. Only if such rows exist do we then perform the desired UPDATE logic, in this case the PRINT statement.

The above trigger will now only do anything if data has actually changed in the underlying table. This has the potential of saving a lot of time and unnecessary processing of data on those occasions when an UPDATE has occurred but no overall change has actually taken place.

When updating a table it can often be very useful to know not just that an update has taken place, but exactly which columns have been updated. For example, in an HR department you may have an employee table that contains customer names and bank account numbers. If the bank account number for an employee changes, either through a front end application, a DML query or any other method, it should be audited – by which we mean an audit table should be inserted with a corresponding record. While if an employee middle name is changed, an audit is not required.

Below we’ll look at a number of ways for testing for specific column updates, and in Part 2 we’ll look at determining if values have indeed actually changed or not – they could have been updated with the same value as before.

UPDATE()

One way of determing which column has been updated is by using an AFTER UPDATE trigger. Within this trigger we could use the UPDATE() function to detect the column that has been updated and apply the audit logic accordingly:

CREATE TRIGGER tr_Employee ON Employee AFTER UPDATE

IF ( UPDATE (BankAccountNo))

BEGIN

INSERT INTO EmployeeAudit ( Column1, Column2,... ModifiedBy, ModifiedDate)

SELECT inserted.Column1, inserted.Column2,... USER_NAME(), GETDATE()

END

But what happens if we have additional columns, some requiring audit (such as SortCode, NationalInsuranceNo etc), and some not (eg marital status)? Or perhaps the non-sensitive columns don’t require audit but do require some other logic, eg an additional table to be amended in line with this one. We could append multiple IF UPDATE() statements:

IF UPDATE(BankAccountNo)

BEGIN

--apply audit logic

END

IF UPDATE (MaritalStatus)

BEGIN

--apply other logic

END

--next IF…

Or we could combine the groups of logic together into fewer statements:

IF (UPDATE(BankAccountNo) OR UPDATE(SortCode))

BEGIN

--apply audit logic

END

IF (UPDATE(MaritalStatus) OR UPDATE(MiddleName)

BEGIN

--apply other logic

END

COLUMNS_UPDATED()

If you like working with binary and bitwise logic, COLUMNS_UPDATED() can be used to achieve similar results and much less code. COLUMNS_UPDATED() returns a varbinary bit pattern indicating which columns in a table have been updated. From BOL:

COLUMNS_UPDATED returns one or more bytes that are ordered from left to right, with the least significant bit in each byte being the rightmost. The rightmost bit of the leftmost byte represents the first column in the table; the next bit to the left represents the second column, and so on. COLUMNS_UPDATED returns multiple bytes if the table on which the trigger is created contains more than eight columns, with the least significant byte being the leftmost.

So for a simple four column table, when COLUMNS_UPDATED() is used in the trigger it may return a binary value of 1010. This would indicate that columns two and four (as listed in syscolumns) have been updated, but one and three have not. We can therefore use the value returned by this function in our trigger to determine if a specific combination of columns in our table have been updated. However in order to do this we need to first convert the binary number into decimal.

As each column (right to left) represents an incremental increase in the power of 2 (ie, right most column is 2^0, next to the left is 2^1, then 2^2), a binary value of 1010 can be translated to decimal as:

Column Name	Col4	Col3	Col2	Col1
Column Position	4	3	2	1
COLUMNS_UPDATED()	1	0	1	0
Bin to dec conversion	2^3	2^2	2^1	2^0
Decimal value	8	0	2	0

Binary 1010 = ( 0 x 2^0) + (1 x 2^1) + (0 x 2^2) + (1 x 2^3) = 0 + 2 + 0 + 8 = 10

When working with binary we need to use SQL Server bitwise operators, in this case ‘&’ (Bitwise AND) to compare two values or we may get unexpected results. The ‘&’ function looks at each bit (that is each 1/0) in the first binary number and compares it to the corresponding bit in the second number. If the first bit in the first number is 1 AND the second bit in the second number is 1, then the first bit of the output is 1. If either bits are 0 then the output is zero. Then it moves onto the second bit in both numbers and starts again. The output number can be converted back to decimal again. In the first table below, the binary number 1010 is ‘AND-ed’ with itself and the result is shown. In the second table it is ‘AND-ed’ with a different number.

	Col4	Col3	Col2	Col1	Decimal
Binary Value 1	1	0	1	0	10
Binary Value 2	1	0	1	0	10
Output of ‘AND’	1	0	1	0	10

	Col4	Col3	Col2	Col1	Decimal
Binary Value 1	1	0	1	0	10
Binary Value 2	0	1	1	0	6
Output of ‘AND’	0	0	1	0	2

The above tables show that using the bitwise AND operator: 10 & 10 = 10, 10 & 6 = 2. Therefore to we can say that in our table if COLUMNS_UPDATED() = 10 we know that columns two and four have been updated.

The resulting trigger for this example would look like this:

CREATE TRIGGER tr_Employee ON Employee AFTER UPDATE

IF (COLUMNS_UPDATED() & 10 = 10)

BEGIN

--apply audit logic

END

Note that the IF statement above will return TRUE if columns two and four or more are updated, not just two and four alone.

Advantages of this approach are that for a very wide table, the number of IF statements, or alternatively the number of conditions within the IF statements could be huge and time consuming to write, while using COLUMNS_UPDATED() reduces it to just one short line. However, using this function will mean that the functionality of the code is heavily dependent on the order of the columns in the table. If the table is dropped and recreated with columns reordered, the trigger may no longer function as expected. In addition, while being concise, this code is more difficult to understand and debug. Therefore I would recommend the earlier approaches unless absolutely necessary.

In Part 2 we’ll look at how to analyse the records that have been updated and determine if any update has actually been made at all – an Update trigger can be fired even if no values have actually changed.

SQL Banana

Sunday 14 June 2015

SQL Server Triggers – Which fields have been updated? – Part 2

Monday 8 June 2015

SQL Server Triggers – Which fields have been updated? – Part 1