Saturday 8 December 2012

Reporting Services Report – Changing Column Names, Changing Table Names and sql PIVOT

The other day I was creating some SSRS reports.  For each datafeed in an ETL process the rejected rows were being diverted into error tables – a single error table for each feed, eg error_feed1, error_feed2.  As each of the feeds were different, so too were the column names and metadata of the error tables.  To allow users to review and correct these records I needed to build Reporting Services reports on each of the tables.


Initially it looked like I would need a different report for each feed – when you bind a SSRS report object  to a dataset the column names of the dataset have to remain constant or the report will fail, hence one report, one error table.  As I was dealing with dozens of feeds, the prospect of dozens of very similar reports did not seem favourable.

All these reports would be almost identical, the only difference was the column names and the table names.  I was sure there must be an easier way.  I googled around and found several useful suggestions.  Generally they followed the idea of pivoting the columns into rows in the dataset and then using a SSRS matrix object.  EG:
error_feed1
Id
Col1
Col2
Col3
Col4
1
W
X
Y
Z

Would become:
ID
measure
value
1
Col1
W
1
Col2
X
1
Col3
Y
1
Col4
Z

Using a matrix you would put the ID column on the rows, the measure column on the cross tab section, and value in the data section of the matrix object.  Now it wouldn’t matter if the column names changed, if new columns were added or old ones removed from the source table.  The 3 columns outputted by the pivot query would remain and the matrix report will adapt accordingly. The pivot/unpivot command to do the above would look like this:

SELECT ID, measure, value
FROM( SELECT id, col1, col2, col3, col4, col5 FROM error_feed1) p
UNPIVOT( VALUE FOR measure IN (col1, col2, col3, col4, col5)) AS unpvt


The columns are now dynamic, which solves half the problem. But the FROM clause uses specific column and table names, meaning this metatdata needs to be known in advance and hardcoded into the SSRS dataset query.  Which brings us back to our original problem – we cannot hard code these values because they are constantly changing.
In order to get around this problem I decided to have the dataset be the result of a stored procedure.  I can then have greater flexibility in manipulating the data, so long as I return a result set to SSRS at the end, and always with the same column names returned.
The proc accepts one parameter  - the feed name, to be supplied by the user running the report using a standard SSRS drop down.
The proc itself makes use of the sysobjects and syscolumns system tables to get the full list of columns for any given table:

SELECT      c.name
FROM        sys.columns c
INNER JOIN  sys.objects o
      ON    c.object_id = o.object_id
WHERE       type = 'U'
      AND   o.name LIKE @TableNameORDER BY    c.column_id

name
------------
id
col1
col2
col3
col4



Once the table name has been supplied (as a parameter in the SSRS report), dynamic sql can be leveraged to query the system tables and use the results to build a string containing the required sql PIVOT command, with all the relevant column names for any given table.
 
Once the string variable is populated with the sql script it is then executed,  returning a result set of just 3 columns; the same 3 columns - ID, measure and value -  regardless of the table being queried.  The actual code code of the proc is below:




This result set is all that the SSRS queryset would ever see, and the column names would always be the same 3 named columns, regardless of the feed selected by the user.  Setting up the SSRS matrix object in the manner suggested above would then display the contents of the table as normal – effectively doing a PIVOT to counter the UNPIVOT done in the stored procedure.

We now only need  one single SSRS report to display data from any database table the user selects from the feed list in the parameter drop down -  much simpler than dozens of different reports, or dozens of datasets and playing with visibility settings etc.