Auditing Your Database Changes

SQL Server can capture Database level Trigger information using XML.¬† I personally use this on my home PC’s because

  1. I’m poor and can’t afford one of those amazing DB Source control products that are out there on the market.
  2. I’m forgetful and always forget to check in¬† Source Control data.
  3. I don’t trust anyone else to remember either ūüôā

So in an attempt to track what i do on my databases I have implemented the following.

I create a Database called Logs (imaginative) that  I place on all my instances of SQL Server.  Within this database I create all those fun DBA type tables that mean during meetings you can casually say, all our indexes are optimized, or we have never changed that.  This Audit Table is one of them.
CREATE TABLE [dbo].[SchemaEventLog](
[ID] [bigint] IDENTITY(1,1) NOT NULL,
[DatabaseName] [varchar](120) NOT NULL,
[EventType] [varchar](120) NOT NULL,
[SchemaName] [varchar](120) NOT NULL,
[ObjectName] [varchar](120) NOT NULL,
[ChangeDate] [datetime] NOT NULL,
[ChangeUser] [varchar](120) NOT NULL,
[CommandText] [varchar](max) NULL,


I then run in the model database several scripts all based on the following:

CREATE trigger [change_Index] on database
for create_Index, alter_Index, drop_index
set nocount on
declare @data xml
set @data = EVENTDATA()

insert into Logs.dbo.SchemaEventLog(DatabaseName,eventType,schemaName, ObjectName, changeUser, CommandText)

values(@data.value(‘(/EVENT_INSTANCE/DatabaseName)[1]’, ‘varchar(120)’)
, @data.value(‘(/EVENT_INSTANCE/EventType)[1]’, ‘varchar(120)’)
, @data.value(‘(/EVENT_INSTANCE/SchemaName)[1]’, ‘varchar(120)’)
, @data.value(‘(/EVENT_INSTANCE/ObjectName)[1]’, ‘varchar(120)’)
, @data.value(‘(/EVENT_INSTANCE/LoginName)[1]’, ‘varchar(120)’)
, @data.value(‘(/EVENT_INSTANCE/TSQLCommand)[1]’, ‘varchar(max)’) )

These capture and record in the Logs db all changes to indexes, tables, stored procedures and functions.  When I create a new database as the triggers are placed in the new database due to them being in the model database.

The Trigger makes a call to the EVENTDATA() for the action this returns an XML Blob like


The information that I’m interested in is taken via the .value xml command and placed in the audit table using a normal insert.¬† you could take all the information from the xml but I Like to limit it to a generic set.

It is possibly to extend the functionality of the triggers to include other object modifications but i normally find the key ones to use are tables, indexes, stored procedures and functions but this can be extended beyond these to almost any object.

There is a clean up script that I created that removes any entries that are older than 1 month (3 months/a year etc. depending on the DB use)  but this is a simple delete mechanism, run off a daily SQL Agent maintenance Job using the date predicate in the table.

Posted in Uncategorized | Leave a comment

Recursive XML branches

One of the main features of XML is that it handles hierarchical data structures.  I agree that so does SQL either using the hierarchy variable or creating a Recursive CTE.  but nothing compares to the simplicity  of the XML format to show data.  the following snippet of XML is a Geographical  script I occasionally use.  The Highest level of the Tree is the Country Data,  Then branched off this is Region information and bellow this is the town/village information.
The position in the tree is pivotal to the Area that it describes. however the data that each level is related to the branch above.  i.e. Cardiff is in Wales.

<Areas AreaName=”Wales”>

<Area AreaName=”South Wales” GridRef=”123994″>

<Area AreaName=”Cardiff” GridRef=”332233″/>

<Area AreaName=”Pontypool” GridRef=”545676″/>

<Area AreaName=”Newport” GridRef=”123322″/>

<Area AreaName=”Cwmbran” GridRef=”123432″/>



Now this is only a small fragment of a much larger document (and made up data ).  but the examples should still work.
If I were to search for a Specific Town/City (specific to the location in the XML document) with an AreaName of ‘Cardiff” I could use

SELECT @XML.value(‘(/Areas/Area/Area[@AreaName =”Cardiff”]/@GridRef)[1]’,varchar(50)’)
This would specifically find the 2nd Area Level of Cardiff.¬† But if I didn’t care what level the area was at Town or Region or even country I could change the query to…

SELECT @XML.value(‘(//Area[@AreaName=”Cardiff”]/@GridRef)[1],’varchar(50)’)

This query will return the grid reference for the first Torfean Element that it encounters regardless of the level.

This is an expensive query and should be considered very carefully before being used in a production environment.

Posted in Uncategorized, XML | Leave a comment

Enforcing XML Document Structure

by default the xml variable supports a content structure of xml this is very useful if you are constructing an xml blob.¬† It works for typed and untyped code, and it has it’s place.

The following code shows an xml query that would work if the data passed the schema checks.

Content XML Query

This query should pass

Though the following query manadates the need for a single root node and would fail.

document xml image

this query will fail

in the case where you are processing data and you have the data laid out in a schema you might want to constrain the incoming information to a single document.  If two xml blobs are passed in to be processed but both blobs meet the schema criteria your dataload may fail.  or worse it could pass correctly, without warning you of issues.

The code below would pass as it fulfils the schema matching requirement and also has a single root node.

this query will pass

A very simple solution to what could be a awkward fix after the fact.

Posted in XML, XML Schemas | Tagged , | Leave a comment

Schema visibility using xml_schema_namespace

One of the issues I’ve faced is the constant referencing and rereferencing of XML Schemas that are stored against a database.¬† the xml_schema_namespace dmv helps with this.¬† it takes 2 parameters.¬† the schema of the database object XML Schema and Name of the XML Schema (and as I mentioned Schema too many times here is an example).


This returns a list of the contents of the schema, similar results can be returned by scripting CREATE AS … on the XML schema but this command returns purely the XML schema with out CREATE XML SCHEMA AS …. and can the query can be used in the snippet manager of your choice.

I extended this capability for my own purposes, using a cursor to retrieve each instance of an xml schema in the database (note: db specific command not server) and return it in one large xml blob. This is part of a longer term project that i am working on but i thought it might be useful to blog about this specific part of it.

--Author: Phil Quinn
--Date: 5th October 2012
--Contact: Twitter @SQLPippy
-- Email:
-- http:\\

DECLARE @AllSchemas varchar(max) =”
DECLARE @SchemaNameVar varchar(50)
DECLARE @XMLSchemaNameVar varchar(100)

DECLARE schema_cursor CURSOR FOR
SELECT AS SchemaName
, AS XMLSchemaName
FROM sys.xml_schema_collections XS
INNER JOIN sys.schemas S ON s.schema_id = XS.schema_id
WHERE <> ‘sys’
–get a list of all xml schema’s and including DB Schema name

OPEN schema_cursor

FETCH NEXT FROM schema_cursor
INTO @SchemaNameVar, @XMLSchemaNameVar

–use @AllSchemas as a varchar as the xml var does not have a + operator
–could use and insert variable but that will be version 2 ūüôā
SET @AllSchemas=@AllSchemas + CAST(xml_schema_namespace(@SchemaNameVar,@XMLSchemaNameVar) AS VARCHAR(max))
— Get the next Schema.
FETCH NEXT FROM schema_cursor
INTO @SchemaNameVar, @XMLSchemaNameVar
CLOSE schema_cursor;
DEALLOCATE schema_cursor;
–Close and deallocate cursor

–Transfer back from a varchar to a XML variable

Posted in SQL, XML, XML Schemas | Leave a comment

modifying data in XML a really clunky solution. (no I mean REALLY Clunky)

If you are trying to replace the value of an element that does not exist if can be problematic, i have not found a nice way to do this, if any one does please let me know.

So to get from

<blob><a href=””/><span>Microsoft</span></blob>


<a href=””>Microsoft</a>

I’ve used this piece of very clunky code.

DECLARE @xml XML = N'<blob><a href=""/><span>Microsoft</span></blob>'
SET @xml.modify('
insert <a href="empty">Empty</a> into (/blob)[1]')
SET @xml.modify('replace value of (/blob/a/text())[1] with (/blob/span/text())[1]')
SET @xml.modify('replace value of (/blob/a/@href)[2] with (/blob/a/@href)[1]')
SET @xml.modify('delete (/blob/a)[1]')
SET @xml.modify('delete (/blob/span)[1]')

Posted in XML | 2 Comments

being silly with spatial data

Okay, a fun post with no educational value (sort of). My Adopted Twin, also a Database geek, has been under the weather recently so to cheer her up I have been playing with spatial data, under her orders to create this game. After reading Purple Frogs spatial data blog ( I’ve created a simple game for her.

It is basically Rock, Paper, Scissors however I added the “famous” Big Bang Theory version ‘Lizzard’, ‘Spock’ to it as well ( and just when you thought it couldn’t get geekier).

So the break down of it is a random number generator that creates a number between 0 and 2 (or 4 with RPSLS) inclusive and then a vector line art pattern based on the selected number is created in to the Table variable and finally the selected picture is shown in the spatial tab.


(created as a doc file but should be opened up in SQL Server)


Posted in spatial data, SQL | 2 Comments

Schema validation using Reg Ex

One of the most powerful tools that XML in SQL Server brings to the table is Regular expresions. In XML Schemas Regular Expressions are an easy method to validate your data without having to bolt on complex CLR functions to you database.

Consider a UK Postcode they are constructed in a known pattern.

MK13 3XZ

The first part of a post code is either 1 or 2 letters in the range of a to z or (to be pedantic, as with databases you have to) A to Z.¬† This is followed by a number in the range of 1¬† to 99 or for use of regex’s 2 numbers in the range of 1 to 9.¬† The two parts of thepostcode should be seperated with a space.

The second part of the postcode is constructed of a single number in the range 0 to 9 and then two letters a to z or A to Z.

Admittedly there are other ways to use sql server to confirm that the data is a valid post code my favourite being running a query against all valid post codes but for a first parse a regEx could be the way forward.

A RegEx for a postcode could be “[a-zA-Z]{1,2}[0-9]{2} [0-9][a-zA-Z]{2}”

There are lots of articles on the web for the construction of RegEx’s and this is a small blog article so i won’t go into the deconstruction of the regEx here.

The regular expresion can be put into the XML schema in the following fashion

N'<xsd:schema xmlns:xsd="">
  <xsd:element name="Address">
        <xsd:element name="Postcode" minOccurs="0">
         <xsd:restriction base="xsd:string">
          <xsd:pattern value="[a-zA-Z]{1,2}[0-9]{2} [0-9][a-zA-Z]{2}"/>
SET @XML = N'<Address><Postcode>NP32 4Df</Postcode></Address>'

The above XML can be loaded into the Schema’d variable. and be validated. however the xml fragment of

SET @XML = N'<Address><Postcode>3NP2 4Df</Postcode></Address>'

will automatically be rejected.

In summary you can can use the xml regex to automatically validate data with a known sting format before the data is processed (i.e. in the loading stage of SQL Server).  This can be useful if you are trying to load large volumes of data from xml or you want to keep all you validation conditions in one area.

Posted in SQL, XML | Leave a comment

XML .value simple syntax

There are several ways to get .value to extract an item of information from xml.  in the following article I will show some of the syntax and reasons why some methods are better than others.

All the calls in this post will all reference the following blob.  This is a simple blob for demonstration purposes only.








The Following 3 methods of .value calls all work and will all bring back the same value (which is the first person element in the blob)

DECLARE @xml xml = N'<xmlblob><personlist><person>Phil</person><person>Stina</person></personlist></xmlblob>’

SELECT @xml.value(‘(xmlblob/personlist/person)[1]’,‘varchar(50)’)

¬†SELECT @xml.value(‘(xmlblob/personlist/person/text())[1]’,‘varchar(50)’)

¬†SELECT @xml.value(‘(xmlblob[1]/personlist[1]/person[1])’,‘varchar(50)’)

 The first query is potentially the kind you would want to have if you were using typed xml via a schema,  however in this case without a schema the second query is far better especially with reference to the Query plans generated. this is due to the /text() on the end of the element name that directs the xml to be processed as text.

100% to untyped xml, 0% for typed xml

The third option is interesting as it is the slowest to run under the above conditions and would normally be avoided however the interesting thing with this syntax is that when other elements are accessed the query is seemingly scaled.

The following queries look at the second element in the person list.

SELECT @xml.value(‘(xmlblob/personlist/person)[2]’,‘varchar(50)’)

¬†SELECT @xml.value(‘(xmlblob/personlist/person/text())[2]’,‘varchar(50)’)

¬†SELECT @xml.value(‘xmlblob[1]/personlist[1]/person[2]’,‘varchar(50)’)

  Still the second method remains the best with the third following closely behind. 

 Improvements can be made to the third query by using the text() method as in the second query, but the QP cost is still above that of the second query.

Posted in SQL, XML | Leave a comment

Reducing size of a nonclustered index with Filtering

SQL Server 2008 has a very nice new feature that enables a partial index to be created based on a specific where clause.

I have recently used¬†this as an¬†solution checking a table that contained an isSent flag. by creating a non-clustered index¬†on the¬†condition of¬†‘isSent’ = 0 (where 0 means not sent) I was able to reduce the scans on the clustered index. previously this index was scanned every time an email delivery system was¬†polled.¬† This lead to some considerable savings.

The three examples below are taken from the same query that was run with different mechanisms. Each Table is identical with data having 96% of rows having the¬†‘isSent’ flag set to 1.¬† there were approx. 200000 rows in the table.¬†In each example only the non clustered indexes changed.

the first table emailqueuev1 had no nonclustered index index so the table was scanned on each search, the second¬†had a non-clustered index created on the ‘isSent’ field and contained a reference to each row in the clustered index. The¬†third contains a filtered non-clustered index this only contained rows from the clustered index that fitted the condition ‘isSent’ = 0

The following queries were then run on the seperate tables.

Select * from emailqueuev1 where isSent = 0
Select * from emailqueuev2 where isSent = 0
Select * from emailqueuev3 where isSent = 0


As you can see the first index scans the entire table due to the size of this table it means there is a query cost 20 times the other 2 methods. The second seeks the non-clustered index and the third also seeks the nonclustered index both of these methods are fairly light weight.  There is a small improvement between the second and third queries in processing.  This improvement will only get better in time as the size of the clustered index increases

The major saving is from storage space.  The second index is comprised of 203 pages of data.  This could increase dramatically in size if you created included columns within the index


The third Index is comprised though of only 2 pages (16kb) of data.

The beauty of the filtered index is once the isSent flag is changed the reference is removed from this small index.  So it never gets too large.

Posted in database performance, filtered indexes, non-clustered Indexes, SQL, Uncategorized | Leave a comment