XML .value simple syntax

There are several ways to get .value to extract an item of information from xml.  in the following article I will show some of the syntax and reasons why some methods are better than others.

All the calls in this post will all reference the following blob.  This is a simple blob for demonstration purposes only.

<xmlblob>

<personlist>

<person>Phil</person>

<person>Stina</person>

</personlist>

</xmlblob>

or

The Following 3 methods of .value calls all work and will all bring back the same value (which is the first person element in the blob)

DECLARE @xml xml = N'<xmlblob><personlist><person>Phil</person><person>Stina</person></personlist></xmlblob>’

SELECT @xml.value(‘(xmlblob/personlist/person)[1]’,‘varchar(50)’)

 SELECT @xml.value(‘(xmlblob/personlist/person/text())[1]’,‘varchar(50)’)

 SELECT @xml.value(‘(xmlblob[1]/personlist[1]/person[1])’,‘varchar(50)’)

 The first query is potentially the kind you would want to have if you were using typed xml via a schema,  however in this case without a schema the second query is far better especially with reference to the Query plans generated. this is due to the /text() on the end of the element name that directs the xml to be processed as text.

100% to untyped xml, 0% for typed xml

The third option is interesting as it is the slowest to run under the above conditions and would normally be avoided however the interesting thing with this syntax is that when other elements are accessed the query is seemingly scaled.

The following queries look at the second element in the person list.

SELECT @xml.value(‘(xmlblob/personlist/person)[2]’,‘varchar(50)’)

 SELECT @xml.value(‘(xmlblob/personlist/person/text())[2]’,‘varchar(50)’)

 SELECT @xml.value(‘xmlblob[1]/personlist[1]/person[2]’,‘varchar(50)’)

  Still the second method remains the best with the third following closely behind. 

 Improvements can be made to the third query by using the text() method as in the second query, but the QP cost is still above that of the second query.

Advertisements
Posted in SQL, XML | Leave a comment

Reducing size of a nonclustered index with Filtering

SQL Server 2008 has a very nice new feature that enables a partial index to be created based on a specific where clause.

I have recently used this as an solution checking a table that contained an isSent flag. by creating a non-clustered index on the condition of ‘isSent’ = 0 (where 0 means not sent) I was able to reduce the scans on the clustered index. previously this index was scanned every time an email delivery system was polled.  This lead to some considerable savings.

The three examples below are taken from the same query that was run with different mechanisms. Each Table is identical with data having 96% of rows having the ‘isSent’ flag set to 1.  there were approx. 200000 rows in the table. In each example only the non clustered indexes changed.

the first table emailqueuev1 had no nonclustered index index so the table was scanned on each search, the second had a non-clustered index created on the ‘isSent’ field and contained a reference to each row in the clustered index. The third contains a filtered non-clustered index this only contained rows from the clustered index that fitted the condition ‘isSent’ = 0

The following queries were then run on the seperate tables.

Select * from emailqueuev1 where isSent = 0
Select * from emailqueuev2 where isSent = 0
Select * from emailqueuev3 where isSent = 0

 

As you can see the first index scans the entire table due to the size of this table it means there is a query cost 20 times the other 2 methods. The second seeks the non-clustered index and the third also seeks the nonclustered index both of these methods are fairly light weight.  There is a small improvement between the second and third queries in processing.  This improvement will only get better in time as the size of the clustered index increases

The major saving is from storage space.  The second index is comprised of 203 pages of data.  This could increase dramatically in size if you created included columns within the index

.

The third Index is comprised though of only 2 pages (16kb) of data.

The beauty of the filtered index is once the isSent flag is changed the reference is removed from this small index.  So it never gets too large.

Posted in database performance, filtered indexes, non-clustered Indexes, SQL, Uncategorized | Leave a comment