Optimizing queries using XML in SQL Server

-> I recently worked on tuning a query that scans through XML data.
-> Below is the query,


set statistics io on
SELECT t.COL1 FROM Table1 t WHERE COL3=N'1' or COL3=N'10' or COL3=N'102'
or COL3=N'103' or COL3=N'104' or COL3=N'105' or COL3=N'106' or COL3=N'11'
or COL3=N'12' or COL3=N'120' or COL3=N'121' or COL3=N'122' or COL3=N'123'
or COL3=N'124' or COL3=N'125' or COL3=N'126' or COL3=N'127' or COL3=N'129'
or COL3=N'13' or COL3=N'130' or COL3=N'131' or COL3=N'132' or COL3=N'133'
or COL3=N'134' or COL3=N'135' or COL3=N'136' or COL3=N'137' or COL3=N'138'
or COL3=N'139' or COL3=N'14' or COL3=N'141' or COL3=N'143' or COL3=N'144'
or COL3=N'145' or COL3=N'146' or COL3=N'147' or COL3=N'148' or COL3=N'149'
or COL3=N'15' or COL3=N'150' or COL3=N'151' or COL3=N'152' or COL3=N'153'
or COL3=N'154' or COL3=N'155' or COL3=N'156' or COL3=N'157' or COL3=N'158'
or COL3=N'159' or COL3=N'16' or COL3=N'160' or COL3=N'161' or COL3=N'162'
or COL3=N'163' or COL3=N'164' or COL3=N'166' or COL3=N'167' or COL3=N'168'
or COL3=N'169' or COL3=N'17' or COL3=N'170' or COL3=N'171' or COL3=N'173'
or COL3=N'174' or COL3=N'175' or COL3=N'176' or COL3=N'177' or COL3=N'178'
or COL3=N'179' or COL3=N'18' or COL3=N'184' or COL3=N'185' or COL3=N'186'
or COL3=N'187' or COL3=N'188' or COL3=N'189' or COL3=N'19' or COL3=N'190'
or COL3=N'191' or COL3=N'192' or COL3=N'193' or COL3=N'194' or COL3=N'195'
or COL3=N'196' or COL3=N'197' or COL3=N'198' or COL3=N'199' or COL3=N'2'
or COL3=N'20' or COL3=N'200' or COL3=N'201' or COL3=N'202' or COL3=N'203'
or COL3=N'204' or COL3=N'205' or COL3=N'206' or COL3=N'207' or COL3=N'208'
or COL3=N'209' or COL3=N'21' or COL3=N'210' or COL3=N'211' or COL3=N'212'
or COL3=N'213' or COL3=N'214' or COL3=N'215' or COL3=N'216' or COL3=N'217'
or COL3=N'218' or COL3=N'22' or COL3=N'227' or COL3=N'228' or COL3=N'229'
or COL3=N'23' or COL3=N'230' or COL3=N'233' or COL3=N'234' or COL3=N'235'
or COL3=N'236' or COL3=N'237' or COL3=N'238' or COL3=N'239' or COL3=N'24'
or COL3=N'240' or COL3=N'241' or COL3=N'242' or COL3=N'25' or COL3=N'26'
or COL3=N'27' or COL3=N'28' or COL3=N'29' or COL3=N'3' or COL3=N'30'
or COL3=N'309' or COL3=N'31' or COL3=N'32' or COL3=N'33' or COL3=N'34'
or COL3=N'35' or COL3=N'36' or COL3=N'37' or COL3=N'38' or COL3=N'39'
or COL3=N'4' or COL3=N'40' or COL3=N'41' or COL3=N'42' or COL3=N'43'
or COL3=N'44' or COL3=N'45' or COL3=N'46' or COL3=N'47' or COL3=N'48'
or COL3=N'49' or COL3=N'5' or COL3=N'51' or COL3=N'52' or COL3=N'53'
or COL3=N'54' or COL3=N'55' or COL3=N'56' or COL3=N'57' or COL3=N'58'
or COL3=N'59' or COL3=N'6' or COL3=N'62' or COL3=N'64' or COL3=N'66'
or COL3=N'67' or COL3=N'68' or COL3=N'69' or COL3=N'7' or COL3=N'70'
or COL3=N'71' or COL3=N'72' or COL3=N'73' or COL3=N'74' or COL3=N'76'
or COL3=N'77' or COL3=N'78' or COL3=N'79' or COL3=N'8' or COL3=N'80'
or COL3=N'81' or COL3=N'82' or COL3=N'83' or COL3=N'84' or COL3=N'85'
or COL3=N'86' or COL3=N'87' or COL3=N'88' or COL3=N'89' or COL3=N'9'
or COL3=N'90' or COL3=N'91' or COL3=N'92' or COL3=N'93' or COL3=N'94'
or COL3=N'95' or COL3=N'96' or COL3=N'97' or COL3=N'98'
ORDER BY COL6 ,COL5
,ISNULL(COL2.value('(/row/c9)[1]',  'VARCHAR(256)'), CHAR(1)) ,COL1
set statistics io off

-> Object Table1 definition,


CREATE TABLE [dbo].[Table1](
            [COL1] [varchar](255) NOT NULL,
            [COL2] [xml] NULL,
            [COL3]  AS ([dbo].[udTable1_C4]([COL2])) PERSISTED,
            [COL4]  AS ([dbo].[udTable1_C2]([COL2])) PERSISTED,
            [COL5]  AS ([dbo].[udTable1_C8]([COL2])) PERSISTED,
            [COL6]  AS ([dbo].[udTable1_C1]([COL2])) PERSISTED,
            [COL7]  AS ([dbo].[udTable1_C3]([COL2])) PERSISTED,
CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED
(
            [COL1] ASC
)WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

-> Execution plan for the above query,
Blog46_1

-> Checking the “Table Valued Function” operator from the execution plan,
Blog46_2

-> We are spending most of our time reading the XML data from Table Table1.

(3227544 row(s) affected)
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Table1’. Scan count 1, logical reads 2097889, physical reads 0, read-ahead reads 0, lob logical reads 424529, lob physical reads 0, lob read-ahead reads 36.

-> Lets introduce a new computed column using below code to avoid the costly read on XML column,


CREATE FUNCTION [dbo].[udTable1_C9](@COL2 XML)
RETURNS nvarchar(20)
WITH SCHEMABINDING
BEGIN
RETURN ISNULL(@COL2.value('(/row/c9)[1]',  'VARCHAR(256)'), CHAR(1))
END
GO

ALTER TABLE dbo.Table1 ADD COL8  AS ([dbo].[udTable1_C9]([COL2])) PERSISTED
GO

CREATE INDEX IX_TABLE1_C3_C6_C5_C8_C1 ON Table1 (COL3) INCLUDE (COL6,COL5,COL8,COL1)
GO

-> Executing below query. The only change to this query is that we are using the computed column in order by.


set statistics io on
SELECT t.COL1 FROM Table1 t WHERE COL3=N'1' or COL3=N'10' or COL3=N'102'
or COL3=N'103' or COL3=N'104' or COL3=N'105' or COL3=N'106' or COL3=N'11'
or COL3=N'12' or COL3=N'120' or COL3=N'121' or COL3=N'122' or COL3=N'123'
or COL3=N'124' or COL3=N'125' or COL3=N'126' or COL3=N'127' or COL3=N'129'
or COL3=N'13' or COL3=N'130' or COL3=N'131' or COL3=N'132' or COL3=N'133'
or COL3=N'134' or COL3=N'135' or COL3=N'136' or COL3=N'137' or COL3=N'138'
or COL3=N'139' or COL3=N'14' or COL3=N'141' or COL3=N'143' or COL3=N'144'
or COL3=N'145' or COL3=N'146' or COL3=N'147' or COL3=N'148' or COL3=N'149'
or COL3=N'15' or COL3=N'150' or COL3=N'151' or COL3=N'152' or COL3=N'153'
or COL3=N'154' or COL3=N'155' or COL3=N'156' or COL3=N'157' or COL3=N'158'
or COL3=N'159' or COL3=N'16' or COL3=N'160' or COL3=N'161' or COL3=N'162'
or COL3=N'163' or COL3=N'164' or COL3=N'166' or COL3=N'167' or COL3=N'168'
or COL3=N'169' or COL3=N'17' or COL3=N'170' or COL3=N'171' or COL3=N'173'
or COL3=N'174' or COL3=N'175' or COL3=N'176' or COL3=N'177' or COL3=N'178'
or COL3=N'179' or COL3=N'18' or COL3=N'184' or COL3=N'185' or COL3=N'186'
or COL3=N'187' or COL3=N'188' or COL3=N'189' or COL3=N'19' or COL3=N'190'
or COL3=N'191' or COL3=N'192' or COL3=N'193' or COL3=N'194' or COL3=N'195'
or COL3=N'196' or COL3=N'197' or COL3=N'198' or COL3=N'199' or COL3=N'2'
or COL3=N'20' or COL3=N'200' or COL3=N'201' or COL3=N'202' or COL3=N'203'
or COL3=N'204' or COL3=N'205' or COL3=N'206' or COL3=N'207' or COL3=N'208'
or COL3=N'209' or COL3=N'21' or COL3=N'210' or COL3=N'211' or COL3=N'212'
or COL3=N'213' or COL3=N'214' or COL3=N'215' or COL3=N'216' or COL3=N'217'
or COL3=N'218' or COL3=N'22' or COL3=N'227' or COL3=N'228' or COL3=N'229'
or COL3=N'23' or COL3=N'230' or COL3=N'233' or COL3=N'234' or COL3=N'235'
or COL3=N'236' or COL3=N'237' or COL3=N'238' or COL3=N'239' or COL3=N'24'
or COL3=N'240' or COL3=N'241' or COL3=N'242' or COL3=N'25' or COL3=N'26'
or COL3=N'27' or COL3=N'28' or COL3=N'29' or COL3=N'3' or COL3=N'30'
or COL3=N'309' or COL3=N'31' or COL3=N'32' or COL3=N'33' or COL3=N'34'
or COL3=N'35' or COL3=N'36' or COL3=N'37' or COL3=N'38' or COL3=N'39'
or COL3=N'4' or COL3=N'40' or COL3=N'41' or COL3=N'42' or COL3=N'43'
or COL3=N'44' or COL3=N'45' or COL3=N'46' or COL3=N'47' or COL3=N'48'
or COL3=N'49' or COL3=N'5' or COL3=N'51' or COL3=N'52' or COL3=N'53'
or COL3=N'54' or COL3=N'55' or COL3=N'56' or COL3=N'57' or COL3=N'58'
or COL3=N'59' or COL3=N'6' or COL3=N'62' or COL3=N'64' or COL3=N'66'
or COL3=N'67' or COL3=N'68' or COL3=N'69' or COL3=N'7' or COL3=N'70'
or COL3=N'71' or COL3=N'72' or COL3=N'73' or COL3=N'74' or COL3=N'76'
or COL3=N'77' or COL3=N'78' or COL3=N'79' or COL3=N'8' or COL3=N'80'
or COL3=N'81' or COL3=N'82' or COL3=N'83' or COL3=N'84' or COL3=N'85'
or COL3=N'86' or COL3=N'87' or COL3=N'88' or COL3=N'89' or COL3=N'9'
or COL3=N'90' or COL3=N'91' or COL3=N'92' or COL3=N'93' or COL3=N'94'
or COL3=N'95' or COL3=N'96' or COL3=N'97' or COL3=N'98'
ORDER BY COL6 ,COL5 ,ISNULL(COL8, CHAR(1)) ,COL1
set statistics io off

-> Below execution plan uses the index created. It no longer performs a read on the XML column,
Blog46_3

-> Execution stats below for the query,

(3227544 row(s) affected)
Table ‘Table1’. Scan count 202, logical reads 20644, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

-> Reads are greatly reduced and the query performs comparably good. All good till now as far as scalar promotion is concerned. Lets look at issues that this computed column can cause,

1) All queries using Table1 will be serial after computed column with scalar function is defined on Table1.
2) Parallel rebuild on this table will not be possible if SQL Server Enterprise edition is used.
3) More space used by the computed column and indexes.
4) Many more….

-> It is very important to test with important workloads before using computed columns in production.

-> In my case the script cannot be rewritten and this query is very important for our business. Further Table1 had computed columns with scalar function before I created this, so went with this choice.

-> I would have definitely thought twice if this was the first computed column for Table1.

Thank You,
Vivek Janakiraman

Disclaimer:
The views expressed on this blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights

Advertisements