for xml path reverse??
-
Friday, November 04, 2005 3:15 PMjust started playing with xml datatype, and I'm running into a performance issue...
I have to query an xml column which was populated by a 'for xml path' statement, and get the values back into relational tables...
select
DeletedData.value('(/row/ListingID)[1]','int') as ListingID,
DeletedData.value('(/row/ListingTypeID)[1]','int') as ListingTypeID, DeletedData.value('(/row/EventID)[1]','int') as EventID,
DeletedData.value('(/row/UserID)[1]','uniqueidentifier') as
etc.......
............
............
where DeletedData.value('(/row/ListingID)[1]','int') = x
Performance slows down considerably as the number of values retreived in the select increases which is understandable since it looks like it traverses for every value...
Is there a way to do a 'for xml path' reverse into a table variable without explicitly retreiving every value?
thanks.
All Replies
-
Friday, November 04, 2005 5:43 PMDo you have an XML Index? If so, what secondary XML Indexes do you have?
There are a couple of things you can try doing.
Is your data untyped (meaning there is no associated XML Schema Collection)? If so, then you should rewrite your path expressions to look like this:
(/row/ListingID/text())[1]
Also, I would recommend changing your where clause to use the XML datatype exist() method, this will maximize the effectiveness of your XML Indexes.
where DeletedData.exist('/row/ListingID/text()[.=sql:variable("@x")]') = 1 -
Friday, November 04, 2005 8:01 PM
Can you give a better repro? Do you expect to get more than one row or only ever get one row? Why do you use FOR XML PATH instead of the table variable in the first place?
Also, as a performance hint: You may want to use
where 1= col.exist('/row/ListingID/text()[. = sql:column("x")]')
which can give you better performance than doing the cast into SQL and then the comparison.
Best regards
Michael -
Monday, November 07, 2005 6:35 PMhi..
rewriting the expression as (/row/ListingID/text())[1] improved performance by about 25%...
changing the where clause to
where DeletedData.exist('/row/ListingID/text()[.=sql:variable("@x")]') = 1 didn't make any difference...
adding a for path index made very little difference ( < 5%)
CREATE
PRIMARY XML INDEX idx_DeletedData on audit (DeletedData)
CREATE XML INDEX idx_DeletedDataPath on audit (DeletedData) USING XML INDEX idx_DeletedData FOR PATH
Do you expect to get more than one row or only ever get one row? Why do you use FOR XML PATH instead of the table variable in the first place?
we have generic data audit triggers that look like this:
insert audit select tablename, (select * for xml path from inserted), (select * from deleted for xml path).... etc.
the select described above is used to get the audit trail of changes to a row.
we have a large development effort going on, and using genric triggers seemed like a perfect way to audit data in an enviroment where number of tables and table schema changes on a daily basis without having to change triggers and audit tables... Once we stabilize the schema we might move to a more sophisticated strategy.. I'd prefer not to since I really like this solution, but if getting an audit tral of 100 rows takes 20-30 seconds, i might have to...
thanks! -
Monday, November 07, 2005 10:33 PMThanks for testing it. Did you try the WHERE clause rewrite with the PATH index together?
If so, and you have a reasonable amount of data, can you please contact me in email (mrys at the usual microsoft com domain).
Thanks
Michael -
Monday, November 07, 2005 10:36 PMHow selective is the variable @X? If it is highly selective, then you may want to consider also creating a VALUE index on the XML Index. This will allow QO to select a plan in which we seek for the value and then match the path.
-
Tuesday, November 15, 2005 6:29 AM
Take a look at the optimization described under "Merging multiple value() method executions for indexed XML" in the XML optimizations whitepaper at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/sqloptxml.asp.
The optimization can apply to your case:
1) When it is written as nodes()/value() combination
2) You use attributes instead of subelements (of <row>). If this is an option, please rerun the experiments and let us know the performance you observe.
Thank you,
Shankar
Program Manager
Microsoft SQL Server -
Thursday, November 24, 2005 5:52 PMPugV
Let me know if you still have the performance issue. There's a way to get close to what you want with a better performance. The best is to write to Eugene dot Kogan at Microsoft dot com and I'll reply to the forum.
Best regards,
Eugene Kogan
Technical Lead,
Microsoft SQL Server -
Monday, November 28, 2005 7:10 PMsorry guys, was away for a while, I will do some more benchmarking next week and get back to you.
thanks a lot for everyone's help!

