Answered by:
Changing quazi-XML into different output

Question
-
Hi everybody,
I am wondering if C# or SQL Server may be better to convert XML like text into a different way.
Here are two row samples we have in our table in the Text type of the column:
<CurrentState>
<guests><guest_no>3004001</guest_no><parent_no>34000001</parent_no><addit_no>0</addit_no><addit_no2>0</addit_no2><trans_no>0</trans_no><guestgroup></guestgroup><salute></salute><first_name>LAURA</first_name><last_name>STONE</last_name><mid_name></mid_name><suffix></suffix><e_mail></e_mail><birth_date></birth_date><gender></gender><mug_shot></mug_shot><mug_date></mug_date><notes></notes><e_message></e_message><check_bx1>false</check_bx1><check_bx2>false</check_bx2><check_bx3>false</check_bx3><check_bx4>false</check_bx4><check_bx5>false</check_bx5><check_bx6>false</check_bx6><check_bx7>false</check_bx7><check_bx8>false</check_bx8><check_bx9>false</check_bx9><check_bx10>false</check_bx10><check_bx11>false</check_bx11><check_bx12>false</check_bx12><check_bx13>false</check_bx13><check_bx14>false</check_bx14><check_bx15>false</check_bx15><number_1>0</number_1><number_2>0</number_2><number_3>0</number_3><number_4>0.00</number_4><number_5>0.00</number_5><text_1></text_1><text_2></text_2><text_3></text_3><text_4></text_4><text_5></text_5><text_6></text_6><text_7></text_7><text_8></text_8><memo_1></memo_1><memo_2></memo_2><memo_3></memo_3><date_1></date_1><datetime_1></datetime_1><web_user></web_user><web_pswd></web_pswd><importid></importid><import2nd></import2nd><importir></importir><importeml></importeml><guest_2nd>0</guest_2nd><salute2></salute2><firstname2></firstname2><lastname2></lastname2><midname2></midname2><suffix2></suffix2><salespoint>TICKET</salespoint><operator>ADMIN</operator><date_time>04/11/2008 10:03:36 AM</date_time><last_mod>0</last_mod><gfwdstatus>0</gfwdstatus><guest_id>0</guest_id><relation>0</relation><role_no>0</role_no><acct_name></acct_name><vipcode>0</vipcode><height>0</height><weight>0</weight><height_m>0</height_m><weight_m>0</weight_m><no_mail>false</no_mail><no_email>false</no_email><no_phone>false</no_phone><addr_pref>0</addr_pref></guests></CurrentState>
-------------------
--Record N0
details:<LiabilityText>Standard Season Pass Liability Text goes here...</LiabilityText>We need to change the above into something like this (I used a different ID that's why output doesn't match):
Changed:
CurrentState:
guests:
guest_no: 5000001
parent_no: 5000001
first_name: MECHELE
last_name: LRE
birth_date: 01/20/1966 12:00:00 AM
importid: SIMP5000001
import2nd: S2ND5000001
importir: SIR5000001
importeml: SEML5000001
firstname2: JIM
lastname2: LRE
salespoint: LISA
operator: ADMIN
date_time: 03/17/2008 03:59:39 PM
last_mod: 6
gfwdstatus: 1So, the idea is to remove all empty values and output the rest using the above format and the number of spaces before row equal the tag nesting level. I have VFP function that performs the above display and my challenge is to either convert it to C# code or attempt to get that string formatted properly from the SQL Server directly. There is also a possibility of having bad XML data (tags not closed / matched) in some rows.
What do you think?
Thanks in advance.
For every expert, there is an equal and opposite expert. - Becker's Law
My blog- Edited by Naomi N Tuesday, March 5, 2013 5:59 PM
Tuesday, March 5, 2013 5:56 PM
Answers
-
Hi Naomi,
Try this
DECLARE @x1 XML = ' <CurrentState> <guests><guest_no>5000001</guest_no><parent_no>5000001</parent_no> <addit_no>0</addit_no><addit_no2>0</addit_no2><trans_no>0</trans_no> <guestgroup></guestgroup><salute></salute><first_name>MEC</first_name> <last_name>LAR</last_name><mid_name></mid_name><suffix></suffix><e_mail></e_mail> <birth_date>01/20/1966 12:00:00 AM</birth_date><gender></gender> <mug_shot></mug_shot><notes></notes><e_message></e_message> <number_2>0</number_2> <number_3>0</number_3><number_4>.00</number_4><number_5>.00</number_5> <text_1></text_1><text_2></text_2><text_3></text_3><text_4></text_4><text_5></text_5><text_6> </text_6><text_7></text_7><text_8></text_8><memo_1></memo_1><memo_2></memo_2><memo_3></memo_3> <web_user></web_user><web_pswd></web_pswd><importid>SIMP5000001</importid> <import2nd>S2ND5000001</import2nd><importir>SIR5000001</importir> <importeml>SEML5000001</importeml><guest_2nd>0</guest_2nd><salute2></salute2> <firstname2>JIM</firstname2><lastname2>LAR</lastname2> <midname2></midname2><suffix2></suffix2> <salespoint>LISA</salespoint><operator>ADMIN</operator> <date_time>03/17/2008 03:59:39 PM</date_time> <last_mod>6</last_mod><gfwdstatus>1</gfwdstatus> <guest_id>0</guest_id><relation>0</relation> <role_no>0</role_no><acct_name></acct_name> <vipcode>0</vipcode><height>0</height> <weight>0</weight><height_m>0</height_m> <weight_m>0</weight_m><no_mail>false</no_mail><no_email>false</no_email> <no_phone>false</no_phone><addr_pref>0</addr_pref></guests></CurrentState>"; ' DECLARE @Count INT, @totalCount INT, @ElementName VARCHAR(30), @ElementValue VARCHAR(30) -- counter variables SELECT @Count = 1, @totalCount = @x1.value('count(/CurrentState/guests/*)','INT') DECLARE @TMP TABLE(Name VARCHAR(1000),Value VARCHAR(1000)) -- loop WHILE @Count <= @totalCount BEGIN SELECT @ElementName = @x1.value( 'local-name((/CurrentState/guests/*[position()=sql:variable("@Count")])[1])', 'VARCHAR(30)'), @ElementValue = @x1.value( '(/CurrentState/guests/*[position()=sql:variable("@Count")])[1]', 'VARCHAR(30)') INSERT @TMP SELECT @ElementName,@ElementValue --PRINT @ElementName + ':' + @ElementValue --PRINT '' -- increment the counter variable SELECT @Count = @Count + 1 END SELECT * FROM @TMP WHERE Value <> ''
The below link is for just removing XML tags
http://sathyadb.blogspot.in/2012/10/sql-server-remove-xml-tags-from-xml.html
Thanks & Regards, sathya
- Edited by SathyanarrayananS Wednesday, March 6, 2013 12:59 PM
- Marked as answer by Yubo. Zhang Monday, March 11, 2013 4:28 PM
Wednesday, March 6, 2013 12:54 PM
All replies
-
While removing empty tags is not a problem as also are the spaces, I think the challenge are your incomplete tags. I'm not sure right now without digging deeper into it, but I think there are two solutions:
- Use a SGML parser, it should be able to handle this.
- Use a RegEx clean up run before parsing it as XML.
Tuesday, March 5, 2013 8:22 PM -
After struggling with it for almost the whole day, I got the following code
For every expert, there is an equal and opposite expert. - Becker's Law
My blogWednesday, March 6, 2013 5:41 AM -
Hi Naomi,
Try this
DECLARE @x1 XML = ' <CurrentState> <guests><guest_no>5000001</guest_no><parent_no>5000001</parent_no> <addit_no>0</addit_no><addit_no2>0</addit_no2><trans_no>0</trans_no> <guestgroup></guestgroup><salute></salute><first_name>MEC</first_name> <last_name>LAR</last_name><mid_name></mid_name><suffix></suffix><e_mail></e_mail> <birth_date>01/20/1966 12:00:00 AM</birth_date><gender></gender> <mug_shot></mug_shot><notes></notes><e_message></e_message> <number_2>0</number_2> <number_3>0</number_3><number_4>.00</number_4><number_5>.00</number_5> <text_1></text_1><text_2></text_2><text_3></text_3><text_4></text_4><text_5></text_5><text_6> </text_6><text_7></text_7><text_8></text_8><memo_1></memo_1><memo_2></memo_2><memo_3></memo_3> <web_user></web_user><web_pswd></web_pswd><importid>SIMP5000001</importid> <import2nd>S2ND5000001</import2nd><importir>SIR5000001</importir> <importeml>SEML5000001</importeml><guest_2nd>0</guest_2nd><salute2></salute2> <firstname2>JIM</firstname2><lastname2>LAR</lastname2> <midname2></midname2><suffix2></suffix2> <salespoint>LISA</salespoint><operator>ADMIN</operator> <date_time>03/17/2008 03:59:39 PM</date_time> <last_mod>6</last_mod><gfwdstatus>1</gfwdstatus> <guest_id>0</guest_id><relation>0</relation> <role_no>0</role_no><acct_name></acct_name> <vipcode>0</vipcode><height>0</height> <weight>0</weight><height_m>0</height_m> <weight_m>0</weight_m><no_mail>false</no_mail><no_email>false</no_email> <no_phone>false</no_phone><addr_pref>0</addr_pref></guests></CurrentState>"; ' DECLARE @Count INT, @totalCount INT, @ElementName VARCHAR(30), @ElementValue VARCHAR(30) -- counter variables SELECT @Count = 1, @totalCount = @x1.value('count(/CurrentState/guests/*)','INT') DECLARE @TMP TABLE(Name VARCHAR(1000),Value VARCHAR(1000)) -- loop WHILE @Count <= @totalCount BEGIN SELECT @ElementName = @x1.value( 'local-name((/CurrentState/guests/*[position()=sql:variable("@Count")])[1])', 'VARCHAR(30)'), @ElementValue = @x1.value( '(/CurrentState/guests/*[position()=sql:variable("@Count")])[1]', 'VARCHAR(30)') INSERT @TMP SELECT @ElementName,@ElementValue --PRINT @ElementName + ':' + @ElementValue --PRINT '' -- increment the counter variable SELECT @Count = @Count + 1 END SELECT * FROM @TMP WHERE Value <> ''
The below link is for just removing XML tags
http://sathyadb.blogspot.in/2012/10/sql-server-remove-xml-tags-from-xml.html
Thanks & Regards, sathya
- Edited by SathyanarrayananS Wednesday, March 6, 2013 12:59 PM
- Marked as answer by Yubo. Zhang Monday, March 11, 2013 4:28 PM
Wednesday, March 6, 2013 12:54 PM -
Thanks, but this is not going to work for the generic case I am after. I can not predict the way the details column is saved and what information is in there. In my first message I showed samples of two rows which used completely different content. Although most likely for the method I am working on the content of the row will be like the first row showed.
For every expert, there is an equal and opposite expert. - Becker's Law
My blogWednesday, March 6, 2013 1:03 PM