locked
Force IE to send URLs in UTF-8 RRS feed

  • Question

  • I am having one issue with the behavior of Internet Explorer and UTF-8 characters in URLs. I found that my site fails to complete the request when IE has unchecked "Send UTF-8 URLs" in Internet Options. 

    When I click on a link on my site to open it in a new page, I get a 404 error page. When I open links in the same page (with no target="_blank"), the links work. 

    Upon tracing packets, I saw that for UTF8 request, it looks something like this:
       GET /%EB%89%B4%EC%8A%A4/ HTTP/1.1\r\n

    For 404 response, the invalid request looks like this:
     GET /%B4%BA%BD%BA/ HTTP/1.1\r\n

       

    The URL requested is /뉴스. The thing is, using this tool I managed to see that the first, working link, is actually "뉴스" encoded in UTF8. The second one is the same string but encoded in system default, which is EUC-KR. 

    Using W3C i18n checker gave me these results:

    Character encoding
    HTTP Content-Type
    UTF-8 Content-Type: text/html; charset=UTF-8
    Byte order mark (BOM)
    No 
    Meta tag 
    UTF-8 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
    
    
    HTML tag 
    ko <html xmlns="http://www.w3.org/1999/xhtml" lang="ko" xml:lang="ko">
    HTTP Content-Language
    None found
    
    
    Default direction
    LTR (by default)
    Class & id names
    
    Non-ascii class or id names
    None 
    Non-NFC class or id names
    None
    
    Request headers
    
    Accept-Language
    en-US en Accept-Language: en-US,en;q=0.8
    Accept-Charset
    ISO-8859-1 utf-8 *
    Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3



    So, the issue I have now is when opening links in new tabs, **IE encodes the URL in system default encoding,** which breaks the site. I am looking for a solution for this, **how to make the IE send those URLs as UTF8**, for users that have that option turned off? Is there some meta tag I can add to do this, or I have to do some sort of JS modification to parse all of the URLs or some other solution?

    I searched extensively for this online and found a lot of resources about UTF8, but none solution to this situation I have which involves IE reverting to system default encoding.

    Also, the weird thing is that this sometimes works, sometimes not. If I open new link, I get error, and close the tab, when I do it again, error always comes up. if I open new tab, get an error, and without closing new tab, open another one, it might serve the request. This pattern is obvious, so its like every other click /sometimes every third/ works, but when closing tabs after failure, it always fails.

    Thanks!


    • Edited by elAhmo Sunday, April 28, 2013 5:45 PM formatting
    Sunday, April 28, 2013 5:41 PM

Answers


  • I was just curious to see what you're struggling with, and decided to try it myself.

    I guess your URL (뉴스) is a local intranet site?  I looked up its meaning on google.  It means news.  Ok, very good.  But I couldn't get a DNS hit on that, any way I tried.  Silly me.   Anyway, I set IE10 options the way you explained, then composed the following .html test page on my desktop, and invoked it.  Check it and you'll see I also partly percent-encoded msdn.com in a link.

    The %encoded  msdn.com  link works fine for me, even though there is a  target="_blank"  in there, which you explained was triggering a problem.  No problem for me, I tried it several times.  The only issue I had with it is, it kept opening the links in a new window.  Even though I have IE set to open links in new tabs.  But that's not the problem you're discussing.

    Also.  I kind of guessed maybe the meta element with  application/x-www-form-urlencoded  might have some positive effect.  It didn't seem to do anything.




    <html>

    <head>
    <meta http-equiv="Content-type" content="application/x-www-form-urlencoded; charset=UTF-8"/>
    </head>

    <body>
    <a href="http://%EB%89%B4%EC%8A%A4/" target="_blank">UTF-8 encoded http://뉴스/</a>

    <br><br><br>

    <a href="http://%6D%73%64%6e.com" target="_blank">percent encoded msdn (dot com)</a>
    </body>

    </html>


    Sunday, April 28, 2013 11:49 PM