Save Webpage as MHT is changing all relative hyperlinks to absolute links


  • I've been seeing this problem in IE 7, 8 and 9, with Windows XP and Windows 7.

    Have a web page with relative links and anchor tags, that all get changed to absolute links when I save the page as an MHT.

    I also noticed the Content-Location: header gets generated with the full URL of the file, which based on other reading is what is making the links transform.

    Interesting part, when I use MHT plugins on Firefox, it's fine, the relative links remain unchanged.

    I tried adding a <base href=""/> tag in the header, but that did nothing.

    What is Explorer doing to mess up the file? And more importantly, how do I get it to stop.

    Wednesday, November 13, 2013 9:03 PM


    It occurred to me that I should be able to script something that would rely on the relative url encoding of page1 and page2.html.  That would demonstrate your point NH.  But before I attempted it, I first looked into the Firefox DOM for that <a> element to see how the UnMHT encoding was represented there.  To my surprise, it is this:

    <a target="_self" onclick="event.preventDefault();'page2.html','_self')" href="http://johndoe-pc/pages/page1.html">click this</a>

    It looks exactly equivalent to how IE encoded its mht of that page.  (see very first codeblock above).  So, sorry.  I was afraid I offended you, and wanted to make up for it by demonstrating your complaint.  But now I've further contradicted it.  Anyway, I hope this has been interesting for you.  I learned something from the exercise.

    Saturday, November 16, 2013 5:40 PM

    Wednesday, November 13, 2013 11:43 PM
  • In your code, the 2nd link was inside a javascript call, so it was probably just interpreted as regular text.

    The main problem is an MHT file is supposed to be self-contained. If it is emailed outside the company, the server is inaccessable, so those links won't work.

    The addition of a Content-Location header is probably part of the problem as well. I've found no hook within IE to prevent that from being generated.

    I simply want IE not to change the page, keep the same content as I see when I look at it in the browser.

    Thursday, November 14, 2013 2:20 PM
  • I did. Was told my question was too complex and to post it elsewhere.
    Thursday, November 14, 2013 4:17 PM
  • I don't see an issue with self-contained. Treat similar to "save complete webpage".

    The biggest problem is the href's with just anchor tags getting fully expanded. Zero reason for it. Clicking on an anchor tag is supposed to be just a scroll action, but with the full URL behind it in the MHT, it is causing a browser event to grab that page, which will break once you are outside the local network. If you have the Content-Location header, why do all that extra work to expand the links?

    UnMHT is working fine, even with the Content-Location header, because it does not manipluate the hrefs in the page content. Trying to view an UnMHT created file in IE, it prepends the Content-Location to the anchor and tries to load the original page. And lots of complaints.

    Thursday, November 14, 2013 5:57 PM