locked
Windows Search not indexing some HTML mail messages RRS feed

  • Question

  • I was surprised to find that no search could find messages (.eml files) containing terms which I knew to be there. Examining the HTML source of these quoted-printable-encoded messages, I see that (a) the text/plain part only contains part of the message, and (b) most non-alphanumeric characters in the text/html part had been replaced by their ASCII code in QP format. Here's an example (an alert from an Answers forum); there may be a reason why spaces, quotes and angle brackets are converted when dashes, slashes, commas and colons aren't, but it escapes me:
    <td>=3Cfont=20color=3D=22#333333=22=20face=3D=22Verdana=22=20size=3D=222=22=3E=3Cp=3EAnother=20user,=20=3Ca=20href=3D=22http://social.answers.microsoft.com/Profile/en-US=3Fuser=3DSantosh+Y=22=3ESantosh=20Y=3C/a=3E,=20has=20replied=20to=20a=20thread=20you=20have=20subscribed=20to=20in=20the=20=3Ca=20href=3D=22http://social.answers.microsoft.com/Forums/en-US/w7programs/threads/=22=3E=20Programs=3C/a=3E=20Forum.=3Cbr=3E=3Cbr=3EThread=20Title=3Ca=20href=3D=22http://social.answers.microsoft.com/Forums/en-US/w7programs/thread/2deccfdb-fe9c-4f60-8558-61791c511dd6/=22=3E=20What=20is=20the=20folder=20where=20the=20Windows=20Live=20Mail=20Contacts=20are=20stored=3F=3C/a=3E=3Cbr=3E=20Started=20by:=20=3Ca=20href=3D=22http://social.answers.microsoft.com/Profile/en-US=3Fuser=3Dhealee=22=3Ehealee=3Ca/=3E=3Cbr=3E=3Cbr=3EReply:=3C ...
    It's perhaps understandable that the indexer doesn't pick up the highlighted terms if it doesn't understand QP, but can anyone say whether this is in fact a shortcoming in the indexer? What about Base64-encoded messages with no text part?

    Noel
    • Changed type Leo Huang Thursday, October 21, 2010 8:44 AM
    • Changed type iFiredogMVP Thursday, October 21, 2010 10:11 AM
    Monday, October 18, 2010 5:54 PM