How to stop extracted metadata title being shown in search results for a list item attachment (PDF)


  • I am using a Content Search Web Part to display the contents of an announcement list on the home page of an Intranet.

    It has been observed that sometimes when the announcement has an attachment the value returned for the out of the box managed property Title sometimes returns unexpected results. Looking into this I determined that the value being displayed for Title is the value of Title that I would find in the Word version of the PDF file that is attached to the announcement.

    The problem is not just with my Content Search Web Part but also on the standard search results page. An example of what might be seen:

    Announcement Title in List: Test New Announcement

    Title returned in Search Results: Microsoft Word - Newsletter template.docx

    So this would occur when a PDF file is attached and in the Word document it was created from had a title in it's properties of the above.

    I could accept that this may be somewhat logical if I was indexing a document in a library. But I am indexing an announcement and it makes no sense that its Title should be overridden by some property extracted from an attachment.

    I tried removing the mappings for the Title managed property to the crawled property MetadataExtractorTitle - this made absolutely no difference after I triggered a full re-index. My experience with SharePoint on-premises has always been similar in that changing the order or mapping of crawled properties to the Title managed property seems to have no effect on search results...despite what the documentation might say!

    Any suggestions how I can get around this are most welcome!

    Sunday, October 9, 2016 9:34 PM

All replies

  • I don't think you can change that setting in O365. As you've identified the title field is being crawled as a crawled property, which is then mapped to the managed property for the title. In that area it is set to a higher priority than other values, which means it is overriding your other title values. To fix it on prem would be simple enough, remove the crawled property, or lower the priority, from the managed property and do a full crawl.

    However O365 search doesn't give you access to modify those settings, they are locked away from us.

    You could hack it to work by using one of the User defined properties at the site collection level instead but it's not a nice fix.

    Monday, October 10, 2016 8:32 AM