none
Display search relevancy RRS feed

  • Question

  • Is there any way where we can display the search relevancy in percentile (e.g 70% relevant) for each search results in people and all site.
    Chittaranjan. Consultant Enterprise Search Products.
    Wednesday, January 5, 2011 11:19 PM

All replies

  • If doing search against the QR server directly you get a max rank for the result set, and you could calculate a percentage from the hit ranks towards the max rank. Unfortuntaly this value does not seem to be passed on when doing searches against FS4SP.

    That said, relevancy is very subjective, and using a pure mathematical approach will not feel right to your users. I've worked with other systems where hit #1 only had 54% relevancy from a mathematical point of view. But it was still the best and correct hit. So I used a logarithmic function to upscale all the values closer to 100.

    What you could to is check the rank of your first hit and last hit on the current page, and also the number of total hits for that particular query. The first hit should be displayed as somewhere close to 100%, as it is the most relevant hit you managed to return. Then create a distribution based on the gap between the hits on your page and the total number of hits.

    Instead of using a %, you could use stars 1-5 instead. Less granularity, but you remove yourself from the math and a much better UI element.

    Good luck :)

    Regards,
    Mikael Svenson 


    Search Enthusiast - MCTS SharePoint/WCF4/ASP.Net4
    http://techmikael.blogspot.com/ - http://www.comperiosearch.com/
    • Proposed as answer by tarjeieo Tuesday, January 18, 2011 1:26 PM
    Wednesday, January 12, 2011 8:14 PM
  • Mikael

    Could you detail your logarithimic function you had used previously?

     

    thanks in advance

     

     


    Steve Schaff Discover Technologies, LLC
    Thursday, September 8, 2011 3:35 PM
  • Hi SSchaff,

    Here's a sample written in C#:

    double p = (double)740 / 2100;
    Console.WriteLine( "Original: " + p);
    p = Math.Log(p * 100, 100);
    Console.WriteLine( "Upscale: "+ p);
    


    2100 is the max score, and 740 is the score of a certain hit.

    This yields the following output:

    Original: 0,352380952380952
    Upscale: 0,773506212498528

    So instead of saying the hit was 35% relevant we say it's 77%. If you are to apply this with FAST for SharePoint there is one problem. I don't see the maxscore value being passed on out to the end-user objects or xml we have access to. It's present from the QR server, and also in some internal components in SharePoint, but it's lost along the way :(

    Therefore you have no way of knowing what the score of hit #1 is measured against.

    I also want to re-iterate the point that using % is not very good UI wise. You have no real way of matching the relevancy of the users intention to what score is produced by the search system.

    Regards,
    Mikael Svenson 


    Search Enthusiast - SharePoint MVP/WCF4/ASP.Net4
    http://techmikael.blogspot.com/
    Thursday, September 8, 2011 8:19 PM
  • what is max rank and how is it calculated?

    if for a query max rank=i, then one record should always be there in result with rank=i.

    that means one record will always have 100% relevance. SO, we can calculate relevance based upon the highest rank in the result(considering it equal to max rank). Will that be a correct solution to calculate relevance?

     

    But, I have tested on QR, there is no record ever with rank equal to max rank. So, how does this value come?

    Thanks!

    Monday, September 12, 2011 1:40 PM
  • Hi aggrawal,

    I love hard questions :) And from the FAST ESP docs maxrank is described as:

    MAXRANK is a theoretical maximum rank for a document for a specific query (if the document contained all the query terms close to each other, early in the document, in all the important fields, etc.).

    In practice the best document in the result set will usually have a rank score much lower then MAXRANK.

    And if you relate this to the default rank model (where data is a very important factor) this translates to something like: if you query with several words, the words are present in the title and the document was created in the last 5 minutes, then you can get close to MAXRANK.

    And of course, you can treat the first hit as 100%, but would that "feel" right to the users if the top hit is not that good? This is why I think mathematical scores just don't work as they can never convey the right number compared to the users intention and context.

    Conclusion: it's impossible to calculate a proper relevance score for a query, but you can probably fake something in order to make the user feel good about the results ;-)

    Question back: Why would you want to display a percentage? Why doesn't Google or Bing use percentage scores?

    Regards,
    Mikael Svenson 


    Search Enthusiast - SharePoint MVP/WCF4/ASP.Net4
    http://techmikael.blogspot.com/
    Monday, September 12, 2011 6:15 PM
  • Hi Mikael,

    MaxRank and numerical scores are needed when the relevancy from FAST will be used outside the context of the user search experience in SharePoint. Here are two scenarios:

    1. A resource manager is trying to quickly find promising candidates for an opening and passes resumes to others for review. Those reviewers should not waste their time studying poor matches, especially when FAST could have told the resource manager the matches were poor and saved them the trouble.

    2. An application uses FAST search scores along with other numerical factors in recommending the best people for projects. This depends on high scores only being given by FAST to good matches. 

    Regards,

    Rick Beesley

    Product Manager, Deloitte Touche Tohmatsu Limited

    Monday, October 24, 2011 6:42 PM
  • Hi Rick,

    Good input on the discussion :)

    Say for your scenario #1, the max rank is 15000, and your top result has a score of 1200. Would you return this at all for your case? Is the hit still a good and valuable one, or should it be discarded as it's only 8% relevant?

    If you discard it, then I agree, as you only want results above a certain threshold. You would also need to tune both your rank model and your queries to match your search scenario, in order to be able to act on the theoretical max rank for a certain query.

    A better approach to avoid poor matches is perhaps to craft 'quality' fql queries to ensure you don't get bad results. Certainly not an easy task, and it requires trial and error.  ...and perhaps the only solution as we can't get the max rank back with the current API's.

    Regards,
    Mikael Svenson 

     


    Search Enthusiast - SharePoint MVP/WCF4/ASP.Net4
    http://techmikael.blogspot.com/
    Monday, October 24, 2011 7:05 PM
  • What do you think of running a custom developed app on the QR server to which we would submit our search queries and from which we would get only MaxRank?

    By have it return only the MaxRank, I think we would address the security concern regarding the potential release of indexed content to unauthorized users. If it releases no content, then we don't have to worry about who it releases it to.

    So when running a query from the separate app, we'd query this custom program for MaxRank and we'd separately query the Discovery service for the results, dividing each Rank by MaxRank within our app.

    Would that be a good solution in your opinion?

    Regards,

    Rick Beesley

    Monday, October 24, 2011 7:29 PM
  • Hi Rick,

    It would certainly work, and I have thought about it. The number is getting sent over to the Query SSA, but it's not passed on publicly, which would have made this a lot easier.

    Default, security is in place, and I'm not sure if maxrank is returned even though all results are filtered away. If that is the case you have to pick up the security token on your query so you can reuse it to grab the xml output yourself. Much like I do in fs4splogger.codeplex.com.

    The question is how would the architecture look? Should you still query via the Query SSA, and get the maxrank from the outside, or only query your own service? It certainly complicates matters with one extra query and a proxy service to query QR server. It might work to call the "QRProxy Service" directly, if you find out how to set up your client against it.

    And then, would this be a supported scenario? :) From my point of view this is way too much effort for getting a percentage. I would rather execute some sample queries, record the scores, and use them in my UI. Yes, it will be static and now totally correct, but easier to implement, and again, the number will only make sense if you know how the query is matched against your rank model.

    Regards,
    Mikael Svenson 

     


    Search Enthusiast - SharePoint MVP/WCF4/ASP.Net4
    http://techmikael.blogspot.com/
    Monday, October 24, 2011 7:58 PM