none
How to index to minimize SORT cost

    Question

  • Hello, I have this table:

     TopScores (Username char(255), Score int, DateAdded datetime2) which will have a lot of rows.

    I run the following query (code for an SP) against it to get the top 5 high scorers, and the score for a particular Username preceded by the person directly above them in position and the person below:

    with Rankings
    as
    (
    select Row_Number() over (order by Score desc, DateAdded desc) as Pos, --if score same, latest date higher
    Username,
    Score
    from TopScores
    )     
    select top 5 Pos, Username, Score from Rankings
    union all
    SELECT Pos , Username , Score From Rankings
    WHERE Pos BETWEEN
    (SELECT Pos FROM Rankings WHERE Username = @User) - 1
    AND ( SELECT Pos FROM Rankings WHERE Username = @User) + 1

    I had to index the table so I added clustered: ci_TopScores(Username) first and nonclustered: nci_TopScores(Dateadded, Score). 

    Query plan showed that clustered was completely ignored (before I created the nonclustered I tested and it was used by the query), and logical reads were more (as compared to a table scan without any index).

    Sort was the highest costing operator. So I adjusted indexes to clustered: ci_TopScores(Score desc, Dateadded desc) and nonclustered: nci_TopScores(Username).

    Still sort costs the same. Nonclustered: nci_TopScores(Username) is completely ignored again.

    How can I avoid the high cost of sort and index this table effectively?

    Sunday, February 02, 2014 9:49 AM

Answers

  • Without DDL, we need to assume lots of things about your system. It would be helpful if you can share your DDL when you ask specific question.

    Few suggestions:

    1. Your table does not look normalized. Instead of username you can have userid(assume it will be a int or bigint type) from your user (master) table in TopScores table. Create clustered on UserID for TopScores table.

    2. Split your queries into two

          a. one would get the POS for a Specific User from TopScores that you passed. Push these records into a temp table.

          b. Another one, to get top 5 POC based on Score and Dateadded in sorted way. Here you can expect SORT operator. But I assume the above approach would be much faster than the earlier. You may give try for it.

    Sunday, February 02, 2014 11:09 AM

All replies

  • Without DDL, we need to assume lots of things about your system. It would be helpful if you can share your DDL when you ask specific question.

    Few suggestions:

    1. Your table does not look normalized. Instead of username you can have userid(assume it will be a int or bigint type) from your user (master) table in TopScores table. Create clustered on UserID for TopScores table.

    2. Split your queries into two

          a. one would get the POS for a Specific User from TopScores that you passed. Push these records into a temp table.

          b. Another one, to get top 5 POC based on Score and Dateadded in sorted way. Here you can expect SORT operator. But I assume the above approach would be much faster than the earlier. You may give try for it.

    Sunday, February 02, 2014 11:09 AM
  • Thank you very much.
    Tuesday, February 18, 2014 5:27 AM