Unique Count with ROW_NUMBER()

# Unique Count with ROW_NUMBER()

• Tuesday, February 26, 2013 3:28 PM

I am trying to create a unique counting column with ROW_NUMBER() and here is my scenario. I have an EmployeeId INT, DOS DATE, SupervisorId INT, ProblemCode VARCHAR(10), TransID INT. I want to uniquely count the EmployeeId, DOS and SupervisorId only if the ProblemCode is different and the Supervisor is different. I want to use ROW_NUMBER() OVER(PARTITION BY EmployeeId, DOS, ?? ORDER BY TransId) within an inner query and then filter on that column = 1. For example, the count below should be 2 because the new supervisor has a problem code that is different from the first supervisor.

Record 1:
EmployeeId = 1
DOS = 20120101
SupervisorId = 10
ProblemCode = f25

Record 2:
EmployeeId = 1
DOS = 20120101
SupervisorId = 10
ProblemCode = f10

Record 3:
EmployeeId = 1
DOS = 20120101
SupervisorId = 10
ProblemCode = f01

Record 4:
EmployeeId = 1
DOS = 20120101
SupervisorId = 26
ProblemCode = f01

Record 5:
EmployeeId = 1
DOS = 20120101
SupervisorId = 26
ProblemCode = f55

• Edited by Tuesday, February 26, 2013 3:29 PM
•

### All Replies

• Tuesday, February 26, 2013 3:38 PM
Moderator

Try something like this:

```SELECT * from (SELECT ..., ROW_NUMBER() OVER(PARTITION BY EmployeeId, DOS, SupervisorId, ProblemCode ORDER BY TransId) rn

FROM yourtable ) t
WHERE rn>1
```

• Tuesday, February 26, 2013 3:44 PM

Since ProblemCode and SupervisorId are in PARTITION BY, it attributes a 1 to every row, making the count 5.

• Edited by Tuesday, February 26, 2013 3:44 PM
•
• Tuesday, February 26, 2013 3:55 PM

Please explain a little more.  If you remove the ?? from your example, what is wrong with the output?  Your description implies you want to count whenever SupervisorId or ProblemCode change, so the example should produce the correct results.
• Tuesday, February 26, 2013 4:01 PM

If i remove the ?? then the total unique count will be 1 because the ROW_NUMBER() will give me 1-5 since there are 5 records that have the same EmployeeId and DOS and i only count the records that have a 1 in the ROW_NUMBER() column. For example, today I have an employee that goes to Supervisor 10 and that supervisor documents 3 problem codes (which counts as one visit). On that same day the employee goes to Supervisor 26. Supervisor 26 document one of the same ProblemCodes that supervisor 10 did and one new ProblemCode. I want to ignore the first problem code documented by supervisor 26, but count the new ProblemCode because that is considered "a new visit". Does that make more sense? So for each record the ROW_COUNT() should look like this...

1

2

3

4

1

• Edited by Tuesday, February 26, 2013 4:02 PM
• Edited by Tuesday, February 26, 2013 4:02 PM
• Edited by Tuesday, February 26, 2013 4:03 PM
•
• Tuesday, February 26, 2013 8:36 PM

I think i figured the problem out, i have tested it on a few examples, i am not sure how fast this can run with a lot of data. Here is the basic layout: The innermost query finds all the ProblemCodes for each employee on that specific day. I include the SupervisorId in the query to find which Supervisors have a value of one and return those SupervisorIds in my second inner query. Then if i want to sum i need a 3rd query. I am not a fan of nesting queries but i could not think of another way.

SELECT
SUM(Visit)
FROM(
SELECT
i.SupervisorId
,Visit
FROM
(
SELECT SupervisorId, ROW_NUMBER() OVER(PARTITION BY EmployeeID, DOS, ProblemCode ORDER BY TranID) AS Visit FROM EmployeeTracking
) AS i
WHERE i.Visit = 1
GROUP BY
i.SupervisorID,
i.Visit
) AS o

• Edited by Tuesday, February 26, 2013 8:36 PM
• Proposed As Answer by Wednesday, February 27, 2013 10:26 AM
• Marked As Answer by Wednesday, February 27, 2013 4:35 PM
•