none
Agregating the increasing number of unique objects for each day? RRS feed

  • Question

  • I have a logfile that looks somewhat like below. I would like to list, for each day, How many unique IP-node there is in the list. Not per day, rather aggregated from the beginning. Has anyone else done this 

    Time IP Node Date
    5/4/2016 6:39 10.37.31.220 5/4/2016
    5/4/2016 6:39 10.37.31.220 5/4/2016
    5/4/2016 6:40 10.37.31.220 5/4/2016
    5/4/2016 8:02 164.4.92.226 5/4/2016
    5/4/2016 8:02 164.4.92.226 5/4/2016
    5/4/2016 4:23 164.4.92.226 5/4/2016
    5/4/2016 4:23 164.4.92.226 5/4/2016
    5/4/2016 2:55 164.4.244.91 5/4/2016
    5/3/2016 7:41 164.4.92.226 5/3/2016
    5/3/2016 4:01 164.4.92.226 5/3/2016
    5/3/2016 4:01 164.4.92.226 5/3/2016
    5/3/2016 7:47 164.4.190.224 5/3/2016
    5/3/2016 7:48 164.4.190.224 5/3/2016
    5/2/2016 4:29 10.37.29.192 5/2/2016
    5/2/2016 1:40 164.4.180.142 5/2/2016
    5/2/2016 2:11 164.4.180.142 5/2/2016
    5/2/2016 4:29 10.37.29.192 5/2/2016
    5/2/2016 7:26 138.103.66.117 5/2/2016
    5/2/2016 7:42 164.4.92.226 5/2/2016
    5/2/2016 7:44 164.4.92.226 5/2/2016

    It's quite easy to list unique IP nodes per day by using a pivot table and count distinct count of IP Nodes (if the data is within a data model). This would look like below: 

    Row Labels Distinct Count of IP Node
    5/2/2016 4
    5/3/2016 2
    5/4/2016 3
    Grand Total 7

    But I would like to list unique IP Nodes between 5/2/2016 and 5/3/2016, as well as between 5/2/2016 and 5/4/2016 as described below: 

    Row Labels Distinct Count of IP Node Unique IP Nodes from start
    5/2/2016 4 4
    5/3/2016 2 5
    5/4/2016 3 7
    Grand Total 7

    Any idea of how to do this, either using power query or just a pivot table





    • Edited by erapade Wednesday, May 4, 2016 2:38 PM
    Wednesday, May 4, 2016 2:00 PM

Answers

  • Hi @erapade

    Try this code

    let
        Source = Excel.CurrentWorkbook(){[Name="LogTbl"]}[Content],
        ChType = Table.TransformColumnTypes(Source,{{"Time", type datetime}, {"IP Node", type text}, {"Date", type date}}),
        DistinctPerDay = Table.Group(ChType, {"Date"}, {{"Distinct Count of IP Node", each List.Count(List.Distinct([IP Node])), type number}}),
        SortDateAsc = Table.Sort(DistinctPerDay,{{"Date", Order.Ascending}}),
        DistinctAllDays = Table.AddColumn(SortDateAsc, "Unique IP Nodes from start", (x) => List.Count(List.Distinct(Table.SelectRows(ChType, each [Date] <= x[Date])[IP Node])))
    in
        DistinctAllDays

    "LogTbl" is a table from your post above

    Regards

    • Marked as answer by erapade Wednesday, May 4, 2016 4:32 PM
    Wednesday, May 4, 2016 3:03 PM

All replies

  • Hi @erapade

    Try this code

    let
        Source = Excel.CurrentWorkbook(){[Name="LogTbl"]}[Content],
        ChType = Table.TransformColumnTypes(Source,{{"Time", type datetime}, {"IP Node", type text}, {"Date", type date}}),
        DistinctPerDay = Table.Group(ChType, {"Date"}, {{"Distinct Count of IP Node", each List.Count(List.Distinct([IP Node])), type number}}),
        SortDateAsc = Table.Sort(DistinctPerDay,{{"Date", Order.Ascending}}),
        DistinctAllDays = Table.AddColumn(SortDateAsc, "Unique IP Nodes from start", (x) => List.Count(List.Distinct(Table.SelectRows(ChType, each [Date] <= x[Date])[IP Node])))
    in
        DistinctAllDays

    "LogTbl" is a table from your post above

    Regards

    • Marked as answer by erapade Wednesday, May 4, 2016 4:32 PM
    Wednesday, May 4, 2016 3:03 PM
  • Thanks a lot Bill. You're my Hero for today (and the week end as well)

    Confirmed working

    Wednesday, May 4, 2016 4:32 PM