none
How to generate random number based on the row value RRS feed

  • Question

  • I'm trying to create a demo database by multiplying the data by a random number.

    I wanted to generate a column of random numbers between 0 and 1, but for each HeaderID, the random number would be the same. Because the data needs to be balanced after randomized.

    Here's a sample of the data, for each HeaderID, the DR and CR amount needs to net to 0.

    HEADERID     DR       CR
    16548 45133.03
    16548 45133.03
    16606 504.51
    16606 504.51
    17249 155270.4
    17249 32812.5
    17249 43750
    17249 43750
    17249 34957.92

    The Rand Column is the list I want to generate

    

    I'm not sure how to do this in PowerQuery. Also the volume of the data is large (~5m rows). I'm looking for an efficient way to do this. 

    If there are other better ways of doing this, please feel free to suggest!


    • Edited by qshng Wednesday, November 2, 2016 4:15 PM
    Wednesday, November 2, 2016 4:15 PM

Answers

  • Hi Imke :-)

    What do you think about this? ;-)  #"Added Costom" step has manually written, the rest steps are directly from UI

    (This is only first shot)

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"HEADERID", Int64.Type}, {"DR", type number}, {"CR", type number}}),
        #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
        #"Sorted Rows" = Table.Sort(#"Added Index",{{"HEADERID", Order.Ascending}, {"Index", Order.Ascending}}),
        #"Grouped Rows" = Table.Group(#"Sorted Rows", {"HEADERID"}, {{"tbl", each _, type table}, {"minIDX", each List.Min([Index]), type number}}),
        #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", (x) => Table.AddColumn(x[tbl], "Rand", each if [Index]=x[minIDX] then Number.Random() else null )),
        #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom"}),
        #"Expanded {0}" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"HEADERID", "DR", "CR", "Index", "Rand"}, {"HEADERID", "DR", "CR", "Index", "Rand"}),
        #"Filled Down" = Table.FillDown(#"Expanded {0}",{"Rand"}),
        #"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"})
    in
        #"Removed Columns"
    Best regards :-)

    Thursday, November 3, 2016 12:04 PM
  • Hi Bill :-)

    That looks nice!

    I've also experienced problems with the random number being "unrandomnized", like described here:

    https://social.technet.microsoft.com/Forums/en-US/81ca82f3-fa9f-4e49-935b-33b42b257e41/random-number-being-changed-to-the-same-number-for-all-rows?forum=powerquery

    So if you're running into that problem, an additional index after the random-number-generation would do the trick.


    Imke Feldmann TheBIccountant.com

    Thursday, November 3, 2016 1:37 PM
    Moderator

All replies

  • Problem with random numbers in PQ seems to be that they are going to be recalculated with every step, so it's difficult to carry them with you along the query steps.

    I'd recommend to create a lookup-table with your random-numbers that you export to csv using DAX studio and then reimport & join that to your tables.


    Imke Feldmann TheBIccountant.com

    Wednesday, November 2, 2016 7:22 PM
    Moderator
  • Hi Imke :-)

    What do you think about this? ;-)  #"Added Costom" step has manually written, the rest steps are directly from UI

    (This is only first shot)

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"HEADERID", Int64.Type}, {"DR", type number}, {"CR", type number}}),
        #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
        #"Sorted Rows" = Table.Sort(#"Added Index",{{"HEADERID", Order.Ascending}, {"Index", Order.Ascending}}),
        #"Grouped Rows" = Table.Group(#"Sorted Rows", {"HEADERID"}, {{"tbl", each _, type table}, {"minIDX", each List.Min([Index]), type number}}),
        #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", (x) => Table.AddColumn(x[tbl], "Rand", each if [Index]=x[minIDX] then Number.Random() else null )),
        #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom"}),
        #"Expanded {0}" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"HEADERID", "DR", "CR", "Index", "Rand"}, {"HEADERID", "DR", "CR", "Index", "Rand"}),
        #"Filled Down" = Table.FillDown(#"Expanded {0}",{"Rand"}),
        #"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"})
    in
        #"Removed Columns"
    Best regards :-)

    Thursday, November 3, 2016 12:04 PM
  • Hi Bill :-)

    That looks nice!

    I've also experienced problems with the random number being "unrandomnized", like described here:

    https://social.technet.microsoft.com/Forums/en-US/81ca82f3-fa9f-4e49-935b-33b42b257e41/random-number-being-changed-to-the-same-number-for-all-rows?forum=powerquery

    So if you're running into that problem, an additional index after the random-number-generation would do the trick.


    Imke Feldmann TheBIccountant.com

    Thursday, November 3, 2016 1:37 PM
    Moderator