none
Check for Anomalies after Performing Divsion RRS feed

  • Question

  • I am doing a custom column: Total Sales / Total Quantity (for 6 million rows of data.) to get a Unit Price column

    I want to add a data validation check so I can filter out anomalies from the Unit Price column (checking for Error, NaN, negative and positive infinity types). I know how to do these checks individually by adding calculated columns to check for each type, but I'd like to know if they can all be implemented together by adding code to the Advanced Editor.

    I've added what my Advanced Editor looks like below. I highlighted in bold the code where I did checks for Error and NaN Values by adding Custom Columns.

    let
        Source = Folder.Files("C:\Users\khizer.daar\Desktop\Combined Sell Out Files\Carrefour\2019"),
    TargetName = "Table10",
        
        FilteredOutHiddenFiles = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
        BinariesOnly = Table.SelectColumns(FilteredOutHiddenFiles, {"Content"}),
        ExtractedData = Table.TransformColumns(BinariesOnly,
            {"Content", each
                let
                    ExcelContent = Excel.Workbook(_, null, true),
                    SelectedTarget = Table.SelectRows(ExcelContent,
                        each ([Kind] = "Table") and ([Item] = TargetName)),
                    DataOnly = Table.SelectColumns(SelectedTarget, {"Data"})
                in
                    Table.ExpandTableColumn(DataOnly, "Data", Table.ColumnNames(DataOnly[Data]{0})),
                Table.Type
            }
        ),
        CombinedTables = Table.Combine(ExtractedData[Content]),
        #"Inserted Merged Column" = Table.AddColumn(CombinedTables, "Merged", each Text.Combine({Text.From([#"Total Quantity "], "en-US"), Text.From([Total Quantity], "en-US")}, ""), type text),
        #"Removed Columns" = Table.RemoveColumns(#"Inserted Merged Column",{"Merged"}),
        #"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Removed Columns", {{"Total Quantity", type text}, {"Total Quantity ", type text}}, "en-US"),{"Total Quantity", "Total Quantity "},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
        #"Renamed Columns" = Table.RenameColumns(#"Merged Columns",{{"Merged", "Total Quantity"}}),
        #"Merged Columns1" = Table.CombineColumns(Table.TransformColumnTypes(#"Renamed Columns", {{"Total Sales ", type text}, {"Total Sales", type text}}, "en-US"),{"Total Sales ", "Total Sales"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
        #"Renamed Columns1" = Table.RenameColumns(#"Merged Columns1",{{"Merged", "Total Sales"}}),
        #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns1",{"Year", "Week Start", "Week End", "Week", "Dept", "Dept Name", "Section", "Section Name", "Family", "Family name", "Sub Family", "Sub Family Name", "Brand Code", "Brand Principle", "Brand Name", "SupplierNo", "SupplierName", "Item Code", "Item Bar Code", "Item Name", "MKT Code", "MKT Name", "0002 - GAJM Qty Hypermarket", "0003 - GDER Qty Hypermarket", "0004 - GAIR Qty Hypermarket", "0005 - GRAK Qty Hypermarket", "0006 - GMAR Qty Hypermarket", "0007 - GJIM Qty Hypermarket", "0008 - GSHA Qty Hypermarket", "0009 - GSHI Qty Hypermarket", "0010 - GCEN Qty Hypermarket", "0011 - GMOE Qty Hypermarket", "0012 - GBWD Qty Hypermarket", "0013 - GSRK Qty Hypermarket", "0014 - GDAL Qty Hypermarket", "0015 - GMIR Qty Hypermarket", "0016 - GMDN Qty Hypermarket", "0017 - GBAN Qty Hypermarket", "0018 - GSFJ Qty Hypermarket", "0060 - GFCC Qty Hypermarket", "0061 - GDRF Qty Hypermarket", "0062 - GBJM Qty Hypermarket", "0069 - GMEA Qty Hypermarket", "0072 - GDFC Qty Hypermarket", "0073 - GWAF Qty Hypermarket", "0080 - GAIN Qty Hypermarket", "0098 - GGUR Qty Hypermarket", "7001 - GIBM Qty Hypermarket", "7002 - GYAS Qty Hypermarket", "7003 - GDRG Qty Hypermarket", "Total QTY Hypermarket", "Total Sales Hypermarket", "0806 - GXFAV Qty Supermarket", "0808 - GXIRS Qty Supermarket", "0810 - GXRML Qty Supermarket", "0812 - GXSIL Qty Supermarket", "0816 - GXRAN Qty Supermarket", "0817 - GXDTC Qty Supermarket", "0818 - GXZWY Qty Supermarket", "0819 - GXTHD Qty Supermarket", "0822 - GXSUN Qty Supermarket", "0825 - GXSPR Qty Supermarket", "0855 - GXOAS Qty Supermarket", "0857 - GXBUH Qty Supermarket", "0860 - GXFRA Qty Supermarket", "0861 - GXHDB Qty Supermarket", "0862 - GXPLD Qty Supermarket", "0863 - GXNHT Qty Supermarket", "0865 - GXMAC Qty Supermarket", "0866 - GXARM Qty Supermarket", "0868 - GXTCM Qty Supermarket", "0869 - GXFRD Qty Supermarket", "0870 - GXNTT Qty Supermarket", "0872 - GXUAQ Qty Supermarket", "0874 - GXBRJ Qty Supermarket", "0877 - GXQOZ Qty Supermarket", "0878 - GXDIA Qty Supermarket", "0881 - GXSLV Qty Supermarket", "0883 - GXWSL Qty Supermarket", "0884 - GXSOR Qty Supermarket", "0885 - GXSEF Qty Supermarket", "0886 - GXMTR Qty Supermarket", "0887 - GXRHA Qty Supermarket", "0889 - GXKHN Qty Supermarket", "0890 - GXMGB Qty Supermarket", "0891 - GXJUR Qty Supermarket", "0892 - GXMRJ Qty Supermarket", "0893 - GXDIP Qty Supermarket", "0894 - GXNAS Qty Supermarket", "0895 - GXTOR Qty Supermarket", "2110 - GCBUN Qty Supermarket", "2111 - GCIBS Qty Supermarket", "2112 - GCDNT Qty Supermarket", "2113 - GCHDA Qty Supermarket", "2114 - GCFLH Qty Supermarket", "2115 - GCDXT Qty Supermarket", "2116 - GXDJA Qty Supermarket", "2117 - GCSHG Qty Supermarket", "2118 - GXHIA Qty Supermarket", "2120 - UXUNI Qty Supermarket", "2121 - UXMKT Qty Supermarket", "2122 - UXBSH Qty Supermarket", "2123 - UXBJN Qty Supermarket", "2124 - UXMAR Qty Supermarket", "2125 - UXFAH Qty Supermarket", "2126 - UXSPT Qty Supermarket", "2127 - GCNOV Qty Supermarket", "2128 - GXBJK Qty Supermarket", "2130 - GCETR Qty Supermarket", "2132 - GXDMC Qty Supermarket", "2134 - GCBGT Qty Supermarket", "7100 - GXRCH Qty Supermarket", "7101 - GXTEC Qty Supermarket", "7102 - GCGAT Qty Supermarket", "7103 - GXKHA Qty Supermarket", "7104 - GXPAR Qty Supermarket", "7105 - GXJPK Qty Supermarket", "7106 - GXDIS Qty Supermarket", "7107 - GXGOR Qty Supermarket", "7108 - GXRAM Qty Supermarket", "7111 - GXBDR Qty Supermarket", "Total QTY Supermarket", "Total Sales Supermarket", "Total Quantity", "Total Sales", "2145 - GXSRY Qty Supermarket", "0064 - GTAL Qty Hypermarket", "0065 - GMSD Qty Hypermarket", "0824 - GXLOS Qty Supermarket", "0085 - GCLM Qty Hypermarket", "0809 - GXMGT Qty Supermarket"}),
        #"Filtered Rows" = Table.SelectRows(#"Reordered Columns", each ([Week] <> null)),
        #"Unpivoted Other Columns1" = Table.UnpivotOtherColumns(#"Filtered Rows", {"Total Sales", "Total Quantity", "Total Sales Supermarket", "Total QTY Supermarket", "Total Sales Hypermarket", "Total QTY Hypermarket", "MKT Name", "MKT Code", "Item Name", "Item Bar Code", "Item Code", "SupplierName", "SupplierNo", "Brand Name", "Brand Principle", "Brand Code", "Sub Family Name", "Sub Family", "Family name", "Family", "Section Name", "Section", "Dept Name", "Dept", "Week", "Week End", "Week Start", "Year"}, "Attribute", "Value"),
        #"Renamed Columns2" = Table.RenameColumns(#"Unpivoted Other Columns1",{{"Value", "Volume"}, {"Item Bar Code", "Item Barcode"}}),
        #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns2",{{"Item Barcode", Int64.Type}}),
        #"Renamed Columns3" = Table.RenameColumns(#"Changed Type",{{"Week Start", "Date"}}),
        #"Added Custom" = Table.AddColumn(#"Renamed Columns3", "Month", each Date.ToText([Date],"MMMM")),
        #"Reordered Columns1" = Table.ReorderColumns(#"Added Custom",{"Year", "Month", "Date", "Week End", "Week", "Dept", "Dept Name", "Section", "Section Name", "Family", "Family name", "Sub Family", "Sub Family Name", "Brand Code", "Brand Principle", "Brand Name", "SupplierNo", "SupplierName", "Item Code", "Item Barcode", "Item Name", "MKT Code", "MKT Name", "Total QTY Hypermarket", "Total Sales Hypermarket", "Total QTY Supermarket", "Total Sales Supermarket", "Total Quantity", "Total Sales", "Attribute", "Volume"}),
        #"Renamed Columns6" = Table.RenameColumns(#"Reordered Columns1",{{"Date", "Start Date"}, {"Week End", "End Date"}}),
        #"Split Column by Delimiter" = Table.SplitColumn(#"Renamed Columns6", "Attribute", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Attribute.1", "Attribute.2", "Attribute.3", "Attribute.4", "Attribute.5"}),
        #"Merged Columns2" = Table.CombineColumns(#"Split Column by Delimiter",{"Attribute.1", "Attribute.2", "Attribute.3"},Combiner.CombineTextByDelimiter(" ", QuoteStyle.None),"Merged"),
        #"Removed Columns1" = Table.RemoveColumns(#"Merged Columns2",{"Attribute.4", "Attribute.5"}),
        #"Merged Queries" = Table.NestedJoin(#"Removed Columns1", {"Item Barcode"}, PGMasterFile, {"Item Barcode"}, "PGMasterFile", JoinKind.LeftOuter),
        #"Expanded PGMasterFile" = Table.ExpandTableColumn(#"Merged Queries", "PGMasterFile", {"01. Subsector", "02. Subcategory", "03. Manufacturer", "04. Brand", "05. Subbrand", "06. Form", "07. Segment", "08. Variant", "09. Size", "10. Packsize", "11. Basicsize", "12. ProductName", "Promotion"}, {"01. Subsector", "02. Subcategory", "03. Manufacturer", "04. Brand", "05. Subbrand", "06. Form", "07. Segment", "08. Variant", "09. Size", "10. Packsize", "11. Basicsize", "12. ProductName", "Promotion"}),
        #"Changed Type1" = Table.TransformColumnTypes(#"Expanded PGMasterFile",{{"Total Quantity", type number}, {"Total Sales", type number}}),
        #"Added Custom1" = Table.AddColumn(#"Changed Type1", "Custom", each try [Total Sales]/[Total Quantity] otherwise 0),
        #"Renamed Columns4" = Table.RenameColumns(#"Added Custom1",{{"Custom", "Unit Price"}}),
        #"Added Custom2" = Table.AddColumn(#"Renamed Columns4", "Custom", each (if Number.IsNaN([Unit Price]) then false else true)),
        #"Filtered Rows1" = Table.SelectRows(#"Added Custom2", each ([Custom] = true)),
        #"Inserted Multiplication" = Table.AddColumn(#"Filtered Rows1", "Multiplication", each [Unit Price] * [Volume], type number),
        #"Renamed Columns5" = Table.RenameColumns(#"Inserted Multiplication",{{"Multiplication", "Value"}}),
        #"Reordered Columns2" = Table.ReorderColumns(#"Renamed Columns5",{"Year", "Month", "Start Date", "End Date", "Week", "Dept", "Dept Name", "Section", "Section Name", "Family", "Family name", "Sub Family", "Sub Family Name", "Brand Code", "Brand Principle", "Brand Name", "SupplierNo", "SupplierName", "Item Code", "Item Barcode", "Item Name", "MKT Code", "MKT Name", "Total QTY Hypermarket", "Total Sales Hypermarket", "Total QTY Supermarket", "Total Sales Supermarket", "Total Quantity", "Total Sales", "Merged", "01. Subsector", "02. Subcategory", "03. Manufacturer", "04. Brand", "05. Subbrand", "06. Form", "07. Segment", "08. Variant", "09. Size", "10. Packsize", "11. Basicsize", "12. ProductName", "Promotion", "Unit Price", "Volume", "Value", "Custom"}),
        #"Renamed Columns7" = Table.RenameColumns(#"Reordered Columns2",{{"Merged", "Branch"}}),
        #"Merged Queries1" = Table.NestedJoin(#"Renamed Columns7", {"Branch"}, BranchMap, {"Branch"}, "BranchMap", JoinKind.LeftOuter),
        #"Expanded BranchMap" = Table.ExpandTableColumn(#"Merged Queries1", "BranchMap", {"Cristal Ball Store Key", "Branch", "Store Number", "Transmed Name", "Cristal Ball Name", "City", "Store Type", "Region", "Channel", "Profile", "ProfileGen"}, {"Cristal Ball Store Key", "Branch.1", "Store Number", "Transmed Name", "Cristal Ball Name", "City", "Store Type", "Region", "Channel", "Profile", "ProfileGen"}),
        #"Removed Columns2" = Table.RemoveColumns(#"Expanded BranchMap",{"Branch.1", "Dept", "Dept Name", "Section", "Section Name", "Family", "Family name", "Sub Family", "Sub Family Name", "Brand Code", "Brand Principle", "Brand Name", "SupplierNo", "Item Code", "MKT Code", "MKT Name", "Total QTY Hypermarket", "Total Sales Hypermarket", "Total QTY Supermarket", "Total Sales Supermarket"}),
        #"Renamed Columns8" = Table.RenameColumns(#"Removed Columns2",{{"Custom", "NaN Error Validation"}, {"Transmed Name", "Branch Name (TM)"}, {"Cristal Ball Name", "Branch Name (P&G)"}, {"Cristal Ball Store Key", "Store Key (P&G)"}, {"Store Number", "Store Number (TM)"}})
    in
        #"Renamed Columns8"



    Wednesday, December 18, 2019 7:12 AM

Answers

  • Colin gave the explanation 3 years ago: try otherwise doesn't catch error

    You can't use try because a value (NaN) is returned, and not an error. You have to test for the value like: if Number.IsNaN([miles] / [hour]) then null else [miles] / [hour].

    Also, any non-zero value of miles with a zero value for hour does not generate an error, but a different value, i.e. infinity.

    Then you can try:

    #"Filtered Rows1" = Table.SelectRows(
    	Table.AddColumn(
    		#"Changed Type1", 
    		"Unit Price", 
    		each 
    			try [Total Sales]/[Total Quantity] 
    			otherwise 0
    		),
    	each not Number.IsNaN([Unit Price]) and Number.Abs([Unit Price])<>#infinity
    )

    Wednesday, December 18, 2019 11:59 AM

All replies

  • You could replace these steps:

       #"Added Custom1" = Table.AddColumn(#"Changed Type1", "Custom", each try [Total Sales]/[Total Quantity] otherwise 0),
        #"Renamed Columns4" = Table.RenameColumns(#"Added Custom1",{{"Custom", "Unit Price"}}),
        #"Added Custom2" = Table.AddColumn(#"Renamed Columns4", "Custom", each (if Number.IsNaN([Unit Price]) then false else true)),
        #"Filtered Rows1" = Table.SelectRows(#"Added Custom2", each ([Custom] = true)),

    with this step:

        #"Filtered Rows1" = Table.SelectRows(Table.AddColumn(#"Changed Type1", "Unit Price", each try [Total Sales]/[Total Quantity] otherwise 0), each not Number.IsNaN([Unit Price]))

    Wednesday, December 18, 2019 7:28 AM
  • what would it look like if I also did the check for positive and negative infinity?
    Wednesday, December 18, 2019 7:32 AM
  • Colin gave the explanation 3 years ago: try otherwise doesn't catch error

    You can't use try because a value (NaN) is returned, and not an error. You have to test for the value like: if Number.IsNaN([miles] / [hour]) then null else [miles] / [hour].

    Also, any non-zero value of miles with a zero value for hour does not generate an error, but a different value, i.e. infinity.

    Then you can try:

    #"Filtered Rows1" = Table.SelectRows(
    	Table.AddColumn(
    		#"Changed Type1", 
    		"Unit Price", 
    		each 
    			try [Total Sales]/[Total Quantity] 
    			otherwise 0
    		),
    	each not Number.IsNaN([Unit Price]) and Number.Abs([Unit Price])<>#infinity
    )

    Wednesday, December 18, 2019 11:59 AM
  • Since you have 6 million rows, this approach may be better:
    First you remove the rows that will generate infinity or NaN or null and only then you make the division
    (also check Is "try ... otherwise" provides equal performance in comparison with checking the circumstanses leading to an error?)
    Still, I would keep the try ... otherwise,  just in case of anything else

    #"Filtered Rows1" = Table.AddColumn(
    	Table.SelectRows(
    		#"Changed Type1",
    		each 
    			not List.Contains({null,"",0},[Total Quantity]) and 
    			not List.Contains({null,""},[Total Sales])
    	)
    	"Unit Price", 
    	each try [Total Sales]/[Total Quantity] otherwise 0
    )


    • Edited by anthony34 Wednesday, December 18, 2019 1:07 PM
    Wednesday, December 18, 2019 1:01 PM