none
Weird pattern to normalize RRS feed

  • Question

  • Hi guys, 

    Please, I have a table with numbers like this:

    1000
    -1000.00
    1000.00
    1.000.00
    1000
    -1,000,00
    1,000,00

    Is there a way to normalize it?

    I've tried splitting in different ways, but anything is working.


    Monday, March 6, 2017 7:14 PM

Answers

  • Based on the example data I adjusted my code:

    1. Determine the number of digits after the last comma or dot (null if none).
    2. Determine the base divisor as 0 if digits = null or digits = 3 (then the dot or comma is regarded thousand separator), otherwise base divisor = digits.
    3. Divisor = Number.Power(10,digits).

    The remainder is the same as my previous solution.

    let
        Source = Excel.CurrentWorkbook(){[Name="Tabela1"]}[Content],
        #"Removed Other Columns" = Table.SelectColumns(Source,{"Original values"}),
        #"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{"Original values", "Valor"}}),
        AddedDigitsAfterLastDotOrComma = Table.AddColumn(#"Renamed Columns", "Digits", each Text.Length([Valor])-List.Max(Text.PositionOfAny([Valor],{",","."},Occurrence.All))-1),
        AddedBaseDivisor = Table.AddColumn(AddedDigitsAfterLastDotOrComma, "Divisor", each if [Digits] = null then 0 else if [Digits] = 3 then 0 else [Digits]),
        TransformedDivisor = Table.TransformColumns(AddedBaseDivisor,{{"Divisor", each Number.Power(10, _), type number}}),
        RemovedCommas = Table.ReplaceValue(TransformedDivisor,",","",Replacer.ReplaceText,{"Valor"}),
        RemovedDots = Table.ReplaceValue(RemovedCommas,".","",Replacer.ReplaceText,{"Valor"}),
        Typed = Table.TransformColumnTypes(RemovedDots,{{"Valor", Int64.Type}, {"Divisor", Int64.Type}}),
        AddedNormalized = Table.AddColumn(Typed, "Normalized", each [Valor] / [Divisor], type number)
    in
        AddedNormalized

    • Marked as answer by Felipe Vaz Wednesday, March 8, 2017 12:53 PM
    Tuesday, March 7, 2017 9:32 PM

All replies

  • Did you try Decimal.From(text column)?
    Monday, March 6, 2017 7:47 PM
  • Are these numbers from mixed locales? What number does 1.000.00 represent? Is it the same as 1,000,00 but in a different format? If so, what number does 1,000,00 represent?

    Perhaps even more to the point - what output do you want to see for each of the numbers provided?

    Monday, March 6, 2017 7:50 PM
  • Are these numbers from mixed locales? What number does 1.000.00 represent? Is it the same as 1,000,00 but in a different format? If so, what number does 1,000,00 represent?

    Perhaps even more to the point - what output do you want to see for each of the numbers provided?

    Hi Colin,

    They represent Brazil currency. Instead of comma, brazillian currency uses dots for thousand and comma for decimals.

    I cannot control the pattern they are filled, so is there a way to normalize it?

    Tuesday, March 7, 2017 1:07 PM
  • Did you try Decimal.From(text column)?

    Hi Gerald,

    I've tried 

    = try Number.FromText([Valor]) otherwise Decimal.From([Valor])

    It does not work, this is getting rid of the decimals ("." or ","), like:

    17862.37, in brazillian currency should be 17862,37, but that formula are resulting in 1786237 (get rid of the ".")

    Image here: https://1drv.ms/f/s!As8YC7wZr8RKirQMsYegvqYLhxcuWw

    Tuesday, March 7, 2017 1:32 PM
  • My suggestion would be:

    1. Determine a divisor as 100 if there is a dot or a comma at the third last position, otherwise 1.

    2. Remove all dots and commas from the values.

    3. Make the value numeric and divide by the divisor.

    Note this requires any minus sign to be in front of the value.

    let
        Source = Excel.CurrentWorkbook(){[Name="Input"]}[Content],
        AddedDivisor = Table.AddColumn(Source, "Divisor", each if Text.PositionOf(",.",Text.At([Valor],Text.Length([Valor])-3)) >= 0 then 100 else 1),
        RemovedCommas = Table.ReplaceValue(AddedDivisor,",","",Replacer.ReplaceText,{"Valor"}),
        RemovedDots = Table.ReplaceValue(RemovedCommas,".","",Replacer.ReplaceText,{"Valor"}),
        Typed = Table.TransformColumnTypes(RemovedDots,{{"Valor", Int64.Type}, {"Divisor", Int64.Type}}),
        AddedNormalized = Table.AddColumn(Typed, "Normalized", each [Valor] / [Divisor], type number),
        RemovedColumns = Table.RemoveColumns(AddedNormalized,{"Valor", "Divisor"})
    in
        RemovedColumns

    Tuesday, March 7, 2017 2:13 PM
  • Are these numbers from mixed locales? What number does 1.000.00 represent? Is it the same as 1,000,00 but in a different format? If so, what number does 1,000,00 represent?

    Perhaps even more to the point - what output do you want to see for each of the numbers provided?

    Hi Colin,

    They represent Brazil currency. Instead of comma, brazillian currency uses dots for thousand and comma for decimals.

    I cannot control the pattern they are filled, so is there a way to normalize it?

    A simple question. What does a number with two decimal indicators mean (1,000,00). For currencies that use the period as a decimal indicator, that number would be 1.000.00, which is meaningless.

    Tuesday, March 7, 2017 2:25 PM
  • Are these numbers from mixed locales? What number does 1.000.00 represent? Is it the same as 1,000,00 but in a different format? If so, what number does 1,000,00 represent?

    Perhaps even more to the point - what output do you want to see for each of the numbers provided?

    Hi Colin,

    They represent Brazil currency. Instead of comma, brazillian currency uses dots for thousand and comma for decimals.

    I cannot control the pattern they are filled, so is there a way to normalize it?

    A simple question. What does a number with two decimal indicators mean (1,000,00). For currencies that use the period as a decimal indicator, that number would be 1.000.00, which is meaningless.

    1,000,00

    1000

    1.000.00

    1,000.00

    One thousand

    It should be here in brazil 1000,00 or 1.000,00


    Tuesday, March 7, 2017 2:28 PM
  • The simplest approach would be to convert to a number and format the number in the client tool (Excel, for example). This way, the numbers will be formatted based on regional setting, and you can use the numbers in calculations. If the decimal place is insignificant (you don't provide an example to suggest that it is), then one way to normalize would be:

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        ChangedType = Table.TransformColumnTypes(Source,{{"Valor", type text}}),
        AddedCustom = Table.AddColumn(ChangedType, "Custom", each Number.FromText(List.Accumulate(List.Select(Text.SplitAny([Valor], ".,"), each _ <> "00"), "", (accum, curr)=> accum & curr)))
    in
        AddedCustom

    This code splits by text into a list based on the decimal or comma separator, removes the insignificant "00" digits, and reassembles the list into a number.

    Tuesday, March 7, 2017 4:19 PM
  • My suggestion would be:

    1. Determine a divisor as 100 if there is a dot or a comma at the third last position, otherwise 1.

    2. Remove all dots and commas from the values.

    3. Make the value numeric and divide by the divisor.

    Note this requires any minus sign to be in front of the value.

    let
        Source = Excel.CurrentWorkbook(){[Name="Input"]}[Content],
        AddedDivisor = Table.AddColumn(Source, "Divisor", each if Text.PositionOf(",.",Text.At([Valor],Text.Length([Valor])-3)) >= 0 then 100 else 1),
        RemovedCommas = Table.ReplaceValue(AddedDivisor,",","",Replacer.ReplaceText,{"Valor"}),
        RemovedDots = Table.ReplaceValue(RemovedCommas,".","",Replacer.ReplaceText,{"Valor"}),
        Typed = Table.TransformColumnTypes(RemovedDots,{{"Valor", Int64.Type}, {"Divisor", Int64.Type}}),
        AddedNormalized = Table.AddColumn(Typed, "Normalized", each [Valor] / [Divisor], type number),
        RemovedColumns = Table.RemoveColumns(AddedNormalized,{"Valor", "Divisor"})
    in
        RemovedColumns

    Hi,

    I implemented the code, but does not work. Almost there!!!! See row 24...

    Sample: https://1drv.ms/f/s!As8YC7wZr8RKirQMsYegvqYLhxcuWw

    Tuesday, March 7, 2017 5:10 PM
  • The simplest approach would be to convert to a number and format the number in the client tool (Excel, for example). This way, the numbers will be formatted based on regional setting, and you can use the numbers in calculations. If the decimal place is insignificant (you don't provide an example to suggest that it is), then one way to normalize would be:

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        ChangedType = Table.TransformColumnTypes(Source,{{"Valor", type text}}),
        AddedCustom = Table.AddColumn(ChangedType, "Custom", each Number.FromText(List.Accumulate(List.Select(Text.SplitAny([Valor], ".,"), each _ <> "00"), "", (accum, curr)=> accum & curr)))
    in
        AddedCustom

    This code splits by text into a list based on the decimal or comma separator, removes the insignificant "00" digits, and reassembles the list into a number.

    Hi Colin, thanks!!

    Does not work, see the sample with your name in the link below

    Sample: https://1drv.ms/f/s!As8YC7wZr8RKirQMsYegvqYLhxcuWw

    PS: the decimals are needed, sorry I didnt say before!


    • Edited by Felipe Vaz Tuesday, March 7, 2017 5:20 PM I forgot some words
    Tuesday, March 7, 2017 5:19 PM
  • Try using the custom function below.  It will not work if the number to be normalized has 3 decimal digits, since the function will process that as an integer, but currency data usually comes with 2 or 4 decimal positions so it shouldn't be that bad.

    // fnNormalizeNumber
    ( WeirdNumber as text, optional DecimalDelimiter as nullable text ) as number => 
    let
    	Delimiter = 
    		if DecimalDelimiter is null then "." else DecimalDelimiter,
    	Trim = 
    		Text.Remove( WeirdNumber, " " ),
        Split = 
    		Text.SplitAny( Trim, ",." ),
        CountPortions = 
    		List.Count( Split ),
    	DecimalPosition = 
    		List.PositionOf( 
    			List.Transform( 
    				List.Skip( Split, 1 ), 
    				each Text.Length( _ ) <> 3 
    			), 
    			true 
    		) 
    		+ 1,
    	AddDelimiter = 
    		if CountPortions = 1 or DecimalPosition = 0 
    		then Split 
    		else List.InsertRange( Split, DecimalPosition, { Delimiter } ),
    	ReCombine = 
    		Number.FromText( Text.Combine( AddDelimiter ) )
    in
    	ReCombine

    I tested it using this query that uses your original sample data and some extra values:

    // NormalizeNumbers
    let
        YourTable = 
    		Table.FromList(
    			Text.Split( 
    "1.000.
    -1000.00 
    1000.00
     1.000.00
    -1,000,00
    1,000
    10000.0 0
    10 000
    10.000
    10.00
    10
    .00
    0
    55.50
    9.999.9999", 
    		        	"#(lf)" 
    			), 
    			Splitter.SplitByNothing(), 
    			{ "WeirdPattern" }
    		),
        Normalize = Table.TransformColumns( YourTable, { "WeirdPattern", fnNormalizeNumber, type number } )
    in
        Normalize

    I hope it helps.

    Tuesday, March 7, 2017 6:32 PM
  • Based on the table provided in the link, I think that the simplest solution is based on suggestions from my previous post plus a new query:

    1) Check if there is a decimal in the third-to-last position of the value. If so replace with comma, otherwise leave value as is.
    2) Change the type of the new column to number and locale to Brazilian Portuguese ("pt-BR").

    let
        Source = Excel.CurrentWorkbook(){[Name="Tabela2_2"]}[Content],
        AddedCustom = Table.AddColumn(
                          Source, 
                          "Valor Normalizado", 
                          each 
                            try 
                               if Text.Middle([Valor], Text.Length([Valor]) -3, 1) = "." then 
                                  Text.ReplaceRange([Valor], Text.Length([Valor])-3, 1, ",") 
                               else [Valor] 
                            otherwise [Valor]),
        ChangedTypeWithLocale = Table.TransformColumnTypes(AddedCustom, {{"Valor Normalizado", type number}}, "pt-BR")
    in
        ChangedTypeWithLocale

    Assuming that your Windows Regional Settings are set for Brazil, in Excel you will see the correct decimal indicator for the values, but you may not see the thousands separator. Either change the column format to a number with the thousands separator, or use the Accounting format, whichever makes more sense for the values. You could send this data to anyone in the world, and it will show properly for his/her locale.

    Tuesday, March 7, 2017 7:17 PM
  • Based on the example data I adjusted my code:

    1. Determine the number of digits after the last comma or dot (null if none).
    2. Determine the base divisor as 0 if digits = null or digits = 3 (then the dot or comma is regarded thousand separator), otherwise base divisor = digits.
    3. Divisor = Number.Power(10,digits).

    The remainder is the same as my previous solution.

    let
        Source = Excel.CurrentWorkbook(){[Name="Tabela1"]}[Content],
        #"Removed Other Columns" = Table.SelectColumns(Source,{"Original values"}),
        #"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{"Original values", "Valor"}}),
        AddedDigitsAfterLastDotOrComma = Table.AddColumn(#"Renamed Columns", "Digits", each Text.Length([Valor])-List.Max(Text.PositionOfAny([Valor],{",","."},Occurrence.All))-1),
        AddedBaseDivisor = Table.AddColumn(AddedDigitsAfterLastDotOrComma, "Divisor", each if [Digits] = null then 0 else if [Digits] = 3 then 0 else [Digits]),
        TransformedDivisor = Table.TransformColumns(AddedBaseDivisor,{{"Divisor", each Number.Power(10, _), type number}}),
        RemovedCommas = Table.ReplaceValue(TransformedDivisor,",","",Replacer.ReplaceText,{"Valor"}),
        RemovedDots = Table.ReplaceValue(RemovedCommas,".","",Replacer.ReplaceText,{"Valor"}),
        Typed = Table.TransformColumnTypes(RemovedDots,{{"Valor", Int64.Type}, {"Divisor", Int64.Type}}),
        AddedNormalized = Table.AddColumn(Typed, "Normalized", each [Valor] / [Divisor], type number)
    in
        AddedNormalized

    • Marked as answer by Felipe Vaz Wednesday, March 8, 2017 12:53 PM
    Tuesday, March 7, 2017 9:32 PM
  • Based on the example data I adjusted my code:

    1. Determine the number of digits after the last comma or dot (null if none).
    2. Determine the base divisor as 0 if digits = null or digits = 3 (then the dot or comma is regarded thousand separator), otherwise base divisor = digits.
    3. Divisor = Number.Power(10,digits).

    The remainder is the same as my previous solution.

    let
        Source = Excel.CurrentWorkbook(){[Name="Tabela1"]}[Content],
        #"Removed Other Columns" = Table.SelectColumns(Source,{"Original values"}),
        #"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{"Original values", "Valor"}}),
        AddedDigitsAfterLastDotOrComma = Table.AddColumn(#"Renamed Columns", "Digits", each Text.Length([Valor])-List.Max(Text.PositionOfAny([Valor],{",","."},Occurrence.All))-1),
        AddedBaseDivisor = Table.AddColumn(AddedDigitsAfterLastDotOrComma, "Divisor", each if [Digits] = null then 0 else if [Digits] = 3 then 0 else [Digits]),
        TransformedDivisor = Table.TransformColumns(AddedBaseDivisor,{{"Divisor", each Number.Power(10, _), type number}}),
        RemovedCommas = Table.ReplaceValue(TransformedDivisor,",","",Replacer.ReplaceText,{"Valor"}),
        RemovedDots = Table.ReplaceValue(RemovedCommas,".","",Replacer.ReplaceText,{"Valor"}),
        Typed = Table.TransformColumnTypes(RemovedDots,{{"Valor", Int64.Type}, {"Divisor", Int64.Type}}),
        AddedNormalized = Table.AddColumn(Typed, "Normalized", each [Valor] / [Divisor], type number)
    in
        AddedNormalized

    WELL DONE!! \0/\0/\0/\0/

    Thanks, the code works perfectly. I am very thankful!

    This workaround was amazing, I learned a lot in this case!

    Regards,

    Felipe

    Wednesday, March 8, 2017 12:55 PM