Asked by:
BinaryFormatter and byte[] - bug or feature?

Question
-
Hi,
I faced an issue w/ the BinaryFormatter in PS, which took me a couple of hours to understand **why** -- as I never saw it happen in C#!
Not really sure if this is the expected behavior... so trying to figure out if this is a feature... or a bug!
Here it is...
For this example, I used up a text file, called aaa, with 10K+ bytes. And yes, in this example, size does matter!
Also, the file serialization is not the goal... is just an easy way to have a considerable array of data of different data types.
PS C:\temp> dir aaa Directory: C:\temp Mode LastWriteTime Length Name ---- ------------- ------ ---- -a---- 2017-10-20 16:40 13542 aaa
Loading and serializing the file, looks trivial:
PS C:\temp> $x=gc .\aaa PS C:\temp> $x.GetType() IsPublic IsSerial Name BaseType -------- -------- ---- -------- True True Object[] System.Array PS C:\temp> $x[0].GetType() IsPublic IsSerial Name BaseType -------- -------- ---- -------- True True String System.Object PS C:\temp> $x.Length 33 PS C:\temp> $ms=New-Object System.IO.MemoryStream; try{ date; (New-Object System.Runtime.Serialization.Formatters.Binary.BinaryFormatter).Serialize( $ms, $x); date; $ms.Length }Finally{ $ms.Close() } Monday, Nov 20, 2017 15:01:45 Monday, Nov 20, 2017 15:01:45 93333
So, the serialization of the object[] happens in a glitch, no issues so far.
However, when I import the file as a byte[]:
PS C:\temp> $x=gc .\aaa -Encoding Byte PS C:\temp> $x.Length 13542 PS C:\temp> $x.GetType() IsPublic IsSerial Name BaseType -------- -------- ---- -------- True True Object[] System.Array PS C:\temp> $ms=New-Object System.IO.MemoryStream; try{ date; (New-Object System.Runtime.Serialization.Formatters.Binary.BinaryFormatter).Serialize( $ms, $x); date; $ms.Length }Finally{ $ms.Close() } Monday, Nov 20, 2017 15:03:01 Monday, Nov 20, 2017 15:06:57 35557612
Not only the serialized image is 381 times bigger than the text one, as it takes almost 4 mins to serialize it!!
If however, I force the cast to byte[]
PS C:\temp> $x=[byte[]]$x PS C:\temp> $x.GetType() IsPublic IsSerial Name BaseType -------- -------- ---- -------- True True Byte[] System.Array PS C:\temp> $ms=New-Object System.IO.MemoryStream; try{ date; (New-Object System.Runtime.Serialization.Formatters.Binary.BinaryFormatter).Serialize( $ms, $x); date; $ms.Length }Finally{ $ms.Close() } Monday, Nov 20, 2017 15:07:58 Monday, Nov 20, 2017 15:07:58 13570
It gives a more acceptable size and is done in a glitch!
I thought it may be due to boxing; as in the first byte[] load, $x was object[], it is probably boxing each byte in an object... so, I decided to box, not a byte[] but a int[], expecting similar results. However...
PS C:\temp> $x=[object[]][int[]]$x PS C:\temp> $x.GetType() IsPublic IsSerial Name BaseType -------- -------- ---- -------- True True Object[] System.Array PS C:\temp> $x[0].GetType() IsPublic IsSerial Name BaseType -------- -------- ---- -------- True True Int32 System.ValueType PS C:\temp> $ms=New-Object System.IO.MemoryStream; try{ date; (New-Object System.Runtime.Serialization.Formatters.Binary.BinaryFormatter).Serialize( $ms, $x); date; $ms.Length }Finally{ $ms.Close() } Monday, Nov 20, 2017 15:08:38 Monday, Nov 20, 2017 15:08:38 81279
It currently looks well!
I made many more experiments and looks like that when is a byte[], the system runs wild and the serialization takes forever.
I definitely need to force the cast to byte[] to have a normal behavior... However, any other type (any array of any value type, not byte, or class) works as expected.
Am I missing something ??
Best regards,
José
Monday, November 20, 2017 2:32 PM
All replies
-
PowerShell is not C# and cannot be made to work like C#.
To read a file as bytes we would do this:
$a = Get-Content ,filename> -Raw
Now you have bytes and not an array of strings.
\_(ツ)_/
Monday, November 20, 2017 3:06 PM -
I don't pretend to completely understand your smart posts, but I wonder if powershell core would work better for you. They just put out a release candidate. https://github.com/PowerShell/PowerShell/releases
- Edited by JS2010 Monday, November 20, 2017 3:55 PM
Monday, November 20, 2017 3:44 PM -
Hi jrv,
I know PS and C# are different... but this is about the way the language uses the .NET framework. Also, the issue is not about the file read; as I stated before, this was used only to help to simulate the issue.
I find this accidentally when serializing an abstract object... sometimes it would take nothing, other times several minutes.
I find out that is you have an object that happens to be a byte[] but is presented as object, this will happen... however, any other array of vaue types (int, short, doubles, whatever) it works correctly.
Also, the size of the MemoryStream in the different scenarios makes me believe that in this particular case, not sure why, it is boxing the values... but is just a wild guess.
I am just checking if someone knows the why of this strange behavior.
About the -Raw argument, it is not mandatory to send a byte[].
PS C:\temp> (gc .\aaa -Raw).gettype() IsPublic IsSerial Name BaseType -------- -------- ---- -------- True True String System.Object PS C:\temp> ([byte[]](gc .\aaa -Raw)).gettype() Cannot convert value " ...
while
PS C:\temp> (gc .\aaa -Encoding Byte).gettype() IsPublic IsSerial Name BaseType -------- -------- ---- -------- True True Object[] System.Array PS C:\temp> ([byte[]](gc .\aaa -Encoding Byte)).gettype() IsPublic IsSerial Name BaseType -------- -------- ---- -------- True True Byte[] System.Array
Regards,
José
- Edited by José M. Nobre Monday, November 20, 2017 5:52 PM
Monday, November 20, 2017 5:50 PM -
Hi JS2010,
Will take a look; however, is a RC, so for production purposes, is still baking :)
Regards,
José
Monday, November 20, 2017 5:51 PM -
Hi,
This is a quick note to let you know that I am currently performing research on this issue and will get back to you as soon as possible. I appreciate your patience.
If you have any updates during this process, please feel free to let me know.
Best Regards,
Albert LingPlease remember to mark the replies as an answers if they help.
If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.comTuesday, November 21, 2017 12:25 PM -
Hi Albert,
Thanks for your update; I'll wait for the results (for the time being, I bypassed it with a workaround).
Regards,
José
Tuesday, November 21, 2017 6:08 PM -
Hi Albert,
did you got any feedback on this?
Regards,
José
José
Monday, December 4, 2017 5:33 PM