locked
Set Encoding for selected files RRS feed

  • Question

  • Hi There, 

    I am trying to write a script to search folders and sub folders to find files of a certain name and then set the encoding to utf8 instead of utf8-BOM. I can search and file the files fine, but setting the encoding without deleting the content of the files is proving a challenge, I want to keep the content of the file as it is, I just want to change the encoding. See code below for my progress so far. hope you can offer some guidance.

    $Files = Get-ChildItem -Recurse -Path 'D:\MyFolder\SubFolder' | Where {$_.name -eq 'MyFile.txt' } 

    Foreach($File in $Files)
    {
    Get-Content -Encoding Unknown | Set-Content -Encoding utf8 # This is the part that just does not work
    }


    Friday, June 21, 2019 4:07 PM

Answers

  • In Windows UTF8 is really UTF8-BOM.  ASCII is identical to UTF8.  UTF8 with no BOM is ASCII.

    Get-Content $file | Set-Content -Encoding ASCII

    Also note that you code, as posted, does nothing.

    This is the correct way to replace a file;

    Get-ChildItem D:\MyFolder\SubFolder -Include myfile.txt -Recurse |
        ForEach-Object{
            $fi = $_
            (Get-Content $fi) | Out-File $fi -Encoding ASCII 
        }


    \_(ツ)_/



    • Edited by jrv Friday, June 21, 2019 4:57 PM
    • Marked as answer by Tottenham Tony Friday, June 21, 2019 5:43 PM
    Friday, June 21, 2019 4:55 PM

All replies

  • In Windows UTF8 is really UTF8-BOM.  ASCII is identical to UTF8.  UTF8 with no BOM is ASCII.

    Get-Content $file | Set-Content -Encoding ASCII

    Also note that you code, as posted, does nothing.

    This is the correct way to replace a file;

    Get-ChildItem D:\MyFolder\SubFolder -Include myfile.txt -Recurse |
        ForEach-Object{
            $fi = $_
            (Get-Content $fi) | Out-File $fi -Encoding ASCII 
        }


    \_(ツ)_/



    • Edited by jrv Friday, June 21, 2019 4:57 PM
    • Marked as answer by Tottenham Tony Friday, June 21, 2019 5:43 PM
    Friday, June 21, 2019 4:55 PM
  • That's great, done exactly what i wanted. Much appreciated.
    Friday, June 21, 2019 5:44 PM
  • Yes, utf8 and utf8-bom is the same thing.  Ascii is like utf8, except there's no bom, and it can't handle special characters like accents.  You can check the encoding of a file in notepad, while doing a save as.
    Friday, June 21, 2019 8:01 PM
  • Yes, utf8 and utf8-bom is the same thing.  Ascii is like utf8, except there's no bom, and it can't handle special characters like accents.  You can check the encoding of a file in notepad, while doing a save as.

    No.  UTTF-8 is without a BOM and UTF-8BOM is with a BOM.  Microsoft has always used UTF-8 with a BOM because they pre-date IETF and ISO.  They just never changed its name.  UTF-8 is equivalent to ASCII.  UTF-8BOM is not recommended by IETF and is defined for completeness among other things.

    To sort this out read the following: https://en.wikipedia.org/wiki/Byte_order_mark

    Just remember that, when a system requires UTF-8 then send it ASCII.  When a system requires UTF-8 BOM then use UTF-8 in Windows to generate the BOM.


    \_(ツ)_/

    Friday, June 21, 2019 9:26 PM