locked
Unzipping duplicate files in separate folders RRS feed

  • Question

  • Hi,

    I wonder if someone could help please,

    I have a task of unzipping over 5k zipped files. The following command does this, however some of the zipped files have duplicate content. I would like to have all duplicate content unzipped into a separate folder. Can some advise how this can be achieved?

    Thank you

    Rick

    Function ZipEverything($src, $dest)
      {
            Add-Type -AssemblyName System.IO.Compression.Filesystem
            $zps = Get-ChildItem $src -Filter *.zip


        foreach ($zp IN $zps)
        {
            $all = $src + $zp
            [System.IO.Compression.ZipFile]::ExtractToDirectory($all, $dest)
        }

     }

        ZipEverything -src "C:\Support Team\Extract PS Script\Source\" -dest "U:\Support Team\Monthly Reports\Extract PS Script\Target\"


        # Open Target folder
    explorer "C:\Support Team\Extract PS Script\Target\"

    Wednesday, October 16, 2019 9:40 AM

Answers

  • First please fix your original post and post the code correctly using the code posting tool provided on the edit bar.

    Here is how to read a zip:

    [System.IO.Compression.ZipFile]::OpenRead('c:\scripts\scrips.zip').Entries.Name


    \_(ツ)_/

    Wednesday, October 16, 2019 10:35 AM

All replies

  • The code does nothing useful for your request. You will need to unzip everything then compare every file with every other file.

    There are programs that can do what you ask.  There is no way to do that in the code you have alth0ouogh you could generate the hash for each file and add it to a dictionary then test each new file against the dictionary.  If the hashes are in the dic then send the file to a different folder.

    With your code this would have to be done against each batch of files.

    Please don't expect anyone here to write this script for you.  It is more than a dozen lines and would require access to your zips to test.


    \_(ツ)_/

    Wednesday, October 16, 2019 9:52 AM
  • See:
    help get-filehash -online


    \_(ツ)_/

    Wednesday, October 16, 2019 9:55 AM
  • I have been investigating the catch command which could catch the unzipped files and condition them to be extracted into the correct directory?

                                         

    Function UnzipEverything($src, $dest)
    {
       Add-Type -AssemblyName System.IO.Compression.FileSystem
       $zps = Get-ChildItem $src -Filter *.zip

       foreach ($zp IN $zps)
       {
           $all = $src + $zp
           try {
    [System.IO.Compression.ZipFile]::ExtractToDirectory($all, $dest)
           } 

           catch {
            using (ZipArchive archive = ZipFile.OpenRead($all))
            {
                foreach (ZipArchiveEntry entry in archive.Entries)
                {
    $destinationPath = Path.GetFullPath(Path.Combine($all + "/duplicate/", entry.FullName));
    entry.ExtractToFile($destinationPath)
                }
            }
           }
       }
    }

    Wednesday, October 16, 2019 10:03 AM
  • Are you talking about duplicate files or duplicate file names? Your post asked about duplicate files.

    To detect duplicate file names you could just list the names in each zip and test them.

    Try\Catch will abort the zip on the first error.


    \_(ツ)_/

    Wednesday, October 16, 2019 10:30 AM
  • First please fix your original post and post the code correctly using the code posting tool provided on the edit bar.

    Here is how to read a zip:

    [System.IO.Compression.ZipFile]::OpenRead('c:\scripts\scrips.zip').Entries.Name


    \_(ツ)_/

    Wednesday, October 16, 2019 10:35 AM
  • Hi,

    Was your issue resolved?

    If you resolved it using our solution, please "mark it as answer" to help other community members find the helpful reply quickly.

    If you resolve it using your own solution, please share your experience and solution here. It will be very beneficial for other community members who have similar questions.

    If no, please reply and tell us the current situation in order to provide further help.

    Best Regards,

    Lee

    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Friday, November 8, 2019 1:43 PM

  • 0) Final Destination of files are kept in Folder B

    1) Unzip the file to folder A

    2) Loop through folder A and compare each file to files in Folder B. If found Move Duplicates to folder C.

    3) When loop completes in step 2 move all remaining files from A to B. 

    • Proposed as answer by ComputerScott Friday, November 8, 2019 9:26 PM
    Friday, November 8, 2019 9:26 PM