none
Is it possible to unzip a .xlsm file without corrupting it? RRS feed

  • Question

  • Hello.

    I have had some success at actioning the following sequence of steps over a .xlsx file:
    1) Rename it to .zip.
    2) Unzip it.
    3) Make changes across the unzipped XML files with a search and replace tool.
    4) Rezip it.
    5) Rename it back to .xlsx.
    6) Open it up in Excel, without loss of integrity (and with the replacements from the search and replace tool intact).

    Does anyone know if this is possible with a .xlsm file? If I eliminate step 3) (i.e., in order to round-trip without making any edits, just to prove that round-tripping works), Excel tells me that the file has been corrupted when I try to open it up.

    While on the subject, does anyone know of a tool or method of automating the above 6 steps, so that I can action it over all my Excel files?

    Many thanks.

    Keith

    Tuesday, March 1, 2011 6:58 PM

Answers

  • BTW: Install 7-zip if you want to do it manual
     
    Use the free zip program 7-Zip
    Why ? : Much easier, because you not have to change the extension and can edit the file in the zip.

    http://www.7-zip.org/

    After you install 7-Zip you can do this to change a xml file in it.

    1: Right click on your workbook icon on your desktop
    2: Choose 7-Zip > Open archive
    3: Right click on the file named “customUI.xml” in the “userCustomization” folder and choose "Edit"
        Or select the file and press F4
    4: Edit the file and save and close the “customUI.xml“ file
    5: Say YES to update the archive




     
     
     

    Regards Ron de Bruin
    http://www.rondebruin.nl/tips.htm


    "R. Keith Howard" wrote in message news:548e2842-b419-4657-bfcf-3dfa9d5a15be...

    Hello.

    I have had some success at actioning the following sequence of steps over a ..xlsx file:
    1) Rename it to .zip.
    2) Unzip it.
    3) Make changes across the unzipped XML files with a search and replace tool.
    4) Rezip it.
    5) Rename it back to .xlsx.
    6) Open it up in Excel, without loss of integrity (and with the replacements from the search and replace tool intact).

    Does anyone know if this is possible with a .xlsm file? If I eliminate step 3) (i.e., in order to round-trip without making any edits, just to prove that round-tripping works), Excel tells me that the file has been corrupted when I try to open it up.

    While on the subject, does anyone know of a tool or method of automating the above 6 steps, so that I can action it over all my Excel files?

    Many thanks.

    Keith

    • Marked as answer by Sally Tang Monday, March 14, 2011 1:59 AM
    Saturday, March 5, 2011 2:59 PM
  • The .xlsx and .xlsm files behave the same in this respect.  I suspect you had an error in your process.  In particular, make sure that no new files were created while making changes and that the changes were made properly.

    You can use many zip/unzip tools, including Java's jar tool (cross platform, client or server).

    Example:
    Start with a spreadsheet such as updateTest.xlsm.
    Create a subdirectory such as updateTest and cd to that.
    Run
      $ jar xvf ../updateTest.xlsm
    Then run
      $ jar cvfM ../updateTest01.xlsm *

    The above example leaves all the xml files in the subdirectory and the resulting updateTest01.xlsm opens in Excel without warnings.

    Note: the “M” option for jar is essential to avoid adding a manifest.  The “v” option is optional.

    It is possible to modify the xml files between the above two steps, but again, make sure there are no new files and make changes properly.  The file [Content_Types].xml tracks the expected files, so these and only these should be present.

    For changing the content, here is an excellent article: Editing Data in an Excel 2007 Open XML File with VBA (http://msdn.microsoft.com/en-us/library/dd819387(v=office.12).aspx)

    In summary, you can edit numbers in your xl/worksheets/sheetX.xml files, but strings are centralized in xl/sharedStrings.xml.  Cells with strings include an index into the sharedString.xml table.  Note that there is a bug in the above article: any new strings must be added to the end or else they will throw off the indexing for other references.

    As a hack, you can include dummy strings in your spreadsheet and then update sharedStrings.xml.

    Note that with the java.util.zip package, you can create the files and zip files programmatically.  This could be better for large numbers of updates.  See http://java.sun.com/developer/technicalArticles/Programming/compression/.  Other tools and approaches can be used as well.
    • Proposed as answer by MSDNCommenter Wednesday, July 27, 2011 8:47 PM
    • Marked as answer by R. Keith Howard Wednesday, August 3, 2011 12:38 PM
    Wednesday, July 27, 2011 8:43 PM
  • Hello Ron,

    After a couple of months of searching, I found the answer. There is a program called Funduc Replace Studio Professional which enables searching and replacing over all files in a path (including directories). It can search and replace within Office 2010 archive files (e.g., .docx and xlsx files).

    So I no longer need to write the code that I was envisaging when I first asked the question months ago.

    I recommend this product.

    Thanks for your help.

    Kind regards,

    Keith

    Saturday, August 6, 2011 5:19 PM

All replies

  • Hello Keith, If you leave out the unzipping and rezipping it will work for you. 1) Add .zip to the end of the XLSM file name 2) View components in Windows Explorer 3) Close Windows Explorer 4) Rename the file removing the .zip When you rezip it, it may not be compiled correctly which will give you the message. There are some 3rd party tools available. 1) Go to www.bing.com 2) Search 'Office XML editor' Best Regards, Chad
    Chad Mathiason
    Friday, March 4, 2011 5:45 PM
    Moderator
  • Hello Keith, If you leave out the unzipping and rezipping it will work for you. 1) Add .zip to the end of the XLSM file name 2) View components in Windows Explorer 3) Close Windows Explorer 4) Rename the file removing the .zip When you rezip it, it may not be compiled correctly which will give you the message. There are some 3rd party tools available. 1) Go to www.bing.com 2) Search 'Office XML editor' Best Regards, Chad
    Chad Mathiason
    • Marked as answer by Sally Tang Monday, March 14, 2011 1:58 AM
    • Unmarked as answer by R. Keith Howard Wednesday, August 3, 2011 12:18 PM
    Friday, March 4, 2011 5:45 PM
    Moderator
  • See the code in my QAT add-in, it will give you some ideas
     
    See this part on the page
     
    Copy a customized QAT from one workbook to another(Add-in)

     
     
     

    Regards Ron de Bruin
    http://www.rondebruin.nl/tips.htm


    "R. Keith Howard" wrote in message news:548e2842-b419-4657-bfcf-3dfa9d5a15be...

    Hello.

    I have had some success at actioning the following sequence of steps over a ..xlsx file:
    1) Rename it to .zip.
    2) Unzip it.
    3) Make changes across the unzipped XML files with a search and replace tool.
    4) Rezip it.
    5) Rename it back to .xlsx.
    6) Open it up in Excel, without loss of integrity (and with the replacements from the search and replace tool intact).

    Does anyone know if this is possible with a .xlsm file? If I eliminate step 3) (i.e., in order to round-trip without making any edits, just to prove that round-tripping works), Excel tells me that the file has been corrupted when I try to open it up.

    While on the subject, does anyone know of a tool or method of automating the above 6 steps, so that I can action it over all my Excel files?

    Many thanks.

    Keith

    Saturday, March 5, 2011 2:52 PM
  • BTW: Install 7-zip if you want to do it manual
     
    Use the free zip program 7-Zip
    Why ? : Much easier, because you not have to change the extension and can edit the file in the zip.

    http://www.7-zip.org/

    After you install 7-Zip you can do this to change a xml file in it.

    1: Right click on your workbook icon on your desktop
    2: Choose 7-Zip > Open archive
    3: Right click on the file named “customUI.xml” in the “userCustomization” folder and choose "Edit"
        Or select the file and press F4
    4: Edit the file and save and close the “customUI.xml“ file
    5: Say YES to update the archive




     
     
     

    Regards Ron de Bruin
    http://www.rondebruin.nl/tips.htm


    "R. Keith Howard" wrote in message news:548e2842-b419-4657-bfcf-3dfa9d5a15be...

    Hello.

    I have had some success at actioning the following sequence of steps over a ..xlsx file:
    1) Rename it to .zip.
    2) Unzip it.
    3) Make changes across the unzipped XML files with a search and replace tool.
    4) Rezip it.
    5) Rename it back to .xlsx.
    6) Open it up in Excel, without loss of integrity (and with the replacements from the search and replace tool intact).

    Does anyone know if this is possible with a .xlsm file? If I eliminate step 3) (i.e., in order to round-trip without making any edits, just to prove that round-tripping works), Excel tells me that the file has been corrupted when I try to open it up.

    While on the subject, does anyone know of a tool or method of automating the above 6 steps, so that I can action it over all my Excel files?

    Many thanks.

    Keith

    • Marked as answer by Sally Tang Monday, March 14, 2011 1:59 AM
    Saturday, March 5, 2011 2:59 PM
  • Hello Ron.

    Thanks for the reply.

    The problem is that I need to be able to execute bulk search and replace operations over all .xlsx or .docx files on the network in one go. I have an editor that can do this, but it operates over xml files. Therefore, I need to unzip all .xlsx files on the network, then search and replace over all the related .xml files, then rezip.

    As I understand it, 7 Zip cannot do this.

    See my problem?

    Kind regards,

    Keith

    Tuesday, April 5, 2011 6:22 PM
  • Hello Chad.

    Could I ask you to please look at Ron de Bruin's responses to my question below, and my reply to his response? I don't really understand your response, but maybe if you read my response to Ron, you will get a better picture of what I am trying to accomplish.

    Many thanks.

    Keith

    Tuesday, April 5, 2011 6:24 PM
  • See my other reply that have a link to my QAT add-in
    This add-in change stuff in the zip
     
    See the code in it
     

    Regards Ron de Bruin
    http://www.rondebruin.nl/tips.htm


    "R. Keith Howard" wrote in message news:7c36dc2f-ba0f-4d18-997a-43d5991f3a2b...

    Hello Ron.

    Thanks for the reply.

    The problem is that I need to be able to execute bulk search and replace operations over all .xlsx or .docx files on the network in one go. I have an editor that can do this, but it operates over xml files. Therefore, I need to unzip all .xlsx files on the network, then search and replace over all the related .xml files, then rezip.

    As I understand it, 7 Zip cannot do this.

    See my problem?

    Kind regards,

    Keith

    Tuesday, April 5, 2011 6:33 PM
  • Thanks again Ron.

    That's interesting, but, as I understand it, it does not solve my problem. I need to be able to execute search and replace operations over dozens or hundreds of .xlsx files in one bulk operation.

    Kind regards,

    Keith

    Tuesday, April 5, 2011 6:42 PM
  • You can create a loop through all your files
    In the add-in you have some example code so try it if it is working for you.
    If it works you can create a loop through all your files
     
    But maybe there is a better solution ???
     

    Regards Ron de Bruin
    http://www.rondebruin.nl/tips.htm


    "R. Keith Howard" wrote in message news:6e4914c1-73a4-44b3-a42b-ee0594e63662...

    Thanks again Ron.

    That's interesting, but, as I understand it, it does not solve my problem. I need to be able to execute search and replace operations over dozens or hundreds of .xlsx files in one bulk operation.

    Kind regards,

    Keith

    Tuesday, April 5, 2011 6:58 PM
  • Thanks Ron.

    I'll study the code and see if I can figure out how to implement that loop.

    Also, I agree with you; there is probably a better solution, but I am struggling to figure out where to find it or who to ask.

    Cheers,

    Keith

    Tuesday, April 5, 2011 7:18 PM
  • The .xlsx and .xlsm files behave the same in this respect.  I suspect you had an error in your process.  In particular, make sure that no new files were created while making changes and that the changes were made properly.

    You can use many zip/unzip tools, including Java's jar tool (cross platform, client or server).

    Example:
    Start with a spreadsheet such as updateTest.xlsm.
    Create a subdirectory such as updateTest and cd to that.
    Run
      $ jar xvf ../updateTest.xlsm
    Then run
      $ jar cvfM ../updateTest01.xlsm *

    The above example leaves all the xml files in the subdirectory and the resulting updateTest01.xlsm opens in Excel without warnings.

    Note: the “M” option for jar is essential to avoid adding a manifest.  The “v” option is optional.

    It is possible to modify the xml files between the above two steps, but again, make sure there are no new files and make changes properly.  The file [Content_Types].xml tracks the expected files, so these and only these should be present.

    For changing the content, here is an excellent article: Editing Data in an Excel 2007 Open XML File with VBA (http://msdn.microsoft.com/en-us/library/dd819387(v=office.12).aspx)

    In summary, you can edit numbers in your xl/worksheets/sheetX.xml files, but strings are centralized in xl/sharedStrings.xml.  Cells with strings include an index into the sharedString.xml table.  Note that there is a bug in the above article: any new strings must be added to the end or else they will throw off the indexing for other references.

    As a hack, you can include dummy strings in your spreadsheet and then update sharedStrings.xml.

    Note that with the java.util.zip package, you can create the files and zip files programmatically.  This could be better for large numbers of updates.  See http://java.sun.com/developer/technicalArticles/Programming/compression/.  Other tools and approaches can be used as well.
    • Proposed as answer by MSDNCommenter Wednesday, July 27, 2011 8:47 PM
    • Marked as answer by R. Keith Howard Wednesday, August 3, 2011 12:38 PM
    Wednesday, July 27, 2011 8:43 PM
  • Hello Ron,

    After a couple of months of searching, I found the answer. There is a program called Funduc Replace Studio Professional which enables searching and replacing over all files in a path (including directories). It can search and replace within Office 2010 archive files (e.g., .docx and xlsx files).

    So I no longer need to write the code that I was envisaging when I first asked the question months ago.

    I recommend this product.

    Thanks for your help.

    Kind regards,

    Keith

    Saturday, August 6, 2011 5:19 PM
  • This is an old question but still relevant if anyone is Googling this issue. The key to zipping/unzipping is to make sure you don't include the parent directory - make sure that the _rels, docProps, xl etc. are all added to the zip file directly. Props to StackOverflow for the tip: http://stackoverflow.com/questions/11082278/how-to-properly-assemble-a-valid-xlsx-file-from-its-internal-sub-components
    Wednesday, December 10, 2014 6:20 PM