locked
Excel files containing embedded graphics fail to import into Power Query RRS feed

  • Question

  • Hello. We receive dozens of files from dozens of our business partners on a daily basis. Each of these files needs to be processed and consolidated into a daily revenue report.  Some of our business partners are sending us an Excel file that contains a graphic of their company logo embedded in the spreadsheet. This is preventing me from importing these files into PQ for processing with the following error:

    DataFormat.Error: The input couldn't be recognized as a valid Excel document.
    Details:
        Binary

    My temporary work-around is to have the end user open these files, then save them as .csv files.  However, this is time-consuming and we would prefer to allow the end user to simply save the files "as-is". 

    I've searched for a solution online but I have been unable to find one that works for our situation. I've been using PQ for over a year now, but I'm not a programmer so I am not familiar with advanced concepts such as writing custom M code or functions, etc. Any ideas or suggestions would be deeply appreciated!

    Tuesday, June 26, 2018 4:02 PM

Answers

  • It has nothing to do with the image. The styles element in this Excel file is malformed when specifying number formats. Here's what it looks like:

    <numFmts count="3">
      <numFmt numFmtId="164" formatCode="mm/dd/yyyy"/>
      <numFmt numFmtId="165" formatCode="_-* #,##0_-;-* #,##0_-;_-* &quot;-&quot;??_-;_-@_-"/>
      <numFmt numFmtId="166" formatCode="\$* #,##0.00"/>
      <numFmt numFmtId="165" formatCode="_-* #,##0_-;-* #,##0_-;_-* &quot;-&quot;??_-;_-@_-"/>
      <numFmt numFmtId="166" formatCode="\$* #,##0.00"/>
    </numFmts>

    Do you know what software this customer is using to write the XLSX file? It seems unlikely to be Excel.

    • Marked as answer by Rocket Dog Tuesday, July 3, 2018 12:51 PM
    Thursday, June 28, 2018 7:15 PM

All replies

  • I can't reproduce this with either XLS or XLSX format and would be surprised if it was the embedded image which caused the problem. Maybe it's something else about the file. Do you have a sample you can share with us?
    Wednesday, June 27, 2018 2:27 PM
  • Hi Curt.  Thanks for offering to help.  If I strip out the graphic and resave the file as any other format (.xls, xlsx, .csv) it imports fine. I tried posting a link to the file on DropBox but this site says I can't post links until my account is verified (no idea how I get "verified"?)


    • Edited by Rocket Dog Wednesday, June 27, 2018 4:10 PM
    Wednesday, June 27, 2018 4:09 PM
  • Try stringing this together into a URL to access the sample file I'm having trouble with:

    http:
    //
    bit.
    ly/
    2KqJ5Td

    Thursday, June 28, 2018 1:36 PM
  • It has nothing to do with the image. The styles element in this Excel file is malformed when specifying number formats. Here's what it looks like:

    <numFmts count="3">
      <numFmt numFmtId="164" formatCode="mm/dd/yyyy"/>
      <numFmt numFmtId="165" formatCode="_-* #,##0_-;-* #,##0_-;_-* &quot;-&quot;??_-;_-@_-"/>
      <numFmt numFmtId="166" formatCode="\$* #,##0.00"/>
      <numFmt numFmtId="165" formatCode="_-* #,##0_-;-* #,##0_-;_-* &quot;-&quot;??_-;_-@_-"/>
      <numFmt numFmtId="166" formatCode="\$* #,##0.00"/>
    </numFmts>

    Do you know what software this customer is using to write the XLSX file? It seems unlikely to be Excel.

    • Marked as answer by Rocket Dog Tuesday, July 3, 2018 12:51 PM
    Thursday, June 28, 2018 7:15 PM
  • I do not know how the file is being generated, but I can ask them.  I do know that when I delete the graphic and resave the file it loads into PQ just fine.  So maybe the process of resaving the file is correcting the malformed styles element?
    Thursday, June 28, 2018 8:40 PM
  • Yes. In general, Excel is fairly lenient about accepting files which aren't compliant with the OpenXML specification, and will correct them during save. It's much harder for us to do that because we don't have the same deep knowledge of the file format. If this kind of error is caused by fairly commonly-used software, we may be able to work around it.
    Thursday, June 28, 2018 9:04 PM
  • I have a request out to our partner to find out how they're generating these files. I'll post their response here as soon as I hear back from them. Thanks for your insight!

    Thursday, June 28, 2018 9:09 PM
  • I think I just confirmed your hypothesis.  I opened and resaved the file in the same .xlsx format without deleting the graphic, and I was able to import it into PQ.  So indeed there is something else causing the problem, which is likely the malformed styles element you detected. The act of resaving the file is restoring the style element, thus enabling me to import it into PQ. I'll be back with the answer from our partner on how they are generating these files once I hear from them.
    Thursday, June 28, 2018 9:20 PM