locked
2007+ Word XML language conflictions when interacting with other programs RRS feed

  • Question

  •  

    Is there a conflict between the XML language of Microsoft Office Word 2007 & 2010 when compared to a supplier who uses:
    - Open Office
    - Google Docs
    - Lotus Symphony
    - iWork

    For Example, within our company we use 2007+ Word for all of our business contracts.  The companies that we contract with sometimes use  OO, Google Docs, etc.  During the contracting process there is a lot of back and forth file transfers between us and them.

    My question is, what data, if any, is either changed or lost in the conversion process amongst these programs?  

    Thank you in advance for your help.

    Brent

    Monday, January 21, 2013 8:31 PM

Answers

  • Roughly speaking...

    1. For several years, there have been 2 main ISO standards for Word processing documents. One is called the "Open Office XML" (OOXML) format, which is the "native format" used by Microsoft Word since Word 2007. You might think of it as ".docx". The other is often called the Open Document Format (ODF) and has been the "native format" for Open Office for rather longer. You might see that as ".odt"

    2. I don't know which format Google Docs (Google Drive) uses internally, beause you never get to see it. But it allows you to download OOXML, ODF and PDF, for example. I have no idea what Lotus Symphony supports now. The current version of iWork's "Pages" program uses its own XML-based format but can import/export OOXML and ODF among others.

    3. The two formats are both XML, but are completely different. In some cases, the difference are primarily to do with the way things are expressed in the XML. e.g., if you have a document with a piece of bold text, and save it in ODF format from Open Office, you can reasonably expect that Word 2010 (say) will open the file and display the same text as bolded. In other cases, the problem is that the "feature sets" of the different Word processors is different. For example, Word has a feature called Content Controls. The OOXML format provides for content controls. But as far as I am aware, Open Office does not "understand" Content Controls, and ODF does not have any way to specify that a document contains one. So if you save a Document containing a content control from Word in ODF format, the feature will be lost when you save. But even if you save in OOXML format, when you open that in Open Office, it will probably just display the text in the control, not the control itself. The same problem exists the other way around - Open Office has features that Word does not. Incidentally, not every modern version of Word has a complete implementation of Content controls either -  Mac Word 2011 does not, and the "Web" version of Word does not, either.

    4. Word has a feature that lets you define the default save format. Word also has a feature (a valuable one IMO) that tries to tell you what you will lose if you save using a given format. But I cannot tell you more precisely what is lost either because the format does not have support for a particular feature, or because the different Word processors do not. It's difficult to keep up! There may be some useful tables somewhere. What I would try to do is

     a. keep things as simple as possible

     b. try to distinguish what is used by you and your collaborators as a "work format" from waht is used as a "publicaiton format". e.g., it doesn't really matter that you use complicated stuff such as COntent COntrols in Word if no-one else needs to edit it. But when you "release" your document, if you send it as a .docx and someone opens it in Open Office, they may not see your stuff. So perhaps better always to "release" or "publish" as PDF, for example

     c. try to agree at project startup whether any given project is "really" an OOXML project or "really" an ODF project (if either), probably depending on who you are working with or delivering to, and to stick to that format through the document development process. i.e. if it's an "ODF" project, never save as .docx, always as .odt. That said, people used to understand why you had to agree on standards in this kind of area. I'm not sure a lot of them do any more.


    Peter Jamieson

    • Marked as answer by Max Meng Tuesday, January 29, 2013 9:50 AM
    Tuesday, January 22, 2013 11:41 AM