Sunday, 26 August 2007

MS OOXML and ECMA 376 are a Sham

Microsoft, as a response to the recent pushes by governement entities who want an open standard for office document formats, has documented OOXML and submitted it to ECMA and ISO as an office document format standard. Unfortunately, the proposed standard has been carefully crafted by Microsoft to provide the marketing buzzword benefit of having an approved standard without actually conferring any of the benefits of open standards. They do this by two methods: (A) the standard is too complex to implement from scratch and (B) complying with the standard will not confer the benefit of interoperability with MS Office, which is the sole purported benefit of this standard beyond the already existing ODF standard.

I'll discuss mainly the second point. Let me be clear: MS Office is not, nor will it be, interoperable with a theoretical OOXML implementation. Here's the crux of the matter: the OOXML document submitted to the ECMA and ISO standards processes does not describe what Microsoft Office implements, nor will it ever The submission has been carefully crafted to obscure this. The goal is to intentionally make a standard with the following properties:

  • The standard is impossible in practice to implement from scratch
  • The standard appears to specify the MS Office document format
  • The standard fails to specify MS Office document format
Microsoft is well seasoned at playing the standards game. They've learned from their battles over various web standards like HTML, CSS, and DOM, that you can have all the benefits of being proprietary while appearing to conform to an open standard, if you "almost" conforming to it. The first order model is conformity, while the second order model is intentional non-conformity. This is a brilliant tactic in politicized settings, since it allows people who want to claim conformity to do so. It allows MS to lobby successfully because they can demonstrate, to first order, compliance to officials who do not have the time or expertise to look deeper. When you pitch something you know to be false because you designed it to be false, I call it a sham. A number of researchers have been documenting the discrepancies between OOXML and what MS Office actually uses. Stéphane Rodriguez is one such researcher. Here is my interpretation of some of his findings: I focus on three glaring examples explored by Mr. Rodriquez.
  1. Proprietary floating point operations. Excel stores numbers in it's file format that differs from what was typed into the cell, and is transformed by unspecified proprietary floating point operations. For example, the proper way to express "12345.12345" in MS Office file formats can be verified to be <v>12345.123449999999</v> which is not based on an open standard. If you enter <v>12345.12344</v> Excel will not treat this as if you had entered "12345.12345" in the formula.
  2. VML. VML is a proprietary format for drawings. It is not specified by OOXML and is required by MS Office as it is pervasive in Word, Excel and Powerpoint documents. MS calls it "deprecated" but uses it extensively.
  3. Proprietary Date formats. When you enter a date literal into a cell in Excel, a string representation of that date is serialized into the XML. Much like the case with floating point operations, the meaning of this string is defined by a proprietary, undisclosed standard.
These examples show that OOXML simply does not document what MS Office does. A key milestone for creating an open standard should be that there are at least two separate parties who have constructed distinct implementations and demonstrate a working interchange of data. Calling something an open standard when this cannot possibly happen is a sham. OOXML is a sham. For more information, go to: the grokdoc summary of OOXML objections and a compendium of objections from nooxml.org.

Technorati Tags:

Posted by spout at 12:26 PM in the internet, web, web 2.0 and beyond
« August »
SunMonTueWedThuFriSat
   1234
567891011
12131415161718
19202122232425
262728293031