Ben Langhinrichs

Photograph of Ben Langhinrichs

E-mail address - Ben Langhinrichs

July, 2006
02 03 04 05 06 07 08
09 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31

Search the weblog

Genii Weblog

Civility in critiquing the ideas of others is no vice. Rudeness in defending your own ideas is no virtue.

Wed 12 Jul 2006, 06:37 PM
This post was inspired by a comment/question from Paul Ryan (#477.7) regarding an earlier post.  Paul says:
That said, honestly, I didn't find this topic particularly interesting, except to note that ODF consists entirely of XML, something that I hadn't quite registered before. Therefore, and as your example clearly demonstrates, ODF files are going to be very fat compared with proprietary formats like Notes rich-text or Microsoft formats.

Digressive musing...what XML really needs is a widely-used, maybe even compulsory, compression component to combat the bloat problem. Maybe there is something like that out there, and I'm just not aware of it.As a lawyer might say, Paul, asked and answered.  Yes, XML is fairly heavy, and the way ODF is implemented is even heavier than it would need to be, although still not as heavy as Office Open XML (OOXML) seems to be.

So, the obvious answer is to compress the whole thing, which is just what both ODF and Office Open XML do, and even in almost exactly the same way.  When you see an ODF file such as ThisDoc.odt, you are really seeing a zipped repositiory with several files inside it.  Technically, it is even more specific than a zip file, it is a "JAR file", which is to say exactly the same format as a Java Archive package.  There are usually several files and subdirecties in such a package, although the only required files are the META-INF\manifest.xml and the content.xml file which descrbe the contents of the JAR file and the content of the document, respectively.

But is it any good knowing this?  Sure, it makes clear why ODF files are not as humungous as they might otherwise be, since the zip compression is fairly good at compressing, but what else is it good for?

Well, for one thing, it is good for extracting images.  Unlike a Word .doc file or Notes rich text field, if you want all the images included in an ODF file, you can simply rename the .odt to .zip and unzip the graphics files.  You can also alter the content.xml file by hand or with some other utility and re-zip it, so long as there is not encryption set up on the JAR file.  This is like fiddling with DXL, except it is more reasonably structured.

So, for what it is worth, there you have it.

Copyright 2006 Genii Software Ltd.


Wed 12 Jul 2006, 11:05 AM
I'd never make a good spy, at least not in England, even if I ever could learn an English accent.  For you see, when I get to the end of a conversation with someone in England, I almost always remember to say "Cheers!" instead of "Bye!", but it just doesn't feel... finished.  So, inevitably, although I tell myself not to, I add a small, "Bye!" after the "Cheers!".  Arghh!!!!

Copyright 2006 Genii Software Ltd.

Wed 12 Jul 2006, 12:30 AM
Well, it is a small thing, but as I reported on July 5th, Genii Software joined the ODF Alliance.  According to Bob Sutor, Google has decided to follow our lead (OK, my conclusion, not Bob's).  See his post.  It makes sense, given Google's acquisition of Writely.

Copyright 2006 Genii Software Ltd.