Thursday, 30 August 2007

ISO: Standards For Sale - No Standards for Standards

It turns out that Microsoft has admitted, after being caught red handed, to buying votes in Sweden for their OOXML proposed standard. See the groklaw article for details.

Sweden has invalidated it's vote, concluded that there isn't time to revote, and abstained. The net result is that a NO vote was turned into ABSTAIN, netting MS half a point. If ISO approves this standard, I will lose all respect for ISO, as it means that there are no standards for standards.

Technorati Tags:

Posted by spout at 3:22 PM in the internet, web, web 2.0 and beyond

Sunday, 26 August 2007

MS OOXML and ECMA 376 are a Sham

Microsoft, as a response to the recent pushes by governement entities who want an open standard for office document formats, has documented OOXML and submitted it to ECMA and ISO as an office document format standard. Unfortunately, the proposed standard has been carefully crafted by Microsoft to provide the marketing buzzword benefit of having an approved standard without actually conferring any of the benefits of open standards. They do this by two methods: (A) the standard is too complex to implement from scratch and (B) complying with the standard will not confer the benefit of interoperability with MS Office, which is the sole purported benefit of this standard beyond the already existing ODF standard.

I'll discuss mainly the second point. Let me be clear: MS Office is not, nor will it be, interoperable with a theoretical OOXML implementation. Here's the crux of the matter: the OOXML document submitted to the ECMA and ISO standards processes does not describe what Microsoft Office implements, nor will it ever The submission has been carefully crafted to obscure this. The goal is to intentionally make a standard with the following properties:

  • The standard is impossible in practice to implement from scratch
  • The standard appears to specify the MS Office document format
  • The standard fails to specify MS Office document format
Microsoft is well seasoned at playing the standards game. They've learned from their battles over various web standards like HTML, CSS, and DOM, that you can have all the benefits of being proprietary while appearing to conform to an open standard, if you "almost" conforming to it. The first order model is conformity, while the second order model is intentional non-conformity. This is a brilliant tactic in politicized settings, since it allows people who want to claim conformity to do so. It allows MS to lobby successfully because they can demonstrate, to first order, compliance to officials who do not have the time or expertise to look deeper. When you pitch something you know to be false because you designed it to be false, I call it a sham. A number of researchers have been documenting the discrepancies between OOXML and what MS Office actually uses. Stéphane Rodriguez is one such researcher. Here is my interpretation of some of his findings: I focus on three glaring examples explored by Mr. Rodriquez.
  1. Proprietary floating point operations. Excel stores numbers in it's file format that differs from what was typed into the cell, and is transformed by unspecified proprietary floating point operations. For example, the proper way to express "12345.12345" in MS Office file formats can be verified to be <v>12345.123449999999</v> which is not based on an open standard. If you enter <v>12345.12344</v> Excel will not treat this as if you had entered "12345.12345" in the formula.
  2. VML. VML is a proprietary format for drawings. It is not specified by OOXML and is required by MS Office as it is pervasive in Word, Excel and Powerpoint documents. MS calls it "deprecated" but uses it extensively.
  3. Proprietary Date formats. When you enter a date literal into a cell in Excel, a string representation of that date is serialized into the XML. Much like the case with floating point operations, the meaning of this string is defined by a proprietary, undisclosed standard.
These examples show that OOXML simply does not document what MS Office does. A key milestone for creating an open standard should be that there are at least two separate parties who have constructed distinct implementations and demonstrate a working interchange of data. Calling something an open standard when this cannot possibly happen is a sham. OOXML is a sham. For more information, go to: the grokdoc summary of OOXML objections and a compendium of objections from nooxml.org.

Technorati Tags:

Posted by spout at 12:26 PM in the internet, web, web 2.0 and beyond

Friday, 24 August 2007

The Case Against Software Patents

A patent is a government sanctioned monopoly that our Constitution authorizes the government to give to inventors. The exclusive rights given to patent holders are intended as a form of quid pro quo, inducing inventors to disclose to the public the details of their novel creations. The purpose of the limited duration monopoly is to benefit the public through greater access to inventions which other wise might be kept secret. In many cases, patents live up to this ideal, and are a wise investment by the public. Software patents are a glaring failure, for a variety of reasons which I wish to discuss.

A monopoly is generally considered a bad thing, and to the extent that a patent is wrongly granted in an individual instance or to a class of creations that do not warrant it, the harm to the public is severe. So, I present the case against software patents in these terms. There is a vast body of legal praxis in the area of patents, and I should note that I am not a lawyer, but I emphatically reject the notion that one must be trained in the profession. In fact, I don't think lawyers have really added much to the discussion of whether software patents are good thing or not for society. It is primarily a political question, and only slightly a legal one. The constitutional purpose of patents is abundantly straight forward and isn't a net win for software patents. Oddly, it looks to me like the legal questions have actually been decided by the Supreme Court in the correct way, and have gotten muddled and contorted by the Federal Circuit.

Software patents have flaws that can be classified along three lines:

  1. Foundations. Software is an expression of ideas, not an invention. Patents should not apply.
  2. Utility. A software patent grant does not, cannot, and will never be capable of achieving the public benefit necessary to justify it. In the arena of software, patents stifle innovation more than they reward it.
  3. Practicality. The patent process cannot make proper determinations with regard to the standards for patentability within the arena of software patents, nor is it likely to be able to within reasonable budgetary bounds.
Lets review these in detail.

Foundations. We start with a look at the history of patents in software. Many people are rather suprised to know that standing Supreme Court precedent is that "for use in programming conventional general-purpose digital computers ... a series of mathematical calculations or mental steps and does not constitute a patentable 'process' within the meaning of the Patent Act". Gottschalk v. Benson, 409 U.S. 63 (1972). It's really difficult to put it any plainer than that.

Eventually, the Supreme Court upheld a patent that happened to use some software as part of the invention. This case was Diamond v. Diehr, 450 U.S. 175 (1981). James Diehr invented a process for molding and curing rubber that happened to use a computer to achieve real time solutions to equations to control the heating process. The physical/chemical process was a novel invention and the Court ruled the necessity of using software to implement it was not a disqualification for the patent.

These two cases, should tell us everything we need to know about when a patent can contain software. Most software is not part of a process where the process itself is a novel invention. Unfortunately, the Federal Circuit Court of Appeals has completely destroyed the common sense approach by filling in the area between the two Supreme Court precedents with blathering nonsense. The unique construction of the Court system for patents contributes to the failure to fix the mistakes. For most judicial matters, we have multiple circuit courts who pass judgements, and if one court does something strange (often it's the Ninth Circuit) you'll get a split between Circuits and the Supreme Court can resolve the split. From a legal quality view, this is a much more reliable system. Unfortunately, we've bought into this bogus idea that "patents are special" and we've created a single point of failure in the Federal Circuit, and it's failed with regard to upholding software patents despite well-reasoned guidance from above that should form the basis for rejecting most of them.

As described in Gottschalk, you don't invent mathematics or algorithms. Software is predominantly an expression of an algorithm. The guy who invents computer parts or a new combination of them deserves a patent. People who twiddle the bits on a machine invented by someone else deserve only a copyright for their pattern of twiddled bits. Only in the rare circumstance where the algorithm is used to control a physical device, generally by interacting with a hardware controller of some kind, is there an actual physical component present that might qualify for a patent.

As Thomas Jefferson expressed "it is the invention of the machine itself, which is to give a patent right, and not the application of it to any particular purpose" (letter to Isaac McPherson Monticello, 1813). He's saying the machine, taken as a whole must be something new. If you invent a new way to use a hardware graphics card in a computer, or some other part of it that controls something physical, you might meet this standard of having a new machine. But taking the same old PC and writing a program on it that makes it output something different simply is not an invention. It cannot be novel and it is not supposed to be patentable. The idea is so painfully obvious it's hard to understand how the entire legal patent profession has been so boneheaded about this over the last 20 years.

Utility Software is a form of speech, and that's why it's protected by copyright. The cost of creating, reproducing, and distributing software is really quite low, once we obtain a computer. This is so true, that one of the leading software development models, open source development, generally seeks no compensation other than attribution for the software license itself. Of course, many companies continue with proprietary licensing models as is their right, but the point is there isn't much need to seek to catalyze innovation in the software market by granting patents.

To the extent it's important to secure to software creators the rights to recoup their invested hours, the just market rewards for the fruits of labor are completely protected by copyright. Since software is speech, the body of precedent for securing original speech to its author has a well reasoned, vibrant, and enforcable set of laws backing it. Simply put, there is no problem with lack of financial reward for software creators needing patents to solve it.

Moreover the public disclosure benefit motivating the patent grant really achieves nothing. Since software is expression, there is a vibrant body of free code floating around for the public to benefit from. Open source alone does more for public access to ideas than a software patent system ever could. Worse, since software is also copyrighted, even when it's patented, so you can't actually use the disclosed ideas in the executable form even when the patent expires, since this violates the copyright. We do not need double protection for software creators.

Practicality Since the rules for granting copyrights are much more economical and the market demand for most software doesn't support it, most software authors choose not to seek patents. This systematically deprives the patent process of the very body of prior art that it needs to determine whether a new submission is novel. While the existence of prior art should be sufficient to have a Court invalidate a patent, in practice the cost of litigation and the cost of searching for prior art create economics where the decision to contest a wrongly granted patent is too expensive to justify.

Compounding the problem is the incompetence with which the USPTO searches for software prior art. I've seen examples where prior art can be found by typing the patent title into Google. The standard for obviousness has been polluted to such an extent that you could probably patent removing the fuzz from your navel near a computer and you would get a grant. That the obviousness of the Amazon 1-click patent is debated is a travesty. If it outrages people as obvious, it's obvious. Worse, since many of the software creations during the "golden age" of computing, the 1970's and 1980's, happened when it was believed that software was normally not patentable, there was no industry effort to document, or even preserve, prior art. It is fundamentally impractical given this and the above reality to economically demonstrate that a work isn't novel.

This reality has several effects. Some people knowingly file "stupid patents" and rely on the fact that it isn't affordable for anyone to contest the patent. The stupid patent, once granted, will be used to extract fees that depend more on the cost litigation than the value of the invention.

Large companies often become targets of such patent extortion. Very large companies typically seek to patent their software ideas in large volumes not because they want to exercise the right to exclusivity, but simply to have a defensive weapon. Such companies almost never engage each other in patent litigation, because it would become a form of mutual assured destruction. Unfortunately, patent holding companies seem to be the fatal flaw here. Patent litigation such as over the Blackberry or the Eolas claim against Internet Explorer has no adequate defense. It's disgusting to see companies whose sole reason to exist is patent extortion. It is only a matter of time until some well known and respected innovator is forced out of business by bogus patent claims. Software patents are not rewarding inventors, they are rewarding litigation specialists who produce nothing of value to the economy.

Wrap Up In consclusion, software patents are foundationally wrong, cannot achieve any public benefit, and in practice drive economic waste and fail to reward anyone who deserves to be rewarded. Since the legal system has abandoned common sense, Congress should step in and pass legislation that makes software not patentable unless it's part of a physical machine which is, taken as a whole, novel.

Technorati Tags:

Posted by spout at 7:05 PM in the internet, web, web 2.0 and beyond

Tuesday, 14 August 2007

Excluding Unfixables from Eclipse Problem View

I find myself using Eclipse for most of my development now. I still bounce back and forth to a text editor (I love jEdit) for writing xml and an interactive scripting language environment, but for java projects the Eclipse tool stack is really valueable.

The problem view in particular is extremely useful, it will show you errors and warnings throughout the project, and with one click take you right to the line where the problem is. I try very hard to refer to this a lot and all errors and all warnings that it surfaces. Warnings is a little hard, and sometimes this means doing things like putting in @SuppressWarnings("unchecked") when mixing generics and pre-generics code. I also clean up my imports (CTRL-O) a lot more. The reason I'm so aggressive with these is that if I keep the list short then new things that pop onto it are much more visible, which helps me prevent them from becoming bigger problems.

One thing that makes this difficult is generated code, or imported 3rd party files. For a long time, I couldn't figure out how to exclude files or directories from the scope of problem detection and validation. It turns out this is extremely easy and can be refined all the way down to the file level.

You do this by using working sets. The trick is to define a similar working set for problem detection scope to the one you typically use under "Select Working Set". Do this via Window->Working Sets->Edit, or any other way to access working sets. You will leave most things from your selected working set in the problem detection working set, except for the few things you want to exclude. Then in the problem view, click "configure the filters to be applied to this view", and then click the select button under the "on working set:" radio button. You can then change to your problem detection scope working set and voila, unfixable things can be excluded by keeping them out of the working set.

Technorati Tags:

Posted by spout at 1:04 PM in stuff about java

Sunday, 5 August 2007

Java Timezone Wrong on Fedora

How's this for annoying:
bash$ groovy -e "println new java.util.Date()"
Mon Aug 06 01:40:33 EDT 2007

bash$ date
Mon Aug  6 00:40:35 CDT 2007

bash$ /usr/sbin/zdump /etc/localtime
/etc/localtime  Mon Aug  6 00:41:45 2007 CDT

bash$ cat /etc/timezone
America/Chicago
This is on Fedora. Java seems to have decided I'm on the east coast, and stubbornly refuses to look at any of the many sources where it could determine that I'm in the central timezone.

To save you some trouble, here's the solution:

bash$ export TZ="US/Central"
bash$ groovy -e "println new java.util.Date()"
Mon Aug 06 00:49:02 CDT 2007
To make this automatic, I added the export TZ line to /etc/profile.d/java.sh , which is where I set my JAVA_HOME, ANT_HOME, GROOVY_HOME and so on.

Technorati Tags:

Posted by spout at 7:48 PM in stuff about java

Saturday, 4 August 2007

Particles

There's an interesting new health/safety issue that's surfaced. As covered in many places, it turns out that many office printers emit very small toner particles that some allege are a health risk, and printer makers defend as harmless. So is this a case of the big nasty Corporations trying to play dumb about the serious health side affects of its product or is it sensationalist media outlets and plaintiffs lawyers trying to create hysteria and get rich? Or both? or neither? or some shade of gray? It's impossible to know up front, but I guarantee you that the plaintiffs lawyers will be massing to take a bite out of HP and other printer makers, who will defend themselves vigorously.

I'm surprising that this is new, actually. Somebody out there who works in a clean room must have asked about printer emissions. I recall that when I worked at Applied Materials, some types of printer were approved for use in the clean room. I also recall that smokers have particles all over their clothes that come off -- 3rd hand smoke, if you will.

I don't see any reason to assume without evidence that printer particles are any worse than anything other kinds of inert particles. Paper itself releases white particulates that are left over from the cutting process. Cells in your body, and especially in the lungs, have pretty reasonable defenses called macrophages against ordinary particles and other "debris". You'd have to go through something comparable to what a coal miner does over years of exposure to have problems.

All particles are not created equal, of course. Asbestos is a notable example at the other extreme. The problem is that with large jury verdicts on the line over the outcome, the science inevitably becomes politicized, and appeals to emotions are exploited on both sides. The burden of scientific proof has to lie on whoever is trying to make the claim. Also, even if you prove the claim, you should have to prove the fix is better than the disease. DDT has undoubtably harmed the health of hundreds or perhaps thousands of people, yet it was also a powerful weapon in combating mosquito borne malaria, which has killed 10's of millions of people. It's isn't clear that from a strict utilitarian viewpoint that we are safer without it. There are many ironies here: some people claim that had asbestos not been banned, then the world trade towers, which used it up to the 40th floor, might have stood a little longer. 10 minutes might have saved several hundred more lives.

There's a problem with the human psyche that we are predisposed to take the side of the person at risk in front of us. Even greater nonsense has been levied in the name of safety for the anonymous masses. I'm surprised we haven't actually started bubble-wrapping children. When I was a little kid, there was a big push to wear seatbelts. Now it isn't enough to have kids ride in a car-seat, it has to be installed in the proper location within the car (away from those dangerous air-bags) and in the proper direction (facing backwards until 1 year and 24 pounds). Am I opposed to car seats? No, I use them every time. I just don't see how we ever as a society decide not to take the next step in the long chain that eventually leads to bubble-wrapping our kids 24/7. Hopefully this journey will take long enough that I'll die in the meantime from one of the officially safe ways to die.

We also have extreme difficulty understanding what is required to prove causality". and we often confuse it with correlation. Everybody together now: "correlation is not causality". Dow Corning found that out in their silicon breast implant litigation which caused the company to fold under the weight of large verdicts that were sustained on appeal. How do we get a legal system that's capable of defeating this tactic by plaintiffs lawyers. To prove that A causes B, find all the people that have A and B and litigate. Something like 1% of women have lupus. Something like 1% had silicon breast implants. So out of 100 million women of the appropriate age, you'd expect about 10 thousand to have both. The plaintiffs lawyers found a couple hundred of those women and destroyed a company that had marketed a product with proveable safety problem. The accepted scientific opinion now is that there is no clear evidence of a causal link between silicone breast implants and systemic disease.

What do we really know about printers and particles? Not much. I know that I no longer get the newspaper (which would tell me about bridges collapsing during ordinary use), partly because I find the ink occasionally discolors my hands, but mostly because it's a pain to have to continually throw them out. I do feel a little guilty as now the volume of stuff I recycle is lower. :-] My advice to you: don't print this blog. Instead read it on the screen, where no one alleges there are any safety risks, or a least with all this distraction due to printer pollution, I've forgotten what they are. Absense of evidence is not evidence of absence. Absence of allegation, on the other hand, IS an allegation of absensce.

I better stop writing, my eyes kind of hurt. Either that's because I had lasik a month ago and my eyes have contracted one of the several complications that arise in 0.2% of treatments, or it's because video screens actually are more risky than printers, or maybe my eyes are just dry. I hope there are no particles in my eye drops.

Posted by spout at 11:37 AM in the internet, web, web 2.0 and beyond

Thursday, 2 August 2007

Blog Comment Policy

I've chosen to allow comments, but only moderated comments. This comes after much consideration. Joel Spolsky, of Joel on Software, has a good restatement of Dave Winer's arguement for why comments on blogs are not a good thing. I mostly agree, and I will set my moderation standard pretty high. I'll only pass through comments that I think significantly add the conversation. Things that should get a "(Score:5, Informative)" if they were a Slashdot comment. Otherwise, I might respond with email. This all has the added benefit of reducing and simplifying my spam fighting tasks.

Sooner or later somebody will get ticked and complain that I censored them, blah, blah, blah. Yes, I censored you. I censored you from spouting your drivel using the web server I pay for because I deem your drivel to polluted my own. Don't like it? I don't care. Get your own blog. Feel free to send angry follow up emails, so that I can stroke my delusions self-importance when I ignore you dismissively.

Posted by spout at 5:15 PM in the internet, web, web 2.0 and beyond

SyntaxHighlighting CSS issue

It turns out there's a slight problem with the CSS from the syntax highlighting library I discussed in my previous post More precisly, something with blojsom has a negative interaction with it. The issue is rather prominent if I turn off the "nocontrols" directive. Viz:
public interface EqualityHelper {
    
    public EqualityHelper forObject(Object baseObject);
    
    public boolean isEqual(Object other);
    
    public int getHash();

}
Notice how the green line isn't continuous. It should look like this, which is the same except not wrapped by blojsom. This is obviously some kind of CSS interaction issue, but I'm not enough of an uber web monkey to see the problem immediately.
Posted by spout at 4:24 PM in the internet, web, web 2.0 and beyond

Wednesday, 1 August 2007

SyntaxHighlighting for Code

I found a nice syntax highlighting javascipt/CSS library. It's called SyntaxHighlighter by Alex Gorbatchev. Here's an example of highlighting java code:
public interface EqualityHelper {
    
    public EqualityHelper forObject(Object baseObject);
    
    public boolean isEqual(Object other);
    
    public int getHash();

}
It supports syntax coloring with different dialects. It looks very straightforward to create your own as well. Here's one that highlights XML, to show the code that created the above. In order to make this work, you've got to get some CSS and javascript entries into your HTML. There a javascript file for each "brush" which highlights a kind of code. In Blojsom, I added these via it's template management capabilities to the head and footer templates. Here's the relevent ones I've used on this page:

Technorati Tags:

Posted by spout at 8:34 PM in the internet, web, web 2.0 and beyond