Thursday, April 3, 2008

The Cathedral and the Bizarre

Good post here. It's thesis: that in the near future the software industry will look a lot like the energy business does today with its regulated and unregulated business; where regulated business is open source and its predictable steady stream of income and where unregulated business is analog to highly vertical, niche software products that are closed, closely guarded and expensive.

Interestingly, the software industry as we know it grew up with unregulated businesses - and only in the last decade has open source created these credible alternatives. I think the future looks similar to the energy industry: large technology companies will have a mix of regulated and unregulated businesses, that maximizes the advantages of both. For standard, widely-used technologies, open source "regulation" makes sense because it lowers development costs and provides a standards-based, predictable subscription base of business. For niche and high-end software, companies will still expect a substantial return on their development cost, and therefore will protect that IP and sell it at a premium until competition makes that impossible. The most successful of these integrated companies will be careful not to exploit the community, and will be respected for having transparency between what parts of their business are "regulated", and which parts aren't.

I certainly see this as the future of Microsoft which is unlikely to go completely open any time soon.

Wednesday, April 2, 2008

A Public API and Policy Would Really Help

If you have ever worked VMS or Windows at the API level you are familiar with the concept of a "Public API" and the concept of an API lifecycle. A public API is one that consumer code can count on. Some code is "for internal use only" while other code is meant specifically for "public consumption. " The public API code is a contract with the outside world. An API lifecycle is a contract between the API and its consumers that provides a level of backward compatibility and stability to the API. API calls never simply change or disappear. They are deprecated and then supported for some period of time before actually being removed. DEC had a really strong policy. DEC API calls would first be deprecated, after a number of releases they would be removed from the documentation but would continue to be supported, and finally after yet another number of releases they would be removed from the source base. Customers were given plenty of warning and time to port their consumer code.
Example:
Windows NT used to have a DLL called NT.dll. All of the actual system calls existed in this library but Microsoft made no commitment to anyone who called its methods directly that they would work from one patch or upgrade to the next. Developers were told specifically to avoid NT.dll and instead to rely on WIN32. NT.dll calls were at liberty to change as often as Microsoft saw fit, while the WIN32 library signatures had to maintain consistency for customers. Microsoft would then take the responsibility of mapping the stable WIN32 API calls to the more volatile NT.dll signatures whenever a call needed to be changed.

Many open source projects do not follow this practice. Some do, I believe Apache Struts for example deprecates for major two releases and removes on the third. As far as I know Alfresco does not have such a policy but I really wish they did. Enterprise customers expect reliability. They demand that versions come out at a slower more digestible pace and I would bet they expect some stability and policy around the backward compatibility of the API.

Today I installed a AMP in 2.9B (community) that is fully functional in 2.1E (enterprise) but does not work properly in 2.9B. The source of the problem is due to a class that has been moved from one package to another. This sort of refactoring happens all the time in source code. It's really important and it has to happen -- but we need to maintain some backward compatibility at the same time.

It may be a semi-valid criticism of this post that I am trying to install something that worked in the enterprise version but breaks in the community version because the community version is a sandbox / lab environment. I would however disagree. The labs release cannot be entirely free to evolve without consideration for any backward compatibility if we expect to build a community willing to contribute plug-ins and add-ons, etc. At OSBC many of the executives, when asked why their developers were taking from open source more than giving responded that it's simply too difficult to package and maintain contributions. This issue is further complicated if you cannot count on compatibility for some reasonable time frame.

There are downsides to maintaining backward compatibility. Backward compatibility code is cruft and can, over time, become weight on the system that impedes innovation. This issue can to some degree be managed by proper factoring of code and packaging as well as policies that attempt to seek balance. Don't try to remain backwardly compatible for more than a reasonable duration. Give customers the ability to port their code forward over the course of one or two upgrades but no more.

Stability and predictability goes beyond releases. If we want to encourage development outside the walls of Alfresco we have to extend some stability to the source code as well. Developers want to know exactly what API calls are public. Additionally they want fair warning when public calls change as well as to be given a sufficient amount of time to port their work.