Versioning - The Next Big Thing

In the web development world, anyways. So, in the grand scheme of things, maybe not a huge deal to anyone else. Versioning is going to be one of the biggest problems and opportunities there is in web development, and it's going to take us at least five years to get it right.

Actually, let me admit up front that five years is a shot in the dark, and optimistic to boot. If people keep hanging out with bondage and discipline languages like Java and C# that are still catching up to language and framework developments from the 90's it'll take us more like ten years. (Attention Lisp Weenies: Yes, I know you solved every problem forty years ago for certain values of "solved" and "problem" while the rest of us were getting work done.) Not only is versioning a difficult technical problem, it will be difficult to educate programmers in what it is, how it works, and why you'll wish you used about a year after you decided it was too much work.

I'm writing to get help get the ball rolling on the process of solving this problem and publicizing it. So maybe it's about time I go into what the problem is instead of just yammering on, eh?

Because your website is served from your machines, you can update it every few days or hours. This is almost pedantically obvious, but it's a big change from having to press CDs, ship them to a store, and wait for them to go home with customers -- it's even a big change from posting a new version online for customers to download and install.

The web's faster pace means that updating, say, your database model to add or remove fields is a common occurence rather than something your InstallShield wizard does every 1-2 years. It's vital that changes be streamlined and safeguarded. What kind of defaults get set when you add fields? How do you save data when you lose fields? How does the programmer make sure the Right Thing happens? If you have the fields Name and Address and want to combine them into a simple MailingAddress field, you're not just deleting two columns and adding one.

Not only is it complex to change your schema, getting those changes into production are not easy. The script that updates definitions and migrates data (which we tested somewhere other than production, right?) needs to placate the multiple web and database servers used for load-balancing. If you update the database server first, the application code on the web server shouldn't break because it didn't get the word about the new schema.

I see too many problems with trying to get multiple servers to update to a new version at the Exact Same Instant, I think schema changes are going to have to be a four-step process with a backwards-compatible first step applied to the database (add columns, loosen restrictions), a code update (with testing!), a backwards-incompatible database change (drop columns, tighten restrictions), and finally another code update to remove outdated usage.

Jakob Kaplan-Moss, one of the creators of Django, just opened the discussion of versioning in Django (with links to prior art). As a web framework with an object-relational mapper, versioning is an important feature.

In addition to databases, web APIs need versioning. Adam Kelsey has opened that discussion, but so far the solutions are pretty rough and I don't have anything to add to the discussion.

If you'd like to read more about how web development differs from other development, Steve Yegge wrote an excellent article titled It's Not Software coining the term "servware" to emphasize that web developers need to recognize the ways our code ends up so different. (It's easy to turn this topic into rah-rah "We're so special we don't have to learn from history" bullshit, but nobody wants that.)

Hey, if you can't take a

Hey, if you can't take a little heat -- and if you can't even capitalize or sign your a name to stand behind your words -- don't read my posts, you won't ever like them.

To everyone else: I've got several posts planned about features and techniques that popular languages have picked up from more esoteric ones and how learning them broadens a programmer. Feel free to mail me (ph@malaprop.org) if there are any topics you'd like to see specifically covered.

Sure, I'll write about it in

Sure, I'll write about it in a later article. I haven't used C#, so I'll have to compare to Java or just describe Python positives. In a nutshell, Python has really nice introspection and abstraction that are well-suited to poking around in objects (that'd be of different types in C#/Java).

The dig at Java and C# was mostly friendly razzing, I wasn't ooking to start a "my language is better than your language!" flamewar. And I've been online too long to be bothered by flamers.

Borrowing from other languages

the majority of what's good about most popular languages around right now comes from borrowing a large number of ideas from lisp.

Actually, part of the goal behind CodeSnipers was to get people from different backgrounds/architectures together and start kicking around ideas. I think that *most* languages have valuable concepts to offer other languages. The difficulty is finding the commonalities.

Of course, then there's Perl. ;) (Just trying to get you going Nola.)

yeah yeah

Go on about Perl... just wait till they come after you ..

Flickr, Basecamp, etc

Personally, I've been fascinated by groups such as Flickr and Basecamp which claim to push out new versions (up to?) numerous times/day. It implies that there are numerous Unit Tests, Integration Tests, etc throughout the process which are applied and heeded prior to each deployment.

That's a long way from where many environments/projhects are...

Yeah, it is a big

Yeah, it is a big difference. It's common that software has no automated testing whatsoever, and that's a really lousy standard. I didn't realize how useful automated tests were until I'd worked on a project with them and then a very large project without them.

I've come around to Michael C. Feather's view in the book "Working Effectively With Legacy Software": legacy code is simply code without tests.

On a mailing list about an

On a mailing list about an upcoming conference they were talking about this same thing.. some of the links referenced were:

www.DatabaseRefactoring.com

http://martinfowler.com/articles/evodb.html

I have that Databasebook checked out on my Safari account but have yet to really dive into it.