Lesson Learned

Project Post Mortems

Another couple of days and the current cycle comes to an end: We are indeed releasing the next iteration of a product. Of course it doesn't end there, as most software products don't ever really finish. No matter though, it is a nice time to take a step back, pause and evaluate how things went this time around.

It's time for a post mortem of sorts.

We're 90% Done!

We're 90% complete with X and we'll be done in no time!

We all know it never happens that way, yet we hear it regularly, most of us have said it, and some of us have said it more than once.

Why does this sort of thing prevade software development?

Good software has users

If you build software, manage a project or direct an IT division take a moment to answer the following question: Who are your users? If an answer eludes you, read on.

Good software has users. Understanding this simple statement may make the difference between solving imaginary and real problems. Information Technology, at its core, is an effort to pair technology solutions with domain problems, where a problem may range from generating reports to architecting massive systems. No matter the scope, technology solutions should be informed by the needs of users.

The secret family split in Windows code page functions

My earlier post "Strange case of two system locale ANSI charsets" discussed the confusion between the default system locale (GetACP, Language for non-Unicode Programs) and the default user locale (setlocale, Standards and Formats). There I mentioned a problem with setting the system code page in C/C++ using setlocale, but that is only the first clue in what reveals a secret split in the family of locale-based charset functions.

Micro ISV Mistake #3

This is the third in a series of posts documenting mistakes I made over the last few years as I started my company, developed our products and tried to sell them online. The previous posts are still available if you missed them: Mistake #1, challenge from Keith, Mistake #2.

Of all the mistakes I had on my list when I started this series, I was dreading writing about this one the most. It doesn’t really lend itself well to a hard and fast rule, has a lot to do with timing, prioritisation, and moderation, and involves discouraging something that is undeniably important. Despite this, it has to be on the list, over the years it’s sucked away a lot of my time and energy, and you might need someone to tell you that you can take it too far.

Outsourced Offshoring!?

Everyone seems to hate outsourcing, but I'm attempting to draw a distinction here. There is outsourcing which I define as "sending work outside your organization" and there is offshoring which I define as "sending work overseas". Both have some potential upsides for the organization in terms of focusing on core skills, benefiting from a larger development team than they have in house, and a variety of other aspects. Both have some potential downsides too. I'm not going into this right now. As a mISV owner, I am all for outsourcing because we couldn't exist without it. I think the jury is still out on offshoring.

I came across an article recently detailing the writer's experience visiting Singapore and Mumbai for a recent wedding and I found the whole thing fascinating.

In order to send work offshore, a certain amount of groundwork must be laid. First, the organization must be able to write a solid specification. Next, someone needs to act as an intermediary especially when the work hours are completely different. Then someone has to evaluate the work stateside. Etc, etc, etc. You get the idea.

It seems that certain offshoring destinations have faced a great deal of wage inflation due to demand for skilled people and the work is starting to move elsewhere. And it makes sense, once an organization jumps the hurdles to make offshoring work in the first place, moving it somewhere else becomes easy and nearly trivial.

Therefore, with the further expansion and improvement of global communication systems, it's impossible to predict where this work will end up going in the long term. The primary destinations at this point seem to be Pakistan, India, and China, but the Ukraine, Romainia, and more rural areas of the others are already picking up steam.

A guide for changing programming languages

Many projects will encounter a point of frustration with some aspect of Information Technology architecture and developers may clamor for a new technical solution. Changing programming languages may involve a significant rewrite of the software, which is a bad idea most times. However, if you are faced with a compelling technology how should a Project Manager approach the transition? Two personal examples may shed some light on the decision whether or not to change programming languages.

Whether Double-Byte Is ANSI

I'm telling this story because a senior manager flippantly dismissed my careful analysis of an issue, and it got me a little riled up. For about a year as a side project I have been working with people in Japan and Taiwan on converting a Delphi-written Windows tool to Unicode as well as localizing it to Japanese (I am a C++ programmer but picked up Delphi during this project). It is a project that requires clarity on the character set issues involved.

We had an independent expert who audited the Delphi program and cautioned us among other things that we "cannot assume that the non-Unicode character string is always ANSI. When ANSI code page is specified but the non-Unicode character string contains double-byte characters, the two bytes of each double-byte character will be separated." This person was implying that double-byte was different from ANSI in that context.

The glaring problem with his point is that, in Windows terminology, double-byte is ANSI! Plain and simple. Well, read on...

On Windows, ANSI character sets are those that can act as the default non-Unicode system locale character set including double-byte character sets (DBCS). Win32 APIs involving strings generally have an A and a W version (e.g. SetWindowTextA and SetWindowTextW), where the A stands for ANSI and the W stands for Wide Character Unicode. The GetACP Win32 API returns the ANSI code page that the A APIs operate in, which is double-byte (e.g. 932) on PCs configured for Far Eastern code pages.

Thinking ANSI does not include double-byte is understandable because the popular usage of the term "ANSI" assumes that ANSI character sets are single-byte (SBCS). This is due to the origin of the Microsoft usage of the term "ANSI" in the default Western code page 1252 based on a single-byte character set drafted by the American National Standards Institute (ANSI).

It was a misnomer from the beginning because Windows-1252 was not approved by ANSI and it turned out different than the ISO Standard 8859-1. But the term ANSI on the Microsoft platform went on to encompass all of the Windows single-byte character sets in which the lower 128 values are ASCII and the upper 128 vary according to different international sets like Cyrillic.

The Microsoft usage of the term ANSI made a further leap to encompass double-byte character sets. Why is not perfectly clear, but presumably for practical reasons because it was the only term that was handy. DBCS characters can be either 1 or 2 bytes long (don't be fooled by the name "double-byte", they are actually multi-byte -- MBCS), and the 1 byte characters in the lower 128 are ASCII. An ASCII string is exactly the same when it is represented in any Windows (dare I say ANSI) SBCS or DBCS, but Wide Char Unicode is always different. So it is very convenient for the A and W Windows APIs mentioned above to carry the double-byte character sets under the A label. Another practical reason mught be that the ANSI character sets were distinguished from OEM character sets having to do with DOS and hardware, and since OEM included the double-byte sets it was convenient for ANSI to include them too.

There is no dispute about whether Microsoft includes the double-byte character sets among its ANSI character sets. The Microsoft list of Code-Page Identifiers marks the far eastern DBCS code pages (932, 936, 949, 950) as ANSI. The use of "ANSI/OEM" merely indicates that these are also OEM code pages, not that "ANSI/OEM" is some kind of special ANSI (these code pages are listed more clearly as ANSI & OEM on this page).

But there seems to always be fuzziness about this issue. One MSDN article holds back by saying "DBCS can be thought of as the ANSI character set for some Asian versions of Microsoft Windows". Another Microsoft article actually describes "Windows ANSI" as different from double-byte sets due to the lingering association of the ANSI misnomer with the single-byte sets. But the fact is that across Win32 programming, ANSI is always the name for the alternative to UNICODE, and ANSI functions always support the double-byte character sets.

Now to tell about the experience that is the reason for this article, we have to go back to the independent expert's assertion that you cannot assume ANSI for non-Unicode strings. Well I responded that "double-byte character sets such as Shift-JIS are ANSI and since the ANSI strings used in [the program] are based on machine locale you can assume ANSI locale encoding."

But I was chastised by a senior manager (the President of the Japanese subsidiary) who said that "I am a business guy but even I know that what Ben wrote [is wrong]."

Taking sides with the independent expert, the senior manager (showing an instinctive grasp of the subject despite having picked the wrong battle) went on to state: "ANSI does not always mean multi-byte safe. By specifying an ANSI code page the code can't know whether it is pure ANSI (single-byte) or ANSI/OEM (code page 932 - Japanese). ANSI encoding is single-byte by default which may cause double-byte problems. For conversion when the source is in Japanese Shift-JIS, if you just specify "ANSI" (not code page 932 - Shift-JIS), the code will separate the two bytes and do byte-based conversion to Unicode (2 bytes to 4 bytes instead of the correct way - 2 bytes to 2 bytes)."

His statement shows "ANSI confusion" that is understandable considering what I explained above, and points to what could be a legitimate concern if the wrong ANSI code page was applied. However, ultimately this particular disagreement does not depend on the definition of ANSI; the expert made an incorrect assertion that double-byte strings needed to be handled differently in the Delphi program in question. There may be other development situations (none that I know of) where something like this is a concern, but I tested it out and the functions in question worked correctly with the far eastern double-byte as the default ANSI code page.

So do you want the final final answer to the question of whether double-byte is ANSI? The answer is definitely yes, at least in terms of Windows programming, but with regard to discussions with senior management you must allow for variations in the use of the terminology remembering the fact that ANSI is a misnomer in this case anyway.

Micro ISV Mistake #2

This is the second in a series of posts on common Micro ISV mistakes. Without repeating my entire introduction from the first article, these mistakes are based on lessons learned over the last few years as I started my company, developed our products, and then tried to sell them. Discussion is, as always, welcomed and encouraged in the comments.

Mistake #2 is trying to be all things to all people.

This is a very easy mistake to make. Even if you don’t consciously choose to make a general purpose product from the outset, it creeps up on you. You start with a simple idea, identify a neat little market, plan your development, and know exactly where you’re going. Then you spot another market which you could also serve with just a little extra development, or you decide that if you spent a little more time, you wouldn’t need to limit yourself to that niche; you could attack the entire market. Even if you don’t think of it yourself, someone, an advisor, friend, relative, customer, or potential sales partner, will suggest it to you. It can be seductive and can trick you into feeling like a genius, you’ve gone from a small market to a massive market and you’re just getting started. You’ve actually made a terrible mistake.

Micro ISV Mistake #1

I’ve decided to kick off my regular posting here on CodeSnipers with a short series on common Micro ISV mistakes. In making my initial list, I’ve stuck to mistakes I’ve made myself in the last few years. I’m not ruling out the possibility of adding to the list as the series and discussion goes on, but if that does happen I might be relying on observations, I’m not promising to test drive every mistake for you.

Some of you may argue that not all of my “mistakes” are really mistakes, and you might have a good point, but I firmly believe that in my situation each was the wrong thing to do, so I don’t have any qualms about calling any of them mistakes. If you do have a different perspective, I hope you’ll stay and discuss it in the comments.