I sat on the curb many a summer day growing up in suburban New Jersey listening to Yankee games and maybe biking up to the five and dime hoping to buy a pack of baseball cards with Bobby Richardson, or better yet the entire Yankee team inside. Although I couldn't have asked for a more peaceful and fun way to grow up, that Yankee emphasis on stardom got me a little twisted on what it took to create success. Was it really having the stars making the big plays that made the difference?

I never followed sports enough to dig deeper into that question but I know in software, patience and perseverance play a much bigger role than being a cowboy coder. This is especially true when it comes to software maintenance.

Good programmers have little trouble fixing most software bugs. “Just show me what is wrong, and I’ll fix it,” we like to say, demonstrating all at once our technical prowess, our humble desire to save humanity and our constant need to gain attention.

All that  goes out the window, though, when the user complains about a phantom problem.  If you can’t repeat it, if you have no documentation about it, how are you going to fix it? And how are you not going to look like a goat to your client or boss when you tell them there is nothing you can do?

In the old days, we had plenty of excuses for phantom problems. “Must have been an electrical spike.” “Maybe one of the keypunch cards was crumpled.” Software programmers had a ready store of such comments to keep the user complacent until the real software bug could be found.

The old hardware problems may have seen their day, but there are still many ways phantom errors can crop up. For example, having transitioned our staffing software from its initial implementation on SQL Server 6.0 in 1994, we’ve marshaled it through many Microsoft service packs and major releases, and each one brought its anomalies.

For example, SQL’s stored procedure cache has gone through many phases of evolution, each one delivering a new algorithm for making programs run more efficiently. This gets pretty long-winded to explain but briefly, depending on the state of the data and the procedure cache, SQL executes the same programs differently and not rarely with less-than-desired results.

Performance means everything for my clients because some process thousands of checks time and again in a single hour. They get competitive advantage over the nationals and others because they can produce their payroll checks at will. One national company, for example, can only process checks overnight, making it difficult to get temporary workers their checks on time. Those temporary workers go instead to our clients because they know they can get their checks earlier and more reliably.

We need to be ready whenever a performance issue hits, and we’ve built up a quiver of performance improvement arrows, including ones that attack phantom performance problems. These include quick ways of rewriting SQL statements to use CTEs and ways of altering execution paths.  DBCC DROPCLEANBUFFERS forces SQL down a path of refreshing its data buffers. DBCC FREEPROCCACHE causes SQL to regenerate its internal programs and execute differently. Custom trace tools we’ve developed allow long-running processes to leave clues about performance bottlenecks.

Phantom problems have altered the art of software maintenance. As systems mature and become mission critical, a lot of the cleverness takes place not in creating eye-popping features but rather with behind-the-scenes improvements of the entire system, the patience and perseverance of making problems repeatable.

As for the Yankees, I'm no expert; but the same things seem to apply. Despite loading its deck with the most expensive talent ever assembled, they didn't even make it to the playoffs this last year. It was Tampa Bay who, although not winning the World Series, made it there on the strength of their un-glorious bump and grind of evolving the eco-system of the entire team.

Tags: Staffing Software