Technical debt is a great term, originally coined by Ward Cunningham, to convey the reality of future problems brought about by making decisions with an eye to short-term gains instead of long-term correctness. This is not a new concept, but before Ward, it had never been applied to software development. (Martin Fowler reports that the term was used in Ward's report at the OOPSLA conference in 1992.) Let's say that again incase you missed it.
The financial world has had this concept from the beginning of money (or at least the lending thereof). Debt is a very real thing for many people and it's something that gets dramatically worse the more you fail to address it. The definition of worse can vary of course. For some "worse" means having a credit card declined, or a car repossessed, or a house repossessed, or a business declared insolvent. And then there are some who operate outside of the realm of the legal, who will be more than happy to break your kneecaps when your debt exceeds the agreed repayment amounts.
One aspect that most forms of debt share is the personal pain (sometimes literal pain if you borrow from stereotypical Italian gentlemen named Vinny) from failing to address that debt. When you ignore it long enough it has a way of getting your attention, often to the exclusion of everything else. While this is generally quite unpleasant, it does effectively concentrate the mind on efforts to bear down on the debt and start making it go away.
A big problem with technical debt is that the personal pain is often not applied to the one making the decisions and who, by all rights, should be experiencing it. I'm talking about Information Technology managers here. While not a few bad decisions are made by programmers, the clear and massive majority of them are made by managers. The problems and pain of the technical debt is then felt by the programmers. The irony in the situation is that those same programmers have likely warned against exactly the situation that they now find themselves placed in.
I have seen technical debt accrued in almost every company that I have worked for. There seems to be a slavish adherence to the concept of "first mover advantage". This would be lovely if the concept worked, but the general public seems to be learning to place a premium on correct over fast. Sadly, the memo hasn't made it to Information Technology management yet. Consider Microsoft's Vista operating system. I understand from reading technology blogs that Vista was released because they were fed up of the computer press asking when it was going to be ready (Really, what's the rush? Doesn't everyone take five or six years to release a new operating system version?). Microsoft picked a date and shipped it "no matter what". The result was a stack of bugs tenuously piled up into the shape of an operating system.
I'm sure some are reading this and saying words to the effect of "but you have to spend money to make money", meaning you have to accrue some debt to make money. I call "nonsense" on that. The only way to get out of debt is to spend less than you earn. It also happens to be the only reliable way to not get into debt in the first place.
Technical debt is completely avoidable. You can run your project in a debt free fashion. I've done it and thereby feel that I have the right to refuse to hear that it's impossible. The trick is knowing what's right, sticking to your guns and doing it. Lather, rinse, repeat; as the shampoo instructions say.
WHEN IT'S DONE
A good example of running a technical project debt free may be had by watching almost any open-source project run by the Apache Software Foundation. The Apache folks have a reputation for high quality work. Part of this comes from the natural tendency of good programmers to want to work on Apache projects. Another part of the quality equation comes from the peer review that naturally happens in an open-source project where everyone can see all of the code and is free to examine and comment on it. The biggest part of the reason for the quality of the Apache projects comes from their standard answer to the most common question received on their mailing lists.
The most common question that I remember seeing when I frequented the Apache Struts mailing list was "when will it be done?" A few months after any release I remember seeing this question being asked about the next version or point release. As regular as clockwork this question would come up time and time again. The impressive thing was that the answer was always the same. The answer was always "when it's done!"
Even a superficial consideration of the question leads to the conclusion that it's the only answer that can realistically be given. If a project team sets out to perform a certain unit of work, then they either perform it or they don't. It's a binary issue and that unit of work is delivered or it isn't. While I understand that life brings surprises and plans can change, the work is still either done or it isn't. Sometimes those life surprises are bigger than expected and the plans have to change and the unit of work is modified and that will affect the estimated completion date (remember, an estimate is only a wild guess in a suit) because the work is still done when it's finished. Even changing scope does not eliminate the binary nature of the matter.
All Apache projects are done when they're done. Period. Apache projects carry a heavy expectation of excellence. I know that even back in the 1.0 days of the Struts web framework, I was able to rely upon it totally for the system that I was the tech lead over. In fact we even went to production with a beta version of Struts 1.1 as the quality was so high that I couldn't see any reason not to use it while I waited for the final 1.1 release.
The Apache projects know what the right thing is and they stick to their guns. They know that they pick a unit of work to perform. They work on it until it is finished to their satisfaction and only then do they release it to the waiting world.
Sadly, the concept of "it's done when it's done" is foreign to the modern Information Technology manager. Modern Information Technology projects are all driven by deadlines; arbitrary deadlines at that. I'm working on just such a project now, fixing a compliance issue with our widget sales. We're already out of compliance, but the management thinking was that it should be fixed by the end of the year so that we'd be compliant next year. There was no examination of whether that was doable by a single developer still getting used to the system in question and for which there are no tests so who knows if anything gets broken? Management says proceed because it's more important to be seen to be fixing the issue even if it's a hurried fix that might in turn need to be fixed.
I don't think I could begin to count the number of timeframe driven decisions that I have witnessed or have heard details of from reliable sources that I trust. These decisions have a long history of turning out badly, but because the pain is felt by the programmers, the managers find them most agreeable.
This is root cause of technical debt. Information Technology management making bad decisions based on a desire to get things done "at the speed of business" and then not feeling any of the follow-up pain of those decisions. And the punditry wonder why there are falling numbers of programmers. I know that I've advised my little geeklets to not even think of working in Corporate America and especially not Information Technology. Even just last week I was chatting with co-workers about Corporate Escape Routes; just how do we tunnel out of the cubes that are our prison cells?
So, how does a deadline driven approach to project management, the way that it's implemented in Corporate America today, cause technical debt?
Let me start by saying that deadlines aren't all bad, especially if they are determined by a careful examination of the amount of work that needs to be done and the availability of skilled programmers to do it. This is not how it's done in Corporate America today, so we'll skip straight to discussing "pick any two".
PICK ANY TWO
Information Technology is still a relatively new discipline, but it has been around long enough to have had classic project management principles applied to it. These principles tell us that there are three aspects to a project and that two can be tightly controlled at any time with the third varying depending upon the decisions you make on the two you choose to fix.
The three aspects of an Information Technology project are Scope, Time and Quality. Some people use the term Cost instead of Time, but Cost tends to vary in direct proportion to Time, so with the time obsession of Corporate America, it seems more appropriate to call that aspect Time. The interplay of these aspects are seen by way of tradeoffs. The more rigidly you fix any two of the aspects, the more the third one is left to vary.
A good example of an engineering tradeoff would be NASA. These folks put people into space and bring them back again alive and well. NASA fix Quality at it's maximum and Scope doesn't change because otherwise there's no point in the mission. This leaves Time/Cost as the variable. Of course, being a government agency, it seems like Time/Cost is the last thing they worry about anyway!
Now, in most Information Technology projects, and when I say most I mean every one that I've been on and just about everyone I've heard of from my contacts, the two fixed aspects are Time and Scope. As a motivational speaker that I heard many years ago said, "Every project starts out with a deadline and a name."
Even the Scope is not usually that well defined. Hence the large number of incidents of scope creep. It's not really scope creep ... it's more that the project was started before they knew what they wanted. This is the usual behavior of Information Technology management. They aren't sure what they want, but they're deadly certain about when they want it. To get a change in the planned project completion date usually requires a presidential declaration, delivered by pink pigs flying in formation with white unicorns.
This is why it's so dangerous to offer estimates to managers, because once they've heard a date inside the timeframe they wanted to hear, they stick to it like the proverbial limpet. I know that I've offered estimates to managers and have had the number halved right in front of me. Usually they say something to the effect that they had to do that as their manager wouldn't accept an estimate that large.
So with the Time aspect being cast in concrete and the Scope staying still at best and increasing under normal conditions, the only other aspect left that can vary is Quality. And the universal observation is that Quality will always very downwards. This fits with the law of conservation of energy. If Time is fixed and Scope tends to increase, then the only direction that Quality can go is down.
This concept does not seem to fit with the worldview of the average Information Technology manager. Funnily enough, everyone else seems to get it. I particularly like the way that the U.S. Navy SEALS express it: "Fast is slow. Slow is fast." When everything is melting down around you and you need something done right, then slow down, deliberately slow down and concentrate on doing the action slowly and correctly the way you would have done it in training and let the muscle memory take over. It will be done right (for whatever your definition of "it" is).
Information Technology managers are the reverse of the SEALS. When a problem occurs they switch into panic mode. Everyone is expected to stay late. Senior managers tend to start hovering outside the cubicles of those performing the fixes. Hourly status meetings get called and are conducted in stand-up fashion outside the fixer's cube.
BON APPETITE
I'm going to wrap up with an example currently being worked on by others while I relax at my local Starbucks and enjoy a cup of coffee. The core pricing module for our widgets is broken for the introduction of a new widget. It is blowing up when pricing is requested for that widget. I'm in the process of coming up to speed on this module myself and so I know that there are zero unit tests in the code base. I have written tests for all of the code in the area that I'm working on, but these have not been promoted into the main code base yet. The lack of unit tests means that there are no areas that can be considered as bug free. So the bug could be anywhere. (Where there are good unit tests, there can be no bugs!)
It would seem that the problem involves a NullPointerError, so the chances are good that some domain object is being incorrectly initialized. Unfortunately no one knows which one it is because none of our domain objects have any kind of guard conditions for their input values. Any of our domain objects can be in any condition. There are no guarantees that they are ever in a valid state.
The decisions to have no tests and no guard conditions to force data integrity are the result of previous bad decisions motivated by the desire to get stuff done quickly. These decisions have consequences, we know these as Technical Debt, and those consequences have grown large teeth and claws and have developed a taste for untested code. In this core pricing module the untested code stretches as far as the eye can see.
Bon Appetite Mr. Consequence!
Technical Debt: future problems brought about by making decisions with an eye to short-term gains instead of long-term correctness.This exact concept was used years ago in the advertisements for Fram motor oil filters. Every car owner knows that they need to replace the oil and filter in their car on a regular basis or they will eventually experience engine problems. The advertisements in question stated that using inferior oil filters (naturally, anything that wasn't sold under the Fram brand name) would eventually cause the same problems. At the end of the commercial the mechanic looks at the camera and invites you to "pay me now or pay me later". This is how you avoid mechanical debt; take small amounts of appropriate action now, or take massive (and expensive) reparative actions later.
The financial world has had this concept from the beginning of money (or at least the lending thereof). Debt is a very real thing for many people and it's something that gets dramatically worse the more you fail to address it. The definition of worse can vary of course. For some "worse" means having a credit card declined, or a car repossessed, or a house repossessed, or a business declared insolvent. And then there are some who operate outside of the realm of the legal, who will be more than happy to break your kneecaps when your debt exceeds the agreed repayment amounts.
One aspect that most forms of debt share is the personal pain (sometimes literal pain if you borrow from stereotypical Italian gentlemen named Vinny) from failing to address that debt. When you ignore it long enough it has a way of getting your attention, often to the exclusion of everything else. While this is generally quite unpleasant, it does effectively concentrate the mind on efforts to bear down on the debt and start making it go away.
A big problem with technical debt is that the personal pain is often not applied to the one making the decisions and who, by all rights, should be experiencing it. I'm talking about Information Technology managers here. While not a few bad decisions are made by programmers, the clear and massive majority of them are made by managers. The problems and pain of the technical debt is then felt by the programmers. The irony in the situation is that those same programmers have likely warned against exactly the situation that they now find themselves placed in.
I have seen technical debt accrued in almost every company that I have worked for. There seems to be a slavish adherence to the concept of "first mover advantage". This would be lovely if the concept worked, but the general public seems to be learning to place a premium on correct over fast. Sadly, the memo hasn't made it to Information Technology management yet. Consider Microsoft's Vista operating system. I understand from reading technology blogs that Vista was released because they were fed up of the computer press asking when it was going to be ready (Really, what's the rush? Doesn't everyone take five or six years to release a new operating system version?). Microsoft picked a date and shipped it "no matter what". The result was a stack of bugs tenuously piled up into the shape of an operating system.
I'm sure some are reading this and saying words to the effect of "but you have to spend money to make money", meaning you have to accrue some debt to make money. I call "nonsense" on that. The only way to get out of debt is to spend less than you earn. It also happens to be the only reliable way to not get into debt in the first place.
Technical debt is completely avoidable. You can run your project in a debt free fashion. I've done it and thereby feel that I have the right to refuse to hear that it's impossible. The trick is knowing what's right, sticking to your guns and doing it. Lather, rinse, repeat; as the shampoo instructions say.
WHEN IT'S DONE
A good example of running a technical project debt free may be had by watching almost any open-source project run by the Apache Software Foundation. The Apache folks have a reputation for high quality work. Part of this comes from the natural tendency of good programmers to want to work on Apache projects. Another part of the quality equation comes from the peer review that naturally happens in an open-source project where everyone can see all of the code and is free to examine and comment on it. The biggest part of the reason for the quality of the Apache projects comes from their standard answer to the most common question received on their mailing lists.
The most common question that I remember seeing when I frequented the Apache Struts mailing list was "when will it be done?" A few months after any release I remember seeing this question being asked about the next version or point release. As regular as clockwork this question would come up time and time again. The impressive thing was that the answer was always the same. The answer was always "when it's done!"
Even a superficial consideration of the question leads to the conclusion that it's the only answer that can realistically be given. If a project team sets out to perform a certain unit of work, then they either perform it or they don't. It's a binary issue and that unit of work is delivered or it isn't. While I understand that life brings surprises and plans can change, the work is still either done or it isn't. Sometimes those life surprises are bigger than expected and the plans have to change and the unit of work is modified and that will affect the estimated completion date (remember, an estimate is only a wild guess in a suit) because the work is still done when it's finished. Even changing scope does not eliminate the binary nature of the matter.
All Apache projects are done when they're done. Period. Apache projects carry a heavy expectation of excellence. I know that even back in the 1.0 days of the Struts web framework, I was able to rely upon it totally for the system that I was the tech lead over. In fact we even went to production with a beta version of Struts 1.1 as the quality was so high that I couldn't see any reason not to use it while I waited for the final 1.1 release.
The Apache projects know what the right thing is and they stick to their guns. They know that they pick a unit of work to perform. They work on it until it is finished to their satisfaction and only then do they release it to the waiting world.
Sadly, the concept of "it's done when it's done" is foreign to the modern Information Technology manager. Modern Information Technology projects are all driven by deadlines; arbitrary deadlines at that. I'm working on just such a project now, fixing a compliance issue with our widget sales. We're already out of compliance, but the management thinking was that it should be fixed by the end of the year so that we'd be compliant next year. There was no examination of whether that was doable by a single developer still getting used to the system in question and for which there are no tests so who knows if anything gets broken? Management says proceed because it's more important to be seen to be fixing the issue even if it's a hurried fix that might in turn need to be fixed.
I don't think I could begin to count the number of timeframe driven decisions that I have witnessed or have heard details of from reliable sources that I trust. These decisions have a long history of turning out badly, but because the pain is felt by the programmers, the managers find them most agreeable.
This is root cause of technical debt. Information Technology management making bad decisions based on a desire to get things done "at the speed of business" and then not feeling any of the follow-up pain of those decisions. And the punditry wonder why there are falling numbers of programmers. I know that I've advised my little geeklets to not even think of working in Corporate America and especially not Information Technology. Even just last week I was chatting with co-workers about Corporate Escape Routes; just how do we tunnel out of the cubes that are our prison cells?
So, how does a deadline driven approach to project management, the way that it's implemented in Corporate America today, cause technical debt?
Let me start by saying that deadlines aren't all bad, especially if they are determined by a careful examination of the amount of work that needs to be done and the availability of skilled programmers to do it. This is not how it's done in Corporate America today, so we'll skip straight to discussing "pick any two".
PICK ANY TWO
Information Technology is still a relatively new discipline, but it has been around long enough to have had classic project management principles applied to it. These principles tell us that there are three aspects to a project and that two can be tightly controlled at any time with the third varying depending upon the decisions you make on the two you choose to fix.
The three aspects of an Information Technology project are Scope, Time and Quality. Some people use the term Cost instead of Time, but Cost tends to vary in direct proportion to Time, so with the time obsession of Corporate America, it seems more appropriate to call that aspect Time. The interplay of these aspects are seen by way of tradeoffs. The more rigidly you fix any two of the aspects, the more the third one is left to vary.
A good example of an engineering tradeoff would be NASA. These folks put people into space and bring them back again alive and well. NASA fix Quality at it's maximum and Scope doesn't change because otherwise there's no point in the mission. This leaves Time/Cost as the variable. Of course, being a government agency, it seems like Time/Cost is the last thing they worry about anyway!
Now, in most Information Technology projects, and when I say most I mean every one that I've been on and just about everyone I've heard of from my contacts, the two fixed aspects are Time and Scope. As a motivational speaker that I heard many years ago said, "Every project starts out with a deadline and a name."
Even the Scope is not usually that well defined. Hence the large number of incidents of scope creep. It's not really scope creep ... it's more that the project was started before they knew what they wanted. This is the usual behavior of Information Technology management. They aren't sure what they want, but they're deadly certain about when they want it. To get a change in the planned project completion date usually requires a presidential declaration, delivered by pink pigs flying in formation with white unicorns.
This is why it's so dangerous to offer estimates to managers, because once they've heard a date inside the timeframe they wanted to hear, they stick to it like the proverbial limpet. I know that I've offered estimates to managers and have had the number halved right in front of me. Usually they say something to the effect that they had to do that as their manager wouldn't accept an estimate that large.
So with the Time aspect being cast in concrete and the Scope staying still at best and increasing under normal conditions, the only other aspect left that can vary is Quality. And the universal observation is that Quality will always very downwards. This fits with the law of conservation of energy. If Time is fixed and Scope tends to increase, then the only direction that Quality can go is down.
This concept does not seem to fit with the worldview of the average Information Technology manager. Funnily enough, everyone else seems to get it. I particularly like the way that the U.S. Navy SEALS express it: "Fast is slow. Slow is fast." When everything is melting down around you and you need something done right, then slow down, deliberately slow down and concentrate on doing the action slowly and correctly the way you would have done it in training and let the muscle memory take over. It will be done right (for whatever your definition of "it" is).
Information Technology managers are the reverse of the SEALS. When a problem occurs they switch into panic mode. Everyone is expected to stay late. Senior managers tend to start hovering outside the cubicles of those performing the fixes. Hourly status meetings get called and are conducted in stand-up fashion outside the fixer's cube.
BON APPETITE
I'm going to wrap up with an example currently being worked on by others while I relax at my local Starbucks and enjoy a cup of coffee. The core pricing module for our widgets is broken for the introduction of a new widget. It is blowing up when pricing is requested for that widget. I'm in the process of coming up to speed on this module myself and so I know that there are zero unit tests in the code base. I have written tests for all of the code in the area that I'm working on, but these have not been promoted into the main code base yet. The lack of unit tests means that there are no areas that can be considered as bug free. So the bug could be anywhere. (Where there are good unit tests, there can be no bugs!)
It would seem that the problem involves a NullPointerError, so the chances are good that some domain object is being incorrectly initialized. Unfortunately no one knows which one it is because none of our domain objects have any kind of guard conditions for their input values. Any of our domain objects can be in any condition. There are no guarantees that they are ever in a valid state.
The decisions to have no tests and no guard conditions to force data integrity are the result of previous bad decisions motivated by the desire to get stuff done quickly. These decisions have consequences, we know these as Technical Debt, and those consequences have grown large teeth and claws and have developed a taste for untested code. In this core pricing module the untested code stretches as far as the eye can see.
Bon Appetite Mr. Consequence!