How to Debug Programs
Have you ever felt totally helpless when faced with a computer related problem? Something used to work right, but doesn’t work now. Why can’t it just work like it used to? Why do computers have to be so impersonal, so complicated and so dumb?
I get it. I know what you feel like. I have spent the majority of my life hopelessly frustrated by computers. Despite being a professional programmer, I still have this almost insurmountable fear that I won’t be able to fix something when faced with a problem. Where do these problems come from?
Computers are adept at doing exactly what they have been told to do. They are so good it that, like a perfectly obedient three-year-old, they follow their instructions to the letter, regardless of whether you told them to do the right thing. Computers will faithfully annihilate your data or send that that email before you finished writing it. They pop up messages on your screen (as their authors intended, of course), and force you to make decisions (Do you want to restart now or in 10 minutes?). Alas, computers are merely a messenger. The reason for the heartache is because their author took shortcuts, or didn’t think things all the way through.
Okay, so what does that mean to you? It means one really big thing, that you aren’t going to like. You are going to have to learn how a computer works. I know you don’t want to, and that it will take a lot of time, but it’s really the only answer to why your computer doesn’t work. Though trite, I really have taken to heart Francis Bacon’s “Knowledge is Power”. Knowing is ability. Understanding why something doesn’t work is the only sure-fire way to being able to fix it.
If You’re a Programmer:
That’s all cool, but really, why doesn’t my computer work? Let me give you a recent example of a program I was debugging. I was adding a test to a suite of existing tests. Upon adding my test, another test began to spuriously fail. Now, what could have gone wrong? Some ideas:
- Perhaps the other test was flaky? Maybe the existing tests fail periodically and I was unlucky. This is a common typical first guess, because it blames someone else for writing bad code over my[your]self. Attempting to run the failing test with my change resulted in 100% error rate for their test (mine passed with flying colors) This was not it
- Perhaps the System under test (i.e. the code that I was writing) was actually broader in scope than I thought? This is another guess that expands on the first. I’m pretty confident my change has nothing to do with the rest of the code, but obviously some of my assumptions are invalid. (By the way, just phrasing it like this is the key to figuring out why programs don’t work. If you can’t enumerate your assumptions, you’re lost!). After extensively looking at the code in question, I am confident my change could not possibly have broken the other test.
- Fire up the debugger. If you can’t guess what’s wrong, the next step is to walk through the code with the debugger. If tests are written properly, they should set up initial conditions, set expectations, and do nothing else. Fortunately for me they do exactly this. I walked through the failing test checking the locals at each step. All I’s were dotted and T’s crossed. Strangely, the test passed at the end of debugging. What?!
- Okay we’re on to something. The test, when run by itself, passes. When run as part of the suite, it fails. What could possibly cause something like that? Maybe the test order matters? Maybe it’s a timing thing? Maybe the random number generator acts wonky when called out of order? We have a clue, and we can try these things out.
- The one common thing in the examples above is that they point to the problem being in shared state. Something about test suite means that the tests interact with each other, even though good test practice says that tests should be independent. What’s a common thing that tests as part of a suite share? Their setup, of course!
- Here is where the problem happens. The test suite sets up a database for each test to use, but because database setup is expensive, it only does it once for the suite. The database was allocating unique IDs from one sequence, which some of the tests were statically using IDs. With the addition of my new test, a tipping point was reached which caused the static IDs to collide with the dynamically allocated IDs, resulting in a spurious test failure. The solution, was to change all the static ID tests to all create IDs from the same domain, by using the sequence generator.
Notice how I intentionally left the punchline to the very end? It turns out that isn’t the important part, it’s how you narrow down the possibilities. There are a zillion things that can go wrong, but very few that really do go wrong.
If you are NOT a Programmer:
Okay, so you know a little bit about computers. You aren’t dumb, you just want to get stuff done (or hurry up and watch cat videos). I won’t judge you, computers are spectacular working machines and time wasters. So what’s wrong?
(At this point I grasped for computer problems that were not easy to solve, but solvable nonetheless. If the following problem seems obvious, it’s because you’re too smart.)
So you want to play cat videos? But, for some reason, the volume is always kind of low. It didn’t used to be low, but now it is, and you kind of feel stuck not knowing where to start. Let’s walk through the thought process that results in fix this:
- Ask Google. Google is incredibly smart, and it’s unlikely you are the first person to have problems with low volume. Ask your question in the plainest way possible, without trying to be overly clear. A lot of people ask their questions in a colloquial manner that assumes there is a person on the other side of their screen to answer them. Google is made to work like this.
- Okay, so you looked at the first page of Google results, and there weren’t any good leads. There’s no hope of finding anything useful on the second page, so abandon that plan and stare deep into the screen in front of you. What do you see? Which of those things be volume related?
- Assuming you are on Youtube, there is a little sound icon on the video player. The knobs on your actual speakers is pretty high, so why is there a superfluous extra controller there? (for a complicated set of reasons, but you’ll have to trust me on this one).
- Hmm. That’s at maximum, but the volume still seems kinda low. But, if Youtube has a volume control, maybe there could be other places on the computer that affect the volume. Looking around, there’s another volume looking thing in the bottom right-hand corner. It’s the Windows volume controller.
- Clicking on that, it only shows one volume slider thingy, and that’s at maximum too. However, whenever sound plays, the volume slider blinks. It only seems to ever go up to half way up the volume slider. Maybe that’s a clue?
- Without clicking anywhere else on the screen, there’s a button called “Mixer” underneath the volume slider. I think I know mixers can change the volume, so let’s give it a click! That opens up another window with a lot more sliders, not all of them at maximum! More so, the Chrome slider is only about half way up. Sliding this all the way returns the volume to max, and it sounds right again. Hooray!
Again, I have no idea what computer you have, or if those menus really exist for you, or if you are even using Windows. Those don’t matter. (they do, but they’re besides the point). The idea is to explore; to find out how computer sound works.
If You’re a Human Being:
Figuring out why a computer is acting wonky is an exercise in patience. Working with computers is not particularly hard, but you can’t give up when faced with a problem you have no idea how to fix. (It is for this reason that I have great respect for Kindergarten teachers, as I would lose my mind trying to deal with a bunch of children.)
And, if that doesn’t work, take a break. Think about things. Go get a cup of coffee, or go for a walk. Something about staring at a screen or listening to music mutes your mind. When you come back, you’ll have new ideas about how to approach the unapproachable.