Over the new year holiday time I had a chance to get away from it all, and snuck up to Finland to sit in a lodge on the Gulf of Finland, sip coffee, take saunas and read. I brought along a few books, the only programming one being Brian W. Kernighan and Rob Pike's "The Practice of Programming."
Cabin: woke up like this. 😂 😍 pic.twitter.com/spr130gFzR— katharine jarmul (@kjam) January 3, 2017
I received the book as a loan from a long-time mentor, who helped me first learn how to write production-ready code. I remember reading it in 2008 and having difficulty understanding all the concepts. As I moved from city to city, I always thought I should probably mail it back, or perhaps read it again first, then mail it back...
Practice of Programming: The Book
The book is 18 years old. It covers C programming. It handles issues like signed versus unsigned integers, piping data between mismatched byte systems and a few other topics that do not affect my programming, nor most of the folks I know. Why reread it?
Brian W. Kernighan and Rob Pike should need no introduction, but here is one in case you are like me and getting older and dependent on Google. Kernighan is a contributor to the C programming language and co-author of the famous book, "The C Programming Language". He worked at Bell Labs with Rob Pike, famous in his own right for developing numerous parts of the Unix system we all know and love today; and the whole Go language thing... #nbd.
What gems still held my attention, 18 years after they were published and nearly 9 years after I first was handed the book? Many more than you might think, here are a few:
Chapter 5 is devoted solely to debugging; and has many informative sections including tips on finding patterns, rubber ducking (but with a teddy bear instead), analyzing data to help find programming bugs, and how to solve "non-reproducible" errors. The section that is truly timeless is 5.7 Other People's Bugs, which valiantly takes on how to find, manage and report other programmer's errors.
Including this tidbit:
If you think that you have found a bug in someone else's program, the first step is to make absolutely sure it is a genuine bug, so you don't waste the author's time and lose your own credibility.
From someone who has written and helped fix many bugs, this resonated. Especially when it seems the standard today is to simply report a GitHub issue and let the author(s) and contributors figure it out. If most of us spent an extra day debugging the issue, we might even fix it ourselves (we have the source code) or at least present a well-proven test case for the author(s) to help alleviate the burden on open-source maintainers.
In that vein, Kernighan and Pike write:
Finally, put yourself in the shoes of the person who receives your report. You want to provide the owner with as good a test case as you can manage. It's not very helpful if the bug can be demonstrated only with large inputs, or an elaborate environment, or multiple supporting files. Strip the test down to a minimal and self-contained case. Include other information that could possibly be relevant, like the version of the program itself, and of the compiler, operating system and hardware.
I feel like a checklist of these points should be required before submitting bug reports. A kind of Joel Test for error reporting.
Chapter 6 is devoted to testing. As a fan of testing (even for your data!), this chapter stood out; not just for it's methodical evaluation of how, when and why to write tests, but also it's use of data validation (!!) and test automation (!!!). The fact that good developers are still having to explain why they need these types of tests included in their test suite (or to managers or higher ups that these tests are even necessary), is a sad and telling reflection of our priorities and (non)adherence to lessons learned long ago.
I especially liked this passage:
It is important to test your own code: don't assume that some testing organization or user will find things for you. But it's easy to delude yourself about how carefully you are testing, so try to ignore the code and think of the hard cases, not the easy ones. To quote Don Knuth describing how he creates tests for the TEX formatter, "I get into the meanest, nastiest frame of mind that I can manage, and I write the nastiest [testing] code I can think of; then I turn around and embed that in even nastier constructions that are almost obscene."
I literally spit my coffee out when reading this bit, imaging the coders of the world finding their worst selves and attacking their product with vigor and malice. But it IS great advice. How many times have I written the obvious test instead of devoting a day or a few hours figuring out how to break my own code? 2
The final chapter that struck me as still very much applicable today was Chapter 8 on Portability. This was a surprise, as I assumed the portability issues in 1999 didn't reflect any I might have seen as a developer. Grrllll, was I wrong...
I can't even begin to explain my joy and amusement at turning the page and reading this:
If one lives in the United States, it's easy to forget that English is not the only language, ASCII not the only character set, $ not the only currency symbol, dates can be written with the day first, times can be based on a 24-hour clock, and so on.
The amount of data errors, report misunderstandings and general grief I have seen in my career due to these misconceptions (sometimes my own, of course) are too many for me to recount. Additionally, the fact we still debate the need for internationalization of smaller tools or even our own websites is again interesting to note (when given an 18-year-old book outlining internationalization as a requirement).
Beyond internationalization, Kernighan and Pike touch upon portability for different environments, and elaborate on the pitfalls of massive if/else or switch statements in compilers or setup configuration files. Their warning against modifying source for one particular install was succinct and useful:
When you modify a program to adapt to a new environment, don't begin by making a copy of the entire program. Instead, adapt the existing source. You will probably need to make changes to the main body of the code, and if you edit a copy, before long you will have divergent versions. As much as possible, there should only be a single source for a program; if you find you need to change something to port to a particular environment, find a way to make the change work everywhere.
Finally, something I think we have caught up to (although should still remember)! Version control, generalization (when useful) and open-source libraries eating the world. Hooray us!
Other fun (to me) notes
- An entire section on self-generating code and ideas for better code written by machines.
print("%s", str)and doing a double-take to make sure I was not reading Python.
- A paragraph outlining (very politely) how ridiculous it is that we still need to support carriage returns (
\r) despite the fact that computers have no carriages.
- Learning that "big endian" is a reference to Jonathan Swift's Gulliver's Travels.
- Code to roll your own RegEx parser in C.
- Telnetting from machine to machine to copy files and using checksum (
sum) to test if the copy was properly performed.
- A still semi-functional TCL and Perl script to scrape the web. See footnote for the code.3
- Checking your email with grep
Where did I save that mail from Bob?
% grep '^From:.* [email protected]' mail/*
Granted some of the content in this book was merely fun review for me and several themes are problems of a different era, I found it remarkably relevant given its age. It seems that often we talk about books even a year old as outdated, but this made me reconsider how it's sometimes easy to treat every new thing as just that, NEW. Most often it's the same programming paradigms the folks at Bell Labs were working on since the '80s.
Moral of the story: Never too old to (re)read a good book.
Oh, and, Ryan... I'm sending your book back! Thanks for the loan! 😇
Debating doing a series on some of these older but still relevant texts. If this post is interesting to you, please let me know! ↩
Check out the unmodified 18-year old code as a Gist. Exact usage from book is to run as so: geturl.tcl $1 | unhtml.pl | fmt.awk. I couldn't get piping to work with my current setup, but the scripts still worked using tclsh and perl as a series of commands (granted most sites reject or don't respond to HTTP/1.0 requests without headers anymore... 😏) ↩