# Reflections on a year of reading Knuth

Want to feel stupid? Read Knuth! Want to get smarter? Read Knuth! For a long time, I've seen other programmers write in awed, hushed tones about the multi-volume series of books written by Donald Knuth titled "The Art of Computer Programming" (TAOCP). As somebody who genuinely enjoys reading computer programming books — I actually lost track of time reading Richard Steven's "TCP/IP Illustrated" many years ago — when I consistently hear so many positive things about a book or a series of books, it makes it onto my to-read list. I finally got around to picking up a copy of the first three volumes of TAOCP little over a year ago, and I just finished the first volume, having averaged about 30 minutes a day of reading or working problems.

These books are written college-textbook style: a few pages of exposition followed by some
sample exercises to help you solidify the concepts presented. These books include a *lot*
of sample exercises — in some cases, there are more pages of exercises than explanatory
material. Almost all of them have answers included in the back of the book if you get stuck.

I worked, or at least attempted to work, every single problem in the first volume. In some cases I settled for just understanding what the question was trying to ask for. In some cases I failed even to accomplish that (don't judge me until you try it yourself). Each problem is assigned a difficulty rating from 0-50 where 0 is trivial and 50 is "unsolved research problem" (in the first edition, Fermat's last theorem was listed as a 50, but since Andrew Wiles proved it, it's bumped down to a 45 in the current edition). I think I was able to solve most of the problems rated < 20 — it was hit and miss beyond that. Most of the problems fell into the 20-30 difficulty range, but Knuth's idea of "difficult" is subjective, and problems that he considers to be of average difficulty ended up stretching my comparatively tiny brain painfully. I've never climbed Mount Everest, but I imagine the whole ordeal feels similar: painful while you're going through it, but triumphant when you reach the pinnacle.

The first third or so of the book is pure math: specifically, discrete math. As a computer
programmer by trade and by education, I've always had a tumultuous relationship with math. When I was an
undergraduate studying computer science, I dutifully took all the required math courses, including three
semesters of brain-crushing calculus. At the time, I viewed advanced, college-level math as sort of a "dedication test":
I didn't see it as practical, or useful, or applicable to my field of study, but instead as a weed-out program to see who was serious
about proving their worthiness for this degree. The degree was a prize to be obtained, but you
had to jump through some hoops to obtain it. Although
I've since come around and accepted that math is useful — and I'm damned glad I did spend
a year and a half learning calculus — Knuth manages in this book to underscore, over and over again, that
math is practical and eminently applicable to the field of computer science. This was by far the
hardest section of the book for me: Knuth considers a proof by induction, for example, a triviality,
but I personally still have to stop and think to produce one. Some of the problems depend on
what he calls "higher math", which is usually calculus with a bit of linear algebra sprinkled in,
but always expertly applied. I muscled through,
trying to solve each exercise. I got pretty lost in the chapter on generating functions, but I
learned a *lot* from the section as a whole. There were plenty of topics in here that I
considered myself familiar and proficient with, like number theory and binomial coefficients, but
Knuth had more to teach me than I ever would have imagined.

After about a hundred or so pages of pure math, he finally gets around to the computer programming part and starts by presenting a hypothetical computer architecture and associated assembler language in which all of the code samples in the remainder of the book are written. It seems that a lot of the people who complain about this series of books complain specifically about his decision to present code samples in assembler — and not just any assembler, a pretend one that he made up! Knuth gets a little defensive about that decision in the preface, pointing out that if he had written the book using the popular high-level language of the time, all of the examples in the first edition would have been in Fortran, then re-written in C for the second edition and again in Java for the third. I'm actually glad that he presented everything in a made-up, "ideal" assembler language: after having spent time learning how to code in it and understanding how it worked, there was no "magic" behind any of the examples. There are freely available emulators for the MIX assembly language (I used MDK), so you can (and I did) execute the sample programs and test out your answers to the sample exercises.

The later parts of the book were easier going for me, once I got the hang of Knuth's MIX assembler,
but that's not to say that they were easy! For one thing, MIX is based on ancient assembler
concepts. I learned assembler in the late 80's on a 6502, so I had at least some notion of registers and overflow
flags and such, but MIX is closer to, say, the IBM 700 series (that predated even System/360!).
The first edition of the first book was written in 1963 and follows the state of the art for
computer architecture at the time.
There's a *lot* you're going to have to learn about this architecture and development style
if you're younger than about 60.
Knuth thinks nothing of writing self-modifying code, for instance, and doesn't even call it out
as a "trick" when he does it; to him, it seems perfectly natural. Everything operates in "words"
of five (six-bit!) bytes and a lot of the instruction set is dedicated to shifting and masking
these words in ways that we would use AND/OR and XOR to accomplish in a modern computer. Knuth
has since updated the architecture
and created MMIX for the later volumes of the book, and has stated that he plans to go back and
rewrite the first three volumes to be use MMIX as well, but for now, everything in the current
edition of the first three volumes is still in MIX.

Finally, after spending half the book covering the requisite mathematics and presenting the fictitious platform on which the remaining examples are shown, he starts to cover what you might consider "basic" programming techniques like memory allocation, stacks, queues, linked lists, trees and garbage collection. I thought I knew these concepts inside and out and that there was nothing else I could possibly be taught — how wrong I was! Again, I was in awe of just how deep Knuth could manage to go down each rabbit hole to examine the limits to which all of these concepts can be pushed. Have you ever wondered how many different binary trees there are? Or how garbage collection works when there's almost no memory left for it to use? Here you'll find the distillation of 50 years of research on the boundaries of computer science.

I've seen it suggested that these books be treated as reference books rather than read cover to cover. I'm inclined to disagree. Supposing that you had a copy on your bookshelf, and you were trying to implement a garbage collection routine. You could glance through the table of contents, see that garbage collection is discussed in section 2.3.5, and flip to the corresponding page. But if you did, you'd discover that the text refers back to some of the mathematical results that were presented and proven in previous sections (sometimes even in the exercises of the previous sections), and all of the examples are written out in MIX assembler. To make much sense of it as a reference work, you'd have to at least work through the MIX assembler description and some of the math sections. All in all, I think this book was written to be read all the way through, but also in a way that you could easily refer back to it years later, after you'd absorbed it all once.

I've also seen people mention that they read it, but only skimmed the exercises. I guess everybody is different, but I can't fathom trying to ingest all or even most of this without working through the end-of-chapter exercises. In many cases — maybe even most — I found myself still confused after the end of the chapter, but that working through the exercises solidified the concepts that I was still unclear on in a very controlled way. Quite a few sections are nothing but proofs of complex theorems, and the exercises are there to show you what the theorems are actually useful for. Part of Knuth's brilliance is in leading you down a path of learning such that you walk away with a deep understanding of the material, and actually retain it months or even years later: but you have to follow along with him to get the most out of it. I'd go so far to say that most of the valuable content of the book is in the exercises: in spite of the page count, most of it isn't actually published!

I definitely plan to finish reading the series. It may well be 2020 before I'm finished, but in the end I think it will be well worth the time. No, there are no "marketable skills" to be obtained here. There's no discussion on threading or security. There's nothing that you'll learn from Knuth that you can list on your resume. Nobody is hiring MIX programmers (if they were, I wouldn't apply anyway). It may even well be that I could have just as profitably spent the time I spent reading Knuth's "Art of Computer Programming" doing sudoku puzzles, or crosswords, or reading novels or even reading comic books, although I do feel like I learned a lot and am a better programmer for having completed it.

xerxenka, 2017-01-04If you liked the Math preliminaries in Vol 1 there's a draft of 4B out now on Knuth's site with 'Mathematical Preliminaries Redux' another 100 or so pages of new material covering martingales and probability.

Josh, 2017-01-17xerxenka, 2017-05-28TAOCP is also the only book(s) I am aware of that cover historical references of all material and even has exercises for them, like 7.2.1.7 which has a very thorough chapter on the history of generating combinatorial patterns "Make a patch to Takakazu Seki's 17th century algorithm so that it will complete".

Victor, 2017-08-17xerxenka, 2017-09-02To satisfy the second requirement, search for "15-213 schedule CMU" to get the class page for Intro to Computer Systems which has free recorded lectures on youtube that cover bit level representation like Two's Complement, X86-64 assembly language, stack frames, cache memories (Knuth goes into the principle of locality in later volumes) other program optimization, and basic architecture so you can understand his MIX arch. Like Josh explains don't expect to do this at any sort of fast pace, I started ~6 years ago and I just this year got to the latest published fascicle as Knuth sends you off on many tangents each chapter with his additional book/paper recommendations.