Find the Bug: Errata

Escape Escapade (p. 15)
Equal Sign Embarrassment (p.22)
Subscript Sloppiness (p.52)
Line Number Loopiness (p.56)
Kanji Confusion (p.71)
Backspace Blunder (p.71)
Hint Harmfulness (p.74)
Capitalization Collapse (p. 78)
Initialization Imperfection (p. 97)
Sorting Stumble (p. 140)
Loop Lameness (p. 146)
Database Disaster (p. 187)
Blank String Blooper (p. 232)
Parenthesis Problem (p. 255)
State Table Slipup (p. 256)

 

p. 15: Chapter 2 (Tips on Walking Through Code)

Since characters in C need to be escaped also, so the line

if (buffer[p] == '\') {

should be

if (buffer[p] == '\\') {

This an especially ignominious oversight since the snippet of code in question is supposed to be checking for escape characters.

(5/8/06 - Thanks to David Malone)

 

p. 22: Chapter 2 (Tips on Walking Through Code)

Proof that you need to check every bit of code, not just the ones in the main programs. The discussion of the "implied else" has a line of code

if (x = 5) {

which of course should be

if (x == 5) {

This is about the most basic error you can make in C.

(4/25/05 - Thanks to Robert Schaefer)

 

p. 52: Chapter 3 (C), Program 4 (Memory Copy)

In the description of the test_buffer array in the hints, I repeat the subscripts 0 through 2, instead of continuing on with 3 through 5. For example the first hint currently reads:

test_buffer[0] == 0x01
test_buffer[1] == 0x02
test_buffer[2] == 0x03
test_buffer[0] == 0x04
test_buffer[1] == 0x05
test_buffer[2] == 0x06

but it should actually read:

test_buffer[0] == 0x01
test_buffer[1] == 0x02
test_buffer[2] == 0x03
test_buffer[3] == 0x04
test_buffer[4] == 0x05
test_buffer[5] == 0x06

And the second hint (which uses the same array) is also wrong.

(8/13/06 - Thanks to Marc Cohen)

 

p. 56: Chapter 3 (C), Program 5 (Parse a String into Substrings)

In the explanation of the two possible errors, the first bullet point states "curr_character remains set from the assignment on line 35." The assignment is really on line 24 (it's the same assignment that needs to move up to around line 15 to fix the bug to begin with).

(6/19/07 - Thanks to Paolo Bonzini)

 

p. 71: Chapter 3 (C), Program 10 (Kanji Backspace)

The program incorrectly describes the format of lead and trailing byte characters in a DBCS string. It is in correct to assume that a character with the high bit off is always a trailing byte; rather, the range of lead and trailing bytes is dependent on the specific code page. In Windows, this can be tested with the IsDBCSLeadByte() function.

The program is technically correct given the description stating that having the high bit on makes a byte a lead byte, but to work correctly on real DBCS strings, the description of what constitutes a lead and trailing byte should be changed. Also, line 9 that reads:

if ((*(current-1)) & 0x80) {

should read: 

if (IsDBCSLeadByte(*(current-1))) {

and line 18 which reads:

if (((*location) & 0x80) == 0) {

should read:

if (!IsDBCSLeadByte(*location)) {

(11/1/04 - Thanks to Larry Osterman)

 

p. 71: Chapter 3 (C), Program 10 (Kanji Backspace)

The program does not work correctly if string_start is equal to current on entry. It also doesn't work is current is nowhere within string_start, but the particular case of them being equal should either be checked for, or else noted in the description as not an input that the program has to worry about.

When first written the program had such a check, but at one point during the authoring of the book I was attempting to conform all the programs to 43 or 44 lines so they would fit on one page, and such extra checks were removed.

(12/19/04 - Thanks to Joshua Bell)

 

p. 74: Chapter 3 (C), Program 10 (Kanji Backspace)

Further proof that Kanji Backspace is the hidden reef on which all programmers eventually capsize; the test_string in both hints does not match up with the descriptions of how the code misbehaves. The descriptions both discuss the test_string as if the second byte was an H byte. So the fix would be to change:

test_string[1] == 0x68

to something that initialized it to an H byte, such as:

test_string[1] == 0xe8

The text description of the string, which begins, "it contains an HL character..." is correct as the text stands now, but given the change above would also need to change since the first character would now be an HH character.

(8/22/06 - Thanks to Anantha Devarajan)

 

p. 78: Chapter 4 (Python), Brief Summary of Python

In the string slicing example at the top of the page, it states that the resulting value of newname will be "Tjon123". In fact, it will be "TJon123" (the J is capitalized).

(5/8/06 - Thanks to David Malone)

 

p. 97: Chapter 4 (Python), Program 2 (Find a Substring)

If outerString is empty, outerLen will be set to 0 on line 14 and range(outerLen) on line 18 will be empty, which means that i remains unassigned based on how Python handles for loop variables. As a result the return statement on line 34 generates an exception.

One solution is to initialize i to 0 at the beginning of the program.

(4/20/05 - Thanks to Demetrios Biskinis)

 

p. 140: Chapter 5 (Java), Program 4 (Draw a Triangle on the Screen, Part II)

The code on lines 29-38, which is ordering the three points by x-coordinate, has a bug. It won't work if pt[1].x is equal to pt[2].x and both are less than pt[0].x. The if() on line 29-30, which is supposed to figure out if pt[1].x is the lowest point:

if ((pt[1].x < pt[2].x) &&
    (pt[1].x < pt[0].x)) {

won't work in this case, because the result will be false even though we do want to swap pt[1] with pt[0]; the code instead should read:

if ((pt[1].x <= pt[2].x) &&
    (pt[1].x < pt[0].x)) {

The if on lines 32-33, which appears to have the same error, is actually OK because the first if() will catch the pt[1].x == pt[2].x case.

This is the first really bad errata reported, in that this bug is precisely the kind of one I was trying to include in the book, and it would be completely reasonable of someone reading the program to think they had found the bug if they found this one. In fact Suggestion #1 even recommends double-checking this bit of code. So if you found this bug instead of the one that I inserted intentionally, give yourself full credit.

(2/22/05 - Thanks to Ian Spence)

 

p. 146: Chapter 5 (Java), Program 6 (Check if a List Has a Loop)

The introductory text for this bug states that "A list has a loop if there is some ListNode in it for which node.next is equal to head." This is too specific; a list has a loop if there is some ListNode in it for which node.next points to any predecessor of itself.

The code itself is fine.

(4/26/05 - Thanks to Ralf Holly)

 

p. 187 Chapter 6 (Perl), Program 4 (SimpleDatabase)

The various regular expressions used in the program to match input lines (in the code on lines 4, 9, 12, 17, and 20) do not anchor their match to the beginning of the line. Thus the input line:

put shmooGET Smith

will be interpreted as a "get" command, not a "put" (and similarly for other commands). The solution is to begin each regular expression with "^\s*", which indicates the beginning of the line followed by any number of spaces.

The match for "dump" on line 12 should also end with "\s*$" to anchor it to the end of the line also, otherwise a line like "dumpster" would be interpreted as a "dump".

This is the second bug reported that really could have been the bug that I meant to put in there--in fact I like it better than the bug I did put in.

(5/7/05 - Thanks to Michael Heyman)

 

p. 232 Chapter 7 (x86 Assembly Language), Program 3 (Join Strings With a Delimeter)

The program has an error in how it checks whether we are still processing the first input string, and thus don't need to add a delimeter to the output buffer. The code on line 22 does this check:

cmp word ptr [edx-2], 0 ; length is 0, first string

but in fact if the input has several empty strings to start (which should have delimeters between them), then the output length will remain 0 while those are processed.

The solution is non-trivial; since you can't use the output length to tell if this is the first string, you need to add a flag of some sort which was set around line 13 (outside of the nextstring loop), checked on line 22, and then always cleared around line 35 (before jumping back to the beginning of the loop).

This is the third "real" bug, another one that someone could easily find and think that it was the intended bug.

(6/20/07 - Thanks to Paolo Bonzini)

 

p. 255: Chapter 7 (x86 Assembly Language), Program 9 (Check if Parentheses Match in Source Code)

The program does not check for the paren depth (stored in ecx) going negative. Thus, an input such as:

)(

will return 1 in eax, indicating that the parentheses match, when really they don't. The fix would be to add a check between lines 34 and 35:

cmp ecx, 0
jl done

(2/3/05 - Thanks to Jamie Jason)

 

p. 256: Chapter 7 (x86 Assembly Language), Program 9 (Check if Parentheses Match in Source Code)

The state table does not correctly handle a comment that looks like:

/* comment **/

When it sees the first * at the end of the comment, it correctly moves from state 3 to state 4, where it is looking for a /; but when it sees the * right after that, it moves back to state 3 rather than remain in state 4. As a result, it doesn't "see" the end of the /* */ comment. The fix is to change the initialization of startable from:

* = "0324352"

to:

* = "0324452"

so that it remains in state 4 when it sees the second *.

(2/17/07 - Thanks to Mark Schaal)

 

(c) 2005 Adam Barr