Extended Example: Two Full Debugging Sessions- 123docz.net

Now that we’ve looked at R’s debugging tools, let’s try using them to ﬁnd and ﬁx code problems. We’ll begin with a simple example and then move on to a more complicated one.

13.3.6.1 Debugging Finding Runs of Ones

First recall our extended example of ﬁnding runs of 1s in Chapter 2. Here is a buggy version of the code:

1 findruns <- function(x,k) {

2 n <- length(x)

3 runs <- NULL

4 for (i in 1:(n-k)) {

5 if (all(x[i:i+k-1]==1)) runs <- c(runs,i)

6 }

7 return(runs)

8 }

Let’s try it on a small test case:

> source("findruns.R")

> findruns(c(1,0,0,1,1,0,1,1,1),2) [1] 3 4 6 7

The function was supposed to report runs at indices 4, 7, and 8, but it found some indices that it shouldn’t have and missed some as well. Some- thing is wrong. Let’s enter the debugger and take a look around.

> debug(findruns)

> findruns(c(1,0,0,1,1,0,1,1,1),2)

debugging in: findruns(c(1, 0, 0, 1, 1, 0, 1, 1, 1), 2) debug at findruns.R#1: {

n <- length(x) runs <- NULL

for (i in 1:(n - k)) {

if (all(x[i:i + k - 1] == 1)) runs <- c(runs, i) }

return(runs) }

attr(,"srcfile") findruns.R

So, according to the principle of conﬁrmation, let’s ﬁrst make sure our test vector was received correctly:

Browse[2]> x

[1] 1 0 0 1 1 0 1 1 1

So far, so good. Let’s step through the code a bit. We hit n a couple of times to single-step through the code.

Browse[2]> n

debug at findruns.R#2: n <- length(x) Browse[2]> n

debug at findruns.R#3: runs <- NULL Browse[2]> print(n)

[1] 9

Note that after each single step, R tells us which statement would be the nextone to execute. In other words, at the time we executedprint(n), we had notyet executed the assignment ofNULLtoruns.

Note, too, that although normally you can print out the value of a variable by simply typing its name, we could not do so here for our variablen, becausenis also the abbreviation for the debugger’s next command. Thus, we neededprint().

At any rate, we found that the length of our test vector was 9, conﬁrming what we knew. Now, let’s single-step some more, getting into the loop.

Browse[2]> n

debug at findruns.R#4: for (i in 1:(n - k + 1)) { if (all(x[i:i + k - 1] == 1))

runs <- c(runs, i) }

Browse[2]> n

debug at findruns.R#4: i Browse[2]> n

debug at findruns.R#5: if (all(x[i:i + k - 1] == 1)) runs <- c(runs, i)

Sincekis 2—that is, we are checking for runs of length 2— theif() statement should be checking the ﬁrst two elements ofx, which are (1,0).

Let’s conﬁrm:

Browse[2]> x[i:i + k - 1]

[1] 0

So, it didnotconﬁrm. Let’s check that we have the correct subscript range, which should be 1:2. Is it?

Browse[2]> i:i + k - 1 [1] 2

Also wrong. Well, how aboutiandk? They should be 1 and 2, respec- tively. Are they?

Browse[2]> i [1] 1 Browse[2]> k [1] 2

Well, those do conﬁrm. Thus, our problem must be with the expression i:i + k - 1. After some thought, we realize there is an operator precedence problem there, and we correct it toi:(i + k - 1).

Is it okay now?

> source("findruns.R")

> findruns(c(1,0,0,1,1,0,1,1,1),2) [1] 4 7

No, as mentioned, it should be (4,7,8).

Let’s set a breakpoint inside the loop and take a closer look.

> setBreakpoint("findruns.R",5) /home/nm/findruns.R#5:

findruns step 4,4,2 in <environment: R_GlobalEnv>

> findruns(c(1,0,0,1,1,0,1,1,1),2) findruns.R#5

Called from: eval(expr, envir, enclos) Browse[1]> x[i:(i+k-1)]

[1] 1 0

Good, we’re dealing with the ﬁrst two elements of the vector, so our bug ﬁx is working so far. Let’s look at the second iteration of the loop.

Browse[1]> c findruns.R#5

Called from: eval(expr, envir, enclos)

Browse[1]> i [1] 2

Browse[1]> x[i:(i+k-1)]

[1] 0 0

That’s right, too. We could go another iteration, but instead, let’s look at the last iteration, a place where bugs frequently arise in loops. So, let’s add a conditional breakpoint, as follows:

findruns <- function(x,k) { n <- length(x)

runs <- NULL

for (i in 1:(n-k)) {

if (all(x[i:(i+k-1)]==1)) runs <- c(runs,i)

if (i == n-k) browser() # break in last iteration of loop }

return(runs) }

And now run it again.

> source("findruns.R")

> findruns(c(1,0,0,1,1,0,1,1,1),2)

Called from: findruns(c(1, 0, 0, 1, 1, 0, 1, 1, 1), 2) Browse[1]> i

[1] 7

This shows the last iteration was fori = 7. But the vector is nine elements long, andk = 2, so our last iteration should bei = 8. Some thought then reveals that the range in the loop should have been written as follows:

for (i in 1:(n-k+1)) {

By the way, note that the breakpoint that we set usingsetBreakpoint() is no longer valid, now that we’ve replaced the old version of the object findruns.

Subsequent testing (not shown here) indicates the code now works.

Let’s move on to a more complex example.

13.3.6.2 Debugging Finding City Pairs

Recall our code in Section 3.4.2, which found the pair of cities with the clos- est distance between them. Here is a buggy version of that code:

1 returns the minimum value of d[i,j], i != j, and the row/col attaining

2 that minimum, for square symmetric matrix d; no special policy on

3 ties;

5 mind <- function(d) {

6 n <- nrow(d)

7 add a column to identify row number for apply()

8 dd <- cbind(d,1:n)

9 wmins <- apply(dd[-n,],1,imin)

10 wmins will be 2xn, 1st row being indices and 2nd being values

11 i <- which.min(wmins[1,])

12 j <- wmins[2,i]

13 return(c(d[i,j],i,j))

14 }

16 finds the location, value of the minimum in a row x

17 imin <- function(x) {

18 n <- length(x)

19 i <- x[n]

20 j <- which.min(x[(i+1):(n-1)])

21 return(c(j,x[j]))

22 }

Let’s use R’s debugging tools to ﬁnd and ﬁx the problems.

We’ll run it ﬁrst on a small test case:

> source("cities.R")

> m <- rbind(c(0,12,5),c(12,0,8),c(5,8,0))

> m

[,1] [,2] [,3]

[1,] 0 12 5

[2,] 12 0 8

[3,] 5 8 0

> mind(m)

Error in mind(m) : subscript out of bounds

Not an auspicious start! Unfortunately, the error message doesn’t tell us where the code blew up. But the debugger will give us that information:

> options(error=recover)

> mind(m)

Error in mind(m) : subscript out of bounds Enter a frame number, or 0 to exit

1: mind(m) Selection: 1

Called from: eval(expr, envir, enclos)

Browse[1]> where

where 1: eval(expr, envir, enclos)

where 2: eval(quote(browser()), envir = sys.frame(which)) where 3 at cities.R#13: function ()

{

if (.isMethodsDispatchOn()) { tState <- tracingState(FALSE) ...

Okay, so the problem occurred inmind()rather thanimin()and in par- ticular at line 13. It still could be the fault ofimin(), but for now, let’s deal with the former.

NOTE There is another way we could have determined that the blowup occurred on line 13.

We would enter the debugger as before but probe the local variables. We could reason that if the subscript bounds error had occurred at line 9, then the variablewminswould not have been set, so querying it would give us an error message likeError: object 'wmins' not found.On the other hand, if the blowup occurred on line 13, evenj would have been set.

Since the error occurred withd[i,j], let’s look at those variables:

Browse[1]> d [,1] [,2] [,3]

[1,] 0 12 5

[2,] 12 0 8

[3,] 5 8 0

Browse[1]> i [1] 2 Browse[1]> j [1] 12

This is indeed a problem—donly has three columns, yetj, a column subscript, is 12.

Let’s look at the variable from which we gleanedj,wmins: Browse[1]> wmins

[,1] [,2]

[1,] 2 1

[2,] 12 12

If you recall how the code was designed, columnkofwminsis supposed to contain information about the minimum value in rowkofd. So here wminsis saying that in the ﬁrst row (k= 1) ofd,(0,12,5), the minimum value is 12, occurring at index 2. But it should be 5 at index 3. So, something went wrong with this line:

wmins <- apply(dd[-n, ], 1, imin)

There are several possibilities here. But since ultimatelyimin()is called, we can check them all from within that function. So, let’s set the debug sta- tus ofimin(), quit the debugger, and rerun the code.

Browse[1]> Q

> debug(imin)

> mind(m)

debugging in: FUN(newX[, i], ...) debug at cities.R#17: {

n <- length(x) i <- x[n]

j <- which.min(x[(i + 1):(n - 1)]) return(c(j, x[j]))

} ...

So, we’re inimin(). Let’s see if it properly received the ﬁrst row ofdd, which should be (0,12,5,1).

Browse[4]> x [1] 0 12 5 1

It’s conﬁrmed. This seems to indicate that the ﬁrst two arguments to apply()were correct and that the problem is instead withinimin(), though that remains to be seen.

Let’s single-step through, occasionally typing conﬁrmational queries:

Browse[2]> n

debug at cities.r#17: n <- length(x) Browse[2]> n

debug at cities.r#18: i <- x[n]

Browse[2]> n

debug at cities.r#19: j <- which.min(x[(i + 1):(n - 1)]) Browse[2]> n

debug at cities.r#20: return(c(j, x[j])) Browse[2]> print(n)

[1] 4 Browse[2]> i [1] 1 Browse[2]> j [1] 2

Recall that we designed our callwhich.min(x[(i + 1):(n - 1)]to look only at the above-diagonal portion of this row. This is because the matrix is symmetric and because we don’t want to consider the distance between a city and itself.

But the valuej = 2does not conﬁrm. The minimum value in (0,12,5) is 5, which occurs at index 3 of that vector, not index 2. Thus, the problem is in this line:

j <- which.min(x[(i + 1):(n - 1)]) What could be wrong?

After taking a break, we realize that although the minimum value of (0,12,5) occurs at index 3 of that vector, that isnotwhat we askedwhich.min() to ﬁnd for us. Instead, thati + 1term means we asked for the index of the minimum in (12,5), which is 2.

We did askwhich.min()for the correct information, but we failed to use it correctly, because we do want the index of the minimum in (0,12,5). We need to adjust the output ofwhich.min()accordingly, as follows:

j <- which.min(x[(i+1):(n-1)]) k <- i + j

return(c(k,x[k]))

We make the ﬁx and try again.

> mind(m)

Error in mind(m) : subscript out of bounds Enter a frame number, or 0 to exit

1: mind(m) Selection:

Oh no,anotherbounds error! To see where the blowup occurred this time, we issue thewherecommand as before, and we ﬁnd it was at line 13 again. What aboutiandjnow?

Browse[1]> i [1] 1 Browse[1]> j [1] 5

The value ofjis still wrong; it cannot be larger than 3, as we have only three columns in this matrix. On the other hand,iis correct. The overall minimum value inddis 5, occurring in row 1, column 3.

So, let’s check the source ofjagain, the matrixwmins: Browse[1]> wmins

[,1] [,2]

[1,] 3 3

Well, there are the 3 and 5 in column 1, just as should be the case.

Remember, column 1 here contains the information for row 1 ind, sowmins is saying that the minimum value in row 1 is 5, occurring at index 3 of that row, which is correct.

After taking another break, though, we realize that whilewminsis correct, ouruseof it isn’t. We have the rows and columns of that matrix mixed up.

This code:

i <- which.min(wmins[1,]) j <- wmins[2,i]

should be like this:

i <- which.min(wmins[2,]) j <- wmins[1,i]

After making that change and resourcing our ﬁle, we try it out.

> mind(m) [1] 5 1 3

This is correct, and subsequent tests with larger matrices worked, too.

Extended Example: Two Full Debugging Sessions

Preview of Some Important R Data Structures

Extended Example: Regression Analysis of Exam Grades