Extended Example: A Class for Storing Upper-Triang- 123docz.net

Now it’s time for a more involved example, in which we will write an R class

"ut"for upper-triangular matrices. These are square matrices whose elements below the diagonal are zeros, such as shown in Equation 9.1.

⎛

⎝1 5 12 0 6 9 0 0 2

⎞

⎠ (9.1)

Our motivation here is to save storage space (though at the expense of a little extra access time) by storing only the nonzero portion of the matrix.

NOTE The R class"dist"also uses such storage, though in a more focused context and with- out the class functions we have here.

The componentmatof this class will store the matrix. As mentioned, to save on storage space, only the diagonal and above-diagonal elements will be stored, in column-major order. Storage for the matrix (9.1), for instance, consists of the vector (1,5,6,12,9,2), and the componentmathas that value.

We will include a componentixin this class, to show where inmatthe various columns begin. For the preceding case,ixisc(1,2,4), meaning that column 1 begins atmat[1], column 2 begins atmat[2], and column 3 begins at mat[4]. This allows for handy access to individual elements or columns of the matrix.

The following is the code for our class.

1 # class "ut", compact storage of upper-triangular matrices

3 # utility function, returns 1+...+i

4 sum1toi <- function(i) return(i*(i+1)/2)

6 # create an object of class "ut" from the full matrix inmat (0s included)

7 ut <- function(inmat) {

8 n <- nrow(inmat)

9 rtrn <- list() # start to build the object

10 class(rtrn) <- "ut"

11 rtrn$mat <- vector(length=sum1toi(n))

12 rtrn$ix <- sum1toi(0:(n-1)) + 1

13 for (i in 1:n) {

14 # store column i

15 ixi <- rtrn$ix[i]

16 rtrn$mat[ixi:(ixi+i-1)] <- inmat[1:i,i]

17 }

18 return(rtrn)

19 }

21 # uncompress utmat to a full matrix

22 expandut <- function(utmat) {

23 n <- length(utmat$ix) # numbers of rows and cols of matrix

24 fullmat <- matrix(nrow=n,ncol=n)

25 for (j in 1:n) {

26 # fill jth column

27 start <- utmat$ix[j]

28 fin <- start + j - 1

29 abovediagj <- utmat$mat[start:fin] # above-diag part of col j

30 fullmat[,j] <- c(abovediagj,rep(0,n-j))

31 }

32 return(fullmat)

33 }

35 # print matrix

36 print.ut <- function(utmat)

39 # multiply one ut matrix by another, returning another ut instance;

40 # implement as a binary operation

41 "%mut%" <- function(utmat1,utmat2) {

42 n <- length(utmat1$ix) # numbers of rows and cols of matrix

43 utprod <- ut(matrix(0,nrow=n,ncol=n))

44 for (i in 1:n) { # compute col i of product

45 # let a[j] and bj denote columns j of utmat1 and utmat2, respectively,

46 # so that, e.g. b2[1] means element 1 of column 2 of utmat2

47 # then column i of product is equal to

48 # bi[1]*a[1] + ... + bi[i]*a[i]

49 # find index of start of column i in utmat2

50 startbi <- utmat2$ix[i]

51 # initialize vector that will become bi[1]*a[1] + ... + bi[i]*a[i]

52 prodcoli <- rep(0,i)

53 for (j in 1:i) { # find bi[j]*a[j], add to prodcoli

54 startaj <- utmat1$ix[j]

55 bielement <- utmat2$mat[startbi+j-1]

56 prodcoli[1:j] <- prodcoli[1:j] +

57 bielement * utmat1$mat[startaj:(startaj+j-1)]

58 }

59 # now need to tack on the lower 0s

60 startprodcoli <- sum1toi(i-1)+1

61 utprod$mat[startbi:(startbi+i-1)] <- prodcoli

62 }

63 return(utprod)

64 }

Let’s test it.

> test function() {

utm1 <- ut(rbind(1:2,c(0,2))) utm2 <- ut(rbind(3:2,c(0,1))) utp <- utm1 %mut% utm2 print(utm1)

print(utm2) print(utp)

utm1 <- ut(rbind(1:3,0:2,c(0,0,5))) utm2 <- ut(rbind(4:2,0:2,c(0,0,1))) utp <- utm1 %mut% utm2

print(utm1) print(utm2) print(utp) }

> test() [,1] [,2]

[1,] 1 2 [2,] 0 2

[,1] [,2]

[1,] 3 2 [2,] 0 1

[,1] [,2]

[1,] 3 4 [2,] 0 2

[,1] [,2] [,3]

[1,] 1 2 3 [2,] 0 1 2 [3,] 0 0 5

[,1] [,2] [,3]

[1,] 4 3 2 [2,] 0 1 2 [3,] 0 0 1

[,1] [,2] [,3]

[1,] 4 5 9 [2,] 0 1 4 [3,] 0 0 5

Throughout the code, we take into account the fact that the matrices involved have a lot of zeros. For example, we avoid multiplying by zeros sim- ply by not adding terms to sums when the terms include a 0 factor.

Theut()function is fairly straightforward. This function is aconstructor, which is a function whose job it is to create an instance of the given class, eventually returning that instance. So in line 9, we create a list that will serve as the body of the class object, naming itrtrnas a reminder that this will be the class instance to be constructed and returned.

As noted earlier, the main member variables of our class will bematand idx, implemented as components of the list. Memory for these two components is allocated in lines 11 and 12.

The loop that follows then ﬁlls inrtrn$matcolumn by column and assignsrtrn$idxelement by element. A slicker way to do thisforloop would be to use the rather obscurerow()andcol()functions. Therow()function takes a matrix input and returns a new matrix of the same size, but with each element replaced by its row number. Here’s an example:

> m

[,1] [,2]

[1,] 1 4 [2,] 2 5 [3,] 3 6

> row(m) [,1] [,2]

[1,] 1 1 [2,] 2 2 [3,] 3 3

Thecol()function works similarly.

Using this idea, we could replace theforloop inut()with a one-liner:

rtrn$mat <- inmat[row(inmat) <= col(inmat)]

Whenever possible, we should exploit vectorization. Take a look at line 12, for example:

rtrn$ix <- sum1toi(0:(n-1)) + 1

Sincesum1toi()(which we deﬁned on line 4) is based only on the vectorized functions"*"()and"+"(),sum1toi()itself is also vectorized. This allows us to applysum1toi()to a vector above. Note that we used recycling as well.

We want our"ut"class to include some methods, not just variables. To this end, we have included three methods:

• Theexpandut()function converts from a compressed matrix to an ordi- nary one. Inexpandut(), the key lines are 27 and 28, where we usertrn$ix to determine where inutmat$matthejthcolumn of our matrix is stored.

That data is then copied to thejthcolumn offullmatin line 30. Note the use ofrep()to generate the zeros in the lower portion of this column.

• Theprint.ut()function is for printing. This function is quick and easy, usingexpandut(). Recall that any call toprint()on an object of type"ut"

will be dispatched toprint.ut(), as in our test cases earlier.

• The"%mut%"()function is for multiplying two compressed matrices (with- out uncompressing them). This function starts in line 39. Since this is a binary operation, we take advantage of the fact that R accommodates user-deﬁned binary operations, as described in Section 7.12, and implement our matrix-multiply function as%mut%.

Let’s look at the details of the"%mut%"()function. First, in line 43, we allocate space for the product matrix. Note the use of recycling in an unusual context. The ﬁrst argument ofmatrix()is required to be a vector of a length compatible with the number of speciﬁed rows and columns, so the 0 we provide is recycled to a vector of lengthn2. Of course,rep()could be used instead, but exploiting recycling makes for a bit shorter, more ele- gant code.

For both clarity and fast execution, the code here has been written around the fact that R stores matrices in column-major order. As mentioned in the comments, our code then makes use of the fact that columniof the

product can be expressed as a linear combination of the columns of the ﬁrst factor. It will help to see a speciﬁc example of this property, shown in Equa- tion 9.2.

⎛

⎝ 1 2 3 0 1 2 0 0 5

⎞

⎠

⎛

⎝ 4 3 2 0 1 2 0 0 1

⎞

⎠=

⎛

⎝ 4 5 9 0 1 4 0 0 5

⎞

⎠ (9.2)

The comments say that, for instance, column 3 of the product is equal to the following:

⎛

⎝ 1 00

⎞

⎠+ 2

⎛

⎝ 2 10

⎞

⎠+ 1

⎛

⎝ 3 25

⎞

⎠ Inspection of Equation 9.2 conﬁrms the relation.

Couching the multiplication problem in terms of columns of the two input matrices enables us to compact the code and to likely increase the speed. The latter again stems from vectorization, a beneﬁt discussed in detail in Chapter 14. This approach is used in the loop beginning at line 53. (Arguably, in this case, the increase in speed comes at the expense of readability of the code.)

Extended Example: A Class for Storing Upper-Triangular

Preview of Some Important R Data Structures

Extended Example: Regression Analysis of Exam Grades