Control flow, programming, and data generation

Một phần của tài liệu CRC using r and RStudio for data management statistical analysis and graphics 2nd (Trang 70 - 73)

4.1.1 Looping

Example: 11.2 x = numeric(k) # create placeholder

for (i in 1:length(x)) {

x[i] = rnorm(1) # this is slow and inefficient!

}

or (preferably)

x = rnorm(k) # this is far better

Note: Most tasks in R that could be written as a loop are often dramatically faster if they are encoded as a vector operation (as in the second and preferred option above). Examples of situations where loops are particularly useful can be found in 11.1.2 and 11.2. The along.with option forseq()and theseq along()function can also be helpful.

More information on control structures for looping and conditional processing such as whileandrepeatcan be found in help(Control).

4.1.2 Conditional execution

Examples: 6.6.6 and 8.7.7 if (expression1) { expression2 }

or

if (expression1) { expression2 } else { expression3 } or

ifelse(expression, x, y)

Note: The if statement, with or without else, tests a single logical statement; it is not an elementwise (vector) function. Ifexpression1 evaluates toTRUE, thenexpression2 is evaluated. Theifelse()function operates on vectors and evaluates the expression given as expression and returns x if it is TRUE and y otherwise (see comparisons, A.4.2). An

expression can include multi-command blocks of code (in brackets). Theswitch()function may also be useful for more complicated tasks.

4.1.3 Sequence of values or patterns

Example: 10.1.3 It is often useful to generate a variable consisting of a sequence of values (e.g., the integers from 1 to 100) or a pattern of values (1 1 1 2 2 2 3 3 3). This might be needed to generate a variable consisting of a set of repeated values for use in a simulation or graphical display.

As an example, we demonstrate generating data from a linear regression model of the form:

E[Y|X1, X2] =β0+β1X1+β2X2, V ar(Y|X) = 9, Corr(X1, X2) = 0.

# generate

seq(from=i1, to=i2, length.out=nvals) seq(from=i1, to=i2, by=1)

seq(i1, i2) i1:i2

rep(value, times=nvals) or

rep(value, each=nvals)

Note: The seqfunction creates a vector of length valif the length.out option is speci- fied. If thebyoption is included, the length is approximately(i2-i1)/byval. The i1:i2 operator is equivalent toseq(from=i1, to=i2, by=1). Therepfunction creates a vector of lengthnvals with all values equal tovalue, which can be a scalar, vector, or list. The eachoption repeats each element of value nvalstimes. The default istimes.

The following code implements the model described above for n= 200.

> n = 200

> x1 = rep(c(0,1), each=n/2) # x1 resembles 0 0 0 ... 1 1 1

> x2 = rep(c(0,1), n/2) # x2 resembles 0 1 0 1 ... 0 1

> beta0 = -1; beta1 = 1.5; beta2 = .5;

> rmse = 3

> table(x1, x2) x2

x1 0 1 0 50 50 1 50 50

> y = beta0 + beta1*x1 + beta2*x2 + rnorm(n, mean=0, sd=rmse)

> lm(y ~ x1 + x2)

4.1.4 Perform an action repeatedly over a set of variables

It is often necessary to perform a given function for a series of variables. Here, the square of each of a list of variables is calculated as an example.

4.1. CONTROL FLOW, PROGRAMMING, AND DATA GENERATION 47

l1 = c("x1", "x2", ..., "xk") l2 = c("z1", "z2", ..., "zk") for (i in 1:length(l1)) {

assign(l2[i], eval(as.name(l1[i]))^2) }

Note: It is not straightforward to refer to objects without evaluating those objects. Assign- ments to R objects given symbolically can be made using the assign()function. Here, a non-obvious use of the eval() function is used to evaluate an expression after the string value in l1is coerced to be a symbol. This allows the values of the character vectors l1 andl2to be evaluated (seehelp(assign),eval(), andsubstitute()).

4.1.5 Grid of values

Example: 12.8 It may be useful to generate all combinations of two or more vectors.

> expand.grid(x1=1:3, x2=c("M", "F")) x1 x2

1 1 M 2 2 M 3 3 M 4 1 F 5 2 F 6 3 F

Note: The expand.grid() function takes two or more vectors or factors and returns a dataframe. The first factor varies fastest. The resulting object is a matrix.

4.1.6 Debugging

browser() # create a breakpoint

debug(function) # enter the debugger when function called

Note: When a function flagged for debugging is called, the function can be executed one statement at a time. At the prompt, commands can be entered (n for next, c for con- tinue,wherefor traceback,Qfor quit) or expressions can be evaluated (seebrowser()and trace()). A debugging environment is available within RStudio. The debugger may be invoked by setting a breakpoint by clicking to the left of the line number in a script, or pressing Shift+F9. Profiling of the execution of expressions can be undertaken using the Rprof() function (see also summaryRprof()and tracemem()). RStudio provides a series of additional debugging tools.

4.1.7 Error recovery

try(expression, silent=FALSE) stopifnot(expr1, ..., exprk)

Note: Thetry()function runs the givenexpression and traps any errors that may arise (displaying them on the standard error output device). The functiongeterrmessage()can

be used to display any errors. The stopifnot() function runs the given expressions and returns an error message if all are not true (seestop()andmessage()).

Một phần của tài liệu CRC using r and RStudio for data management statistical analysis and graphics 2nd (Trang 70 - 73)

Tải bản đầy đủ (PDF)

(280 trang)