Use of global variables is a subject of controversy in the programming com- munity. Obviously, the question raised by the title of this section cannot be answered in any formulaic way, as it is a matter of personal taste and style.
Nevertheless, most programmers would probably consider the outright ban- ning of global variables, which is encouraged by many teachers of program- ming, to be overly rigid. In this section, we will explore the possible value of globals in the context of the structures of R. Here, the termglobal variable, or justglobal, will be used to include any variable located higher in the environ- ment hierarchy than the level of the given code of interest.
The use of global variables in R is more common than you may have guessed. You might be surprised to learn that R itself makes very substan- tial use of globals internally, both in its C code and in its R routines. The superassignment operator<<-, for instance, is used in many of the R library functions (albeit typically in writing to a variable just one level up in the en- vironment hierarchy).Threadedcode andGPUcode, which are used for writ- ing fast programs (as described in Chapter 16), tend to make heavy use of global variables, which provide the main avenue of communication between parallel actors.
Now, to make our discussion concrete, let’s return to the earlier exam- ple from Section 7.7:
f <- function(lxxyy) { # lxxyy is a list containing x and y ...
lxxyy$x <- ...
lxxyy$y <- ...
return(lxxyy)
# set x and y lxy$x <- ...
lxy$y <- ...
lxy <- f(lxy)
# use new x and y ... <- lxy$x ... <- lxy$y
As noted earlier, this code might be a bit unwieldy, especially ifxandy are themselves lists.
By contrast, here is an alternate pattern that uses globals:
f <- function() { ...
x <<- ...
y <<- ...
}
# set x and y x <- ...
y <- ...
f() # x and y are changed in here
# use new x and y ... <- x
... <- y
Arguably, this second version is much cleaner, being less cluttered and not requiring manipulation of lists. Cleaner code is usually easier to write, debug, and maintain.
It is for these reasons—avoiding clutter and simplifying the code—that we chose to use globals, rather than to return lists, in the DES code earlier in this chapter. Let’s explore that example further.
We had two global variables (both lists, encapsulating various informa- tion):sim, associated with the library code, andmm1glbls, associated with our M/M/1 application code. Let’s considersimfirst.
Even many programmers who have reservations about using globals agree that such variables may be justified if they are truly global, in the sense that they are used broadly in the program. This is the case forsim in our DES example. It is used both in the library code (inschedevnt(), getnextevnt(), anddosim()) and in in our M/M/1 application code (in mm1reactevnt()). The latter access tosimis on a read-only basis in this par- ticular instance, but it could involve writes in some applications. A common example of such writes is when an event needs to be canceled. This might arise in modeling a “whichever comes first” situation; two events are sched- uled, and when one of them occurs, the other must be canceled.
So, usingsimas a global seems justified. Nevertheless, if we were bound and determined to avoid using globals, we could have placedsimas a local withindosim(). This function would then passsimas an argument to all of the functions mentioned in the previous paragraph (schedevnt(),getnextevnt(), and so on), and each of these functions would return the modifiedsim. Line 94 for example, would change from this:
reactevnt(head) to this:
sim <- reactevnt(head)
We would then need to add a line like the following to our application- specific functionmm1reactevnt():
return(sim)
We could do something similar withmm1glbls, placing a variable called, say,appvarsas a local withindosim(). However, if we did this with simas well, we would need to place them together in a list so that both would be returned, as in our earlier example functionf(). We would then have the lists-within-lists clutter described earlier—well, lists within lists within lists in this case.
On the other hand, critics of the use of global variables counter that the simplicity of the code comes at a price. They worry that it may be dif- ficult during debugging to track down locations at which a global variable changes value, since such a change could occur anywhere in the program.
This seems to be less of a concern in view of our modern text editors and integrated development tools (the original article calling for avoiding use of globals was published in 1970!), which can be used to find all instances of a variable. However, it should be taken into consideration.
Another concern raised by critics involves situations in which a func- tion is called in several unrelated parts of the overall program using differ- ent values. For example, consider using our examplef()function in dif- ferent parts of our program, each call with its own values ofxandy, rather than just a single value of each, as assumed earlier. This could be solved by setting up vectors ofxandyvalues, with one element for each instance of f()in your program. You would lose some of the simplicity of using globals, though.
The above issues apply generally, not just to R. However, for R there is an additional concern for globals at the top level, as the user will typically have lots of variables there. The danger is that code that uses globals may accidentally overwrite an unrelated variable with the same name.
This can be avoided easily, of course, by simply choosing long, very application-specific names for globals in your code. But a compromise is also available in the form of environments, such as the following for the DES
Withindosim(), the line sim <<- list()
would be replaced by
assign("simenv",new.env(),envir=.GlobalEnv)
This would create a new environment, pointed to bysimenvat the top level. It would serve as a package in which to encapsulate our globals. We would access them viaget()andassign(). For instance, the lines
if (is.null(sim$evnts)) { sim$evnts <<- newevnt inschedevnt()would become
if (is.null(get("evnts",envir=simenv))) { assign("evnts",newevnt,envir=simenv)
Yes, this is cluttered too, but at least it is not complex like lists of lists of lists. And it does protect against unwittingly writing to an unrelated vari- able the user has at the top level. Using the superassignment operator still yields the least cluttered code, but this compromise is worth considering.
As usual, there is no single style of programming that produces the best solution in all applications. The globals approach is another option to con- sider for your programming tool kit.