professional perl programming wrox 2001 phần 3 pps

Structure, Flow, and Control 213 Summary We started this chapter by exploring the basic structures of Perl. We covered statements, declarations, expression, and blocks. We looked in particular at the facilities provided by blocks. We covered Perl's conditional statements, if, else, elsif, and unless. We also looked in detail at how to create loops with for and foreach, and how to create conditional loops with while, until, do, do while and do until. The chapter also covered how to control the execution of loops with the modifiers next, last, redo, and continue. Finally, the chapter covered the goto statement as well as map and grep. TEAMFLY Team-Fly ® Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com Chapter 6 214 Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com Subroutines Subroutines are autonomous blocks of code that function like miniature programs and can be executed from anywhere within a program. Because they are autonomous, calling them more than once will also reuse them. There are two types of subroutine, named and anonymous. Most subroutines are of the 'named' persuasion. Anonymous subroutines do not have a name by which they can be called, but are stored and accessed through a code reference. Since a code reference is a scalar value, it can be passed as a parameter to other subroutines. The use of subroutines is syntactically the same as the use of Perl's own built-in functions. We can use them in a traditional function-oriented syntax (with parentheses), or treat them as named list operators. Indeed, we can override and replace the built-in functions with our own definitions provided as subroutines through the use of the use subs pragma. Subroutines differ from ordinary bare blocks in that they can be passed a list of parameters to process. This list appears inside subroutines as the special variable @_, from which the list of passed parameters (also known as arguments) can be extracted. Because the passed parameters take the form of a list, any subroutine can automatically read in an arbitrary number of values, but conversely the same flattening problem that affects lists that are placed inside other lists also affects the parameters fed to subroutines. The flexibility of the parameter passing mechanism can also cause problems if we want to actually define the type and quantity of parameters that a subroutine will accept. Perl allows us to define this with an optional prototype, which, if present, allows Perl to do compile-time syntax checking on how our subroutines are called. Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com Chapter 7 216 Subroutines, like bare blocks, may return either a scalar or a list value to the calling context. This allows them to be used in expressions just as any other Perl value is. The way this value is used depends on the context in which the subroutine is called. Declaring and Calling Subroutines Subroutines are declared with the sub keyword. When Perl encounters sub in a program it stops executing statements directly, and instead creates a subroutine definition that can then be used elsewhere. The simplest form of subroutine definition is the explicit named subroutine: sub mysubroutine { print "Hello subroutine! \n"; } We can call this subroutine from Perl with: # call a subroutine anywhere mysubroutine (); In this case we are calling the subroutine without passing any values to it, so the parentheses are empty. To pass in values we supply a list to the subroutine. Note how the subroutine parentheses resemble a list constructor: # call a subroutine with parameters mysubroutine ("testing", 1, 2, 3); Of course just because we are passing values into the subroutine does not mean that the subroutine will use them. In this case the subroutine entirely ignores anything we pass to it. We'll cover passing values in more detail shortly. In Perl it does not matter if we define the subroutine before or after it is used. It is not necessary to predeclare subroutines. When Perl encounters a subroutine call it does not recognize, it searches all the source files that have been included in the program for a suitable definition, and then executes it. However, defining or predeclaring the subroutine first allows us to omit the parentheses and use the subroutine as if it were a list operator: # call a previously defined subroutine without parentheses mysubroutine; mysubroutine "testing", 1, 2, 3; Note that calling subroutines without parentheses alters the precedence rules that control how their arguments are evaluated, which can cause problems, especially if we try to use a parenthesized expression as the first argument. If in doubt, use parentheses. We can also use the old-style & code prefix to call a subroutine. In modern versions of Perl (that is, anything from Perl 5 onwards) this is strictly optional, but older Perl programs may contain statements like: # call a Perl subroutine using the old syntax &mysubroutine; &mysubroutine(); Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com Subroutines 217 The ampersand has the property of causing Perl to ignore any previous definitions or declarations for the purposes of syntax, so parentheses are mandatory if we wish to pass in parameters. It also has the effect of ignoring the prototype of a subroutine, if one has been defined. Without parentheses, the ampersand also has the unusual property of providing the subroutine with the same @_ array that the calling subroutine received, rather than creating a new one. In general, the ampersand is optional and, in these modern and enlightened times, it is usually omitted for simple subroutine calls. Anonymous Subroutines and Subroutine References Less common than named subroutines, but just as valid, are anonymous subroutines. As their name suggests, anonymous subroutines do not have a name. Instead they are used as expressions, which return a code reference to the subroutine definition. We can store the reference in a scalar variable (or as an element of a list or a hash value) and then refer to it through the scalar: my $subref = sub {print "Hello anonymous subroutine";}; In order to call this subroutine we use the ampersand prefix. This instructs Perl to call the subroutine whose reference this is, and return the result of the call: # call an anonymous subroutine &$subref; &$subref ("a parameter"); This is one of the few places that an ampersand is still used. However, even here it is not required; we can also say: $subref->(); $subref->("a parameter"); These two variants are nearly, but not quite, identical. Firstly, &$subref; passes the current @_ array (if any) directly into the called subroutine, as we briefly mentioned earlier. Secondly, the ampersand disables any prototypes we might have defined for the subroutine. The second pair of calls retains the prototype in place. (We cover both of these points later in the chapter.) We can generate a subroutine reference from a named subroutine using the backslash operator: my $subref = \&mysubroutine; This is more useful than one might think, because we can pass a subroutine reference into another subroutine as a parameter. The following simple example demonstrates a subroutine taking a subroutine reference and a list of values, and returning a new list generated from calling the subroutine on each value of the passed list in turn: #!/usr/bin/perl # callsub.pl use warnings; use strict; sub do_list { my ($subref, @in) = @_; my @out; map {push @out, &$subref ($_)} @in; return @out; } Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com Chapter 7 218 sub add_one { return $_[0]+1; } $, = ","; print do_list (\&add_one, 1, 2, 3); # prints 2, 3, 4 Some Perl functions (notably sort), also accept an anonymous subroutine reference as an argument. We do not supply an ampersand in this case because sort wants the code reference, not the result of calling it. Here is a sort program that demonstrates the different ways we can supply sort with a subroutine. The anonymous subroutine appearing last will not work with Perl 5.005: #!/usr/bin/perl # sortsub.pl use warnings; use strict; # a list to sort my @list = (3, 4, 2, 5, 6, 9, 1); # directly with a block print sort {$a cmp $b} @list; # with a named subroutine sub sortsub { return $a cmp $b; } print sort sortsub @list; # with an anonymous subroutine my $sortsubref = sub {return $a cmp $b;}; print sort $sortsubref @list; Of course, since we can get a code reference for an existing subroutine we could also have said: $sortsubref = \&sortsub; The advantage of using the anonymous subroutine is that we can change the subroutine that sort uses elsewhere in the program, for example: # define anonymous subroutines for different sort types: $numericsort = sub {$a <=> $b}; $stringsort = sub {$a cmp $b }; $reversenumericsort = sub {$b <=> $a}; # now select a sort method $sortsubref = $numericsort; The disadvantage of this technique is that unless we take care to write and express our code clearly, it can be very confusing to work out what is going on, since without running the code it may not always be possible to tell which subroutine is being executed where. We can use print $subref to print out the address of the anonymous subroutine, but this is not nearly as nice to read as a subroutine name. Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com Subroutines 219 It is also possible to turn an anonymous subroutine into a named one, by assigning it to a typeglob. This works by manipulating the symbol table to invent a named code reference that Perl thereafter sees as a subroutine definition. This leads to the possibility of determining the actual code supported by a subroutine name at run time, which is handy for implementing things like state machines. This will be covered more fully in 'Manipulating the Symbol Table Directly' in Chapter 8. Strict Subroutines and the 'use strict subs' Pragma The strict pragma has three components, refs, vars, and subs. The subs component affects how Perl interprets unqualified (that is, not quoted or otherwise identified by the syntax) words or 'barewords' when it encounters them in the code. Without strict subroutines in effect, Perl will allow a bareword and will interpret it as if it were in single quotes: $a = bareword; print $a; # prints "bareword"; The problem with this code is that we might later add a subroutine called bareword, at which point the above code suddenly turns into a function call. Indeed, if we have warnings enabled, we will get a warning to that effect: Unquoted string "bareword" may clash with future reserved word at Strict subroutines is intended to prevent us from using barewords in a context where they are ambiguous and could be confused with subroutines. To enable them, use one of the following: use strict; # enables strict refs, vars, and subs use strict subs; # enables strict subs only Now any attempt to use a bareword will cause Perl to generate a fatal error: Bareword "bareword" not allowed while "strict subs" in use at Ironically, the second example contains the illegal bareword subs. It works because at the point Perl parses the pragma it is not yet in effect. Immediately afterwards, barewords are not permitted, so to switch off strict subs again we would have to use either quotes or a quoting operator like qw: no strict 'subs'; no strict q(subs); no strict qw(subs); Predeclaring Subroutines Perl allows subroutines to be called in two alternative syntaxes: functions with parentheses or list operators. This allows subroutines to be used as if they were one of Perl's built-in list operator functions such as print or read (neither of which require parentheses). This syntax is only valid if Perl has already either seen the subroutine definition or a declaration of the subroutine. The following subroutine call is not legal, because the subroutine has not yet been defined: Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com Chapter 7 220 debug "This is a debug message"; # ERROR: no parentheses # rest of program sub debug { print STDERR @_, "\n"; } The intention here is to create a special debug statement, which works just like the print statement, but prints to standard error rather than standard out, and automatically adds a linefeed. Because we want it to work like print in all other respects we would prefer to omit the brackets if we choose to, since print allows us to do that. # predeclare subroutine 'debug' sub debug; debug "This is a debug message"; # no error # rest of program sub debug { print STDERR @_, "\n"; } Subroutines are also predeclared if we import them from another package (see Chapter 10 for more on packages), as in: use mypackage qw(mysubroutine); It is worth noting here that even if a package automatically exports a subroutine when it is used, that does not predeclare the subroutine itself. In order for the subroutine to be predeclared, we must name it in the use statement. Keeping this in mind, we might prefer just to stick to parentheses. Overriding Built-in Functions Another way to predeclare subroutines is with the use subs pragma. This not only predeclares the subroutine, but also allows us to override Perl's existing built-in functions and replace them with our own. We can access the original built-in function with the CORE:: prefix. For example, here is a replacement version of the srand function, which issues a warning if we use srand in a version of Perl of 5.004 or greater without arguments (see Appendix C for more on the srand function): #!/usr/bin/perl # srandcall.pl use warnings; use strict; use subs qw(srand); sub srand { if ($] >= 5.004 and not @_){ warn "Unqualified call to srand redundant in Perl $]"; } else { # call the real srand via the CORE package CORE::srand @_; } } Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com Subroutines 221 Now if we use srand without an argument and the version of Perl is 5.004 or greater, we get a warning. If we supply an argument we are assumed to know what we are doing and are supplying a suitably random value. Subroutines like this are generally useful in more than one program, so we might want to put this definition into a separate module and use it whenever we want to override the default srand: #!/usr/bin/perl # mysrand.pm package mysrand; use strict; use vars qw(@ISA @EXPORT @EXPORT_OK); use Exporter; @ISA = qw(Exporter); @EXPORT = qw(mysrand); @EXPORT_OK = qw(srand); sub mysrand { if ($] >= 5.004 and not @_){ warn "Unqualified call to srand redundant in Perl $]"; } else { # call the real srand via the CORE package CORE::srand @_; } } use subs qw(srand); sub srand {&mysrand;}; # pass @_ directly to mysrand This module, which we would keep in a file called mysrand.pm to match the package name, exports the function mysrand automatically, and the overriding srand function only if we ask for it. use mysrand; # import 'mysrand' use mysrand qw(mysrand); # import and predeclare mysrand; use mysrand qw(srand); # override 'srand' We'll talk about packages, modules, and exporting subroutines in Chapter 10. The Subroutine Stack Whenever Perl calls a subroutine, it pushes the details of the subroutine call onto an internal stack. This holds the context of each subroutine, including the parameters that were passed to it in the form of the @_ array, ready to be restored when the call to the next subroutine returns. The number of subroutine calls that the program is currently in is known as the 'depth' of the stack. Calling subroutines are higher in the stack, and called subroutines are lower. This might seem academic, and to a large extent it is, but Perl allows us to access the calling stack ourselves with the caller function. At any given point we are at the 'bottom' of the stack, and can look 'up' to see the contexts stored on the stack by our caller, its caller, and so on, all the way back to the top of the program. This can be handy for all kinds of reasons, but most especially for debugging. Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com Chapter 7 222 In a purely scalar context, caller returns the name of the package from which the subroutine was called, and undef if there was no caller. Note that this does not require that the call came from inside another subroutine – it could just as easily be from the main program. In a list context, caller returns the package name, the source file, the line number from which we were called, and the name of the subroutine that was called (that is, us). This allows us to write error traps in subroutines like: sub mysub { ($pkg, $file, $line) = caller; die "Called with no parameters at $file line $line" unless @_; } If we pass a numeric argument to caller, it looks back up the stack the requested number of levels, and returns a longer list of information. This level can of course be '0', so to get everything that Perl knows about the circumstances surrounding the call to our subroutine we can write: @caller_info = caller 0; # or caller(0), if we prefer This returns a whole slew of items into the list, which may or may not be defined depending on the circumstances. They are, in order: package: the package of the caller filename: the source file of the caller line: the line number in the source file subroutine: the subroutine that was called (that is, us). If we execute code inside an eval statement then this is set to eval hasargs: this is true if parameters were passed (@_ was defined) wantarray: the value of wantarray inside the caller, see 'Returning Values' later in the chapter evaltext: the text inside the eval that caused the subroutine to be called, if the subroutine was called by eval is_require: true if a require or use caused the eval hints: compilation details, internal use only bitmask: compilation details, internal use only In practice, only the first four items: package, filename, line, and subroutine are of any use to us, which is why they are the only ones returned when we use caller with no arguments. Unfortunately we do not get the name of the calling subroutine this way, so we have to extract that from further up the stack: # get the name of the calling subroutine, if there was one $callingsub = (caller 1)[3]; Or, more legibly: ($pkg, $file, $line, $callingsub) = caller 1; Armed with this information, we can create more informative error messages that report errors with respect to the caller. For example: Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com [...]... difference between this subroutine and fibonacci1: #!/usr/bin /perl # fib3.pl use warnings; use strict; sub fibonacci3 { my ($count, $aref) = @_; unless ($aref) { # first call - initialize $aref = [1,1]; $count -= scalar(@{$aref}); } if ($count ) { my $next = $aref->[-1] + $aref->[-2]; push @{$aref}, $next; @_ = ($count, $aref); goto &fibonacci3; } else { return wantarray?@{$aref}:$aref->[-1]; } } # calculate... the name): #!/usr/bin /perl # attr.pl use warnings; use strict; { package Testing; use Alias; no strict 'vars'; # to avoid declaring vars sub new { return bless { count => [3, 2, 1], message => 'Liftoff!', }, shift; } sub change { # define @count and $message locally attr(shift); # this relies on 'shift' being a hash reference @count = (1, 2, 3) ; $message = 'Testing, Testing'; } } 235 Chapter 7 Simpo PDF... undefined: sub testing { (@count, $message) = @_; print "@_"; } # ERROR testing(1, 2, 3, "Testing"); # results in @count = (1, 2, 3, "Testing") and $message = undef If we can define all our subroutines like this we won't have anything to worry about, but if we want to pass more than one list we still have a problem 230 Subroutines Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com... illustrate this, the following actually works just fine, the array not withstanding: print volume(@size, 4, 9); # displays 3 * 4 * 9 == 108 We have not supplied three scalars, but we have supplied three values that can be interpreted as scalars, and that's what counts to Perl 237 Chapter 7 Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com We can also use @ and % in prototype... does not, as it might suggest, mean that the subroutine requires a reference to a scalar, array, or hash variable Instead, it causes Perl to require a variable instead of merely a value It also causes Perl to automatically pass the variable as a reference: #!/usr/bin /perl # varproto.pl use warnings; use strict; sub capitalize (\$) { ${$_[0]} = ucfirst (lc ${$_[0]}); } my $country = "england"; capitalize... place the result in a variable passed as the fourth argument, if one is supplied: sub volume ($$$;\$) { $volume = $_[0] * $_[1] * $_[2]; ${$_ [3] } = $volume if defined $_ [3] ; } And here is how we could call it: volume(1, 4, 9, $result); # $result ends up holding 36 Disabling Prototypes All aspects of a subroutine's prototype are disabled if we call it using the old-style prefix & This can occasionally... like, including mis-spelled ones Another problem is that it does not learn from the past; each time we call a non-existent subroutine, Perl looks for it, fails to find it, then calls AUTOLOAD It would be more elegant to define the subroutine so that next time it is called, Perl finds it The chances are that if we use it once, we'll use it again To do that, we just need to create a suitable anonymous subroutine... AUTOLOAD subroutines that define subroutines on-the-fly Autoloading is quite handy in functional programming, but much more useful in modules and packages Accordingly we cover it in more depth in Chapter 10 227 Chapter 7 Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com Passing Parameters Basic Perl subroutines do not have any formal way of defining their arguments We say 'basic' because... Version - http://www.simpopdf.com $message = "Testing"; @count = (1, 2, 3) ; testing ($message, @count); # calls 'testing' see below The array @count is flattened with $message in the @_ array created as a result of this subroutine, so as far as the subroutine is concerned the following call is actually identical: testing ("Testing", 1, 2, 3) ; In many cases this is exactly what we need To read the subroutine... recursion, for example 224 Subroutines Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com Both approaches suffer from the problem that Perl generates a potentially large call stack If we try to calculate a sufficiently large sequence then Perl will run out of room to store this stack and will fail with an error message: Deep recursion on subroutine "main::fibonacci2" at Some languages . subroutine. In modern versions of Perl (that is, anything from Perl 5 onwards) this is strictly optional, but older Perl programs may contain statements like: # call a Perl subroutine using the old. add_one { return $_[0]+1; } $, = ","; print do_list (&add_one, 1, 2, 3) ; # prints 2, 3, 4 Some Perl functions (notably sort), also accept an anonymous subroutine reference as an. anonymous subroutine appearing last will not work with Perl 5.005: #!/usr/bin /perl # sortsub.pl use warnings; use strict; # a list to sort my @list = (3, 4, 2, 5, 6, 9, 1); # directly with a block print