Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 125 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
125
Dung lượng
859,58 KB
Nội dung
Chapter 4: Variables and Data 85 FUNDAMENTALS Arrays An array is just a set of scalars. It’s made up of a list of individual scalars that are stored within a single variable. You can refer to each scalar within that list using a numerical index. You can use arrays to store any kind of list data, from the days of the week to a list of all the lines in a file. Creating individual scalars for each of these is cumbersome, and in the case of the file contents, impossible to prepare for. What happens if the input file has 100 lines instead of 10? The answer is to use an array, which can be dynamically sized to hold any number of different values. Creation Array variables have are prefixed with the @ sign and are populated using either parentheses or the qw operator. For example: @array = (1, 2, 'Hello'); @array = qw/This is an array/; The second line uses the qw// operator, which returns a list of strings, separating the delimited string by white space. In this example, this leads to a four-element array; the first element is 'this' and last (fourth) is 'array'. This means that you can use newlines within the specification: @days = qw/Monday Tuesday Sunday/; We can also populate an array by assigning each value individually: $array[0] = 'Monday'; $array[6] = 'Sunday'; However, you should avoid using square brackets to create a normal array. The line @array = [1, 2, 'Hello']; initializes @array with only one element, a reference to the array contained in the square brackets. We’ll be looking at references in Chapter 10. 86 Perl: The Complete Reference Extracting Individual Indices When extracting individual elements from an array, you must prefix the variable with a dollar sign (to signify that you are extracting a scalar value) and then append the element index within square brackets after the name. For example: @shortdays = qw/Mon Tue Wed Thu Fri Sat Sun/; print $shortdays[1]; Array indices start at zero, so in the preceding example we’ve actually printed “Tue.” You can also give a negative index—in which case you select the element from the end, rather than the beginning, of the array. This means that print $shortdays[0]; # Outputs Mon print $shortdays[6]; # Outputs Sun print $shortdays[-1]; # Also outputs Sun print $shortdays[-7]; # Outputs Mon Remember: ■ Array indices start at zero, not one, when working forward; for example: @days = qw/Monday Tuesday Sunday/; print "First day of week is $days[0]\n"; ■ Array indices start at –1 for the last element when working backward. The use of $[, which changes the lowest index of an array, is heavily deprecated, so the preceding rules should always apply. Be careful when extracting elements from an array using a calculated index. If you are supplying an integer, then there shouldn’t be any problems with resolving that to an array index (provided the index exists). If it’s a floating point value, be aware that Perl always truncates (rounds down) values as if the index were interpreted within the int function. If you want to round up, use sprintf—this is easily demonstrated; the script Chapter 4: Variables and Data 87 FUNDAMENTALS @array = qw/a b c/; print("Array 8/5 (int) is: ", $array[8/5], "\n"); print("Array 8/5 (float) is: ", $array[sprintf("%1.0f",(8/5))],"\n"); generates Array index 8/5 (int) is: b Array index 8/5 (float) is: c The bare 8 / 5, which equates to 1.6, is interpreted as 1 in the former statement, but 2 in the latter. Slices You can also extract a “slice” from an array—that is, you can select more than one item from an array in order to produce another array. @weekdays = @shortdays[0,1,2,3,4]; The specification for a slice must a list of valid indices, either positive or negative, each separated by a comma. For speed, you can also use the range operator: @weekdays = @shortdays[0 4]; Ranges also work in lists: @weekdays = @shortdays[0 2,6,7]; Note that we’re accessing the array using an @ prefix—this is because the return value that we want is another array, not a scalar. If you try accessing multiple values using $array you’ll get nothing, but an error is only reported if you switch warnings on: $ perl -ew "print $ARGV[2,3];" Fred Bob Alice Multidimensional syntax $ARGV[2,3] not supported at -e line 1. Useless use of a constant in void context at -e line 1. Use of uninitialized value in print at -e line 1. Single Element Slices Be careful when using single element slices. The statement print @array[1]; is no different than print $array[1]; except that the former returns a single element list, while the latter returns a single scalar. This can be demonstrated more easily using the fragment @array[1] = <DATA>; which actually reads in all the remaining information from the DATA filehandle, but assigns only the first record read from the filehandle to the second argument of the array. Size The size of an array can be determined using scalar context on the array—the returned value will be the number of elements in the array: @array = (1,2,3); print "Size: ",scalar @array,"\n"; The value returned will always be the physical size of the array, not the number of valid elements. You can demonstrate this, and the difference between scalar @array and $#array, using this fragment: @array = (1,2,3); $array[50] = 4; print "Size: ",scalar @array,"\n"; print "Max Index: ", $#array,"\n"; This should return Size: 51 Max Index: 50 88 Perl: The Complete Reference Chapter 4: Variables and Data 89 FUNDAMENTALS There are only four elements in the array that contain information, but the array is 51 elements long, with a highest index of 50. Hashes Hashes are an advanced form of array. One of the limitations of an array is that the information contained within it can be difficult to get to. For example, imagine that you have a list of people and their ages. We could store that information in two arrays, one containing the names and the other their ages: @names = qw/Martin Sharon Rikke/; @ages = (28,35,29); Now when we want to get Martin’s age, we just access index 0 of the @ages array. Furthermore, we can print out all the people’s ages by printing out the contents of each array in sequence: for($i=0;$i<@names;$i) { print "$names[$i] is $ages[$i] years old\n"; } But how would you print out Rikke’s age if you were only given her name, rather than her location within the @names array? The only way would be to step through @names until we found Rikke, and then look up the corresponding age in the @ages array. This is fine for the three-element array listed here, but what happens when that array becomes 30, 300, or even 3000 elements long? If the person we wanted was at the end of the list, we’d have to step through 3000 items before we got to the information we wanted. The hash solves this, and numerous other problems, very neatly by allowing us to access that @ages array not by an index, but by a scalar key. Because it’s a scalar, that value could be anything (including a reference to another hash, array, or even an object), but for this particular problem it would make sense to make it the person’s name: %ages = ('Martin' => 28, 'Sharon' => 35, 'Rikke' => 29,); Now when we want to print out Rikke’s age, we just access the value within the hash using Rikke’s name as the key: print "Rikke is $ages{Rikke} years old\n"; 90 Perl: The Complete Reference The process works on 3000 element hashes just as easily as it does on 3: print "Eileen is $ages{Eileen} years old\n"; We don’t have to step through the list to find what we’re looking for—we can just go straight to the information. Perl’s hashes are also more efficient than those supported by most other languages. Although it is possible to end up with a super-large hash that takes a long time to locate its values, you are probably talking tens or hundreds of thousands of entries. If you are working with that level of information though, consider using a DBM file—see Chapter 13 for more information. Creation Hashes are created in one of two ways. In the first, you assign a value to a named key on a one-by-one basis: $ages{Martin} = 28; In the second, you use a list, which is converted by taking individual pairs from the list: the first element of the pair is used as the key, and the second, as the value. For example, %hash = ('Fred' , 'Flintstone', 'Barney', 'Rubble'); For clarity, you can use => as an alias for , to indicate the key/value pairs: %hash = ('Fred' => 'Flintstone', 'Barney' => 'Rubble'); When specifying the key for a hash element, you can avoid using quotes within the braces according to the normal brace expansion rules: $ages{Martin} = 28; However, if the contents are a more complex term, they will need to be quoted: $ages{'Martin-surname'} = 'Brown'; You can also use the - operator in front of a word, although this makes the key include the leading - sign as part of the key: %hash = (-Fred => 'Flintstone', -Barney => 'Rubble'); print $hash{-Fred}; TEAMFLY Team-Fly ® For single-letter strings, however, this will raise a warning; use single quotes to explicitly define these arguments. Extracting Individual Elements You can extract individual elements from a hash by specifying the key for the value that you want within braces: print $hash{Fred}; Care needs to be taken when embedding strings and/or variables that are made up of multiple components. The following statements are identical, albeit with a slight performance trade-off for the former method: print $hash{$fred . $barney}; print $hash{"$fred$barney"}; When using more complex hash keys, use sprintf: print $hash{sprintf("%s-%s:%s",$a,$b,$c)}; You can also use numerical values to build up your hash keys—the values just become strings. If you are going to use this method, then you should use sprintf to enforce a fixed format for the numbers to prevent minor differences from causing you problems. For example, when formatting time values, it’s better to use $hash{sprintf("%02d%02d",$hours,$min)}; than $hash{$hours . $min}; With the former, all times will be displayed in the form ‘0505’ instead of ‘55’. Extracting Slices You can extract slices out of a hash just as you can extract slices from an array. You do, however, need to use the @ prefix because the return value will be a list of corresponding values: %hash = (-Fred => 'Flintstone', -Barney => 'Rubble'); print join("\n",@hash{-Fred,-Barney}); Chapter 4: Variables and Data 91 FUNDAMENTALS 92 Perl: The Complete Reference Using $hash{-Fred, -Barney} would return nothing. Extracting Keys, Values, or Both You can get a list of all of the keys from a hash by using keys: %ages = ('Martin' => 28, 'Sharon' => 35, 'Rikke' => 29); print "The following are in the DB: ",join(', ',keys %ages),"\n"; You can also get a list of the values using values: %ages = ('Martin' => 28, 'Sharon' => 35, 'Rikke' => 29); print "The following are in the DB: ",join(', ',values %ages),"\n";\ These can be useful in loops when you want to print all of the contents of a hash: foreach $key (%ages) { print "$key is $ages{$key} years old\n"; } The problem with both these functions is that on large hashes (such as those attached to external databases), we can end up with very large memory-hungry temporary lists. You can get round this by using the each function, which returns key/value pairs. Unlike keys and values, the each function returns only one pair for each invocation, so we can use it within a loop without worrying about the size of the list returned in the process: while (($key, $value) = each %ages) { print "$key is $ages{$key} years old\n"; } The order used by keys, values, and each is unique to each hash, and its order can’t be guaranteed. Also note that with each, if you use it once outside of a loop, the next invocation will return the next item in the list. You can reset this “counter” by evaluating the entire hash, which is actually as simple as sort keys %hash; Chapter 4: Variables and Data 93 FUNDAMENTALS Checking for Existence If you try to access a key/value pair from a hash that doesn’t exist, you’ll normally get the undefined value, and if you have warnings switched on, then you’ll get a warning generated at run time. You can get around this by using the exists function, which returns true if the named key exists, irrespective of what its value might be: if (exists($ages{$name})) { print "$name if $ages{$name} years old\n"; } else { print "I don't know the age of $name\n"; } Sorting/Ordering There is no way to simply guarantee that the order in which a list of keys, values, or key/value pairs will always be the same. In fact, it’s best not even to rely on the order between two sequential evaluations: print(join(', ',keys %hash),"\n"); print(join(', ',keys %hash),"\n"); If you want to guarantee the order, use sort, as, for example: print(join(', ',sort keys %hash),"\n"); If you’re accessing a hash a number of times and want to use the same order, consider creating a single array to hold the sorted sequence, and then use the array (which will remain in sorted order) to iterate over the hash. For example: my @sortorder = sort keys %hash; foreach my $key (@sortorder) Size You get the size—that is, the number of elements—from a hash by using scalar context on either keys or values: print "Hash size: ",scalar keys %hash,"\n"; Don’t use each, as in a scalar context it returns the first key from the hash, not a count of the key/value pairs, as you might expect. If you evaluate a hash in scalar context, then it returns a string that describes the current storage statistics for the hash. This is reported as “used/total” buckets. The buckets are the storage containers for your hash information, and the detail is only really useful if you want to know how Perl’s hashing algorithm is performing on your data set. If you think this might concern you, then check my Debugging Perl title, which details how hashes are stored in Perl and how you can improve the algorithm for specific data sets (see Appendix C for more information). Lists Lists are really a special type of array—essentially, a list is a temporary construct that holds a series of values. The list can be “hand” generated using parentheses and the comma operator, @array = (1,2,3); or it can be the value returned by a function or variable when evaluated in list context: print join(',' @array); Here, the @array is being evaluated in list context because the join function is expecting a list (see Chapter 6 for more information on contexts). Merging Lists (or Arrays) Because a list is just a comma-separated sequence of values, you can combine lists together: @numbers = (1,3,(4,5,6)); The embedded list just becomes part of the main list—this also means that we can combine arrays together: @numbers = (@odd,@even); Functions that return lists can also be embedded to produce a single, final list: @numbers = (primes(),squares()); 94 Perl: The Complete Reference [...]... of all the offsets of the last successful submatches from the last regular expression Note that this contains the offset to the first character following the match, not the location of the match itself This is the equivalent of the value returned by the pos function The first index, $+[0] is offset to the end of the entire match Therefore, $+[1] is the location where $1 ends, $+ [2] , where $2 ends Team-Fly®... $CHILD_ERROR The status returned by the last external command (via backticks or system) or the last pipe close This is the value returned by wait, so the true return value is $? >> 8, and $? & 127 is the number of the signal received by the process, if appropriate FUNDAMENTALS $] $OLD _PERL_ VERSION 105 106 Perl: The Complete Reference $^C $COMPILING The value of the internal flag associated with the -c switch... construct 108 Perl: The Complete Reference $^T $BASETIME The time at which the script started running, defined as the number of seconds since the epoch $^V $PERL_ VERSION The current revision, version, and subversion of the currently executing Perl interpreter Specified as a v-string literal $VERSION The variable accessed to determine whether a given package matches the acceptable version when the module... list of the files that have been included via do, require, or use The key is the file you specified, and the value is the actual location of the imported file $^I The value of the inplace-edit extension (enabled via the -i switch on the command line) True if inplace edits are currently enabled, false otherwise $^M The size of the emergency pool reserved for use by Perl and the die function when Perl runs... list of all the offsets to the beginning of the last successful submatches from the last regular expression The first index, $-[0], is offset to the start of the entire match Therefore, $-[1] is equal to $1, $- [2] is equal to $2, and so on $ $NR $INPUT_LINE_NUMBER The current input line number of the last file from which you read This can be either the keyboard or an external file or other filehandle... to the output channel This is set to “\f” by default $@ $EVAL_ERROR The error message returned by the Perl interpreter when Perl has been executed via the eval function If empty (false), then the last eval call executed successfully 103 FUNDAMENTALS $% $FORMAT_PAGE_NUMBER format_page_number HANDLE EXPR Variables and Data 104 Perl: The Complete Reference $$ $PID $PROCESS_ID The process number of the Perl. .. using the null filehandle in the angle operator $ARGV The name of the current file when reading from the default filehandle @ARGV The @ARGV array contains the list of the command line arguments supplied to the script Note that the first value, at index zero, is the first argument, not the name of the script ARGVOUT The special filehandle used to send output to a new file when processing the ARGV... values as LIST 111 1 12 Perl: The Complete Reference The third format allows for exceptions If the expression evaluates to true, then the first block is executed; otherwise (else), the second block is executed: if ($date == $today) { print "Happy Birthday!\n"; } else { print "Happy Unbirthday!\n"; } The fourth form allows for additional tests if the first expression does not return true The elsif can be... chapter, in the The continue Block” section until Loops The inverse of the while loop is the until loop, which evaluates the conditional expression and reiterates over the loop only when the expression returns false Once the expression returns true, the loop ends In the case of a do…until loop, the conditional expression is only evaluated at the end of the code block In an until (EXPR) BLOCK loop, the expression... _PACKAGE_ _ The name of the current package If there is no current package, then it returns the undefined value _ _END_ _ Indicates the end of the script (or interpretable Perl) within a file before the physical end of file _ _DATA_ _ As for END , except that it also indicates the start of the DATA filehandle that can be opened with the open, therefore allowing you to embed script and data into the same . Data 103 FUNDAMENTALS 104 Perl: The Complete Reference $$ $PID $PROCESS_ID The process number of the Perl interpreter executing the current script. $< $UID $REAL_USER_ID The real ID of the user currently. that this contains the offset to the first character following the match, not the location of the match itself. This is the equivalent of the value returned by the pos function. The first index,. with the outside world. Token Value _ _LINE_ _ The current line number within the current file. _ _FILE_ _ The name of the current file. _ _PACKAGE_ _ The name of the current package. If there