1. Trang chủ
  2. » Công Nghệ Thông Tin

Mastering Unix Shell Scripting phần 4 pps

71 264 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 71
Dung lượng 399,95 KB

Nội dung

Solaris # sar 10 4 SunOS wilma 5.8 Generic i86pc 07/29/02 23:01:55 %usr %sys %wio %idle 23:02:05 1 1 0 98 23:02:15 12 53 0 35 23:02:25 15 67 0 18 23:02:35 21 59 0 21 Average 12 45 0 43 Now let’s look at the average of the samples directly. # sar 10 4 | grep Average Average 12 45 0 43 What Is the Common Denominator? With the sar command the only common denominator is that we can always grep on the word “Average.” Like the iostat command, the fields vary between some Unix flavors. We can use a similar case statement to extract the correct fields for each Unix flavor, as shown in Listing 7.3. OS=$(uname) case $OS in AIX|HP-UX|SunOS) F1=2 F2=3 F3=4 F4=5 echo “\nThe Operating System is $OS\n” ;; Linux) F1=3 F2=4 F3=5 F4=6 echo “\nThe Operating System is $OS\n” ;; *) echo “\nERROR: $OS is not a supported operating system\n” echo “\n\t EXITING \n” exit 1 ;; esac Listing 7.3 Case statement for the sar fields of data. 190 Chapter 7 Notice in Listing 7.3 that a single case statement sets up the environment for the shell script to select the correct fields from the sar command for each of the four Unix flavors. If the Unix flavor is not in the list, then the user receives an error message before the script exits with a return code of 1, one. Later we will cover the entire shell script. Syntax for vmstat The vmstat command stands for virtual memory statistics. Using the vmstat command, we can get a lot of data about the system including memory, paging space, page faults, and CPU statistics. We are concentrating on the CPU statistics in this chapter, so let’s stay on track. The vmstat commands also allow us to take direct samples over intervals for a specific time period. The vmstat command does not do any averaging for us, however, we are going to stick with two intervals. The first interval is the average of the system load since the last system reboot, like the iostat command. The last line con- tains the most current sample. Let’s look at the output of the vmstat command for each of our Unix flavors, AIX, HP-UX, Linux, and Solaris. AIX [root:yogi]@/scripts# vmstat 30 2 kthr memory page faults cpu r b avm fre re pi po fr sr cy in sy cs us sy id wa 0 0 23936 580 0 0 0 0 2 0 103 2715 713 8 25 67 0 1 0 23938 578 0 0 0 0 0 0 115 9942 2730 24 76 0 0 The last line of output is what we are looking for. This is the average of the CPU load over the length of the interval. We want just the last four columns in the output. The fields that we want to extract for AIX are in positions $14, $15, $16, and $17. HP-UX # vmstat 30 2 procs memory page faults cpu r b w avm free re at pi po fr de sr in sy cs us sy id 0 39 0 8382 290 122 26 2 0 0 0 3 128 2014 146 14 21 65 1 40 0 7532 148 345 71 0 0 0 0 0 108 5550 379 29 43 27 The HP-UX vmstat output is a long string of data. Notice for the CPU data that HP- UX supplies only three values: user part, system part, and the CPU idle time. The fields that we want to extract are in positions $16, $17, and $18. Monitoring System Load 191 Linux # vmstat 30 2 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 2 0 0 244 1088 1676 21008 0 0 1 0 127 72 1 1 99 3 0 0 244 1132 1676 21008 0 0 0 1 212 530 37 23 40 Like HP-UX, the Linux vmstat output for CPU activity has three fields: user part, system part, and the CPU idle time. The fields that we want to extract are in positions $14, $15, and $16. Solaris # vmstat 30 2 procs memory page disk faults cpu r b w swap free re mf pi po fr de sr cd f0 s0 in sy cs us sy id 0 0 0 558316 33036 57 433 2 0 0 0 0 0 0 0 0 111 500 77 2 8 90 0 0 0 556192 29992 387 2928 0 0 0 0 0 1 0 0 0 155 2711 273 14 60 26 As with HP-UX and Linux, the Solaris vmstat output for CPU activity consists of the last three fields: user part, system part, and the CPU idle time. What Is the Common Denominator? There are at least two common denominators for the vmstat command output between the Unix flavors. The first is that the CPU data is in the last fields. On AIX the data is in the last four fields with the added I/O wait state. HP-UX, Linux, and Solaris do not list the wait state. The second common factor is that the data is always on a row that is entirely numeric. Again, we need a case statement to parse the correct fields for the command output. Take a look at Listing 7.4. OS=$(uname) case $OS in AIX) F1=14 F2=15 F3=16 F4=17 echo “\nThe Operating System is $OS\n” ;; Listing 7.4 Case statement for the vmstat fields of data. 192 Chapter 7 HP-UX) F1=16 F2=17 F3=18 F4=1 # This “F4=1” is bogus and not used for HP-UX echo “\nThe Operating System is $OS\n” ;; Linux) F1=14 F2=15 F3=16 F4=1 # This “F4=1” is bogus and not used for Linux echo “\nThe Operating System is $OS\n” ;; SunOS) F1=20 F2=21 F3=22 F4=1 # This “F4=1” is bogus and not used for SunOS echo “\nThe Operating System is $OS\n” ;; *) echo “\nERROR: $OS is not a supported operating system\n” echo “\n\t EXITING \n” exit 1 ;; esac Listing 7.4 Case statement for the vmstat fields of data. (continued) Notice in Listing 7.4 that the F4 variable gets a valid assignment only on the AIX match. For HP-UX, Linux, and Solaris, the F4 variable is assigned the value of the $1 field, specified by the F4=1 variable assignment. This bogus assignment is made so that we do not need a special vmstat command statement for each operating system. You will see how this works in detail in the scripting section. Scripting the Solutions Each of the techniques presented is slightly different in execution and output. Some options need to be timed over an interval for a user-defined amount of time, measured Monitoring System Load 193 in seconds. We can get an immediate load measurement using the uptime command, but the sar, iostat, and vmstat commands require the user to specify a period of time to measure over and the number of intervals to sample the load. If you enter the sar, iostat, or vmstat commands without any arguments, then the statistics presented are an average since the last system reboot. Because we want current statistics, the scripts must supply a period of time to sample. We are always going to initialize the INTERVAL variable to equal 2. The first line of output is measured since the last system reboot, and the second line is the current data that we are looking for. Let’s look at each of these commands in separate shell scripts in the following sections. Using uptime to Measure the System Load Using uptime is one of the best indicators of the system load. The last columns of the output represent the average of the run queue over the last 5, 10, and 15 minutes for an AIX machine and over the last 1, 5, and 10 minutes for HP-UX, Linux, and Solaris. A run queue is where jobs wanting CPU time line up for their turn for some processing time in the CPU. The priority of the process, or on some systems a thread, has a direct influence on how long a job has to wait in line before getting more CPU time. The lower the priority, the more CPU time. The higher the priority, the less CPU time. The uptime command always has an average of the length of the run queue. The threshold trigger value that you set will depend on the normal load of your system. My little C-10 AIX box starts getting very slow when the run queue hits 2, but the S-80 at work typically runs with a run queue value over 8 because it is a multiprocessor machine running a terabyte database. With these differences in acceptable run queue levels, you will need to tailor the threshold level for notification on a machine-by- machine basis. Scripting with the uptime Command Scripting the uptime solution is a short shell script, and the response is immediate. As you remember in the “Syntax” section, we had to follow the floating load statistics as the time since the last reboot moved from minutes, to hours, and even days after the machine was rebooted. The good thing is that the floating fields are consistent across the Unix flavors studied in this book. Let’s look at the uptime_loadmon.ksh shell shown in Listing 7.5. #!/bin/ksh # # SCRIPT: uptime_loadmon.ksh # AUTHOR: Randy Michael # DATE: 07/26/2002 # REV: 1.0.P # PLATFORM: AIX, HP-UX, Linux, and Solaris # Listing 7.5 uptime_loadmon.ksh shell script listing. 194 Chapter 7 # PURPOSE: This shell script uses the “uptime” command to # extract the most current load average data. There # is a special need in this script to determine # how long the system has been running since the # last reboot. The load average field “floats” # during the first 24 hours after a system restart. # # set -x # Uncomment to debug this shell script # set -n # Uncomment to check script syntax without any execution # ################################################### ############# DEFINE VARIABLES HERE ############### ################################################### MAXLOAD=2.00 typeset -i INT_MAXLOAD=$MAXLOAD # Find the correct field to extract based on how long # the system has been up, or since the last reboot. if $(uptime | grep day | grep min >/dev/null) then FIELD=11 elif $(uptime | grep day | grep hrs >/dev/null) then FIELD=11 elif $(uptime | grep day >/dev/null) then FIELD=10 elif $(uptime | grep min >/dev/null) then FIELD=9 else FIELD=8 fi ################################################### ######## BEGIN GATHERING STATISTICS HERE ########## ################################################### echo “\nGathering System Load Average using the \”uptime\” command\n” # This next command statement extracts the latest # load statistics no matter what the Unix flavor is. LOAD=$(uptime | sed s/,//g | awk ‘{print $’$FIELD’}’) Listing 7.5 uptime_loadmon.ksh shell script listing. (continues) Monitoring System Load 195 # We need an integer representation of the $LOAD # variable to do the test for the load going over # the set threshold defined by the $INT_MAXLOAD # variable typeset -i INT_LOAD=$LOAD # If the current load has exceeded the threshold then # issue a warning message. The next step always shows # the user what the current load and threshold values # are set to. ((INT_LOAD >= INT_MAXLOAD)) && echo “\nWARNING: System load has \ reached ${LOAD}\n” echo “\nSystem load value is currently at ${LOAD}” echo “The load threshold is set to ${MAXLOAD}\n” Listing 7.5 uptime_loadmon.ksh shell script listing. (continued) There are two statements that I want to point out in Listing 7.5 that are highlighted in boldface text. First, notice the LOAD= statement. To make the variable assignment we use command substitution, defined by the VAR=$(command statement) notation. In the command statement we execute the uptime command and pipe the output to a sed statement. This sed statement removes all of the commas (,) from the uptime out- put. We need to take this step because the load statistics are comma separated. Once the commas are removed, the remaining output is piped to the awk statement that extracts the correct field that is defined at the top of the shell script by the FIELD vari- able and based on how long the system has been running. In this awk statement notice how we find the positional parameter that the $FIELD variable is pointing to. If you try to use the syntax $$FIELD, the result is the current process ID ($$) and the word FIELD. To get around this little problem of directly access- ing what a variable is pointing to, we use the following syntax: # The $8 variable points to the value 34. FIELD=8 # Wrong usage echo $$FIELD 3243FIELD # Correct usage echo $’$FIELD’ 34 196 Chapter 7 Notice that the latter usage is correct, and the actual result is the value of the $8 field, which is currently 34. This is really telling us the value of what a pointer is pointing to. You will see other uses of this technique as we go through this chapter. The second command statement that I want to point out is the test of the INT_LOAD value to the INT_MAXLOAD value, which are integer values of the LOAD and MAXLOAD variables. If the INT_LOAD is equal to, or has exceeded, the INT_MAXLOAD, then we use a logical AND (&&) to echo a warning to the user’s screen. Using the logical AND saves a little code and is faster than an if then else statement. You can see the uptime_loadmon.ksh shell script in action in Listings 7.6 and 7.7. # ./uptime_loadmon.ksh Gathering System Load Average using the “uptime” command System load value is currently at 1.86 The load threshold is set to 2.00 Listing 7.6 Script in action under “normal” load. Listing 7.6 shows the uptime_loadmon.ksh shell script in action on a machine that is under a normal load. Listing 7.7 shows the same machine under an excessive load—at least, it is excessive for this little machine. # ./uptime_loadmon.ksh Gathering System Load Average using the “uptime” command WARNING: System load has reached 2.97 System load value is currently at 2.97 The load threshold is set to 2.00 Listing 7.7 Script in action under “excessive” load. This is about all there is to using the uptime command. Let’s move on to the sar command. Using sar to Measure the System Load Most Unix flavors have sar data collection set up by default. This sar data is presented when the sar command is executed without any switches. The data that is displayed is automatically collected at scheduled intervals throughout the day and compiled into a Monitoring System Load 197 report at day’s end. By default, the system keeps a month’s worth of data available for online viewing. This is great for seeing the basic trends of the machine as it is loaded through the day. If we want to collect data at a specific time of day for a specific period of time, then we need to add the number of seconds for each interval and the total number of intervals to the sar command. The final line in the output is an average of all of the previous sample intervals. This is where our shell script comes into play. By using a shell script with the times and intervals defined, we can take samples of the system load over small or large incre- ments of time without interfering with the system’s collection of sar data. This can be a valuable tool for things like taking hundreds of small incremental samples as a devel- opment application is being tested. Of course, this technique can also help in trou- bleshooting just about any application. Let’s look at how we script the solution. Scripting with the sar Command For each of our Unix flavors the sar command produces four CPU load statistics. The outputs vary somewhat, but the basic idea remains the same. In each case, we define an INTERVAL variable specifying the total number of samples to take and a SECS vari- able to define the total number of seconds for each sample interval. Notice that we used the variable SECS as opposed to SECONDS. We do not want to use the variable SECONDS because it is a Korn shell built-in variable used for timing in a shell. As I stated in the introduction, this book uses variable names in uppercase so the reader will quickly know that the code is referencing a variable; however, in the real world you may want to use the lowercase version of the variable name. It really would not matter here because we are defining the variable value and then using it within the same second, hopefully. The next step in this shell script is to define which positional fields we need to extract to get the sar data for each of the Unix operating systems. For this step we use a case statement using the uname command output to define the fields of data. It turns out that AIX, HP-UX, and SunOS operating systems all have the sar data located in the $2, $3, $4, and $5 positions. Linux differs in this respect with the sar data residing in the $3, $4, $5, and $6 positions. In each case, these field numbers are assigned to the F1, F2, F3, and F4 variables inside the case statement. Let’s look at the sar_loadmon.ksh shell script in Listing 7.8 and cover the remain- ing details at the end. #!/bin/ksh # # SCRIPT: sar_loadmon.ksh # AUTHOR: Randy Michael # DATE: 07/26/2002 # REV: 1.0.P # PLATFORM: AIX, HP-UX, Linux, and Solaris # Listing 7.8 sar_loadmon.ksh shell script listing. 198 Chapter 7 # PURPOSE: This shell script takes multiple samples of the CPU # usage using the “sar” command. The average of # sample periods is shown to the user based on the # Unix operating system that this shell script is # executing on. Different Unix flavors have differing # outputs and the fields vary too. # # REV LIST: # # # set -n # Uncomment to check the script syntax without any execution # set -x # Uncomment to debug this shell script # ################################################### ############# DEFINE VARIABLES HERE ############### ################################################### SECS=30 # Defines the number of seconds for each sample INTERVAL=10 # Defines the total number of sampling intervals OS=$(uname) # Defines the Unix flavor ################################################### ##### SETUP THE ENVIRONMENT FOR EACH OS HERE ###### ################################################### # These “F-numbers” point to the correct field in the # command output for each Unix flavor. case $OS in AIX|HP-UX|SunOS) F1=2 F2=3 F3=4 F4=5 echo “\nThe Operating System is $OS\n” ;; Linux) F1=3 F2=4 F3=5 F4=6 echo “\nThe Operating System is $OS\n” ;; *) echo “\nERROR: $OS is not a supported operating system\n” echo “\n\t EXITING \n” exit 1 ;; Listing 7.8 sar_loadmon.ksh shell script listing. (continues) Monitoring System Load 199 [...]... $INTERVAL AIX yogi 1 5 0001256 048 00 19: 24: 00 19: 24: 30 19:25:00 19:25:30 19:26:00 19:26:30 19:27:00 19:27:30 19:28:00 19:28:30 19:29:00 Average 07/31/02 %usr 0 4 26 13 16 27 20 5 11 9 %sys 1 15 28 12 44 73 48 6 9 18 %wio 1 13 40 11 0 0 2 9 5 0 %idle 98 68 6 64 39 0 30 80 75 73 13 26 8 53 The previous output is produced by the first part of the sar command statement Then, all of this output is piped to the next... shown to the user based # on the Unix operating system that this shell script is # executing on Different Unix flavors have differing # outputs and the fields vary too # # REV LIST: # # Listing 7.10 iostat_loadmon.ksh shell script listing (continues) 203 2 04 Chapter 7 # set -n # Uncomment to check the script syntax without any execution # set -x # Uncomment to debug this shell script # ###################################################... Unix flavor ################################################### ##### SETUP THE ENVIRONMENT FOR EACH OS HERE ###### ################################################### # These “F-numbers” point to the correct field in the # command output for each Unix flavor case $OS in AIX|HP-UX) SWITCH=’-t’ F1=3 F2 =4 F3=5 F4=6 echo “\nThe Operating System is $OS\n” ;; Linux|SunOS) SWITCH=’-c’ F1=1 F2=2 F3=3 F4 =4. .. following output: 23.15 31.77 0.00 0.00 26.09 21.79 50.76 46 .44 This brings us to the next addition to the iostat command statement in the shell script This is where we add the awk part of the statement using the F1, F2, F3, and F4 variables, as shown here iostat $SWITCH $SECS $INTERVAL | egrep -v ‘[a-zA-Z]|^$’ \ | awk ‘{print $’$F1’, $’$F2’, $’$F3’, $’$F4’}’ This is the same code that we covered in the last... HP-UX has only three relative columns in the output F1=16 F2=17 F3=18 F4=1 # This “F4=1” is bogus and not used for HP-UX echo “\nThe Operating System is $OS\n” ;; Linux) # Linux has only three relative columns in the output F1= 14 F2=15 Listing 7.12 vmstat_loadmon.ksh shell script listing (continues) 209 210 Chapter 7 F3=16 F4=1 # This “F4=1” is bogus and not used for Linux echo “\nThe Operating System is... the main purpose of this command anyway Let’s look at the vmstat script Scripting with the vmstat Command When you look at this shell script for vmstat you will think that you just saw this shell script in the last section Most of these two shell scripts are the same, with only minor exceptions Let’s look at the vmstat_loadmon.ksh shell script in Listing 7.12 and cover the differences in detail at the... iostat_loadmon.ksh shell script in action (continued) Notice that the output is in the same format as the sar script output This is all there is to the iostat shell script Let’s now move on to the vmstat solution Using vmstat to Measure the System Load The vmstat shell script uses the exact same technique as the iostat shell script in the previous section Only AIX produces four fields of output; the remaining Unix. .. wait while gathering statistics User part is 14% System part is 54% Idle time is 31% Listing 7.13 vmstat_loadmon.ksh shell script in action (continued) Notice that the Solaris output shown in Listing 7.13 does not show the I/O wait state This information is available only on AIX for the vmstat shell script The output format is the same as the last few shell scripts It is up to you how you want to use... ‘{print $’$F1’, $’$F2’, $’$F3’, $’$F4’}’ \ | while read FIRST SECOND THIRD FOURTH We really need to look at the statement one pipe at a time In the very first part of the statement we take the sample(s) over the defined number of intervals Consider the following statement and output: SECS=30 INTERVAL=10 # sar $SECS $INTERVAL AIX yogi 1 5 0001256 048 00 19: 24: 00 19: 24: 30 19:25:00 19:25:30 19:26:00 19:26:30... section that we defined the field numbers and assigned them to the F1, F2, F3, and F4 variables, which in our case results in F1=2, F2=3, F3 =4, and F4=5 Using the following extension to our previous command we get the following statement: sar $SECS $INTERVAL | grep Average \ | awk ‘{print $’$F1’, $’$F2’, $’$F3’, $’$F4’}’ 201 202 Chapter 7 Notice that we continued the command statement on the next line . 0001256 048 00 07/31/02 19: 24: 00 %usr %sys %wio %idle 19: 24: 30 0 1 1 98 19:25:00 4 15 13 68 19:25:30 26 28 40 6 19:26:00 13 12 11 64 19:26:30 16 44 0 39 19:27:00 27 73 0 0 19:27:30 20 48 2 30 19:28:00. swpd free buff cache si so bi bo in cs us sy id 2 0 0 244 1088 1676 21008 0 0 1 0 127 72 1 1 99 3 0 0 244 1132 1676 21008 0 0 0 1 212 530 37 23 40 Like HP-UX, the Linux vmstat output for CPU activity. Chapter 7 HP-UX) F1=16 F2=17 F3=18 F4=1 # This “F4=1” is bogus and not used for HP-UX echo “ The Operating System is $OS ” ;; Linux) F1= 14 F2=15 F3=16 F4=1 # This “F4=1” is bogus and not used for

Ngày đăng: 14/08/2014, 16:20

TỪ KHÓA LIÊN QUAN