UNIX Unleashed, System Administrator''''s Edition phần 8 potx

95 192 0
UNIX Unleashed, System Administrator''''s Edition phần 8 potx

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

237 awilson 0 0 0 0 1 0 0 0 0 2 0 273 wdwood 0 0 0 0 1 0 0 0 0 1 0 The definition of each column in this report are as follows: UID The user's identification number. LOGIN NAME The user's name. CPU prime/non prime The amount of time the user's program required the use of CPU. This is rounded up to the nearest minute. KCORE prime/non prime The amount of memory per minute used to run the programs. This is rounded up to the nearest kilobyte. CONNECT prime/non prime Total time the user was actually connected to the system. DISK BLOCKS The number of disk blocks used. # OF PROCS The number of processes the user executed. # OF SESS The number of sessions the user incurred by logging in. # DISK SAMPLES The number of times acctdusg or diskusg was run to cumulate the average number of DISK BLOCKS. FEE The total amount of usage charges accessed to the user for this given period. Daily Command Summary Report and Total Command Summary Report The Daily Command Summary Report is found in the /var/adm/acct/nite directory. It is an ASCII file called daycms and can be viewed with any available text viewer or editor. $ cat /var/adm/acct/nite/daycms TOTAL COMMAND SUMMARY COMMAND NUMBER TOTAL TOTAL TOTAL MEAN MEAN HOG CHARS BLOCKS NAME CMDS KCOREMIN CPU-MIN REAL-MIN SIZE-K CPU-MIN FACTOR TRNSFD READ TOTALS 82 12.68 0.06 21.91 209.92 0.00 0.28 6.636e+06 0.00 man 1 7.56 0.02 1.68 440.00 0.02 1.02 5.566e+06 0.00 vi 1 2.24 0.02 0.53 121.00 0.02 3.49 71936.00 0.00 ls 5 1.15 0.01 0.02 108.15 0.00 68.33 117144.00 0.00 fgrep 14 0.39 0.00 0.01 124.17 0.00 42.86 286776.00 0.00 tail 14 0.36 0.00 0.02 126.82 0.00 18.03 142744.00 0.00 bsh 6 0.28 0.00 0.01 99.27 0.00 27.50 49410.00 0.00 ps 1 0.21 0.00 0.00 137.00 0.00 66.67 19696.00 0.00 ftpd 1 0.20 0.00 0.81 155.00 0.00 0.16 41576.00 0.00 sendmail 1 0.12 0.00 0.00 468.00 0.00 100.00 13744.00 0.00 fwtmp 3 0.07 0.00 0.00 143.00 0.00 25.00 35840.00 0.00 more 2 0.05 0.00 2.41 195.00 0.00 0.01 30144.00 0.00 pg 3 0.03 0.00 14.78 28.50 0.00 0.01 61232.00 0.00 ksh 2 0.00 0.00 0.00 0.00 0.00 0.00 18360.00 0.00 rm 6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 accton 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 acctwtmp 2 0.00 0.00 0.00 0.00 0.00 0.00 128.00 0.00 egrep 14 0.00 0.00 0.00 0.00 0.00 0.00 143976.00 0.00 grep 1 0.00 0.00 0.00 0.00 0.00 0.00 23976.00 0.00 dspmsg 2 0.00 0.00 0.00 0.00 0.00 0.00 8214.00 0.00 sh 1 0.00 0.00 1.64 0.00 0.00 0.02 5058.00 0.00 The Total Command Summary Report looks like the preceding report with one exception. It is usually a monthly summary showing total accumulated since the last month or the last execution of monacct. The Total Command Summary Report is found in the /var/adm/acct/sum directory. It is an ASCII file called cms and can be viewed with any available text viewer or editor. The definitions of each column in this report are as follows: COMMAND NAME The name of the command. NUMBER COMMANDS The total number of times the command has been executed. KCOREMIN The total cumulative kilobytes segments used by the command. TOTAL CPU-MIN The total processing time in minutes. REAL-MIN The actual processing time in minutes. MEAN SIZE-K The mean of TOTAL KCOREMIN divided by execution. MENU CPU-MIN The mean of executions divided by total processing time in minutes. HOG FACTOR The total processing time divided by elapsed time. This is the utilization ratio of the system. CHARS TRNSFD The total number of reads and writes to the filesystem. BLOCKS READ The total number of physical block reads and writes. Daily Systems Accounting Summary Report This report is generated by the runacct command via cron. This report is found in the /var/adm/acct/sum directory and is a file whose format is rprt{MMDD}. This file is a summary report of daily activity for the system. The Daily Systems Accounting Summary Report is found in the /var/adm/acct/sum directory. It is an ASCII file and can be viewed with any available text viewer or editor. An example of this report follows: $ cat /var/adm/acct/sum/rprt0510 Sat May 10 21:41:50 EST 1997 DAILY REPORT FOR AIX Page 1 from Sat May 10 21:27:28 EST 1997 to Sat May 10 21:41:46 EST 1997 1 openacct 1 runacct 1 acctcon1 TOTAL DURATION: 14 MINUTES LINE MINUTES PERCENT # SESS # ON # OFF lft0 14 100 1 1 1 pts/0 14 100 1 1 1 pts/1 14 100 1 1 1 pts/2 14 100 1 1 1 pts/3 14 100 1 1 1 TOTALS 72 5 5 5 Sat May 10 21:41:50 EST 1997 DAILY USAGE REPORT FOR AIX Page 1 LOGIN CPU CPU KCORE KCORE CONNECT CONNECT DISK FEES # OF # OF # DISK UID NAME PRIME NPRIME PRIME NPRIME PRIME NPRIME BLOCKS PROCS SESS SAMPLES 0 TOTAL 0 0 0 7 0 72 5 0 216 5 4 0 root 0 0 0 1 0 72 1 0 28 5 1 2 bin 0 0 0 0 0 0 1 0 0 0 1 4 adm 0 0 0 6 0 0 0 0 188 0 0 100 guest 0 0 0 0 0 0 1 0 0 0 1 200 servdir 0 0 0 0 0 0 2 0 0 0 1 Sat May 10 21:41:48 EST 1997 DAILY COMMAND SUMMARY Page 1 TOTAL COMMAND SUMMARY COMMAND NUMBER TOTAL TOTAL TOTAL MEAN MEAN HOG CHARS BLOCKS NAME CMDS KCOREMIN CPU-MIN REAL-MIN SIZE-K CPU-MIN FACTOR TRNSFD READ TOTALS 216 6.91 0.05 7.42 132.59 0.00 0.70 1.707e+07 4094.00 diskusg 1 3.73 0.03 0.11 142.00 0.03 23.22 1.625e+07 4094.00 bsh 15 1.28 0.01 0.24 129.47 0.00 4.20 49810.00 0.00 awk 6 0.27 0.00 0.00 175.67 0.00 37.50 48971.00 0.00 ls 6 0.25 0.00 0.00 136.14 0.00 87.50 37662.00 0.00 tail 6 0.13 0.00 0.01 96.80 0.00 18.52 61176.00 0.00 sendmail 1 0.12 0.00 0.00 462.00 0.00 100.00 13744.00 0.00 dspmsg 18 0.11 0.00 0.00 212.00 0.00 33.33 74259.00 0.00 cat 17 0.11 0.00 0.00 136.00 0.00 60.00 1615.00 0.00 acctcms 4 0.10 0.00 0.00 201.00 0.00 50.00 65520.00 0.00 fgrep 6 0.09 0.00 0.00 84.25 0.00 36.36 122904.00 0.00 sort 7 0.09 0.00 0.03 112.00 0.00 3.03 15302.00 0.00 acctmerg 6 0.08 0.00 0.00 105.00 0.00 37.50 9064.00 0.00 vi 1 0.07 0.00 0.11 127.00 0.00 0.48 17912.00 0.00 egrep 6 0.06 0.00 0.00 124.00 0.00 33.33 61704.00 0.00 chown 14 0.06 0.00 0.00 48.80 0.00 35.71 15988.00 0.00 grep 3 0.06 0.00 0.01 121.00 0.00 3.70 7971.00 0.00 date 9 0.05 0.00 0.00 198.00 0.00 50.00 169.00 0.00 acctprc1 1 0.04 0.00 0.00 74.00 0.00 66.67 23672.00 0.00 acctcon1 1 0.04 0.00 0.00 142.00 0.00 50.00 7883.00 0.00 uniq 3 0.04 0.00 0.02 142.00 0.00 1.16 6269.00 0.00 ypcat 2 0.03 0.00 0.00 129.00 0.00 6.25 58000.00 0.00 more 1 0.03 0.00 6.82 101.00 0.00 0.00 11384.00 0.00 pr 5 0.02 0.00 0.00 94.00 0.00 14.29 26094.00 0.00 rm 7 0.02 0.00 0.00 82.00 0.00 9.09 5058.00 0.00 lsuser 1 0.01 0.00 0.01 19.00 0.00 7.14 4600.00 0.00 sed 7 0.00 0.00 0.01 0.00 0.00 0.00 30745.00 0.00 fwtmp 4 0.00 0.00 0.00 0.00 0.00 10.00 10270.00 0.00 getopt 3 0.00 0.00 0.00 0.00 0.00 0.00 48.00 0.00 chmod 15 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 acctwtmp 2 0.00 0.00 0.00 0.00 0.00 0.00 128.00 0.00 uname 1 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 wtmpfix 1 0.00 0.00 0.00 0.00 0.00 0.00 3072.00 0.00 mv 7 0.00 0.00 0.00 0.00 0.00 0.00 5058.00 0.00 acctcon2 1 0.00 0.00 0.00 0.00 0.00 100.00 660.00 0.00 accton 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 df 1 0.00 0.00 0.00 0.00 0.00 0.00 733.00 0.00 basename 2 0.00 0.00 0.00 0.00 0.00 0.00 23.00 0.00 expr 1 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 cp 19 0.00 0.00 0.00 0.00 0.00 0.00 10012.00 0.00 wc 1 0.00 0.00 0.00 0.00 0.00 0.00 1203.00 0.00 acctprc2 1 0.00 0.00 0.00 0.00 0.00 0.00 12736.00 0.00 acctdisk 1 0.00 0.00 0.00 0.00 0.00 0.00 339.00 0.00 ln 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Sat May 10 21:41:48 EST 1997 MONTHLY TOTAL COMMAND SUMMARY Page 1 TOTAL COMMAND SUMMARY COMMAND NUMBER TOTAL TOTAL TOTAL MEAN MEAN HOG CHARS BLOCKS NAME CMDS KCOREMIN CPU-MIN REAL-MIN SIZE-K CPU-MIN FACTOR TRNSFD READ TOTALS 1771 281.68 1.22 706.08 231.12 0.00 0.17 1.423e+08 4094.00 dtterm 2 136.19 0.24 81.83 566.58 0.12 0.29 333760.00 0.00 man 11 79.59 0.18 4.64 431.09 0.02 3.98 5.915e+07 0.00 bsh 135 7.74 0.06 13.13 124.32 0.00 0.47 705187.00 0.00 find 2 6.64 0.21 0.74 31.00 0.11 28.78 13764.00 0.00 lsuser 24 5.85 0.08 0.28 72.03 0.00 29.21 117880.00 0.00 ksh 27 4.60 0.03 189.72 178.53 0.00 0.01 637811.00 0.00 crash 1 3.89 0.02 3.19 237.00 0.02 0.51 2.635e+07 0.00 diskusg 1 3.73 0.03 0.11 142.00 0.03 23.22 1.625e+07 4094.00 acctcom 18 3.30 0.05 0.09 62.18 0.00 62.01 839000.00 0.00 telnet 2 3.25 0.05 79.10 65.62 0.02 0.06 417920.00 0.00 errpt 8 3.13 0.01 0.03 267.40 0.00 41.67 2.129e+06 0.00 telnetd 5 2.40 0.06 107.89 40.19 0.01 0.06 762960.00 0.00 tail 102 2.31 0.02 0.26 99.48 0.00 8.77 1.04e+06 0.00 fgrep 102 2.03 0.02 0.05 113.25 0.00 37.30 2.089e+06 0.00 ls 75 2.00 0.02 0.04 116.15 0.00 45.21 603775.00 0.00 vi 16 1.46 0.01 37.87 116.81 0.00 0.03 330912.00 0.00 more 35 1.21 0.01 26.19 125.30 0.00 0.04 518424.00 0.00 awk 25 0.96 0.01 0.04 175.33 0.00 14.09 141725.00 0.00 dspmsg 98 0.74 0.00 0.01 166.24 0.00 50.00 493457.00 0.00 uniq 50 0.69 0.01 0.55 115.74 0.00 1.09 142133.00 0.00 grep 33 0.59 0.00 0.30 126.22 0.00 1.54 237104.00 0.00 ps 6 0.59 0.01 0.03 86.88 0.00 25.49 104784.00 0.00 rm 75 0.59 0.01 0.04 75.23 0.00 17.75 404784.00 0.00 file 13 0.52 0.00 0.01 110.83 0.00 36.00 461053.00 0.00 sort 59 0.52 0.01 0.55 90.45 0.00 1.04 280988.00 0.00 sendmail 8 0.48 0.00 0.00 463.00 0.00 44.44 109952.00 0.00 date 96 0.46 0.00 0.01 194.89 0.00 33.33 21536.00 0.00 acctcms 15 0.42 0.00 0.01 179.22 0.00 45.00 267903.00 0.00 acctcon1 16 0.41 0.00 0.01 142.45 0.00 22.92 180989.00 0.00 egrep 102 0.40 0.00 0.02 102.33 0.00 16.13 1.049e+06 0.00 chown 49 0.38 0.01 0.01 55.54 0.00 49.06 155734.00 0.00 sh 26 0.34 0.00 4.22 131.70 0.00 0.06 156818.00 0.00 ftpd 1 0.33 0.00 57.30 212.00 0.00 0.00 184320.00 0.00 acctprc2 3 0.30 0.00 0.01 130.00 0.00 39.13 91430.00 0.00 lslpp 4 0.28 0.00 0.03 135.75 0.00 8.25 108864.00 0.00 sadc 4 0.28 0.01 0.08 24.09 0.00 13.92 2.402e+07 0.00 strings 7 0.27 0.00 0.00 95.00 0.00 61.11 76242.00 0.00 cp 84 0.26 0.00 0.02 166.67 0.00 7.79 28181.00 0.00 cat 66 0.26 0.00 0.01 163.33 0.00 27.27 93983.00 0.00 sed 44 0.25 0.00 0.28 117.75 0.00 0.74 245587.00 0.00 acctprc1 4 0.24 0.00 2.77 115.25 0.00 0.08 167963.00 0.00 mv 36 0.21 0.00 11.51 81.50 0.00 0.02 61310.00 0.00 servdir. 4 0.17 0.00 0.19 79.87 0.00 1.10 2876.00 0.00 ibm.psa 4 0.16 0.00 0.05 105.17 0.00 3.12 60576.00 0.00 termdef 8 0.14 0.00 0.00 182.00 0.00 50.00 80960.00 0.00 acctmerg 24 0.14 0.00 0.01 74.14 0.00 17.50 49956.00 0.00 chmod 65 0.13 0.00 0.00 121.50 0.00 33.33 50596.00 0.00 pr 20 0.12 0.00 0.01 155.33 0.00 6.52 83028.00 0.00 ibmelh.p 4 0.12 0.00 0.03 76.50 0.00 5.88 41760.00 0.00 uname 13 0.09 0.00 0.01 89.00 0.00 15.38 91318.00 0.00 Sat May 10 21:41:49 EST 1997 LAST LOGIN Page 1 00-00-00 guest 00-00-00 jwpierce 97-05-09 enwilson 00-00-00 lpd 97-05-09 bmwood 97-05-09 bmwood2 00-00-00 nuucp 97-05-09 sawood 97-05-10 root 00-00-00 servdir 97-05-09 tswilson Summary This chapter explained the basics of UNIX system accounting, provided a list of commands and their respective definitions, and supplied configuration procedures for HP-UX 10.X and IBM AIX 4.2. The system accounting directory structure was discussed, and accounting reports were defined and generated. This information may be used by a number of individuals to: Establish an equitable charge back system● Monitor overall system resource usage● Used as a basis for establishing resource quota requirements● Provide information to management for cost justification● Forecast● This chapter has provided basic information for the systems administrator to implement systems accounting. Used properly, the information can be of great value in helping to manage current resources in a fair manner for all users and processes. This information will also help justify, from a cost basis, future computer purchases. ©Copyright, Macmillan Computer Publishing. All rights reserved. UNIX Unleashed, System Administrator's Edition - 22 - Performance Monitoring By Ronald Rose; edited by Chris Byers Chapter 21, "System Accounting," teaches about the UNIX accounting system, and the tools that the accounting system provides. Some of these utilities and reports give you information about system utilization and performance. Some of these can be used when investigating performance problems. In this portion of the book, you will learn all about performance monitoring. There are a series of commands that enable system administrators, programmers, and users to examine each of the resources that a UNIX system uses. By examining these resources you can determine if the system is operating properly or poorly. More important than the commands themselves, you will also learn strategies and procedures that can be used to search for performance problems. Armed with both the commands and the overall methodologies with which to use them, you will understand the factors that are affecting system performance, and what can be done to optimize them so that the system performs at its best. Although this chapter is helpful for users, it is particularly directed at new system administrators that are actively involved in keeping the system they depend on healthy, or trying to diagnose what has caused its performance to deteriorate. This chapter introduces several new tools to use in your system investigations. The sequence of the chapter is not based on particular commands. It is instead based on the steps and the strategies that you will use during your performance investigations. In other words, the chapter is organized to mirror the logical progression that a system administrator uses to determine the state of the overall system and the status of each of its subsystems. You will frequently start your investigations by quickly looking at the overall state of the system load, as described in the section "Monitoring the Overall System Status." To do this you see how the commands uptime and sar can be used to examine the system load and the general level of Central Processing Unit (CPU) loading. You also see how tools such as SunOS's perfmeter can be helpful in gaining a graphic, high-level view of several components at once. Next, in the section "Monitoring Processes with ps," you learn how ps can be used to determine the characteristics of the processes that are running on your system. This is a natural next step after you have determined that the overall system status reflects a heavier-than-normal loading. You will learn how to use ps to look for processes that are consuming inordinate amounts of resources and the steps to take after you have located them. After you have looked at the snapshot of system utilization that ps gives you, you may well have questions about how to use the memory or disk subsystems. So, in the next section, "Monitoring Memory Utilization," you learn how to monitor memory performance with tools such as vmstat and sar, and how to detect when paging and swapping have become excessive (thus indicating that memory must be added to the system). In the section "Monitoring Disk Subsystem Performance," you see how tools such as iostat, sar, and df can be used to monitor disk Input/Output (I/O) performance. You will see how to determine when your disk subsystem is unbalanced and what to do to alleviate disk performance problems. After the section on disk I/O performance is a related section on network performance. (It is related to the disk I/O discussion because of the prevalent use of networks to provide extensions of local disk service through such facilities as NFS.) Here you learn to use netstat, nfsstat, and spray to determine the condition of your network. This is followed by a brief discussion of CPU performance monitoring, and finally a section on kernel tuning. In this final section, you will learn about the underlying tables that reside within the UNIX operating system and how they can be tuned to customize your system's UNIX kernel and optimize its use of resources. You have seen before in this book that the diversity of UNIX systems make it important to check each vendor's documentation for specific details about their particular implementation. The same thing applies here as well. Furthermore, modern developments such as symmetric multiprocessor support and relational databases add new characteristics and problems to the challenge of performance monitoring. These are touched on briefly in the discussions that follow. Performance and Its Impact on Users Before you get into the technical side of UNIX performance monitoring, there are a few guidelines that can help system administrators avoid performance problems and maximize their overall effectiveness. All too typically, the UNIX system administrator learns about performance when there is a critical problem with the system. Perhaps the system is taking too long to process jobs or is far behind on the number of jobs that it normally processes. Perhaps the response times for users have deteriorated to the point where users are becoming distracted and unproductive (which is a polite way of saying frustrated and angry!). In any case, if the system isn't actually failing to help its users attain their particular goals, it is at least failing to meet their expectations. It may seem obvious that when user productivity is being affected, money and time, and sometimes a great deal of both, are being lost. Simple measurements of the amount of time lost can often provide the cost justification for upgrades to the system. In this chapter you learn how to identify which components of the system are the best candidates for such an upgrade. (If you think people were unhappy to begin with, try talking to them after an expensive upgrade has produced no discernible improvement in performance!) Often, it is only when users begin complaining that people begin to examine the variables that are affecting performance. This in itself is somewhat of a problem. The system administrator should have a thorough understanding of the activities on the system before users are affected by a crisis. He should know the characteristics of each group of users on the system. This includes the type of work that they submit while they are present during the day, as well as the jobs that are to be processed during the evening. What is the size of the CPU requirement, the I/O requirement, and the memory requirement of the most frequently occurring and/or the most important jobs? What impact do these jobs have on the networks connected to the machine? Also important is the time-sensitivity of the jobs, the classic example being payrolls that must be completed by a given time and date. These profiles of system activity and user requirements can help the system administrator acquire a holistic understanding of the activity on the system. That knowledge will not only be of assistance if there is a sudden crisis in performance, but also if there is a gradual erosion of it. Conversely, if the system administrator has not compiled a profile of his various user groups, and examined the underlying loads that they impose on the system, he will be at a serious disadvantage in an emergency when it comes to figuring out where all the CPU cycles, or memory, have gone. This chapter examines the tools that can be used to gain this knowledge, and demonstrates their value. Finally, although all users may have been created equal, the work of some users inevitably will have more impact on corporate profitability than the work of other users. Perhaps, given UNIX's academic heritage, running the system in a completely democratic manner should be the goal of the system administrator. However, the system administrator will sooner or later find out, either politely or painfully, who the most important and the most influential groups are. This set of characteristics should also somehow be factored into the user profiles the system administrator develops before the onset of crises, which by their nature obscure the reasoning process of all involved. Introduction to UNIX Performance While the system is running, UNIX maintains several counters to keep track of critical system resources. The relevant resources that are tracked are the following: CPU utilization Buffer usage Disk I/O activity Tape I/O activity Terminal activity System call activity Context switching activity File access utilization Queue activity Interprocess communication (IPC) Paging activity Free memory and swap space Kernel memory allocation (KMA) Kernel tables Remote file sharing (RFS) By looking at reports based on these counters you can determine how the three major subsystems are performing. These subsystems are the following: CPU The CPU processes instructions and programs. Each time you submit a job to the system, it makes demands on the CPU. Usually, the CPU can service all demands in a timely manner. However, there is only so much available processing power, which must be shared by all users and the internal programs of the operating system, too. Memory Every program that runs on the system makes some demand on the physical memory on the machine. Like the CPU, it is a finite resource. When the active processes and programs that are running on the system request more memory than the machine actually has, paging is used to move parts of the processes to disk and reclaim their memory pages for use by other processes. If further shortages occur, the system may also have to resort to swapping, which moves entire processes to disk to make room. I/O The I/O subsystem(s) transfers data into and out of the machine. I/O subsystems comprise devices such as disks, printers, terminals/keyboards, and other relatively slow devices, and are a common source of resource contention problems. In addition, there is a rapidly increasing use of network I/O devices. When programs are doing a lot of I/O, they can get bogged down waiting for data from these devices. Each subsystem has its own limitations with respect to the bandwidth that it can effectively use for I/O operations, as well as its own peculiar problems. Performance monitoring and tuning is not always an exact science. In the displays that follow, there is a great deal of variety in the system/subsystem loadings, even for the small sample of systems used here. In addition, different user groups have widely differing requirements. Some users will put a strain on the I/O resources, some on the CPU, and some will stress the network. Performance tuning is always a series of trade-offs. As you will see, increasing the kernel size to alleviate one problem may aggravate memory utilization. Increasing NFS performance to satisfy one set of users may reduce performance in another area and thereby aggravate another set of users. The goal of the task is often to find an optimal compromise that will satisfy the majority of user and system resource needs. Monitoring the Overall System Status The examination of specific UNIX performance monitoring techniques begins with a look at three basic tools that give you a snapshot of the overall performance of the system. After getting this high-level view, you will normally proceed to examine each of the subsystems in detail. Monitoring System Status Using uptime One of the simplest reports that you use to monitor UNIX system performance measures the number of processes in the UNIX run queue during given intervals. It comes from the command uptime. It is both a high-level view of the system's workload and a handy starting place when the system seems to be performing slowly. In general, processes in the run queue are active programs (that is, not sleeping or waiting) that require system resources. Here is an example: % uptime 2:07pm up 11 day(s), 4:54, 15 users, load average: 1.90, 1.98, 2.01 The useful parts of the display are the three load-average figures. The 1.90 load average was measured over the last minute. The 1.98 average was measured over the last 5 minutes. The 2.01 load average was measured over the last 15 minutes. TIP: What you are usually looking for is the trend of the averages. This particular example shows a system that is under a fairly consistent load. However, if a system is having problems, but the load averages seem to be declining steadily, then you may want to wait a while before you take any action that might affect the system and possibly inconvenience users. While you are doing some ps commands to determine what caused the problem, the imbalance may correct itself. NOTE: uptime has certain limitations. For example, high-priority jobs are not distinguished from low-priority jobs although their impact on the system can be much greater. Run uptime periodically and observe both the numbers and the trend. When there is a problem it will often show up here, and tip you off to begin serious investigations. As system loads increase, more demands will be made on your memory and I/O subsystems, so keep an eye out for paging, swapping, and disk inefficiencies. System loads of 2 or 3 usually indicate light loads. System loads of 5 or 6 are usually medium-grade loads. Loads above 10 are often heavy loads on large UNIX machines. However, there is wide variation among types of machines as to what constitutes a heavy load. Therefore, the mentioned technique of sampling your system regularly until you have your own reference for light, medium, and heavy loads is the best technique. Monitoring System Status Using perfmeter Because the goal of this first section is to give you the tools to view your overall system performance, a brief discussion of graphical performance meters is appropriate. SUN Solaris users are provided with an OpenWindows XView tool called perfmeter, which summarizes overall system performance values in multiple dials or strip charts. Strip charts are the default. Not all UNIX systems come with such a handy tool. That's too bad because in this case a picture is worth, if not a thousand words, at least 30 or 40 man pages. In this concise format, you get information about the system resources shown in Table 22.1: Table 22.1. System resources and their descriptions. Resources Description cpu Percent of CPU being utilized pkts EtherNet activity, in packets per second page Paging, in pages per second swap Jobs swapped per second intr Number of device interrupts per second disk Disk traffic, in transfers per second cntxt Number of context switches per second load Average number of runnable processes over the last minute colls Collisions per second detected on the EtherNet errs Errors per second on receiving packets The charts of the perfmeter are not a source for precise measurements of subsystem performance, but they are graphic representations of them. However, the chart can be very useful for monitoring several aspects of the system at the same time. When you start a particular job, the graphics can demonstrate the impact of that job on the CPU, on disk transfers, and on paging. Many developers like to use the tool to assess the efficiency of their work for this very reason. Likewise, system administrators use the tool to get valuable clues about where to start their investigations. As an example, when faced with intermittent and transitory problems, glancing at a perfmeter and then going directly to the proper display may increase the odds that you can catch in the act the process that is degrading the system. The scale value for the strip chart changes automatically when the chart refreshes to accommodate increasing or decreasing values on the system. You add values to be monitored by clicking the right mouse button and selecting from the menu. From the same menu you can select properties, which will let you modify what the perfmeter is monitoring, the format (dials/graphs, direction of the displays, and solid/lined display), remote/local machine choice, and the frequency of the display. You can also set a ceiling value for a particular strip chart. If the value goes beyond the ceiling value, this portion of the chart will be displayed in red. Thus, a system administrator who knows that someone is periodically running a job that eats up all the CPU memory can set a signal that the job may be run again. The system administrator can also use this to monitor the condition of critical values from several feet away from his monitor. If he or she sees red, other users may be seeing red, too. The perfmeter is a utility provided with SunOS. You should check your own particular UNIX operating system to determine if similar performance tools are provided. Monitoring System Status Using sar -q If your machine does not support uptime, there is an option for sar that can provide the same type of quick, high-level snapshot of the system. The -q option reports the average queue length and the percentage of time that the queue is occupied. % sar -q 5 5 07:28:37 runq-sz %runocc swpq-sz %swpocc [...]... F 19 19 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 S T S S S S S S S S S O S S O S S S UID 0 0 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001 1000 1001 1033 1033 1001 PID PPID C PRI NI ADDR 0 0 80 0 SY e00ec9 78 2 0 80 0 SY f5735000 1 382 1 80 40 20 f5c6a000 1 386 1 80 40 20 f60ed000 283 80 283 77 80 40 20 f67c0000 283 73 1 80 40 20 f63c6000 283 92 1 80 40 20 f67ce800 283 91 283 88 80 40 20 f690a800 283 61 1 80 60 20... f67e1000 283 60 1 80 40 20 f68e1000 10566 10512 19 70 20 f6abb800 283 88 1 80 40 20 f6 384 800 7750 7749 80 40 20 f634 480 0 95 38 9537 80 81 22 f69 780 00 3735 3734164 40 20 f63b 880 0 52 28 5227 80 50 20 f68a 880 0 283 37 1 80 99 20 f6375000 SZ 0 0 1227 81 9 580 4 1035 1035 580 4 30 580 12565 152 216 5393 581 6 305 305 47412 WCHAN TTY TIME ? 0:01 e00eacdc ? 0:05 e00f 887 c console 0:02 e00f 887 c console 0: 28 f5cfd146 ? 85 :02... Here is an example: % sar -v 5 5 18: 51:12 18: 51:17 18: 51:22 18: 51:27 18: 51:32 18: 51:37 proc-sz 122/40 58 122/40 58 122/40 58 122/40 58 122/40 58 ov 0 0 0 0 0 inod-sz 3205/4000 3205/4000 3205/4000 3205/4000 3205/4000 ov 0 0 0 0 0 file-sz 488 /0 488 /0 488 /0 488 /0 488 /0 ov 0 0 0 0 0 lock-sz 11/0 _ 11/0 _ 11/0 _ 11/0 _ 11/0 _ Since all the ov fields are 0, you can see that the system tables are healthy for this... 34 0 3 0 86 9 384 29660 0 0 0 0 0 0 0 0 4 63 0 2 0 86 9432 29704 0 0 0 0 0 0 0 4 3 64 0 3 0 86 94 48 29696 0 0 0 0 0 0 0 0 3 65 0 3 0 86 9 384 29 684 0 0 0 0 0 0 0 1 3 68 0 2 0 86 9 188 29644 0 0 0 2 2 0 0 2 3 65 0 3 0 86 9176 29612 0 0 0 0 0 0 0 0 3 61 0 2 0 86 9156 29600 0 0 0 0 0 0 0 0 3 69 s5 12 15 11 13 18 10 16 8 faults in sy 366 1396 514 10759 490 24 58 464 25 28 551 2555 432 2495 504 2527 4 38 1 582 0 cs 675... kbytes 381 11 246167 0 0 86 084 8 188 247 492351 7 786 3 used 21173 17 186 9 0 0 632 90 189 179 384 47127 avail capacity 131 28 62% 49 688 78% 0 0% 0 0% 86 0216 0% 792 38 53% 263737 40% 22956 67% Mounted on / /usr /proc /dev/fd /tmp /home /opt /home/met From this display you can see the following information (all entries are in KB): kbytes used avail capacity Total size of usable space in file system (size is adjusted... |1 |1 Name |unix. o |vhwb_nextset |_intr_flag_table [ 18] |37 581 24096| 0|NOTY |LOCL |0 |1 [19]|37 581 21436| 0|NOTY |LOCL |0 |1 [20]|37 581 21040| 0|NOTY |LOCL |0 |1 [21]|37 581 21340| 0|NOTY |LOCL |0 |1 [22]|37 581 247 68| 0|NOTY |LOCL |0 |1 [23]|37 581 24144| 0|NOTY |LOCL |0 |1 [24]|37 581 24796| 0|NOTY |LOCL |0 |1 [25]|37 581 16924| 0|NOTY |LOCL |0 |1 [26]|37 581 21100| 132|NOTY |LOCL |0 |1 [27]|37 581 186 96| 0|NOTY... any parameters: % netstat TCP Local Address -AAA1.1023 AAA1.listen AAA1.login AAA1.32 782 Remote Address Swind Send-Q Rwind Recv-Q State - - -_ bbb2.login 87 60 0 87 60 0 ESTABLISHED Cccc.32 980 87 60 0 87 60 0 ESTABLISHED Dddd.1019 87 60 0 87 60 0 ESTABLISHED AAA1.32774 16 384 0 16 384 0 ESTABLISHED In the report, the important field is the Send-Q field, which indicates the... directory Next, you move the present /stand /system file to /stand /system. prev; then you can move the modified file /stand/build /system to /stand /system Then you move the currently running kernel /stand/vmunix to /stand/vmunix.prev, and then move the new kernel, /stand/build/vmunix.test, into place in /stand/vmunix (i.e., mv /stand/build/vmunix.test /stant/vmunix) The final step is to reboot the machine... tin tout Kps tps serv Kps tps serv Kps tps serv Kps tps serv 0 26 8 1 57 36 4 20 77 34 24 31 12 30 0 51 0 0 0 0 0 0 1 08 54 36 0 0 0 0 47 72 10 2 58 0 0 0 102 51 38 0 0 0 0 58 5 1 9 1 1 23 112 54 33 0 0 0 0 38 0 0 0 25 0 90 139 70 17 9 4 25 0 43 0 0 0 227 10 23 127 62 32 45 21 20 cpu us sy wt id 14 9 47 30 14 7 78 0 15 9 76 0 14 8 77 1 14 8 73 6 20 15 65 0 The first line of the report shows the statistics... 85 :02 f63c61c8 ? 0:07 f67ce9c8 ? 0:07 f60dce46 ? 166:39 e00f 887 c ? 379:35 e00f 887 c ? 182 :22 pts/14 0:00 f60a0346 ? 67:51 f5dad02c pts/2 31:47 ? 646:57 f60e0d46 pts/9 0:00 f60dca46 pts/7 0:00 f63751c8 ? 1135:50 COMD sched pageout mailtool perfmete sqlturbo cdrl_mai cdrl_mai sqlturbo mhdms mhharris ps db_write tbinit sqlturbo ksh ksh velox_ga The following are tips for using ps to determine why system performance . f67ce800 1035 f67ce9c8 ? 0:07 cdrl_mai 8 S 1001 283 91 283 88 80 40 20 f690a800 580 4 f60dce46 ? 166:39 sqlturbo 8 S 1001 283 61 1 80 60 20 f67e1000 30 580 e00f 887 c ? 379:35 mhdms 8 S 1001 283 60 1 80 . 81 9 e00f 887 c console 0: 28 perfmete 8 S 1001 283 80 283 77 80 40 20 f67c0000 580 4 f5cfd146 ? 85 :02 sqlturbo 8 S 1001 283 73 1 80 40 20 f63c6000 1035 f63c61c8 ? 0:07 cdrl_mai 8 S 1001 283 92 1 80 . 20 f68e1000 12565 e00f 887 c ? 182 :22 mhharris 8 O 1001 10566 10512 19 70 20 f6abb800 152 pts/14 0:00 ps 8 S 1001 283 88 1 80 40 20 f6 384 800 216 f60a0346 ? 67:51 db_write 8 S 1000 7750 7749 80

Ngày đăng: 14/08/2014, 02:22

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan