8 cpu Scheduling: The MultiLevel Feedback Queue

In this chapter, we’ll tackle the problem of developing one of the most wellknown approaches to scheduling, known as the Multilevel Feedback Queue (MLFQ). The Multilevel Feedback Queue (MLFQ) scheduler was first described by Corbato et al. in 1962 C+62 in a system known as the Compatible TimeSharing System (CTSS), and this work, along with later work on Multics, led the ACM to award Corbato its highest honor, the Turing Award. The scheduler has subsequently been refined throughout the years to the implementations you will encounter in some modern systems. The fundamental problem MLFQ tries to address is twofold. First, it would like to optimize turnaround time, which, as we saw in the previous note, is done by running shorter jobs first; unfortunately, the OS doesn’t generally know how long a job will run for, exactly the knowledge that algorithms like SJF (or STCF) require. Second, MLFQ would like to make a system feel responsive to interactive users (i.e., users sitting and staring at the screen, waiting for a process to finish), and thus minimize response time; unfortunately, algorithms like Round Robin reduce response time but are terrible for turnaround time. Thus, our problem: given that we in general do not know anything about a process, how can we build a scheduler to achieve these goals? How can the scheduler learn, as the system runs, the characteristics of the jobs it is running, and thus make better scheduling decisions?

Trang 1

Scheduling: The Multi-Level Feedback Queue

In this chapter, we’ll tackle the problem of developing one of the most

well-known approaches to scheduling, known as the Multi-level Feed-back Queue (MLFQ) The Multi-level Feedback Queue (MLFQ) sched-uler was first described by Corbato et al in 1962 [C+62] in a system known as the Compatible Time-Sharing System (CTSS), and this work, along with later work on Multics, led the ACM to award Corbato its

highest honor, the Turing Award The scheduler has subsequently been

refined throughout the years to the implementations you will encounter

in some modern systems

The fundamental problem MLFQ tries to address is two-fold First, it would like to optimize turnaround time, which, as we saw in the previous note, is done by running shorter jobs first; unfortunately, the OS doesn’t generally know how long a job will run for, exactly the knowledge that algorithms like SJF (or STCF) require Second, MLFQ would like to make

a system feel responsive to interactive users (i.e., users sitting and staring

at the screen, waiting for a process to finish), and thus minimize response time; unfortunately, algorithms like Round Robin reduce response time but are terrible for turnaround time Thus, our problem: given that we

in general do not know anything about a process, how can we build a scheduler to achieve these goals? How can the scheduler learn, as the system runs, the characteristics of the jobs it is running, and thus make better scheduling decisions?

THECRUX:

HOWTOSCHEDULEWITHOUTPERFECTKNOWLEDGE?

How can we design a scheduler that both minimizes response time for interactive jobs while also minimizing turnaround time without a priori knowledge of job length?

Trang 2

TIP: LEARNFROMHISTORY

The multi-level feedback queue is an excellent example of a system that learns from the past to predict the future Such approaches are com-mon in operating systems (and many other places in Computer Science, including hardware branch predictors and caching algorithms) Such approaches work when jobs have phases of behavior and are thus pre-dictable; of course, one must be careful with such techniques, as they can easily be wrong and drive a system to make worse decisions than they would have with no knowledge at all

8.1 MLFQ: Basic Rules

To build such a scheduler, in this chapter we will describe the basic algorithms behind a multi-level feedback queue; although the specifics of many implemented MLFQs differ [E95], most approaches are similar

In our treatment, the MLFQ has a number of distinct queues, each assigned a different priority level At any given time, a job that is ready

to run is on a single queue MLFQ uses priorities to decide which job should run at a given time: a job with higher priority (i.e., a job on a higher queue) is chosen to run

Of course, more than one job may be on a given queue, and thus have the same priority In this case, we will just use round-robin scheduling among those jobs

Thus, the key to MLFQ scheduling lies in how the scheduler sets pri-orities Rather than giving a fixed priority to each job, MLFQ varies the priority of a job based on its observed behavior If, for example, a job repeat-edly relinquishes the CPU while waiting for input from the keyboard, MLFQ will keep its priority high, as this is how an interactive process might behave If, instead, a job uses the CPU intensively for long periods

of time, MLFQ will reduce its priority In this way, MLFQ will try to learn about processes as they run, and thus use the history of the job to predict its future behavior

Thus, we arrive at the first two basic rules for MLFQ:

• Rule 1:If Priority(A) > Priority(B), A runs (B doesn’t)

• Rule 2:If Priority(A) = Priority(B), A & B run in RR

If we were to put forth a picture of what the queues might look like at

a given instant, we might see something like the following (Figure 8.1)

In the figure, two jobs (A and B) are at the highest priority level, while job

C is in the middle and Job D is at the lowest priority Given our current knowledge of how MLFQ works, the scheduler would just alternate time slices between A and B because they are the highest priority jobs in the system; poor jobs C and D would never even get to run — an outrage!

Of course, just showing a static snapshot of some queues does not re-ally give you an idea of how MLFQ works What we need is to

Trang 3

under-Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8

[Low Priority]

[High Priority]

D C

Figure 8.1: MLFQ Example

stand how job priority changes over time And that, in a surprise only

to those who are reading a chapter from this book for the first time, is

exactly what we will do next

8.2 Attempt #1: How To Change Priority

We now must decide how MLFQ is going to change the priority level

of a job (and thus which queue it is on) over the lifetime of a job To do

this, we must keep in mind our workload: a mix of interactive jobs that

are short-running (and may frequently relinquish the CPU), and some

longer-running “CPU-bound” jobs that need a lot of CPU time but where

response time isn’t important Here is our first attempt at a

priority-adjustment algorithm:

• Rule 3: When a job enters the system, it is placed at the highest

priority (the topmost queue)

• Rule 4a:If a job uses up an entire time slice while running, its

pri-ority is reduced (i.e., it moves down one queue)

• Rule 4b:If a job gives up the CPU before the time slice is up, it stays

at the same priority level

Example 1: A Single Long-Running Job

Let’s look at some examples First, we’ll look at what happens when there

has been a long running job in the system Figure 8.2 shows what happens

to this job over time in a three-queue scheduler

Trang 4

Q1

Q0

Figure 8.2: Long-running Job Over Time

As you can see in the example, the job enters at the highest priority (Q2) After a single time-slice of 10 ms, the scheduler reduces the job’s priority by one, and thus the job is on Q1 After running at Q1 for a time slice, the job is finally lowered to the lowest priority in the system (Q0), where it remains Pretty simple, no?

Example 2: Along Came A Short Job

Now let’s look at a more complicated example, and hopefully see how MLFQ tries to approximate SJF In this example, there are two jobs: A, which is a long-running CPU-intensive job, and B, which is a short-running interactive job Assume A has been running for some time, and then B ar-rives What will happen? Will MLFQ approximate SJF for B?

Figure 8.3 plots the results of this scenario A (shown in black) is run-ning along in the lowest-priority queue (as would any long-runrun-ning CPU-intensive jobs); B (shown in gray) arrives at time T = 100, and thus is

Q2

Q1

Q0

Figure 8.3: Along Came An Interactive Job

Trang 5

Q1

Q0

Figure 8.4: A Mixed I/O-intensive and CPU-intensive Workload

inserted into the highest queue; as its run-time is short (only 20 ms), B

completes before reaching the bottom queue, in two time slices; then A

resumes running (at low priority)

From this example, you can hopefully understand one of the major

goals of the algorithm: because it doesn’t know whether a job will be a

short job or a long-running job, it first assumes it might be a short job, thus

giving the job high priority If it actually is a short job, it will run quickly

and complete; if it is not a short job, it will slowly move down the queues,

and thus soon prove itself to be a long-running more batch-like process

In this manner, MLFQ approximates SJF

Example 3: What About I/O?

Let’s now look at an example with some I/O As Rule 4b states above, if a

process gives up the processor before using up its time slice, we keep it at

the same priority level The intent of this rule is simple: if an interactive

job, for example, is doing a lot of I/O (say by waiting for user input from

the keyboard or mouse), it will relinquish the CPU before its time slice is

complete; in such case, we don’t wish to penalize the job and thus simply

keep it at the same level

Figure 8.4 shows an example of how this works, with an interactive job

B (shown in gray) that needs the CPU only for 1 ms before performing an

I/O competing for the CPU with a long-running batch job A (shown in

black) The MLFQ approach keeps B at the highest priority because B

keeps releasing the CPU; if B is an interactive job, MLFQ further achieves

its goal of running interactive jobs quickly

Problems With Our Current MLFQ

We thus have a basic MLFQ It seems to do a fairly good job, sharing the

CPU fairly between long-running jobs, and letting short or I/O-intensive

interactive jobs run quickly Unfortunately, the approach we have

devel-oped thus far contains serious flaws Can you think of any?

(This is where you pause and think as deviously as you can)

Trang 6

Q0 Q1 Q2

0 50 100 150 200

Q0 Q1 Q2

0 50 100 150 200

Boost Boost Boost Boost

Figure 8.5: Without (Left) and With (Right) Priority Boost First, there is the problem of starvation: if there are “too many”

in-teractive jobs in the system, they will combine to consume all CPU time,

and thus long-running jobs will never receive any CPU time (they starve).

We’d like to make some progress on these jobs even in this scenario

Second, a smart user could rewrite their program to game the sched-uler Gaming the scheduler generally refers to the idea of doing some-thing sneaky to trick the scheduler into giving you more than your fair share of the resource The algorithm we have described is susceptible to the following attack: before the time slice is over, issue an I/O operation (to some file you don’t care about) and thus relinquish the CPU; doing so allows you to remain in the same queue, and thus gain a higher percent-age of CPU time When done right (e.g., by running for 99% of a time slice before relinquishing the CPU), a job could nearly monopolize the CPU Finally, a program may change its behavior over time; what was CPU-bound may transition to a phase of interactivity With our current ap-proach, such a job would be out of luck and not be treated like the other interactive jobs in the system

8.3 Attempt #2: The Priority Boost

Let’s try to change the rules and see if we can avoid the problem of starvation What could we do in order to guarantee that CPU-bound jobs will make some progress (even if it is not much?)

The simple idea here is to periodically boost the priority of all the jobs

in system There are many ways to achieve this, but let’s just do some-thing simple: throw them all in the topmost queue; hence, a new rule:

• Rule 5: After some time period S, move all the jobs in the system

to the topmost queue

Our new rule solves two problems at once First, processes are guar-anteed not to starve: by sitting in the top queue, a job will share the CPU

Trang 7

Q1

Q0

Q2

Q1

Q0

Figure 8.6: Without (Left) and With (Right) Gaming Tolerance

with other high-priority jobs in a round-robin fashion, and thus

eventu-ally receive service Second, if a CPU-bound job has become interactive,

the scheduler treats it properly once it has received the priority boost

Let’s see an example In this scenario, we just show the behavior of

a long-running job when competing for the CPU with two short-running

interactive jobs Two graphs are shown in Figure 8.5 (page 6) On the left,

there is no priority boost, and thus the long-running job gets starved once

the two short jobs arrive; on the right, there is a priority boost every 50

ms (which is likely too small of a value, but used here for the example),

and thus we at least guarantee that the long-running job will make some

progress, getting boosted to the highest priority every 50 ms and thus

getting to run periodically

Of course, the addition of the time period S leads to the obvious

ques-tion: what should S be set to? John Ousterhout, a well-regarded systems

researcher [O11], used to call such values in systems voo-doo constants,

because they seemed to require some form of black magic to set them

cor-rectly Unfortunately, S has that flavor If it is set too high, long-running

jobs could starve; too low, and interactive jobs may not get a proper share

of the CPU

8.4 Attempt #3: Better Accounting

We now have one more problem to solve: how to prevent gaming of

our scheduler? The real culprit here, as you might have guessed, are

Rules 4a and 4b, which let a job retain its priority by relinquishing the

CPU before the time slice expires So what should we do?

The solution here is to perform better accounting of CPU time at each

level of the MLFQ Instead of forgetting how much of a time slice a

pro-cess used at a given level, the scheduler should keep track; once a propro-cess

has used its allotment, it is demoted to the next priority queue Whether

Trang 8

Q1

Q0

Figure 8.7: Lower Priority, Longer Quanta

it uses the time slice in one long burst or many small ones does not matter

We thus rewrite Rules 4a and 4b to the following single rule:

• Rule 4: Once a job uses up its time allotment at a given level (re-gardless of how many times it has given up the CPU), its priority is reduced (i.e., it moves down one queue)

Let’s look at an example Figure 8.6 (page 7) shows what happens when a workload tries to game the scheduler with the old Rules 4a and 4b (on the left) as well the new anti-gaming Rule 4 Without any protection from gaming, a process can issue an I/O just before a time slice ends and thus dominate CPU time With such protections in place, regardless of the I/O behavior of the process, it slowly moves down the queues, and thus cannot gain an unfair share of the CPU

8.5 Tuning MLFQ And Other Issues

A few other issues arise with MLFQ scheduling One big question is

how to parameterize such a scheduler For example, how many queues

should there be? How big should the time slice be per queue? How often should priority be boosted in order to avoid starvation and account for changes in behavior? There are no easy answers to these questions, and thus only some experience with workloads and subsequent tuning of the scheduler will lead to a satisfactory balance

For example, most MLFQ variants allow for varying time-slice length across different queues The high-priority queues are usually given short time slices; they are comprised of interactive jobs, after all, and thus quickly alternating between them makes sense (e.g., 10 or fewer millisec-onds) The low-priority queues, in contrast, contain long-running jobs that are CPU-bound; hence, longer time slices work well (e.g., 100s of ms) Figure 8.7 shows an example in which two long-running jobs run for 10 ms at the highest queue, 20 in the middle, and 40 at the lowest

Trang 9

TIP: AVOIDVOO-DOOCONSTANTS(OUSTERHOUT’SLAW)

Avoiding voo-doo constants is a good idea whenever possible

Unfor-tunately, as in the example above, it is often difficult One could try to

make the system learn a good value, but that too is not straightforward

The frequent result: a configuration file filled with default parameter

val-ues that a seasoned administrator can tweak when something isn’t quite

working correctly As you can imagine, these are often left unmodified,

and thus we are left to hope that the defaults work well in the field This

tip brought to you by our old OS professor, John Ousterhout, and hence

we call it Ousterhout’s Law.

The Solaris MLFQ implementation — the Time-Sharing scheduling

class, or TS — is particularly easy to configure; it provides a set of tables

that determine exactly how the priority of a process is altered

through-out its lifetime, how long each time slice is, and how often to boost the

priority of a job [AD00]; an administrator can muck with this table in

or-der to make the scheduler behave in different ways Default values for

the table are 60 queues, with slowly increasing time-slice lengths from

20 milliseconds (highest priority) to a few hundred milliseconds (lowest),

and priorities boosted around every 1 second or so

Other MLFQ schedulers don’t use a table or the exact rules described

in this chapter; rather they adjust priorities using mathematical

formu-lae For example, the FreeBSD scheduler (version 4.3) uses a formula to

calculate the current priority level of a job, basing it on how much CPU

the process has used [LM+89]; in addition, usage is decayed over time,

providing the desired priority boost in a different manner than described

herein See Epema’s paper for an excellent overview of such decay-usage

algorithms and their properties [E95]

Finally, many schedulers have a few other features that you might

en-counter For example, some schedulers reserve the highest priority levels

for operating system work; thus typical user jobs can never obtain the

highest levels of priority in the system Some systems also allow some

user advice to help set priorities; for example, by using the command-line

utility nice you can increase or decrease the priority of a job (somewhat)

and thus increase or decrease its chances of running at any given time

See the man page for more

8.6 MLFQ: Summary

We have described a scheduling approach known as the Multi-Level

Feedback Queue (MLFQ) Hopefully you can now see why it is called

that: it has multiple levels of queues, and uses feedback to determine the

priority of a given job History is its guide: pay attention to how jobs

behave over time and treat them accordingly

Trang 10

TIP: USEADVICEWHEREPOSSIBLE

As the operating system rarely knows what is best for each and every process of the system, it is often useful to provide interfaces to allow users

or administrators to provide some hints to the OS We often call such hints advice, as the OS need not necessarily pay attention to it, but rather

might take the advice into account in order to make a better decision Such hints are useful in many parts of the OS, including the scheduler (e.g., with nice), memory manager (e.g., madvise), and file system (e.g., informed prefetching and caching [P+95])

The refined set of MLFQ rules, spread throughout the chapter, are re-produced here for your viewing pleasure:

• Rule 1:If Priority(A) > Priority(B), A runs (B doesn’t)

• Rule 2:If Priority(A) = Priority(B), A & B run in RR

• Rule 3: When a job enters the system, it is placed at the highest priority (the topmost queue)

• Rule 4: Once a job uses up its time allotment at a given level (re-gardless of how many times it has given up the CPU), its priority is reduced (i.e., it moves down one queue)

• Rule 5: After some time period S, move all the jobs in the system

to the topmost queue

MLFQ is interesting for the following reason: instead of demanding

a priori knowledge of the nature of a job, it observes the execution of a job and prioritizes it accordingly In this way, it manages to achieve the best of both worlds: it can deliver excellent overall performance (similar

to SJF/STCF) for short-running interactive jobs, and is fair and makes progress for long-running CPU-intensive workloads For this reason, many systems, including BSD UNIX derivatives [LM+89, B86], Solaris [M06], and Windows NT and subsequent Windows operating systems [CS97] use a form of MLFQ as their base scheduler

Định dạng
Số trang	12
Dung lượng	106,28 KB

Tài liệu tham khảo	Loại	Chi tiết
1. Run a few randomly-generated problems with just two jobs and two queues; compute the MLFQ execution trace for each. Make your life easier by limiting the length of each job and turning off I/Os	Khác
3. How would you configure the scheduler parameters to behave just like a round-robin scheduler	Khác
4. Craft a workload with two jobs and scheduler parameters so that one job takes advantage of the older Rules 4a and 4b (turned on with the -S flag) to game the scheduler and obtain 99% of the CPU over a particular time interval	Khác
5. Given a system with a quantum length of 10 ms in its highest queue, how often would you have to boost jobs back to the highest priority level (with the -B flag) in order to guarantee that a single long- running (and potentially-starving) job gets at least 5% of the CPU	Khác
6. One question that arises in scheduling is which end of a queue to add a job that just finished I/O; the -I flag changes this behavior for this scheduling simulator. Play around with some workloads and see if you can see the effect of this flag	Khác