Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 25 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
25
Dung lượng
275,27 KB
Nội dung
19
QUALITY OF SERVICE PROVISIONING
FOR LONG-RANGE-DEPENDENT
REAL-TIME TRAFFIC
ABDELNASER ADAS
Conexant, Inc., Newport Beach, CA 92660
AMARNATH MUKHERJEE
Knoltex Corporation, San Jose, CA 95157
19.1 INTRODUCTION
19.1.1 Overview
Network support for variable bit rate (VBR) video needs to consider (1) properties of
workload induced (e.g., signi®cant autocorrelations into far lags and heterogeneous
marginal distributions), and (2) application-speci®c bounds on delay-jitter and
statistical cell-loss probabilities. This chapter presents a quality-of-service (QoS)
solution for such traf®c at each multiplexing point in a network. Heterogeneity in
both-offered workload and quality-of-service requirements are considered. The
network is assumed to be cell-switched with virtual circuits (VCs) similar to that
in ATM networks. Chapter 16 discusses an alternative approach for provisioning for
long-range-dependent (LRD) traf®c. See also the work of Heyman and Lakshman
(Chapter 12) and Li and Li (Chapter 13).
19.1.2 Correlated Traf®c and Its Implications
Studies on a range of video applications indicate that there exists a slowly decaying
autocorrelation structure in the underlying stochastic processes [3, 7, 8, 12, 19, 21].
Self-Similar Network Traf®c and Performance Evaluation, Edited by Kihong Park and Walter Willinger
ISBN 0-471-31974-0 Copyright # 2000 by John Wiley & Sons, Inc.
481
Self-Similar Network Traf®c and Performance Evaluation, Edited by Kihong Park and Walter Willinger
Copyright # 2000 by John Wiley & Sons, Inc.
Print ISBN 0-471-31974-0 Electronic ISBN 0-471-20644-X
Providing guarantees on maximum-delay, delay-jitter, and cell-loss probabilities (or
some other measure of cell loss) in the presence of such traf®c is nontrivial,
especially if the coef®cient of variation of the marginal distribution (or the
distribution tail) is large. This is because such traf®c signi®cantly increases queue
length statistics at a multiplexer [2, 5, 8, 16±18, 20].
As an illustration, consider the performance of a ®nite-buffer queue. Let the
arrival process be a fractionally differenced autoregressive moving average process:
1 À f
1
BD
d
X
t
E
t
;
where
f
1
0 < f
1
< 1 represents an exponentially decaying autocorrelation compo-
nent;
d 0 < d <
1
2
represents a hyperbolically decaying autocorrelation component;
fE
t
g is white noise, that is, uncorrelated; and
B is the backward shift operator, that is, BX
t
X
tÀ1
.
The correlation structure of X
t
can be controlled by changing f
1
and d. Figure
19.1 shows the mean cell loss versus number of frame buffers for three input
correlation structures with approximately the same coef®cient of variation $0:24.
A frame buffer is the maximum number of cells that can be transmitted by the output
channel in a given time interval (frame time).
Fig. 19.1 Mean number of cells dropped for different dependency structures in the work-
load. (a) Mean utilization 0.9; (b) mean utilization 0.8. The model form used is a
fractionally differenced ARIMA1; d; 0 process [13]: 1 Àf
1
BD
d
X
t
E
t
.
482
QUALITY OF SERVICE PROVISIONING FOR LRD REAL-TIME TRAFFIC
The solid line showing a slow decay is for a long-memory input traf®c sequence.
The dashed line in the middle is for traf®c with short memory. The line with the
smallest mean cell loss corresponds to a near white noise stream. Note that mean cell
loss decays very slowly with increasing buffer size for traf®c with a slowly decaying
autocorrelation function. Qualitatively similar observations have been reported
elsewhere [2, 5, 8, 16±18, 20].
19.1.3 Summary of the Proposed Architecture
This chapter introduces a per-virtual-circuit (per-VC) framing structure and a
pseudo-earliest-due-date cell dispatcher to provide guaranteed delay-jitter bounds.
Heterogeneous jitter bounds are supported through software-controlled frame sizes,
which may be independently set of each VC. The framing structure is a general-
ization of per-link framing introduced by Golestani. The proposed framing structure
eliminates correcting for phase mismatches between incoming frames and outgoing
frames, necessary in per-link framing. This results in reduction in end-to-end delay
bound and buffer requirements, and a simpler implementation.
Strong autocorrelations typically seen in video traf®c make equivalent bandwidth
computations for heterogeneous cell-loss bounds intractable. To address this, the
framing strategy is combined with an active cell-discard mechanism with prioritized
cell-dropping, the latter utilizing the history of dropped cells and target cell-loss
bounds for each VC. Upper bounds on the equivalent bandwidth needed to support a
given workload with a target quality of service are developed. These are validated
through numerical and simulation results from variable bit rate MPEG-I video
traces.
A high-level view of the proposed architecture is as follows.
1. A Framing Structure on a Per-VC Basis. To provide heterogeneous delay-jitter
bounds, a framing structure is induced on VCs, similar to that in Golestani [9±11].
(Differences between the two approaches are described later.) Consider a virtual
circuit (VC), i, with a desired delay-jitter bound M
i
. The frame structure splits time
for this VC into juxtaposed intervals of length M
i
at each multiplexing point. Cells
from VC i that arrive in a given frame at a multiplexer are buffered, and not
transmitted until the beginning of the next frame time. If suf®cient capacity is
available to transmit these cells in the next interval, all cells arrive at the next hop
within an interval of length M
i
. This way, they are guaranteed to meet a delay-jitter
bound of M
i
. Also, if H
i
is the number of hops for VC i, and D
i
is a bound on the
one-way propagation and processing delay, all cells that make it to the receiver are
guaranteed an end-to-end delay bound of H
i
M
i
D
i
.
2. Priority Scheduling. Cells at an output queue that are ready to go contend for
bandwidth and have competing delay-jitter bounds and cell-loss probability
bounds. A priority scheduler addresses these concerns. For delay-jitter bounds, the
scheduler follows an earliest due-date principle with modi®cations to enhance
algorithmic ef®ciency. For cell-loss bounds, it uses a minimum guaranteed capacity,
19.1 INTRODUCTION 483
C
i
cells=frame for VC i, with the rest of the cells, if any, scheduled on a
nonguaranteed basis. C
i
is based on (i) marginal distribution of #cells=frame, (ii)
maximum acceptable probability of cell loss in a frame, and (iii) the equivalent
bandwidth of all VCs in this jitter class. The C
i
are computed by the equivalent-
bandwidth unit described in Item 4 below.
3. An Active Cell-Discard Unit. If there are excess cells left over from a frame at
the multiplexer after the corresponding frame time is over, it is likely that this is due
to persistence in the arrival process as suggested by the solid lines in Fig. 19.1. These
cells are likely to cause increased delay for cells in successive frames. Since buffering
does not reduce cell loss signi®cantly for persistent traf®c, we may elect to either toss
them right away or mark them as low-priority cells and discard them on demand.
The active cell-discard unit reclaims (or marks as old) cells that do not get
transmitted in their frame time. It is activated at the end of each frame.
An important side effect of using the active cell-discard unit is that it simpli®es
computation of equivalent bandwidth for correlated traf®c (especially for hetero-
geneous cell-loss bounds), while achieving high utilization through statistical
multiplexing.
4. Equivalent Bandwidth Computations. Algorithms for computing upper
bounds on equivalent bandwidths are developed. They address heterogeneous cell-
loss probabilities and heterogeneous jitter classes.
The computation decomposes traf®c by their jitter requirements. All connections
requiring a given delay-jitter are grouped into a class. For each jitter class, let
E
k
k 0; 1; 2; be the desired mean cell-loss ratio of a subset of connections k.
An iterative algorithm approximates the total capacity needed to meet fE
k
g. Also,
virtual capacities C
k
k 0; 1; 2; are computed. All groups of connections
specifying E
k
are guaranteed a bandwidth of C
k
in every frame time if they need it.
However, unused portions of virtual capacities are available to other connections.
5. Miscellaneous
Frames may be implemented on a per-VC basis. Frames from different VCs
need not be synchronized.
Per-VC framing and active cell-discard may be implemented ef®ciently
through associative matching of cell tags similar to that in processor pipelines
(Section 19.2.2).
The frame size for each VC is software setable, so delay-jitter bounds may be
negotiated over a continuum (at the granularity of cell transmission time). Also,
unlike per-link framing (see Section 19.1.4), the frame size of a given VC is
not constrained by frame sizes of other active VCs.
The minimum capacity guarantee per frame, C
i
for each VC i provides
protection from misbehaving or malfunctioning VCs.
A call admission unit will use the equivalent bandwidth algorithms to
determine if a speci®c cell can be admitted without violating quality-of-service
guarantees of other calls (or if an important call must be admitted, which calls
to disconnect). The call-admission unit is beyond the scope of this chapter.
484 QUALITY OF SERVICE PROVISIONING FOR LRD REAL-TIME TRAFFIC
19.1.4 Relationship with Stop-and-Go Queueing
Per-VC framing has been derived from Stop-and-Go Queueing described in
Golestani [9±11]. The primary enhancements are as follows.
1. Framing is induced on a per-VC basis instead of a per-output-link basis; see
Figs. 19.2 and 19.3. Per-VC framing eliminates the need for correcting for
phase mismatches between incoming frames and outgoing frames at a multi-
plexer and signi®cantly simpli®es its implementation. As we shall see in
Section 19.3, per-VC framing also reduces the maximum queueing delay by
half and cuts buffer requirements by one-third at a switch, while retaining the
same delay-jitter bound per-link framing.
2. Once cells from a frame become active (i.e., not dormant, waiting for their
next frame time), they compete with active cells from other VCs for the output
link. The algorithms that decide on which active cells to transmit and when,
and which cells to drop, are necessitated by the need to meet heterogeneous
Fig. 19.3 Arriving and departing frames for VC i when framing is induced on a per-VC
basis. Frames of different VCs need not be synchronized.
Fig. 19.2 Arriving frames and departing frames when framing is induced in a per-link basis.
The phase mismatch between arriving and departing frames is corrected through delay
circuits.
19.1 INTRODUCTION 485
cell-loss bounds and heterogeneous delay-jitter bounds simultaneously. They
also provide a ®rewall across connections (protection from misbehaving
sources). These algorithms are new.
In Golestani [9], the objective was to support no-loss transmission with
heterogeneous delay-jitter bounds. The latter were integral multiples of the
smallest jitter bound supported. Golestani showed that a preemptive priority
scheduler with highest priority to the smallest jitter class could meet all jitter
bounds if suf®cient capacity was available. Golestani [11] also presented a
solution that allowed for cell losses for a single jitter class (®xed delay-jitter
bound).
In the general case of meeting heterogeneous delay-jitter bounds with
potential cell losses, however, the scheduler needs to follow (i) an earliest due-
date principle and (ii) a cell-drop policy that takes into account current
observations on dropped-cells per VC, and heterogeneous cell-loss bounds
across VCs. See Section 19.2.3.
3. The original Stop-and-Go Queueing requirements that a traf®c stream declare
its r; T-smooth
1
parameter is dropped. This trades off higher utilization for a
lossless network. For a long-memory input stream, the average rate over a
small interval, T , can be signi®cantly higher (or lower) that its overall average
rate, so r would need to be the peak rate for lossless transmission and would
result in signi®cantly low utilizations. In the current proposal, cell losses,
while allowed, will be reduced through statistical multiplexing across virtual
circuits and controlled through equivalent bandwidth computations.
4. No-loss transmission can be guaranteed in the proposed architecture if desired;
see Section 19.4.3. However, the emphasis is on ef®cient statistical multi-
plexing that can also guarantee speci®ed cell-loss bounds.
19.1.5 Outline
The rest of this chapter is organized as follows. Section 19.2 presents the proposed
architecture. It includes (1) per-VC framing with active cell-discard and (2) cell
dispatching to meet the heterogeneous delay-jitter and cell-loss guarantees for
heterogeneous VCs (with heterogeneous marginal distributions and autocorrelation
structures). Section 19.3 presents maximum-delay bound, delay-jitter bound, and
buffer requirements for per-VC framing and compares the results with per-link
framing. Section 19.4 addresses upper bounds on equivalent bandwidth needed to
meet heterogeneous delay-jitter requirements and heterogeneous cell-loss probability
bounds, presents numerical and simulation examples, and shows that loss-free
transmission may be achieved for desired VCs. Section 19.5 presents related
work. Section 19.6 presents our conclusions.
1
An r; T-smooth stream was de®ned as one where the average bit rate over a time interval T did not
exceed r. Equivalently, the number of bits over nT; n 1T did not exceed rT, for all integer n.
486 QUALITY OF SERVICE PROVISIONING FOR LRD REAL-TIME TRAFFIC
19.2 PROPOSED ARCHITECTURE
19.2.1 Framing on a Per-Virtual-Circuit Basis Versus Per-Link Basis
Enforcing framing on a per-link basis [9±11] results in a phase mismatch at a switch
between arriving frames on input links and departing frames on output links. This
phase mismatch is due to different propagation delays on different input links. As
shown in Fig. 19.2, the arriving frames on input link 1 and departing frames on the
output link have a phase mismatch of y
1d
, while the arriving frames on input link 2
have a phase mismatch with respect to the output link of y
2d
.
To correct for a phase mismatch, additional delay circuitry is necessary. Also, the
admissible set of frame sizes is constrained. For example, all frame sizes are
considered integer multiples of a base frame size in Golestani [9]. A simpler
approach is to adopt a per-VC framing, without concern for what the frame sizes are,
and whether or not the frames from different VCs are synchronized with respect to
each other; see Fig. 19.3. As we will show in Section 19.3, per-VC framing, in
conjunction with active cell-discard and an appropriate scheduler, retains the
advantages of per-link framing, while improving on performance bounds and
functional ¯exibility. For example, if VC i's frame size is M
i
, and the number of
hops is H
i
, per VC framing provides the same delay-jitter bound, M
i
, as per-link
framing, a reduction in maximum-delay bound by an amount H
i
M
i
, and a maximum
buffer requirement that is one-third lower. Also, in conjunction with the cell
dispatcher described in Section 19.2.3, it guarantees heterogeneous cell-loss
bounds for correlated traf®c. Functional ¯exibility includes ability to set and
modify admissible jitter classes at run time, and not be constrained to an integer
multiple of a base frame size.
Hardware support for ef®cient and ¯exible implementation of per-VC framing
with active cell-discard is discussed next.
19.2.2 Implementation of Per-Virtual-Circuit Framing with Active Cell
Discard
The objective is to induce a framing structure on top of cells of a given VC, and for
the multiplexer to actively discard (or mark as old) cells that are not served during
their assigned frame time.
In order to allow for ¯exibility of application-speci®ed jitter bounds, the
frametime should be software setable (e.g., it may be negotiated during connec-
tion-open). It should then be set to the connection's delay-jitter tolerance. One may
allow for adjusting the frame time during the lifetime of a VC, if desired.
Issues that need to be addressed for per-VC framing, and active cell-discard are as
follows.
1. Frame Identi®cation Across Nodes (where Nodes Refer to Switches and End
Points). Cells transmitted during the tth frame t 0; 1; 2; by a node must be
recognized as belonging to frame t by the next downstream node. An alternating bit
19.2 PROPOSED ARCHITECTURE 487
sequence number distinguishing cells in adjacent frames is suf®cient if the sequence
number is generated at the transmitter. Old cells, if implemented, will be marked by
the ®rst multiplexer where a jitter deadline is missed.
2. Frame-clock Generation. For the ith VC, one needs a step-down counter,
initialized under software control, to the maximum number of cells that constitutes
its frame time. Let this number be M
i
. The counter is to be fed with a clock that runs
at the speed of cell transmission at the output link. On each clock cycle (at cell
granularity), the counter must count down one tick until it hits zero. At this point, it
will need to generate a frame-clock signal and reset itself to M
i
.
3. Cell Tagging. A cell arriving during frame t for VC i will not be eligible for
service until frame t 1 for the same VC. It is, therefore, assigned a state, dormant,
on arrival. See Fig. 19.4(a). When the next frame-clock signal arrives, the cell is
ready to be transmitted, so its state needs to be changed to active. If it still remains in
the queue when the following frame-clock signal arrives, it is old, and now there are
two possibilities. One strategy is simply to discard the cell and reclaim its buffer. A
second strategy is to change its state to old and keep it eligible for transmission on a
best-effort basis. In ATM networks, cells need to be delivered in sequence, so it
might be simplest to discard the old cells.
To simplify the discussion for what happens next, let us assume that active cells
that are not transmitted in their frame time are discarded. Then, at any given time,
cells belonging to frames t and t 2 will never be simultaneously present at the
multiplexer output queue, and all that is necessary is to distinguish between cells of
frames t and t 1. A single bit, therefore, suf®ces to distinguish between active and
dormant cells.
Fig. 19.4 Implementing per-VC framing with active cell-discard.
488
QUALITY OF SERVICE PROVISIONING FOR LRD REAL-TIME TRAFFIC
Assume that during frame t, dormant cells are represented by a 0 and active cells
have been marked 1 in the previous cycle. On a new cell arrival, the multiplexer
needs to attach to it a tag identifying its VC and its frame number (in this case 0), set
its valid bit to 1, and forward it to the output queue. See Fig. 19.4(b). The valid bit's
function is to help discard cells, similar to the action of ¯ushing a cache memory on
a context switch. In a fast cell-switched, VC network, a switch would implement a
tagging scheme for VC identi®ers anyway, so additional circuitry needed is small.
On the next frame clock, the entire output queue would be fed with two logical
signals, one to deactivate the active cells that did not get transmitted during their
allotted frame time (due to lack of available capacity), and one to activate the
dormant cells. See Fig. 19.4(c). Both of these can be achieved by associatively
matching cell tags with an identi®er representing the appropriate VC and its state.
The primary difference between this and off-the-shelf content-addressable memories
is that more than one match is likely, especially for dormant cells. On a match,
activie cells mark themselves invalid by setting their valid bits to 0; the dormant
cells move to the active state and are ready to be transmitted. At this point, they
move under the control of the cell dispatcher, which must decide on a strategy that is
consistent with the overall goals of delay-jitter and statistical cell-loss bounds.
A convenient model for the buffer memory organization is to view it as a set of
logical queues, one per VC, with a sequence number distinguishing active and
dormant cells. All old cells may potentially be grouped into one logical queue, as
discussed below.
19.2.3 Cell Dispatcher
The cell dispatcher is responsible for (1) scheduling and (2) transmitting active and
(potentially) old cells. Dormant cells are not within its purview. From the dispatch-
er's perspective, the active cells for each VC are assumed to be logically organized as
a queue (see Fig. 19.5(a)). The old cells (implemented optionally) are organized
either as separate queues or as a single queue. In either case, they are served on a
best-effort basis and may be reclaimed before they are served to accommodate new
cell arrivals.
The dispatcher consists of two concurrent units, a scheduler and a transmitter.
The scheduler allocates cell times to active cells of individual VCs and decides
which cells are to be dropped if contentions for capacity arise. The transmitter
transmits them (and old cells if all active queues are empty and old cells are waiting).
The scheduler will guarantee transmission of at least C
i
cells=frame for connection
i i 1; ; K, where K is the number of active VCs at the multiplexer. The
computation of the C
i
is based on cell-loss requirements for different VCs and their
marginal distributions, and is presented in Section 19.4. The scheduler and the
transmitter share a circular buffer that represents channel allocations in the future.
This circular buffer is presented below as a linear array for convenience of
exposition. Let this data-structure be called channel_image. Channel_image
n records the ID of one VC. If channel_imagen equals i, the transmitter will
19.2 PROPOSED ARCHITECTURE 489
transmit from the head of the active queue corresponding to VC i at time n. This will
be modi®ed below after the basic algorithm is presented.
The scheduler is activated on every new frame activation, that is, on a frame
clock. Let the new frame activation be at time n. (See Fig. 19.5(b).) Let the
corresponding VC be i, the frame length (jitter bound) be M
i
, the number of cells
in the current active frame be m
i
, and the minimum number of cells guaranteed to be
transmitted from this VC in this frame be C
i
. Channel_image records the action
to be taken by the transmitter in future slots. The scheduler either marks the slots in
channel_image with a VC identi®er or leaves them empty. If it does mark a slot,
it also records whether the transmission is to be guaranteed or not-guaranteed. If a
slot is marked not-guaranteed, it may be reclaimed at some point in the future to
serve a different VC (as described in Section 19.2.3.1).
The scheduler's task is as follows. Assume that it is activated on VC i's frame
clock. The time window in the future over which the m
i
active cells need to be
transmitted is n 1, n M
i
]. (The nth slot is kept aside for the transmission to
begin transmitting.) The scheduler follows the following algorithm.
(a) Beginning with n M
i
, going down to n 1, the scheduler attempts to ®nd
the largest k
i
m
i
slots that are empty in channel_image n M
i
through channel_image n 1, and marks each with the current VC, i.
(b) If (k
i
C
i
){
guaranteed_cells[i]=k
i
;
not_guaranteed[i]=0;
}else{
guaranteed_cells[i]=C
i
;
not_guaranteed[i]=k
i
-C
i
;
}
Fig. 19.5 Cell dispatcher's view. (a) Queue of active cells for each VC plus a queue of old
cells. This is used by the transmitter unit. (b) Channel_image. This is shared by the
scheduler and the transmitter units.
490
QUALITY OF SERVICE PROVISIONING FOR LRD REAL-TIME TRAFFIC