Tài liệu Mạng lưới giao thông và đánh giá hiệu suất P19 docx

19 QUALITY OF SERVICE PROVISIONING FOR LONG-RANGE-DEPENDENT REAL-TIME TRAFFIC ABDELNASER ADAS Conexant, Inc., Newport Beach, CA 92660 AMARNATH MUKHERJEE Knoltex Corporation, San Jose, CA 95157 19.1 INTRODUCTION 19.1.1 Overview Network support for variable bit rate (VBR) video needs to consider (1) properties of workload induced (e.g., signi®cant autocorrelations into far lags and heterogeneous marginal distributions), and (2) application-speci®c bounds on delay-jitter and statistical cell-loss probabilities. This chapter presents a quality-of-service (QoS) solution for such traf®c at each multiplexing point in a network. Heterogeneity in both-offered workload and quality-of-service requirements are considered. The network is assumed to be cell-switched with virtual circuits (VCs) similar to that in ATM networks. Chapter 16 discusses an alternative approach for provisioning for long-range-dependent (LRD) traf®c. See also the work of Heyman and Lakshman (Chapter 12) and Li and Li (Chapter 13). 19.1.2 Correlated Traf®c and Its Implications Studies on a range of video applications indicate that there exists a slowly decaying autocorrelation structure in the underlying stochastic processes [3, 7, 8, 12, 19, 21]. Self-Similar Network Traf®c and Performance Evaluation, Edited by Kihong Park and Walter Willinger ISBN 0-471-31974-0 Copyright # 2000 by John Wiley & Sons, Inc. 481 Self-Similar Network Traf®c and Performance Evaluation, Edited by Kihong Park and Walter Willinger Copyright # 2000 by John Wiley & Sons, Inc. Print ISBN 0-471-31974-0 Electronic ISBN 0-471-20644-X Providing guarantees on maximum-delay, delay-jitter, and cell-loss probabilities (or some other measure of cell loss) in the presence of such traf®c is nontrivial, especially if the coef®cient of variation of the marginal distribution (or the distribution tail) is large. This is because such traf®c signi®cantly increases queue length statistics at a multiplexer [2, 5, 8, 16±18, 20]. As an illustration, consider the performance of a ®nite-buffer queue. Let the arrival process be a fractionally differenced autoregressive moving average process: 1 À f 1 BD d X t  E t ; where  f 1 0 < f 1 < 1 represents an exponentially decaying autocorrelation component;  d 0 < d < 1 2  represents a hyperbolically decaying autocorrelation component; fE t g is white noise, that is, uncorrelated; and  B is the backward shift operator, that is, BX t  X tÀ1 . The correlation structure of X t can be controlled by changing f 1 and d. Figure 19.1 shows the mean cell loss versus number of frame buffers for three input correlation structures with approximately the same coef®cient of variation $0:24. A frame buffer is the maximum number of cells that can be transmitted by the output channel in a given time interval (frame time). Fig. 19.1 Mean number of cells dropped for different dependency structures in the workload. (a) Mean utilization  0.9; (b) mean utilization  0.8. The model form used is a fractionally differenced ARIMA1; d; 0 process [13]: 1 Àf 1 BD d X t  E t . 482 QUALITY OF SERVICE PROVISIONING FOR LRD REAL-TIME TRAFFIC The solid line showing a slow decay is for a long-memory input traf®c sequence. The dashed line in the middle is for traf®c with short memory. The line with the smallest mean cell loss corresponds to a near white noise stream. Note that mean cell loss decays very slowly with increasing buffer size for traf®c with a slowly decaying autocorrelation function. Qualitatively similar observations have been reported elsewhere [2, 5, 8, 16±18, 20]. 19.1.3 Summary of the Proposed Architecture This chapter introduces a per-virtual-circuit (per-VC) framing structure and a pseudo-earliest-due-date cell dispatcher to provide guaranteed delay-jitter bounds. Heterogeneous jitter bounds are supported through software-controlled frame sizes, which may be independently set of each VC. The framing structure is a general- ization of per-link framing introduced by Golestani. The proposed framing structure eliminates correcting for phase mismatches between incoming frames and outgoing frames, necessary in per-link framing. This results in reduction in end-to-end delay bound and buffer requirements, and a simpler implementation. Strong autocorrelations typically seen in video traf®c make equivalent bandwidth computations for heterogeneous cell-loss bounds intractable. To address this, the framing strategy is combined with an active cell-discard mechanism with prioritized cell-dropping, the latter utilizing the history of dropped cells and target cell-loss bounds for each VC. Upper bounds on the equivalent bandwidth needed to support a given workload with a target quality of service are developed. These are validated through numerical and simulation results from variable bit rate MPEG-I video traces. A high-level view of the proposed architecture is as follows. 1. A Framing Structure on a Per-VC Basis. To provide heterogeneous delay-jitter bounds, a framing structure is induced on VCs, similar to that in Golestani [9±11]. (Differences between the two approaches are described later.) Consider a virtual circuit (VC), i, with a desired delay-jitter bound M i . The frame structure splits time for this VC into juxtaposed intervals of length M i at each multiplexing point. Cells from VC i that arrive in a given frame at a multiplexer are buffered, and not transmitted until the beginning of the next frame time. If suf®cient capacity is available to transmit these cells in the next interval, all cells arrive at the next hop within an interval of length M i . This way, they are guaranteed to meet a delay-jitter bound of M i . Also, if H i is the number of hops for VC i, and D i is a bound on the one-way propagation and processing delay, all cells that make it to the receiver are guaranteed an end-to-end delay bound of H i M i  D i . 2. Priority Scheduling. Cells at an output queue that are ready to go contend for bandwidth and have competing delay-jitter bounds and cell-loss probability bounds. A priority scheduler addresses these concerns. For delay-jitter bounds, the scheduler follows an earliest due-date principle with modi®cations to enhance algorithmic ef®ciency. For cell-loss bounds, it uses a minimum guaranteed capacity, 19.1 INTRODUCTION 483 C i cells=frame for VC i, with the rest of the cells, if any, scheduled on a nonguaranteed basis. C i is based on (i) marginal distribution of #cells=frame, (ii) maximum acceptable probability of cell loss in a frame, and (iii) the equivalent bandwidth of all VCs in this jitter class. The C i are computed by the equivalent- bandwidth unit described in Item 4 below. 3. An Active Cell-Discard Unit. If there are excess cells left over from a frame at the multiplexer after the corresponding frame time is over, it is likely that this is due to persistence in the arrival process as suggested by the solid lines in Fig. 19.1. These cells are likely to cause increased delay for cells in successive frames. Since buffering does not reduce cell loss signi®cantly for persistent traf®c, we may elect to either toss them right away or mark them as low-priority cells and discard them on demand. The active cell-discard unit reclaims (or marks as old) cells that do not get transmitted in their frame time. It is activated at the end of each frame. An important side effect of using the active cell-discard unit is that it simpli®es computation of equivalent bandwidth for correlated traf®c (especially for heterogeneous cell-loss bounds), while achieving high utilization through statistical multiplexing. 4. Equivalent Bandwidth Computations. Algorithms for computing upper bounds on equivalent bandwidths are developed. They address heterogeneous cell- loss probabilities and heterogeneous jitter classes. The computation decomposes traf®c by their jitter requirements. All connections requiring a given delay-jitter are grouped into a class. For each jitter class, let E k k  0; 1; 2;  be the desired mean cell-loss ratio of a subset of connections k. An iterative algorithm approximates the total capacity needed to meet fE k g. Also, virtual capacities C k k  0; 1; 2;  are computed. All groups of connections specifying E k are guaranteed a bandwidth of C k in every frame time if they need it. However, unused portions of virtual capacities are available to other connections. 5. Miscellaneous  Frames may be implemented on a per-VC basis. Frames from different VCs need not be synchronized.  Per-VC framing and active cell-discard may be implemented ef®ciently through associative matching of cell tags similar to that in processor pipelines (Section 19.2.2).  The frame size for each VC is software setable, so delay-jitter bounds may be negotiated over a continuum (at the granularity of cell transmission time). Also, unlike per-link framing (see Section 19.1.4), the frame size of a given VC is not constrained by frame sizes of other active VCs.  The minimum capacity guarantee per frame, C i for each VC i provides protection from misbehaving or malfunctioning VCs.  A call admission unit will use the equivalent bandwidth algorithms to determine if a speci®c cell can be admitted without violating quality-of-service guarantees of other calls (or if an important call must be admitted, which calls to disconnect). The call-admission unit is beyond the scope of this chapter. 484 QUALITY OF SERVICE PROVISIONING FOR LRD REAL-TIME TRAFFIC 19.1.4 Relationship with Stop-and-Go Queueing Per-VC framing has been derived from Stop-and-Go Queueing described in Golestani [9±11]. The primary enhancements are as follows. 1. Framing is induced on a per-VC basis instead of a per-output-link basis; see Figs. 19.2 and 19.3. Per-VC framing eliminates the need for correcting for phase mismatches between incoming frames and outgoing frames at a multiplexer and signi®cantly simpli®es its implementation. As we shall see in Section 19.3, per-VC framing also reduces the maximum queueing delay by half and cuts buffer requirements by one-third at a switch, while retaining the same delay-jitter bound per-link framing. 2. Once cells from a frame become active (i.e., not dormant, waiting for their next frame time), they compete with active cells from other VCs for the output link. The algorithms that decide on which active cells to transmit and when, and which cells to drop, are necessitated by the need to meet heterogeneous Fig. 19.3 Arriving and departing frames for VC i when framing is induced on a per-VC basis. Frames of different VCs need not be synchronized. Fig. 19.2 Arriving frames and departing frames when framing is induced in a per-link basis. The phase mismatch between arriving and departing frames is corrected through delay circuits. 19.1 INTRODUCTION 485 cell-loss bounds and heterogeneous delay-jitter bounds simultaneously. They also provide a ®rewall across connections (protection from misbehaving sources). These algorithms are new. In Golestani [9], the objective was to support no-loss transmission with heterogeneous delay-jitter bounds. The latter were integral multiples of the smallest jitter bound supported. Golestani showed that a preemptive priority scheduler with highest priority to the smallest jitter class could meet all jitter bounds if suf®cient capacity was available. Golestani [11] also presented a solution that allowed for cell losses for a single jitter class (®xed delay-jitter bound). In the general case of meeting heterogeneous delay-jitter bounds with potential cell losses, however, the scheduler needs to follow (i) an earliest due- date principle and (ii) a cell-drop policy that takes into account current observations on dropped-cells per VC, and heterogeneous cell-loss bounds across VCs. See Section 19.2.3. 3. The original Stop-and-Go Queueing requirements that a traf®c stream declare its r; T-smooth 1 parameter is dropped. This trades off higher utilization for a lossless network. For a long-memory input stream, the average rate over a small interval, T , can be signi®cantly higher (or lower) that its overall average rate, so r would need to be the peak rate for lossless transmission and would result in signi®cantly low utilizations. In the current proposal, cell losses, while allowed, will be reduced through statistical multiplexing across virtual circuits and controlled through equivalent bandwidth computations. 4. No-loss transmission can be guaranteed in the proposed architecture if desired; see Section 19.4.3. However, the emphasis is on ef®cient statistical multiplexing that can also guarantee speci®ed cell-loss bounds. 19.1.5 Outline The rest of this chapter is organized as follows. Section 19.2 presents the proposed architecture. It includes (1) per-VC framing with active cell-discard and (2) cell dispatching to meet the heterogeneous delay-jitter and cell-loss guarantees for heterogeneous VCs (with heterogeneous marginal distributions and autocorrelation structures). Section 19.3 presents maximum-delay bound, delay-jitter bound, and buffer requirements for per-VC framing and compares the results with per-link framing. Section 19.4 addresses upper bounds on equivalent bandwidth needed to meet heterogeneous delay-jitter requirements and heterogeneous cell-loss probability bounds, presents numerical and simulation examples, and shows that loss-free transmission may be achieved for desired VCs. Section 19.5 presents related work. Section 19.6 presents our conclusions. 1 An r; T-smooth stream was de®ned as one where the average bit rate over a time interval T did not exceed r. Equivalently, the number of bits over nT; n  1T did not exceed rT, for all integer n. 486 QUALITY OF SERVICE PROVISIONING FOR LRD REAL-TIME TRAFFIC 19.2 PROPOSED ARCHITECTURE 19.2.1 Framing on a Per-Virtual-Circuit Basis Versus Per-Link Basis Enforcing framing on a per-link basis [9±11] results in a phase mismatch at a switch between arriving frames on input links and departing frames on output links. This phase mismatch is due to different propagation delays on different input links. As shown in Fig. 19.2, the arriving frames on input link 1 and departing frames on the output link have a phase mismatch of y 1d , while the arriving frames on input link 2 have a phase mismatch with respect to the output link of y 2d . To correct for a phase mismatch, additional delay circuitry is necessary. Also, the admissible set of frame sizes is constrained. For example, all frame sizes are considered integer multiples of a base frame size in Golestani [9]. A simpler approach is to adopt a per-VC framing, without concern for what the frame sizes are, and whether or not the frames from different VCs are synchronized with respect to each other; see Fig. 19.3. As we will show in Section 19.3, per-VC framing, in conjunction with active cell-discard and an appropriate scheduler, retains the advantages of per-link framing, while improving on performance bounds and functional ¯exibility. For example, if VC i's frame size is M i , and the number of hops is H i , per VC framing provides the same delay-jitter bound, M i , as per-link framing, a reduction in maximum-delay bound by an amount H i M i , and a maximum buffer requirement that is one-third lower. Also, in conjunction with the cell dispatcher described in Section 19.2.3, it guarantees heterogeneous cell-loss bounds for correlated traf®c. Functional ¯exibility includes ability to set and modify admissible jitter classes at run time, and not be constrained to an integer multiple of a base frame size. Hardware support for ef®cient and ¯exible implementation of per-VC framing with active cell-discard is discussed next. 19.2.2 Implementation of Per-Virtual-Circuit Framing with Active Cell Discard The objective is to induce a framing structure on top of cells of a given VC, and for the multiplexer to actively discard (or mark as old) cells that are not served during their assigned frame time. In order to allow for ¯exibility of application-speci®ed jitter bounds, the frametime should be software setable (e.g., it may be negotiated during connection-open). It should then be set to the connection's delay-jitter tolerance. One may allow for adjusting the frame time during the lifetime of a VC, if desired. Issues that need to be addressed for per-VC framing, and active cell-discard are as follows. 1. Frame Identi®cation Across Nodes (where Nodes Refer to Switches and End Points). Cells transmitted during the tth frame t  0; 1; 2;  by a node must be recognized as belonging to frame t by the next downstream node. An alternating bit 19.2 PROPOSED ARCHITECTURE 487 sequence number distinguishing cells in adjacent frames is suf®cient if the sequence number is generated at the transmitter. Old cells, if implemented, will be marked by the ®rst multiplexer where a jitter deadline is missed. 2. Frame-clock Generation. For the ith VC, one needs a step-down counter, initialized under software control, to the maximum number of cells that constitutes its frame time. Let this number be M i . The counter is to be fed with a clock that runs at the speed of cell transmission at the output link. On each clock cycle (at cell granularity), the counter must count down one tick until it hits zero. At this point, it will need to generate a frame-clock signal and reset itself to M i . 3. Cell Tagging. A cell arriving during frame t for VC i will not be eligible for service until frame t 1 for the same VC. It is, therefore, assigned a state, dormant, on arrival. See Fig. 19.4(a). When the next frame-clock signal arrives, the cell is ready to be transmitted, so its state needs to be changed to active. If it still remains in the queue when the following frame-clock signal arrives, it is old, and now there are two possibilities. One strategy is simply to discard the cell and reclaim its buffer. A second strategy is to change its state to old and keep it eligible for transmission on a best-effort basis. In ATM networks, cells need to be delivered in sequence, so it might be simplest to discard the old cells. To simplify the discussion for what happens next, let us assume that active cells that are not transmitted in their frame time are discarded. Then, at any given time, cells belonging to frames t and t  2 will never be simultaneously present at the multiplexer output queue, and all that is necessary is to distinguish between cells of frames t and t  1. A single bit, therefore, suf®ces to distinguish between active and dormant cells. Fig. 19.4 Implementing per-VC framing with active cell-discard. 488 QUALITY OF SERVICE PROVISIONING FOR LRD REAL-TIME TRAFFIC Assume that during frame t, dormant cells are represented by a 0 and active cells have been marked 1 in the previous cycle. On a new cell arrival, the multiplexer needs to attach to it a tag identifying its VC and its frame number (in this case 0), set its valid bit to 1, and forward it to the output queue. See Fig. 19.4(b). The valid bit's function is to help discard cells, similar to the action of ¯ushing a cache memory on a context switch. In a fast cell-switched, VC network, a switch would implement a tagging scheme for VC identi®ers anyway, so additional circuitry needed is small. On the next frame clock, the entire output queue would be fed with two logical signals, one to deactivate the active cells that did not get transmitted during their allotted frame time (due to lack of available capacity), and one to activate the dormant cells. See Fig. 19.4(c). Both of these can be achieved by associatively matching cell tags with an identi®er representing the appropriate VC and its state. The primary difference between this and off-the-shelf content-addressable memories is that more than one match is likely, especially for dormant cells. On a match, activie cells mark themselves invalid by setting their valid bits to 0; the dormant cells move to the active state and are ready to be transmitted. At this point, they move under the control of the cell dispatcher, which must decide on a strategy that is consistent with the overall goals of delay-jitter and statistical cell-loss bounds. A convenient model for the buffer memory organization is to view it as a set of logical queues, one per VC, with a sequence number distinguishing active and dormant cells. All old cells may potentially be grouped into one logical queue, as discussed below. 19.2.3 Cell Dispatcher The cell dispatcher is responsible for (1) scheduling and (2) transmitting active and (potentially) old cells. Dormant cells are not within its purview. From the dispatcher's perspective, the active cells for each VC are assumed to be logically organized as a queue (see Fig. 19.5(a)). The old cells (implemented optionally) are organized either as separate queues or as a single queue. In either case, they are served on a best-effort basis and may be reclaimed before they are served to accommodate new cell arrivals. The dispatcher consists of two concurrent units, a scheduler and a transmitter. The scheduler allocates cell times to active cells of individual VCs and decides which cells are to be dropped if contentions for capacity arise. The transmitter transmits them (and old cells if all active queues are empty and old cells are waiting). The scheduler will guarantee transmission of at least C i cells=frame for connection i i  1; ; K, where K is the number of active VCs at the multiplexer. The computation of the C i is based on cell-loss requirements for different VCs and their marginal distributions, and is presented in Section 19.4. The scheduler and the transmitter share a circular buffer that represents channel allocations in the future. This circular buffer is presented below as a linear array for convenience of exposition. Let this data-structure be called channel_image. Channel_image n records the ID of one VC. If channel_imagen equals i, the transmitter will 19.2 PROPOSED ARCHITECTURE 489 transmit from the head of the active queue corresponding to VC i at time n. This will be modi®ed below after the basic algorithm is presented. The scheduler is activated on every new frame activation, that is, on a frame clock. Let the new frame activation be at time n. (See Fig. 19.5(b).) Let the corresponding VC be i, the frame length (jitter bound) be M i , the number of cells in the current active frame be m i , and the minimum number of cells guaranteed to be transmitted from this VC in this frame be C i . Channel_image records the action to be taken by the transmitter in future slots. The scheduler either marks the slots in channel_image with a VC identi®er or leaves them empty. If it does mark a slot, it also records whether the transmission is to be guaranteed or not-guaranteed. If a slot is marked not-guaranteed, it may be reclaimed at some point in the future to serve a different VC (as described in Section 19.2.3.1). The scheduler's task is as follows. Assume that it is activated on VC i's frame clock. The time window in the future over which the m i active cells need to be transmitted is n  1, n  M i ]. (The nth slot is kept aside for the transmission to begin transmitting.) The scheduler follows the following algorithm. (a) Beginning with n M i , going down to n 1, the scheduler attempts to ®nd the largest k i m i slots that are empty in channel_image n M i  through channel_image n  1, and marks each with the current VC, i. (b) If (k i C i ){ guaranteed_cells[i]=k i ; not_guaranteed[i]=0; }else{ guaranteed_cells[i]=C i ; not_guaranteed[i]=k i -C i ; } Fig. 19.5 Cell dispatcher's view. (a) Queue of active cells for each VC plus a queue of old cells. This is used by the transmitter unit. (b) Channel_image. This is shared by the scheduler and the transmitter units. 490 QUALITY OF SERVICE PROVISIONING FOR LRD REAL-TIME TRAFFIC

Định dạng
Số trang	25
Dung lượng	275,27 KB