Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 43 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
43
Dung lượng
573,91 KB
Nội dung
CHANNEL CODING 117 Figure 3.6 Trellis diagram for the (1 + D 2 , 1 +D + D 2 ) convolutional code. be depicted by successively appending such transitions as shown in Figure 3.6. This is called a trellis diagram. Given a defined initial state of the shift register (usually the all- zero state), each code word is characterized by sequence of certain transitions. We call this a path in the trellis. In Figure 3.6, the path corresponding to the data word 1000 0111 0100 and the code word 11 01 11 00 00 11 10 01 10 00 01 11 is depicted by bold lines for the transitions in the trellis. In this example, the last m = 2 bits are zero and, as a consequence, the final state in the trellis is the all-zero state. It is common practice to start and to stop with the all-zero state because it helps the decoder. This can easily be achieved by appending m zeros – the so-called tail bits – to the useful bit stream. State diagrams One can also characterize the encoder by states and inputs and their corresponding transi- tions as depicted in part (a) of Figure 3.7 for the code under consideration. This is known as a Mealy automat. To evaluate the free distance of a code, it is convenient to cut open the automat diagram as depicted in part (b) of Figure 3.7. Each path (code word) that starts in the all-zero state and comes back to that state can be visualized by a sequence of states that starts at the all-zero state on the left and ends at the all-zero state on the right. We look at the coded bits in the labeling b i /c 1i c 2i and count the bits that have the value one. This is just the Hamming distance between the code word corresponding to that sequence and the all-zero code word. From the diagram, one can easily obtain the smallest distance d free to the all-zero code word. For the code of our example, the minimum distance corresponds to the sequence of transitions 00 → 10 → 01 → 00 and turns out to be d free = 5. The alter- native sequence 00 → 10 → 11 → 01 → 00 has the distance d = 6. All other sequences include loops that produce higher distances. From the state diagram, we may also find the so-called error coefficient c d . These error coefficients are multiplicative coefficients that relate the probability P d of an error event of distance d to the corresponding bit error probability. To obtain c d , we have to count all the nonzero data bits of all error paths of distance d to the all-zero code word. Using P(A 1 ∪ A 2 ) ≤ P(A 1 ) + P(A 2 ), we obtain the union bound P b ≤ ∞ d=d free c d P d for the bit error probability. The coefficients c d for most relevant codes can be found in text books. The error event probability P d , for example, for antipodal signaling is given 118 CHANNEL CODING 00 10 01 11 1/11 0/01 1/00 0/11 1/10 0/10 1/01 0/00 (a) (b) 00 0010 01 1/01 11 1/11 0/01 1/00 0/11 0/101/10 Figure 3.7 State diagram (Mealy automat) for the (1 +D 2 , 1 +D + D 2 ) convolutional code. by Equation (3.2) for the AWGN channel and by Equation (3.4) for the Rayleigh fading channel. Catastrophic codes The state diagram also enables us to find a class of encoders called catastrophic encoders that must be excluded because they have the undesirable property of error propagation: if there is a closed loop in the state diagram where all the coded bits c 1i c 2i are equal to zero, but at least one data bit b i equals one, then there exists a path of infinite length with an infinite number of ones in the data, but with only a finite number of ones in the code word. As a CHANNEL CODING 119 Coded bits Data bits Figure 3.8 Example of a catastrophic convolutional encoder. consequence, a finite number of channel bit errors may lead to an infinite number of errors in the data, which is certainly a very undesirable property. An example for a catastrophic encoder is the one characterized by the generators (3, 6) oct = (1 +D,D + D 2 ),whichis depicted in Figure 3.8. Once in the state 11, the all-one input sequence will be encoded to the all-zero code word. Punctured convolutional codes Up to now, we have only considered convolutional codes of rate R c = 1/n.Therearetwo possibilities to obtain R c = k/n codes. The classical one is to use k parallel shift registers and combine their outputs. This, however, makes the implementation more complicated. A simpler and more flexible method called puncturing is usually preferred in practical communication systems. We explain it by means of the example of an R c = 1/2 code that can be punctured to obtain an R c = 2/3 code. The encoder produces two parallel encoded data streams {c 1,i } ∞ i=0 and {c 2,i } ∞ i=0 . The first data stream will be left unchanged. From the other data stream every second bit will be discarded, that is, only the bits with even time index i will be multiplexed to the serial code word and then transmitted. Instead of the original code word c 10 c 20 c 11 c 21 c 12 c 22 c 13 c 23 c 14 the punctured code word c 10 c 20 c 11 c 12 c 22 c 13 c 14 will be transmitted. Here we have indicated the punctured bits by . At the receiver, the puncturing positions must be known. A soft decision (e.g. MLSE) receiver has metric values µ νi as inputs that correspond to the encoded bits c νi . The absolute value of µ νi is an indicator for the reliability of the bit. Punctured bits can be regarded as bits with reliability zero. Thus, the receiver has to add dummy receive bits at the punctured positions of the code word and assign them the metric values µ νi = 0. Recursive systematic convolutional encoders Recursive systematic convolutional (RSC) encoders have become popular in the context of parallel concatenated codes and turbo decoding (see below). For every nonsystematic convolutional (NSC) R c = 1/n encoder, one can find an equivalent RSC encoder that 120 CHANNEL CODING (a) (b) Figure 3.9 Inversion circuit for the generator polynomial 1 +D 2 . produces the same code (i.e. the same set of code words) with a different relation between the data word and the code word. It can be constructed in such a way that the first of the n parallel encoded bit stream of the code word is systematic, that is, it is identical to the data word. As an example, we consider the R c = 1/2 convolutional code of Figure 3.4 that can be written in compact power series notation as c 1 (D) c 2 (D) = b(D) 1 + D 2 1 + D + D 2 . The upper branch corresponding to the generator polynomial g 1 (D) = 1 + D 2 of the shift register circuit depicted in part (a) of Figure 3.9 defines a one-to-one map from the set of all data words to itself. One can easily check that the inverse is given by the recursive shift register circuit depicted in part (b) of Figure 3.9. This can be described by the formal power series g −1 1 (D) = 1 + D 2 −1 = 1 +D 2 + D 4 + D 6 +··· This power series description of feedback shift registers is formally the same as the descrip- tion of linear systems in digital signal processing 5 , where the delay is usually denoted by e −jω instead of D. The shift register circuits of Figure 3.9 invert each other. Thus, g −1 1 (D) is a one-to-one mapping between bit sequences. As a consequence, combining the convolu- tional encoder with that recursive shift register circuit as depicted in part (a) of Figure 3.10 leads to the same set of code words. This circuit is equivalent to the one depicted in part (b) of Figure 3.10. This RSC encoder with generator polynomials (5, 7) oct can formally be written as c 1 (D) c 2 (D) = ˜ b(D) 1 1+D+D 2 1+D 2 , where the bit sequences are related by ˜ b(D) = (1 + D 2 )b(D). For a general R c = 1/n convolutional code, we have the NSC encoder given by the generator polynomial vector g(D) = g 1 (D) . . . g n (D) . 5 In signal processing, we have an interpretation of ω as a (normalized) frequency, which has no meaning for convolutional codes. Furthermore, here all additions are modulo 2. However, all formal power series operations are the same. CHANNEL CODING 121 (a) (b) Figure 3.10 Recursive convolutional encoder. The equivalent RSC encoder is given by the generator vector ˜ g(D) = 1 g 2 (D)/g 1 (D) . . . g n (D)/g 1 (D) . The bits sequence b(D) encoded by g(D) results in the same code word as the bit sequence ˜ b(D) = g 1 (D)b(D) encoded by ˜ g(D) = g −1 1 (D)g(D),thatis, c(D) = b(D)g(D) = ˜ b(D) ˜ g(D). An MLSE decoder will find the most likely code word that is uniquely related to a data word corresponding to an NSC encoder and another data word corresponding to an RSC encoder. As a consequence, one may use the same decoder for both and then relate the sequences as described above. But note that this is true only for a decoder that makes decisions about sequences. This is not true for a decoder that makes bitwise decisions like the MAP decoder. 3.2.2 MLSE for convolutional codes: the Viterbi algorithm Let us consider a convolutional code with memory m and a finite sequence of K input data bits { b k } K k=1 . We denote the coded bits as c i . We assume that the corresponding trellis starts and ends in the all-zero state. In our notation, the tail bits are included in { b k } K k=1 ,thatis, there are only K − m bits that really carry information. 122 CHANNEL CODING Although the following discussion is not restricted to that case, we first consider the concrete case of antipodal (BPSK) signaling, that is, transmit symbols x i = (−1) c i ∈{±1} written as a vector x and a real discrete AWGN channel given by y = x + n, where y is the vector of receive symbols and n is the real AWGN vector with components n i of variance σ 2 = E n 2 i = N 0 2E S . Here, we have normalized the noise by the symbol energy E S . We know from the discussion in Subsection 1.3.2 that, given a fixed receive vector y, the most probable transmit sequence x for this case is the one that maximizes the correlation metric given by the scalar product µ(x) = y · x. (3.30) For an R c = 1/n convolutional code, the code word consists of nK encoded bits, and the metric can be written as a sum µ(x) = K k=1 µ k (3.31) of metric increments µ k = y k · x k corresponding to the K time steps k = 1, ,K of the trellis. Here x k is the vector of the n symbols x i that correspond to encoded bits for the time step number k where the bit b k is encoded, and y k is the vector of the corresponding receive vector. The task now is to find the vector x that maximizes the metric given by Equation (3.31), thereby exploiting the special trellis structure of a convolutional code. We note that the following treatment is quite general and it is by no means restricted to the special case of the AWGN metric given by Equation (3.30). For instance, any metric that is given by expressions like Equations (3.19–3.21) can be written as Equation (3.31). Thus, a priori information about the bits also can be included in a straightforward manner by the expressions presented in Subsection 3.1.5, see also (Hagenauer 1995). For a reasonable sequence length K, it is not possible to find the vector x by exhaustive search because this would require a computational effort that is proportional to 2 K . But, owing to the trellis structure of convolutional codes, this is not necessary. We consider two code words x and ˆ x with corresponding paths merging at a certain time step k in a common state s k (see Figure 3.11). Assume that for both paths the accumulated metrics,thatis,the sum of all metric increments up to that time step k = k i=1 µ i for x and ˆ k = k i=1 ˆµ i CHANNEL CODING 123 s k s k−1 ˆx x s k+1 Figure 3.11 Transition where the paths x and ˆ x merge. for ˆ x have been calculated. Because the two paths merge at time step k and will be identical for the whole future, µ( ˆ x) − µ(x) = ˆ k − k holds and we can already make a decision between both paths. Assume µ( ˆ x) − µ(x)>0. Then, ˆ x is more likely than x, and we can discard x from any further considerations. This fact allows us to sort out unlikely paths before the final decision and thus an effort that is exponentially growing with the time can be avoided. The algorithm that does this is the Viterbi algorithm and it works as follows: starting from the initial state, the metric increments µ k for all transitions between all the state s k−1 and s k are calculated recursively and added up to the time accumulated metrics k−1 . Then, for the two transitions with the same new state s k , the values of k−1 + µ k are compared. The larger value will serve as the new accumulated metric k = k−1 + µ k , and the other one will be discarded. Furthermore, a pointer will be stored, which points from s k to the preceding state corresponding to the larger metric value. Thus, going from the left to the right in the trellis diagram, for each time instant k and for all possible states, the algorithm executes the following steps: 1. Calculate the metric increments µ k for all the 2 · 2 m transitions between all the 2 m states s k−1 and all the 2 m states s k and add them to the to the 2 m accumulated metric values k−1 corresponding to the states s k−1 . 2. For all states s k compare the values of k−1 + µ k for the two transitions ending at s k and select the one that is the maximum and then set k = k−1 + µ k ,whichis the accumulated metric of that state. 3. Place a pointer to the state s k−1 that is the most likely preceding state for that transition. Then, when all these calculations and assignments have been done, we start at the end of the trellis and trace back the pointers that indicate the most likely preceding states. This procedure finally leads us to the most likely path in the trellis. 124 CHANNEL CODING 3.2.3 The soft-output Viterbi algorithm (SOVA) The soft-output Viterbi algorithm (SOVA) is a relatively simple modification of the Viterbi algorithm that allows to obtain an additional soft reliability information for the hard decision bits provided by the MLSE. By construction, the Viterbi algorithm is a sequence estimator, not a bit estimator. Thus, it does not provide reliability information about the bits corresponding to the sequence. However, it can provide us with information about the reliability of the decision between two sequences. Let x and ˆ x be two possible transmit sequences. Then, according to Equation (3.18), the conditional probability that this sequence has been transmitted given that y has been received is P(x|y) = C exp 1 σ 2 µ(x) for x and P( ˆ x|y) = C exp 1 σ 2 µ( ˆ x) for ˆ x. Now assume that ˆ x is the maximum likelihood sequence obtained by the Viterbi algorithm. If one could be sure that one of the two sequences x or ˆ x is the correct one (and not any other one), then Pr( ˆ x|y) = 1 − Pr(x|y) and the LLR for a correct decision would be given by L( ˆ x) = log P( ˆ x|y) P(x|y) = 1 σ 2 ( µ( ˆ x) − µ(x) ) , (3.32) that is, the metric difference is a measure for the reliability of the decision between the two sequences. We note that this LLR is conditioned by the event that one of both paths is the correct one. We now consider a data bit ˆ b k at a certain position in the bit stream corresponding to the ML sequence ˆ x estimated by the Viterbi Algorithm 6 . The goal now is to gain information about the reliability of this bit by looking at the reliability of the decisions between ˆ x and other sequences x (β) whose paths merge with the ML path at some state s k . Any decision in favor of ˆ x instead of the alternative sequence x (β) with a bit b (β) k is only relevant for that bit decision if b (β) k = b k . Thus, we can restrict our consideration to the relevant sequences x (β) . Each of the relevant alternative paths labeled by the index β is the source of a possible erroneous decision in favor of ˆ b k instead of b (β) k . We define a random error bit e (β) k that takes the value e (β) k = 1 for an erroneous decision in favor of ˆ b k instead of b (β) k and e (β) k = 0 otherwise. We write L (β) k = L e (β) k = 0 for the L-values of the error bits. By construction, it is given by L (β) k = 1 σ 2 µ( ˆ x) − µ(x β ) . Note that L (β) k > 0 holds because b k belongs to the maximum likelihood path that is per definitionem more likely than any other. It is important to note that all the corresponding probabilities are conditional probabili- ties because in any case it is assumed that one of the two sequences ˆ x or x (β) is the correct 6 The same arguments apply if we consider a symbol ˆx i of the transmit sequence. CHANNEL CODING 125 one. Furthermore, we only consider paths that merge directly with the ML path. Therefore, all paths that are discarded after comparing them with another path than the ML path are not considered. It is possible (but not very likely in most cases) that the correct path is among these discarded paths. This rare event has been excluded in our approximation. We further assume that the random error bits e (β) k are statistically independent. All the random error bits e (β) k together result in an error bit e k that is assumed to be given by the modulo 2sum e k = relevant β ⊕e (β) k . We further write L k = L ( e k = 0 ) for the L-value of the resulting error bit. Using Equation (3.14), the L-value for this resulting error bit is approximately given by L k ≈ min relevant β L (β) k = min relevant β 1 σ 2 µ( ˆ x) − µ(x (β) ) , where we have used Equation (3.32). It is intuitively simple to understand that this is a reasonable reliability information about the bit b k . We consider all the sequence decisions that are relevant for the decision of this bit. Then, according to the intuitively obvious rule that a chain is as strong as its weakest link, we assign the smallest of those sequence reliabilities as the bit reliability. Now, in the Viterbi algorithm, the reliability information about the merging paths have to be stored for each state in addition to the accumulated metric and the pointer to the most likely preceding state. Then the reliability of the bits of the ML path will be calculated. First, they will all be initialized with +∞, that is, practically speaking, with a very large number. Then, for each relevant decision between two paths, this value will be updated, that is, the old reliability will be replaced by the reliability of the path decision if the latter is smaller. To do this, every path corresponding to any sequence x (β) that has been discarded in favor of the ML sequence ˆ x has to be traced back to a point where both paths merge. We finally note that the reliability information can be assigned to the transmit symbols x i ∈{±1} (i.e. the signs corresponding to the bits of the code word) as well as to the data bit itself. 3.2.4 MAP decoding for convolutional codes: the BCJR algorithm To obtain LLR information about bits rather than about sequences, the bitwise MAP re- ceiver of Equation (3.23) has to be applied instead of a MLSE. This equation cannot be applied directly because it would require an exhaustive search through all code words. For a convolutional code, the exhaustive search for the MLSE can be avoided in the Viterbi algorithm by making use of the trellis structure. For the MAP receiver, the exhaustive search can be avoided in the BCJR (Bahl, Cocke, Jelinek, Raviv) algorithm (Bahl et al. 1974). In contrast to the SOVA, it provides us with the exact LLR value for a bit, not just an approximate one. The price for this exact information is the higher complexity. The BCJR algorithm has been known for a long time, but it became very popular not before its widespread application in turbo decoding. We consider a vector of data bits b = ( b 1 , ,b K ) T encoded to a code word c and transmitted with symbols x k . Given a receive symbol sequence y = ( y 1 , ,y N ) T ,we 126 CHANNEL CODING s k s k−1 y + k y − k y k Figure 3.12 Transition. want to calculate the LLR for a data bit b k given as L(b k = 0|y) = log b∈B (0) k P(b|y) b∈B (1) k P(b|y) . (3.33) Here, B (0) k is the set of those vectors b ∈ B for which b k = 0andB (1) k is the set of those for which b k = 1. We assume that the bit b k is encoded during the transition between the states s k−1 and s k of a trellis. For each time instant k,thereare2 m such transitions corresponding to b k = 0 and 2 m transitions corresponding to b k = 1. Each probability term P(b|y) in the numerator or denominator of Equation (3.33) can be written as the conditional probability P(s k s k−1 |y) for the transition between two states s k−1 and s k . Since the denominator in P(s k s k−1 |y) = p(y, s k s k−1 ) p(y) cancels out in Equation (3.33), we can consider the joint probability density function p(y, s k s k−1 ) instead of the conditional probability P(s k s k−1 |y). We now decompose the receive symbol vector into three parts: we write y − k for those receive symbols correspond- ing to time instants earlier than the transition between the states s k−1 and s k . We write y k for those receives symbols corresponding to time instants at the transition between the states s k−1 and s k . And we write y + k for those receive symbols corresponding to time instants later than the transition between the states s k−1 and s k . Thus, the receive vector may be written as y = y − k y k y + k (see Figure 3.12), and the probability density may be written as p(y, s k s k−1 ) = p(y + k y k y − k s k s k−1 ). If no confusion arises, we dispense with the commas between vectors. Using the definition of conditional probability, we modify the right-hand side and get p(y, s k s k−1 ) = p(y + k |y k y − k s k s k−1 )p(y k y − k s k s k−1 ), [...]... addition of the bit tuples Multiplication is defined as the multiplication of polynomials and reduction modulo α 3 + α + 1 The addition table is then given by + 1 2 3 4 5 6 7 1 0 3 2 5 4 7 6 2 3 0 1 6 7 4 5 3 2 1 0 7 6 5 4 4 5 6 7 0 1 2 3 5 4 7 6 1 0 3 2 6 7 4 5 2 3 0 1 7 6 5 4 3 2 1 0 and the multiplication table by 1 2 3 4 5 6 7 2 4 6 3 1 7 5 3 6 5 7 4 1 2 4 3 7 6 2 5 1 5 1 4 2 7 3 6 6 7 1 5 3 2 4 7 5... elements 0, 1, 2, 3, 4, 5, 6 is defined by the addition table + 1 2 3 4 5 6 1 2 3 4 5 6 0 2 3 4 5 6 0 1 3 4 5 6 0 1 2 4 5 6 0 1 2 3 5 6 0 1 2 3 4 6 0 1 2 3 4 5 and the multiplication table 1 2 3 4 5 6 2 4 6 1 3 5 3 6 2 5 1 4 4 1 5 2 6 3 5 3 1 6 4 2 6 5 4 3 2 1 Note that every field element must occur exactly once in each column or row of the multiplication table to ensure the existence of a multiplicative... matched to the channel and parallelize the given data stream in an appropriate way Theory and Applications of OFDM and CDMA 2005 John Wiley & Sons, Ltd Henrik Schulze and Christian L¨ ders u 146 OFDM Frequency Time Serial symbol duration Frequency Time Parallel symbol duration Figure 4. 1 The multicarrier concept There are two possible ways to look at (and to implement) this idea of multicarrier transmission... the linear scale is indeed a very flattering presentation of the OFDM spectrum Note that 1 0.8 0.6 0 .4 0.2 0 –80 –60 40 –20 0 20 Normalized frequency fT 40 60 80 –80 –60 40 –20 0 20 Normalized frequency fT 40 60 80 (a) Power spectrum [dB] 10 0 –10 –20 –30 40 –50 (b) Figure 4. 6 The power density spectrum of an OFDM signal on a linear scale (a) and on a logarithmic scale (b) ... baseband and to the center frequency fc in the passband, with negative k for the lower sideband and positive k for the upper sideband For reasons of symmetry, we may then choose the number of carriers to ej 2πfk−1 t sk−1,l g(t) ej 2πfk t skl skl S/P g(t) ej 2πfk+1 t Σ sk+1,l g(t) Figure 4. 2 Block diagram for multicarrier transmission: Version 1 s(t) 148 OFDM be K + 1, where K is an even integer, and. .. time interval shifted by lTS We note that for this narrow-sense OFDM the two concepts of Figures 4. 2 and 4. 3 are equivalent because fk = 1/T = 1/TS holds This property will be lost when a guard interval is introduced (see Subsection 4. 1 .4) The power density spectrum of an OFDM signal for K + 1 = 97 subcarriers is depicted in Figure 4. 6 On the linear scale, it looks very similar to a rectangular spectrum... either |x| |y| or |x| |y| 4 OFDM 4. 1 General Principles 4. 1.1 The concept of multicarrier transmission Let us consider a digital transmission scheme with linear carrier modulation (e.g M-PSK or M-QAM) and a symbol duration denoted by TS Let B be the occupied bandwidth −1 −1 Typically, B is of the order of TS , for example, B = (1 + α)TS for raised-cosine pulses with rolloff factor α For a transmission... with the construction of the RS(7, 5, 3) code over GF (8) We want to encode K = 5 useful data symbols Ai , i = 0, 1, 2, 3, 4 to a code word of length N = 7 The polynomial A(x) = A0 + A1 x + A2 x 2 + A3 x 3 + A4 x 4 of degree 4 cannot have more than four zeros Thus, aj = A(α j ) cannot be zero for more than four values of j Then the time domain vector a = (a0 , a1 , a2 , a3 , a4 , a5 , a6 )T has at... Equation (4. 1) Such a time-frequency-dependent phase rotation does not change the performance, so both methods can be regarded as equivalent However, the second – the filter bank – point of view is closer to implementation, especially for the case of OFDM, where the filter bank is just an FFT, as it will be later In the following discussion, we will refer to the second point of view 4. 1.2 OFDM as multicarrier... With fk = k 1+α , TS we define gk (t) = ej 2πfk t g0 (t) and gkl (t) = gk (t − lTS ) Since these pulses are strictly separated in frequency for different k, it is obvious that the condition (4. 3) is fulfilled This multicarrier modulation setup is depicted in Figure 4. 4 150 OFDM (a) B= f 1+α TS (b) f B= 1 TS Figure 4. 4 Multicarrier spectrum for α = 0.5 and α = 0 In this figure, we have replaced the raised-cosine . 0, 1, 2, 3, 4, 5, 6 is defined by the addition table + 12 345 6 1 2 345 60 2 345 601 3 45 6012 4 560123 5 6012 34 6 012 345 and the multiplication table 12 345 6 246 135 3625 14 415263 531 642 6 543 21 Note that. 21076 54 4 5670123 5 47 61032 6 745 2301 7 6 543 210 and the multiplication table by 12 345 67 246 3175 365 741 2 43 76251 5 142 736 67153 24 7521 643 We can visualize the multiplicative structure of GF (8). addition of the bit tuples. Multiplication is defined as the multiplication of polynomials and reduction modulo α 3 + α + 1. The addition table is then given by + 12 345 67 1 032 547 6 2 3016 745 3 21076 54 4