METHOD OF AND APPARATUS FOR IMPROVING COMPUTATIONAL EFFICIENCY IN SEQUENCE ESTIMATION
The present invention relates to a method of and apparatus for reducing the computational load in estimating the most likely received sequence of symbols in a telecommunications system where coding during transmission places constraints of the sequence of symbols that is transmitted.
In the real world, signals propagating between a transmitter and receiver are subject to interference, noise, and multipath distortion. Multipath distortion arises because the signal may undergo multiple reflections. This can result in the received signal being the sum of the transmitted signal and a time delayed (i.e. earlier transmitted) version of the signal
Received Signal{t) = transmitted signal(t - direct path) + a transmitted signal(t - multipathl) +β transmitted signal(t - multipath2 )+ ....
where α and β are complex coefficients.
Techniques have been devised to compensate for multipath distortion and to allow it to be characterized. Commonly this is done by inserting a known training sequence into the center of each data burst. The training sequence can be detected and used to characterise the path. Once this characterization has been done known equalisation techniques can be then used.
Looking at the multipath problem in more detail, it can be seen that the symbol received by a receiver is the result of convolving the transmitted signal with the impulse response of the transmission channel.
For a sampled data system we can consider the signals in their discrete time notation. For simplicity the channel response will be assumed to be time invariant although in practice the channel response may be slowly time varying.
The received signal can be represented by
y[n]= ∑({k] x[n-k] k
and in general y, c and x are complex numbers.
This can be seen in the example shown in Figure 1 which schematically illustrates a transmission path from a transmitter 1 to a receiver 2. In this simplified example a direct path 4 exists but a further path 6, introducing multipath distortion, also exists. The reflector 8 that gives rise to the multipath distortion may have a surface with a complex dielectric coefficient such that the multipath signal has a phase change and amplitude change with respect to the direct signal as well as a time delay.
Figure 2 shows a constellation diagram for a transmission scheme having eight individual symbols equally spaced around a unit circle. The effect of multipath distortion can be seen if, for simplicity, we suppose that the multipath delay introduces a delay equivalent to twice the inter symbol period of the transmission scheme and the reflector introduces a 90° phase shift and reflects 50% of the signal towards the receiver.
Suppose that a sequence of 3 symbols, SO, S4 and S5 are transmitted from the transmitter. The real components of the transmitted symbols would be
Real sequence = 1, — 1, - 0.71
and the imaginary sequence would be
Imaginary sequence = Oj, Oj, - 0.7 Ij
The sequence received from the reflector is equivalent to multiplying each of these sequence streams by 0.5j where j is the imaginary operator, and delaying it by two symbols
So the component from the real sequence becomes
0j 0j, 0.5j, - 0.5j, -0.36j
and the component due to the imaginary sequence becomes
0, 0, 0, 0, +0.36
adding these back together, the real component becomes
+1, - 1, -0.71, 0, 0.36
and the imaginary component is
Oj, Oj, {-0.7 j + OJj), - 0.5 j, - 0.36 j
Thus, the original burst of symbols, labeled Tl, T2 and T3 in Figure 3 gives rise to five received symbols Rl to R5 when received at the receiver. It can be seen that the delayed version of the first transmitted symbol interferes with the reception of the third symbol causing the received symbol R3 to be displaced from its true position in the constellation diagram as marked T3. Similarly any subsequent transmitted symbols can be interfered with by received symbols from the multipath direction, R4 and R5.
According to a first aspect of the present invention, there is provided a method of estimating a sequence of symbols, where the symbols are subject to multipath distortion in a transmission channel, the method comprising pre-calculating a plurality of products of a transmitted symbol and channel coefficients.
It is thus possible to use knowledge that some of the intermediate calculations used in estimation of the received signals will be repeatedly used, and hence pre-calculation of these intermediate calculations delivers savings in the computational burden.
Preferably the multipath distortion is compensated for using an equalisation technique.
According to a second aspect of the present invention there is provided an apparatus for estimating a sequence of symbols where the symbols are subject to multipath distortion in a transmission channel, wherein the apparatus is arranged to pre-calculate a plurality of products of a transmitted symbol and channel coefficients.
Preferably the apparatus is an EDGE enabled receiver.
According to a third aspect of the present invention there is provided a computer program for causing a programmable data processor to perform the method as claimed in claim 1.
The present invention will be described, by way of example, in which:
Figure 1 schematically illustrates a situation in which multipath distortion can occur;
Figure 2 illustrates a constellation diagram used in a telecommunication transmission scheme;
Figure 3 schematically illustrates the effect of multipath distortion on signals received at a receiver, the signals being represented on a constellation diagram;
Figure 4 is a circuit diagram of a convolutional encoder;
Figure 5 is a diagram of the state transitions and output symbols for the convolutional encoder shown in Figure 4;
Figure 6 shows a first step in a trellis decoding scheme;
Figure 7 shows a second step in a trellis decoding scheme;
Figure 8 shows a third step in a trellis decoding scheme; and
Figure 9 is a schematic diagram of a receiver constituting an embodiment of the present invention.
The Viterbi algorithm is well known for its use in decoding data streams which have been subjected to forward error correction in order to enhance communication performance in the presence of the noise and distortion, such as multipath distortion, arising from the communications channel. Forward error correction can be performed using convolutional encoding.
Convolutional encoding can be performed verily easily in hardware. It is generally performed using a series of shift registers in conjunction with the modulo 2 adders. The modulo 2 adders can be implemented as a cascade of exclusive - OR gates.
Viterbi decoding is a complex process, so it is useful to consider a simple example.
Convolutional Encoding
A simple convolutional encoder is in Figure 4. The encoder comprises two delay elements 10 and 12 connected in series. Each delay element introduces a delay of 1 clock cycle T which is also commonly denoted as Z"1. The undelayed input, the input delayed by Z"1 and a copy of the input delayed by Z"2 (2 delays) is summed by a first modulo 2 adder 14. A second modulo 2 adder receives the undelayed input and a copy of the input delayed by Z"2.
This encoder has a constraint length of 3, and encodes the K = 3 (7,5) convolutional code where the bracketed numbers 7 and 5 represent the code generator polynomials 111 and 101 which correspond to the shift register connections.
The encoder has an output of two bits for every encoded bit.
The operation of this encoder in response to an input signal train can be analysed in terms of a timing diagram. From such analysis it can be determined that the encoder operates as a state machine. At each clock signal the state machine transits from one predefined state to another, and on entering a state asserts a new output as a function of the input signal.
For the circuit shown in Figure 4, the state transition table is:
Next state if
Current state Input = 0 Input = 1
00 (SO) 00 10
01 (Sl) 00 10
10 (S2) 01 11
11 (S3) 01 11
And the output from the circuit is defined by the following output table
Output if
Current state Input = 0 Input = 1
00 00 11
01 11 00
10 10 01
11 01 10
These transition tables and output tables can be represented by a trellis as shown in Figure 5. On each arrow from one state the designation a/bc represents as "a" the input which is 1 or 0, and "be" represents the output from the circuit. Thus Figure 5 re- represents the data in the transition and output tables.
Consider an input sequence 01011, it will give rise to an output sequence 00 11 10 00 01. This can be seen since, stating with an initialized encoder (full of O's) such that it starts in state 00, the input being 0 causes an output "00" and the next state to be state 00. Next the input becomes as 1 so the output becomes "11" and the next state is 10, and so on.
Consider now the process of decoding the encoded sequence. We start a time t = 0 with an initialized decoder in the state "00", then the only bit pair (output symbol) that can be output from this state is "00" or "11".
The Viterbi decoding algorithm keeps a running total of a metric used to measure the "distance" or degree of mismatch between the symbol pair that was actually received and the possible symbol pairs that could have been received. In this example a simple metric will be used, namely the sum of the bit difference.
It needs to be remembered that the symbol or bit pairs output from the convolutional encoder are, for the purposes of transmitting them in the communications channel mapped onto an appropriate symbol in the constellation diagram. In this example considered here the convolutional encoder can output "00", "01", "10" and "11" and hence only four symbols would be required to represent these possible outputs. However more complex schemes exist for both convolutional encoders and constellation diagrams but the principles exemplified here are capable of extension to the more complex schemes.
At a receiver the symbols of the constellation diagram are mapped back to the bit sequences output by the encoder.
Figure 6 is a graphical representation of the decoding process. From the initialised state at t = 0, the decoder receives a first symbol "00" at t = 1. It knows that the only valid output symbols from the initial state are 00 and 11 , and hence it can compare the expected symbols with the received signals to calculate an error metric. The difference between "00" and "00" is zero, where the difference between "11" and "00" is 2 because two bits are different. We also know that receiving "1 1" from the encoder causes a jump from state 00 to state 10 because its input would have been a "1". Thus the diagram in Figure 6 also shows two possible states at time t = 1.
At time t = 2 (Figure 7) the decoder receives a symbol "11". The "allowed" symbol pairs for this transition are:
if in state 00, 00 if the input is 0, and 11 if the input is 1 with the next state being 00 if the input is 0, and 10 if the input is 1 ; if in state 10, 10 if the input is 0, and 01 if the input is 1 with the next state being 01 if the input is zero, and 11 if the input is 1.
These can be entered onto the decision graph, as shown in Figure 7.
It is instructive to consider the next state, at time t = 3, as shown in Figure 8.
for state 00 the next allowable state is 00 if the input to the encoder was 0, and the received symbol would be "00". However the other next allowable state is 10 if the input to the encoder was 1, and the output symbol would have been 11 ; for state 01, the next allowable state is 00 if the input signal to the encoder was 0, and the received symbol would be 11. However the alternative allowable next state is state 10 if the input to the encoder was
1, in which case the symbol received would be 00; for state 10 the next allowable state is 01 if the input to the encoder was 0, in which case the received symbol would be "10", whereas an
alternative allowable transition would be state 11 if the input to the encoder 1, in which case the received symbol would be 01; finally for state 11, the next allowable transition is to state 01 if the input to the encoder is zero, in which case the received symbol would be 01, or to state 1 1 if the input to the encoder was 1 , in which case the received symbol would be 10.
By following the lowest cost path through the "trellis" we can tell that the decoded bits, thus far, are 010.
Each position in the constellation diagram corresponds to one of the states in the decoding trellis (thus Figures 5 to 8 need expanding to uniquely identify eight states so as to match up with Figure 2).
In the example of decoding the received symbols given above, each received symbol was assumed to lie exactly at the appropriate place for it on the constellation diagram. However, by referring back to Figure 3 it can be seen that multipath distortion provides a mechanism in which the received symbols become displaced from their ideal position in the constellation diagram.
Consequently the input supplied to the Viterbi decoder represents the convolution of the transmitted signal and the impulse response of the transmission path.
The channel impulse response can be estimated by known techniques and consequently the estimation of the channel impulse response need not be described here. However the known techniques involve the transmission of a predefined training sequence. The information content (and hence symbol sequence) is known and hence the channel state can be estimated.
During data reception a list of possible previous states is used to generate a set of anticipated signals which are supplied to the decoder. In the case of signals undergoing multipath (or similar) distortion it is desirable to include the effects of the multipath distortion during the generation of the anticipated signals.
To perform these calculations the contributions from each of the possible previous states are convolved with the channel impulse response to produce a contribution to the anticipated signal. This is normally done one state at a time. A comparison is then made between the anticipated signal and the received signal to derive a measurement of the difference between these signals.
Of course, the convolution due to multipath interference has be done for all possible transmitted signals.
If h is an array of T channel impulse response taps (i.e. an array which represents quantised delays applied to the transmitted signal), c is array of S ideal transmitted symbols (i.e. the symbols on the constellation diagram), and x is an array consisting of a path of T possible transmitted signals, the signal b that would be anticipated by this path is given by
This signal is compared with the received signal for each path in turn, and the signal that produces the closest match is the one that is selected.
During the Viterbi decoding process, the decoder keeps a running list of possible states and the symbols that would be expected when making a transition from one state to another. Hence the Viterbi decoding process inherently keeps a representation of the signal at different points in time, or as another way of looking at it, with various time delays applied.
In, for example, a receiver having five taps and arranged to work with a transmission scheme having eight symbols it is possible for the decoder to examine all of the possible sequences of the transmitted signals.
In practice each one of the eight symbols is decoded as a sequence of three bits, and the output of the decoder is a bit stream.
Consequently in order to account for multipath distortion it is necessary to re-map the possible transmitted bit sequences back onto the symbols that would have been transmitted in order to encode that bit sequence. The symbols are in a sequence as determined by the bit sequence. The symbol sequence is then presented to the delay calculator which applies the various delays that occur in the real transmission channel, and each respective delay tap is then multiplied by the estimate of the channel response coefficients to derive a value b which corresponds to the value of the next signal (or current signal) that the receiver is expected to receive. It follows that if the estimate of the transmitted bit sequence as determined by the decoder exactly matches that as encoded by the transmitter and if the channel impulse response was exactly modeled with the receiver than the estimate for the symbol "b" would exactly match that which was received by the receiver. Since these conditions are unlikely to be fulfilled the receiver merely notes those bit sequences which give rise to the closest match or matches between the estimated signal b and the actual signal at the relevant moment in time.
Although there are ways to reduce the amount of computation, the central part of processing a Viterbi algorithm is still evaluating the sum of the products /?[/] c[x\i] ] . This evaluation consists of repeated fetch-multiply-accumulate operations. Although all three of these operations are done quickly in modern digital signal processor, the time taken to complete all these operations can still be significant.
For an 8psk equalizer, such as that used in an EDGE receiver for EGPRS decoding, a large number of paths is needed to get a good result. Even with a reduced-state equalizer, it is still computationally intensive. For example, a Viterbi processor with 5 taps on the channel impulse response, and 8 possible states, the calculation of each path requires 5 fetch-multiply-accumulate operations, and there are 85 paths to calculate - a total of 32,768 paths, or 163,840 fetch-multiply-accumulate instructions. Each path is given by
h [θ] c [x[θ]]+ h[l] c [x[l]]+h [2]c[x[2]]+ h [s]c [x[i]]+ h [4]c [x [4]) .
where
ψ[ø]] represents the estimate of the 0th transmitted symbol;
c[χ[l]] represents the estimate of the 1st transmitted symbol;
and so on
In the improved method of calculation according to an embodiment of the present invention the amount of computation required in a Viterbi decoder for communications is reduced by pre-calculating some of the steps. Since the value of h[a] c[x[b]] is always constant for a given tap "a" and symbol "b", two or more pre-calculated values are stored in an array which has as many dimensions as the number of steps to be calculated.
Alternatively, further reductions in computation may be obtained by storing additional tables and summing the contents depending on the data - until the limit is reached where there is one table with as many dimensions as the number of taps. In that case there is no addition either - simply a fetch of the appropriate path from the table.
Consider a receiver, such as an EDGE receiver, as shown in Figure 9 which contains an RF front end 2, a demodulator where the channel state correction is implemented with five taps giving five time delay versions of the assumed transmitted sequence that, when convolved with the channel impulse response, gives rise to the received sequence of symbols.
The impulse response for the channel can be assumed to have been estimated. Thus the value
h [a] c[x[b ]]
is always constant for a given tap and symbol b
Thus, rather than calculating this value on the fly each time it is required, the value can be pre-calculated and need only be updated if it is determined that channel's impulse response has changed. The pre-calculated values can be stored in a rewritable memory 8.
In an embodiment of the invention there are tables hx_01 and hx_234, defined as
and
The computation for a single path is
hx_0l[x[0]][x[l]]+ hx_ 234 [x[2]][x[3]][x[4}]
which requires two fetches, two accumulates, and no multiplications. So the total number of operations has changed from 163840 (comprising fetch, multiply and accumulate operations to 65536 (comprising just fetch and accumulate operations). This is a saving of 60% even if the multiplications take the same time as other operations, and considerably more than 60% if the multiplies take significantly more time as is the case with most processors.
Alternative implementations are possible. For example the number of possible solutions that need to be tested can be reduced by assuming that earlier symbols have already been decoded with a high degree of confidence. This can be illustrated by considering a decoder that operates on six symbols. Each symbol has eight possibilities so in a full trellis decode the possible number of solutions or paths is 8 , or 262144 paths. However, if the system designer had sufficient confidence, he could consider only the first two symbols as being uncertain and the remaining four symbols as given. In this case only 82 = 64 paths need to be calculated because all 64 paths start with the same sequence of four symbols. This is known as "reduced state" Viterbi. E-GPRS equalization in a handset may implement the full trellis only between the first few taps, then a small number of paths for the next few taps then finally only one path for the last few taps. The main computation effort is in the full trellis computation; then comes the small number of paths, then finally the single path. In one example for an EDGE demodulator for a channel impulse response with 7 taps, the full trellis will be computed for the first two taps, two paths for the next three taps, and one path for the final two taps.
A lookup table of hx_01 is used to compute the full trellis: this saves two complex multiplications and a complex add, which corresponds to about 75% of the computation effort.
A lookup table of hx_234 is used for the next 3 taps, saving the same amount of effort for a smaller task. This saves about 15% of the computation effort.
Finally the last two taps are computed "longhand", because there is no advantage to doing them any other way.