RU2012051C1

RU2012051C1 - Device for fast fourier transform

Info

Publication number: RU2012051C1
Authority: RU
Inventors: Александр Анатольевич Оленев; Николай Иванович Червяков; Виктор Васильевич Бережной; Николай Павлович Микула; Юрий Иванович Николаев
Original assignee: Александр Анатольевич Оленев; Николай Иванович Червяков; Виктор Васильевич Бережной; Николай Павлович Микула; Юрий Иванович Николаев
Priority date: 1991-07-08
Filing date: 1991-07-08
Publication date: 1994-04-30

Abstract

FIELD: computer engineering. SUBSTANCE: device has residue computation unit 1, n arithmetic units 2.1-2. n, coefficient storage unit 5, result recovery unit 6, multiplexer 4. Additional arithmetic unit 2 and corresponding functional connections are introduced to accomplish the goal of invention. Single processor with rounding channel is used for implementation of algorithm for computing convolution together with procedure for rounding. EFFECT: simplified design. 3 dwg

Description

Изобретение относится к области цифровой обработки сигналов и может быть использовано при реализации алгоритмов спектрального анализа, цифровой фильтрации устройств с заданными временными характеристиками на специализированных вычислительных устройствах. The invention relates to the field of digital signal processing and can be used in the implementation of spectral analysis algorithms, digital filtering of devices with specified time characteristics on specialized computing devices.

Предлагаемое устройство в своей работе использует процесс цифровой фильтрации, заключающийся в выполнении свертки
Y_s=

X_jω_j, где Х_j - входные отсчеты;
ω_j - некоторые коэффициенты, и о выполнении округления полученного значения
Y

\binom{*}{s}

=

, где К - величина, показывающая во сколько раз необходимо сократить диапазон представления y_s.The proposed device in its work uses the digital filtering process, which consists in the implementation of convolution
Y _s =

X _j ω _j , where X _j are input samples;
ω _j are some coefficients, and about the rounding off of the obtained value
Y

\binom{*}{s}

=

, where K is a value showing how many times it is necessary to reduce the range of representation y _s .

С помощью устройства становятся возможными распараллеливание выполняемых арифметических операций, реализация их табличным способом, реализация округления и образное преобразование на том же устройстве, на котором выполняется процесс вычисления свертки. Последнее является немаловажным, так как процедура округления и преобразования в позиционную систему счисления требует аппаратурных затрат, эквивалентных выполнению операций свертки. Обычно достаточно использовать 5-7 оснований, чтобы процедура вычисления свертки была длиннее всего на 3-5 шагов. Однако за счет выполнения операций табличным или матричным способом общее время выполнения свертки гораздо меньше, чем на позиционном процессоре практически при тех же затратах. Using the device, it becomes possible to parallelize arithmetic operations, implement them in a tabular manner, implement rounding and image conversion on the same device on which the convolution calculation process is performed. The latter is important, since the procedure of rounding and conversion to a positional number system requires hardware costs equivalent to performing convolution operations. Usually it’s enough to use 5-7 bases so that the convolution calculation procedure is only 3-5 steps longer. However, due to the performance of operations in a tabular or matrix manner, the total convolution time is much shorter than on a positional processor at almost the same cost.

На фиг. 1 приведена функциональная схема устройства для быстрого преобразования Фурье; на фиг. 2 - вариант выполнения арифметичесокго блока; на фиг. 3 - вариант выполнения (n+1)-го арифметического блока. In FIG. 1 shows a functional diagram of a device for fast Fourier transform; in FIG. 2 is an embodiment of an arithmetic block; in FIG. 3 - embodiment of the (n + 1) -th arithmetic block.

Устройство содержит (фиг. 1) блок 1 определения вычетов, арифметические блоки 2.1-2(n+1), входную шину 3 данных, мультиплексор 4, блок 5 хранения коэффициентов, блок 6 восстановления результата, шину 7 адреса коэффициентов, управляющие шины 8, 9, 10, выходную шину 11 данных. The device contains (Fig. 1) a block 1 for determining residues, arithmetic blocks 2.1-2 (n + 1), an input data bus 3, a multiplexer 4, a coefficient storage unit 5, a result recovery unit 6, a coefficient address bus 7, control buses 8, 9, 10, output data bus 11.

В состав i-го (i=

) арифметического блока (фиг. 2) входят блок 12 коммутации, блок 13 умножения, сумматор 14, регистр 15, выходная шина 16 блока 2, общая управляющая шина 17 блоков 12 коммутаций для арифметических блоков 2. р-1-2. n и управляющие шины регистров 15. i, шина 18 обнуления, шина 19 тактовых импульсов записи.The composition of the i-th (i =

) arithmetic unit (Fig. 2) includes a switching unit 12, a multiplication unit 13, an adder 14, a register 15, an output bus 16 of a unit 2, a common control bus 17 of the switching units 12 for arithmetic units 2. p-1-2. n and the control bus of the registers 15. i, bus 18 zeroing, bus 19 clock pulses recording.

Арифметический блок 2. (n+1) (фиг. 3) влкючает блок 20 умножения, сумматор 21, регистр 22, блок 23 хранения констант, при этом блок 20 умножения, сумматор 21 и регистр 22 имеют такое же построение, как и в блоке 2. i, отличие составляет блок 23 хранения констант с управляющей шиной 24, являющейся частью адреса. Arithmetic block 2. (n + 1) (Fig. 3) includes a multiplication block 20, an adder 21, a register 22, a constant storage unit 23, while the multiplication unit 20, the adder 21 and the register 22 have the same construction as in the block 2. i, the difference is the constant storage unit 23 with the control bus 24, which is part of the address.

Блок 1 определения представляет собой n блоков определения вычетов поступающих отсчетов по соответствующим основаниям р_i(i=

) которые могут быть реализованы табличными методами на базе ППЗУ. В этом случае входные коды отсчетов являются адресами соответствующего кода вычетов, записанного в ППЗУ на этапе программирования. Аналогично поступает и блок 5 хранения коэффициентов, состоящий из n+1 блоков 5.1-5. n хранения коэффициентов по соответствующим основаниям р_i(i=

), а блок 5. n+1 блок хранит остатки от деления

, например

= 3. Требуемый коэффициент из n ППЗУ выбирается по адресу, поступающему по шине 7 адреса коэффициентов или шине 10.Block 1 definition represents n blocks determining the deduction of incoming samples on the corresponding grounds p _i (i =

) which can be implemented by tabular methods based on the ROM. In this case, the input sample codes are the addresses of the corresponding residue code recorded in the ROM at the programming stage. The coefficient storage unit 5, consisting of n + 1 blocks 5.1-5, does the same. n storage coefficients on the corresponding grounds p _i (i =

), and block 5. n + 1 block stores the remainder of division

, eg

= 3. The required coefficient of n EPROMs is selected at the address received via the coefficient address bus 7 or the bus 10.

Блоки 13. i умножения, сумматоры 14. i(i=

), выполняющие соответствующие арифметические действия умножения, сложения по соответствующим основаниям p_i(i=

) могут быть реализвоаны табличным методом на базе ППЗУ. В этом случае операнды для выполнения соответствующего арифметического действия будут адресом результата арифметического действия.Blocks 13. i multiplications, adders 14. i (i =

) performing the corresponding arithmetic operations of multiplication, addition on the corresponding bases p _i (i =

) can be implemented by the tabular method based on the ROM. In this case, the operands for performing the corresponding arithmetic operation will be the address of the result of the arithmetic operation.

Блок 12. i коммутации для арифметических блоков 2. р+1-2. n является известным функциональным устройством, где выбор входной шины определяется номером на шине 17, например потенциал логического "0" на шине 17 всегда коммутирует первый вход на выход блока 12. i коммутации, а потенциал логической "1" - второй. Block 12. i switching for arithmetic blocks 2. p + 1-2. n is a known functional device, where the choice of the input bus is determined by the number on the bus 17, for example, the potential of the logical "0" on the bus 17 always switches the first input to the output of the block 12. i switching, and the potential of the logical "1" is the second.

Блок 23 хранения констант может быть реализован с помощью ППЗУ. В этом случае операции, записанные в регистре 22 при появлении разрешающего сигнала на 24 входе, являются адресом результата арифметического действия. The constant storage unit 23 may be implemented using an EEPROM. In this case, the operations recorded in the register 22 when the enable signal appears at the 24th input are the address of the result of the arithmetic operation.

Известным цифровым устройством является и мультиплексор 4. В зависимости от кода на адресной шине 9 управления один из входов коммутируется на выход. A well-known digital device is the multiplexer 4. Depending on the code on the address bus 9 control one of the inputs is switched to the output.

Блок 6 восстановления результата реализует преобразование конечного результата процессов свертки и округления из непозиционной системы счисления в позиционную - он также может быть реализован табличным методом на базе ППЗУ. В этом случае входные данные являются адресом результата данного процесса преобразования. Считывание информации с блока 6 восстановления результата разрешается подачей списка на управляющий вход 8. The result recovery block 6 implements the conversion of the final result of the convolution and rounding processes from a non-positional number system to a positional one - it can also be implemented by the tabular method based on the ROM. In this case, the input is the address of the result of this conversion process. Reading information from the block 6 recovery result is allowed by submitting a list to the control input 8.

Устройство работает следующим образом. The device operates as follows.

В исходном состоянии регистры 15. i обнулены, что производится подачей сигнала на шину 18. На управоляющие входы блоков 12. i коммутации (i=

) подается сигнал, под действием которого происходит коммутация блоков 12. i коммутации на соответствующие выходы данных блоков.In the initial state, the registers 15. i are reset, which is done by applying a signal to the bus 18. To the control inputs of the blocks 12. i switching (i =

) a signal is supplied, under the action of which the blocks 12 are switched. i switching to the corresponding outputs of these blocks.

Под действием кода первого входного отсчета Х1 через время выборки адреса на n выходах блока 1 определения вычетов появляются коды вычетов входного отсчета Х₁ по соответствующим модулям: p_i(i=

)

Х₁

.Under the action of the code of the first input sample X1, after the address is sampled at the n outputs of the residue determination unit 1, the residue codes of the input sample X ₁ appear on the corresponding modules: p _i (i =

)

X ₁

.

Через некоторое время, определямое задержкой на блоках коммутации, данные коды поступают на вход блока 13. i умножения. К этому времени на выходе блоков 5. i появляется значение первого коэффициента W₁, так как одновременно с коммутацией подается сигнал на вход 7, подается адрес первого коэффициента. Через время срабатывания блока 13. i умножения на его выходах появляется результат умножения

Y_s,j W

по соответствующим модулям, который подается на первый вход сумматора 14. i, на второй вход окторого подается код нуля с регистра 15. i, так как регистр был обнулен. В результате после сложения результат записывается в регистр 15. i по импульсу 19.After some time, determined by the delay on the switching units, these codes are fed to the input of unit 13. i multiplication. By this time, at the output of blocks 5. i, the value of the first coefficient W ₁ appears, since simultaneously with the switching, a signal is supplied to input 7, the address of the first coefficient is supplied. After the response time of block 13. i of multiplication, the result of multiplication appears at its outputs

Y _{s, j} W

according to the corresponding modules, which is fed to the first input of the adder 14. i, the zero input from register 15. i is sent to the second input of the second, as the register has been reset. As a result, after addition, the result is recorded in register 15. i by pulse 19.

Работа устройства реализуется в конвейерном режиме, т. е. в момент поступления кода входного отсчета

Х

на входы соответствующих блоков 13. i умножения, код второго отсчета подается на входную шину 3 данных и далее весь процесс повторяется аналогично выше описанному (по шине 7 подается адрес второго коэффициента W₂ и т. д. ). Процесс выполнения свертки реализуется до тех пор, пока N-м тактовым импульсом не будет записан результат вычисления свертки

Y

=

Y

\binom{′}{s}

_,j·W

в соответствующие регистры 15.The operation of the device is implemented in the conveyor mode, i.e., at the time of receipt of the input code

X

to the inputs of the corresponding blocks 13. i multiplication, the code of the second sample is supplied to the input data bus 3 and then the whole process is repeated similarly to the above (bus 7 is the address of the second coefficient W ₂ , etc.). The convolution process is implemented until the result of the convolution calculation is recorded by the Nth clock pulse

Y

=

Y

\binom{′}{s}

_{, j}

to the respective registers 15.

Этап округления начинается по завершении последней (N-й) операции умножения по модулю p_i в соответствующих блоках 13. i, на управляющие входы блоков 12. j коммутации (j=

) подается сигнал, под действием которого происходит коммутация информационного входа мультиплексора 4 на вход коммутатора 12. i, а также подается сигнал на соответствующие управляющие входы 9 мультиплексора и на блок 5. n+1 хранения констант, при этом вступает в работу мультиплексор 4, подключая к коммутаторам 12. j(j=

), а также к первому входу блока 20 умножения первый канал. Значение вычисленной свертки y_s,i подается на первые входы блока 13. i (i=

) умножения через блоки 12. i коммутации, на второй вход которого с подачей определенного адреса на шину 10 с выхода блока 5. i выдается

С_к+1

, а на выходе блока 5. n+1-K. k+1, далее производится сложение в сумматоре 14. j (j=

) со значением Y_s,j (j=

). Далее процесс повтоярется в конвейерном режиме с подачей другого управляющего сигнала на управляющую шину 9 мультиплексора 4, на информационный выход подключается второй каналы и т. д. до р. После прохождения р тактов производится анализ необходимости коррекции полученного результата, т. е. происходит накопление в 2. n+1 арифметическом блоке в регистре 22. На (p+1)-м такте поступает разрешающий сигнал 24 на вход блока 23 хранения констант, на выходе которого появляется результат необходимости коррекции

K_k+j, с приходом р+1 на вход мультиплексора на выход его подается результат коррекции

K_k+j и происходит сложение с полученными результатами. На вход блоков умножения в этом случае подается код 1 (единицы).The rounding stage begins when the last (Nth) operation of multiplication modulo p _i in the corresponding blocks 13. i, to the control inputs of the blocks 12. j switching (j =

) a signal is supplied under which the information input of the multiplexer 4 is switched to the input of the switch 12. i, as well as a signal is sent to the corresponding control inputs 9 of the multiplexer and to the block 5. n + 1 storage of constants, while the multiplexer 4 comes into operation, connecting to the switches 12. j (j =

), as well as to the first input of the multiplication unit 20, the first channel. The value of the calculated convolution y _{s, i} is fed to the first inputs of block 13. i (i =

) multiplication through blocks 12. i switching, the second input of which with the supply of a specific address to the bus 10 from the output of block 5. i is issued

C _{to + 1}

, and at the output of block 5. n + 1-K. k + 1, then addition is done in the adder 14. j (j =

) with the value Y _{s, j} (j =

) Next, the process is repeated in the conveyor mode with the supply of another control signal to the control bus 9 of the multiplexer 4, the second channels are connected to the information output, and so on. After passing p cycles, the analysis of the need to correct the result is performed, that is, accumulation occurs in 2. n + 1 arithmetic units in register 22. At the (p + 1) -th cycle, an enable signal 24 is received at the input of the constant storage unit 23, the output of which appears the result of the need for correction

K _{k + j} , with the arrival of p + 1, the correction result is fed to the input of the multiplexer;

K _{k + j} and addition occurs with the results. In this case, code 1 (units) is fed to the input of the multiplication blocks.

После выполнения всей процедуры округления в регистрах 15. i остается результат выполнения всего алгоритма

Y

(i=

).After completing the whole rounding procedure in registers 15. i the result of the entire algorithm remains

Y

(i =

)

Для преобразования результата из непозиционной в позиционную систему счисления подается сигнал на блок 6 восстановления результата. Через время срабатывания этого блока на выходе появляется искомый результат. To convert the result from a non-positional to a positional number system, a signal is sent to the result recovery unit 6. After the response time of this block, the desired result appears at the output.

Claims

DEVICE FOR FAST FOURIER TRANSFORM, containing a residue determination unit, n arithmetic blocks (where n is the number of modules), a coefficient storage unit, a result recovery unit, the input data bus of the device being connected to the input of the residue determination unit, the i-th output of which is connected to the information the input of the i-th arithmetic block (where i = 1,2,..., n - 2), (n - 1) and n-th outputs of the residue determination block are connected to the first information inputs of the (n - 1) -th and n-th arithmetic blocks, respectively, control inputs from the first p of the nth arithmetic blocks are connected respectively to the first through nth outputs of the coefficient storage block, the first to nth address inputs of which are connected to the address bus of the device coefficients, the outputs of the (n - 1) th and nth arithmetic blocks are connected to the first and second information inputs of the result recovery unit, the output of which is connected to the output data bus of the device, characterized in that the device additionally contains a multiplexer and an (n + 1) -th arithmetic unit, the control input of which is connected to the (n + 1) -th output blo coefficient storage, the (n + 1) -th input of which is connected to the first control bus of the device, the second and third control buses of which are connected to the control inputs of the result recovery unit and multiplexer, respectively, the outputs of the i-th and (n + 1) -th arithmetic blocks are connected respectively to the i-th and (n - 2) -th information inputs of the multiplexer, the output of which is connected to the second information inputs of the (n - 1) -th and n-th arithmetic blocks and the information input of the (n + 1) -th arithmetic block.