WO2013006030A1 - Système et procédé de distribution parallèle de bits avec un réseau bi-delta - Google Patents
Système et procédé de distribution parallèle de bits avec un réseau bi-delta Download PDFInfo
- Publication number
- WO2013006030A1 WO2013006030A1 PCT/MY2012/000141 MY2012000141W WO2013006030A1 WO 2013006030 A1 WO2013006030 A1 WO 2013006030A1 MY 2012000141 W MY2012000141 W MY 2012000141W WO 2013006030 A1 WO2013006030 A1 WO 2013006030A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- network
- control
- delta network
- delta
- bits
- Prior art date
Links
- 238000009826 distribution Methods 0.000 title claims abstract description 21
- 238000000034 method Methods 0.000 title claims abstract description 19
- 230000002441 reversible effect Effects 0.000 claims abstract description 17
- 238000011144 upstream manufacturing Methods 0.000 claims description 5
- 238000000151 deposition Methods 0.000 claims description 4
- 230000003247 decreasing effect Effects 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 5
- 240000003537 Ficus benghalensis Species 0.000 description 3
- 230000001174 ascending effect Effects 0.000 description 2
- 238000013478 data encryption standard Methods 0.000 description 2
- 101150034459 Parpbp gene Proteins 0.000 description 1
- 101100206695 Rattus norvegicus Thoc6 gene Proteins 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/06—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
- H04L9/0618—Block ciphers, i.e. encrypting groups of characters of a plain text message using fixed encryption transformation
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09C—CIPHERING OR DECIPHERING APPARATUS FOR CRYPTOGRAPHIC OR OTHER PURPOSES INVOLVING THE NEED FOR SECRECY
- G09C1/00—Apparatus or methods whereby a given sequence of signs, e.g. an intelligible text, is transformed into an unintelligible sequence of signs by transposing the signs or groups of signs or by replacing them by others according to a predetermined system
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L2209/00—Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
- H04L2209/12—Details relating to cryptographic hardware or logic circuitry
- H04L2209/125—Parallelization or pipelining, e.g. for accelerating processing of cryptographic operations
Definitions
- the present invention relates to bit permutation.
- the present invention relates to a system for distributing bits through a bi-dclta network.
- Bit permutation is one way of handling general bit-level processing. Bit permutation operations are therefore becoming more prevalent in the field of cryptography and baseband processing. For example, they are used in Data Encryption Standard (DES) and bit puncturing and interleaving in Forward Error CotTection (FBC).
- DES Data Encryption Standard
- FBC Forward Error CotTection
- bit puncturing requires a number of shifts, bitwise OR and bitwise AND operations, whilst in a general bit permutation require even larger number of operations. For example, to perform general 128-bit permutation, it would require 512 (i.e. 4 x 128) operations. That would be too slow for high-speed wireless broadband baseband processing.
- Extract and Parallel Deposit Instructions proposed a parallel deposit (PDEP) instruction consisting of three operand registers, which include a data bits source register, a bit mask register and a target register. Bits from lower part of the source register are distributed to locations in the result register marked with 1 -bits in a mask register and bits from upper part of the same source register to locations in the same result register marked with 0-bits in the same mask register. Relative positions of the source bits within the two target groups are retained.
- the disclosure provides the uses of complete butterfly network, controls from population count (adder) prefix circuits and left-rolate and mask circuits, and data bits marked with 0-bits are placed in normal order
- FIG. 1 illustrates a known butterfly network of an 8-bit PDRP operation. Control signals for the switches are generated using population count prefix circuit and so-called Left-Rotate and Complement (LRO ' fC) circuits.
- LRO ' fC Left-Rotate and Complement
- PERMUTATIONS USING PERMUTATION INSTRUCTIONS BASED ON MODIFIED OMEGA AND FTJP STAGES discloses permutation instructions which can be used in software executed in a programmable processor for solving permutation problems in cryptography, multimedia and other applications.
- the peraiute instructions are based on an omega-flip network comprising at least two stage. It uses single stages of omega and flip networks and controls directly from mask. The bit distribution requires generally many operations.
- PERMUTATIONS USING PERMUTATION INSTRUCTIONS RASED ON BUTTERFLY NETWORKS discloses permutation instructions which can be used in software executed in a programmable processor for solving permutation problems in cryptography, multimedia and other applications.
- the permute instructions arc based on a Bcncs network comprising two butterfl networks. Summary
- a bi-della network for distributing bits through bit distribution (BDST) instruction from an input to an output of the bi -d lta network.
- the network comprises a control delta network constituting a forward path of the bi-dclta network for receiving an n-bits data from a bitmask register; a data delta network constituting a reverse path of the bi-dclta network for receiving n-hits data from a source register; and a control generation network having a plurality of stages between the control delta network and the data delta network, wherein each stage operationally receives inputs from inputs of stage m of the control delta network and generates control signals for switches in stage m of both control and data delta networks, whereby, m represents a stage number starting from 1 to log 2 (n).
- both the control delta network and the data delta network comprises log:(n) stages respectively.
- the stages in the control delta network may be denoted in ati incremental manner from input to output direction from 1 to log 2 (n), whilst the stages in the data delta network may be denoted in a decremental manner from input to output direction from log2(n) to 1 ,
- the stages in the control generation network may further be denoted in an incremental manner from input to output direction from I to log?(n).
- each stage of the cuntrol generation network may comprise 2 m -l address groups, and each group is adapted to control n/2' n switch elements at the respective stage of the control generation network.
- a method for distributing bits through a bi-delta network with bit distribution (BDST) instruction from an input to an output comprises loading source bits into a source register; determining target bits in a result register lo move oonesponding lower pari of the source register in preserved relative bit orde by setting 1-bit markers at the corresponding bit positions in a bitmask register; determining the target bits in same result register to move the corresponding upper part of same source register in reverse relative bit order by setting 0-bit markers at the corresponding bit positions in the bilmask register; and executing the BDST instruction over the source and bitmask registers and subsequently depositing the result of the bits distribution into the result register.
- BDST bit distribution
- executing the BDST instructions may further comprise feeding n-bits data from bitmask register to a control delta network and the control generation network, wherein both the control delta network and the control generation network comprises log2(n) stages respectively, and the stages are numbered increasingly from I to log 2 (n) in the input to output direction; feeding n-bils data from source register tu the dala delta network, wherein the data delta network comprises iog 2 (n) stages, has reverse structure of the control delta network and the stages arc numbered decreasing! ⁇ ' from log 7 (n) to I in the i put to output direction; and depositing n-bit output of the data delta network in the result register.
- the stage m of the control generation network (hat acquires inputs from the corresponding stage m of the control data nelwork may further comprise dividing the control generation stage m into 2 m ⁇ address groups, wherein each group taking n/2 m" ' inputs associated with target address bits (a m .
- Tbe bi-delta network adapted to carry out parallel bit distribution.
- Tbe bi-delta network comprises a control delta, network; and bit markers that are operable to be distributed in a forward manner in the control delta network for configuring switches of the control delta network.
- User bits arc operationally distributed in a backward manner from the downstream of the control delta network, in both order and reverse-order in accordance with the bit markers, to output the distributed user bits at the upstream of the control delta network.
- a method for carrying out parallel bit distribution in a bi-delta network comprise distributing bit markers in a forward manner of a control delta network for configuring switches of the control delta network; and distributing user bits in a backward manner from the downstream of the control delta network, in both order and rcvcrsc-ordcr in accordance with the bit inarkei's, to output the distributed user bits at the upstream of the control delta nelwork.
- FIG. I illustrates a known butterfly network of an 8 bit PDEP operation
- FIG. 2 exemplifies a BDST operation in accordance with one embodime t of the present invention
- FIG. 4 illustrates a bi-dcita network in accordance with an embodiment of the prese t invention
- FIG. SA exemplifies another bi-della network in accordance with another embodiment of the present invention.
- FIG. 5B illustrates target address bit in routing I -bits in the flip network of FIG. 5 ⁇ ;
- FIG. 7 illustrates a BDST execution unit (BDEU) in accordance with one embodiment of the present invention
- FIG. 8 illustrates a block diagram of a BDEU implementation in accordance with one embodiment of the present invention.
- FIG. 9 illustrates a simplified operation of the BDST Rxeoulion unit of
- FIG. 7 in accordance with an embodiment of the present invention.
- the present invention provides a system and method for parallel bit distribution using bi-delta network which could perform arbitrary bits distribution/placement in one operation.
- Tn nne embodiment of the present invention there is provided a new instruction fur distributing bits.
- the bit distribution is herein alter referred to as "BDST".
- the BDST distributes bits in a normal order, i.e. from lower part of a source register to bit locations in a result register correspondingly marked with 1 -bits in a mask register, and in a reverse order, i.e. from upper part of the source register to bit locations in the result register correspondingly marked with 0-bits in the mask register.
- the BDST uses bi-delta network, in particular a flip network, using both directions to realize the instruction.
- FIG. 2 exemplifies a BDST operation far 8-bit data in accordance with one embodiment of the present invention.
- the BDST operation is an 8- bil operation with a sel ⁇ - routing bi-delta network.
- the BDST operation includes three register operands: a source register 202, a result register 204 and a mask register 206. The bits in the lower part, i.e. bo, bt, and b .
- the BDST operation is implemented in a network that connects to a set of 2*2 switches as shown in FrG. 3. It is understood that the same operation can be applied accordingly in other networks configurations without departing the scope of the present invention.
- the switch shown in FIG. 3 is a crossba switch with two input terminals, iO and il, and two output terminals, oO, o l. The inputs may be configured to have straight connections or cross connections to the outputs.
- ⁇ typical delta network which is also known as digit controlled network, is a type of network with a path from each input to a respective output and ull paths leading to the same output make use of the same path descriptor.
- the delta network forms a subset of the more general class of banyan networks that offers a self- routing property.
- ⁇ banyan network is a multi-stage network of interconnect crossbar switches.
- a bi-delta network herein referred, on the other hand is a type of network having a forward delta network and a reverse delta network.
- FIG. 4 illustrates a bi-delta network with 16-bit inputs and 16-bit outputs, i.e., 16x16 network, in accordance with an embodiment of the present invention.
- the dashed lines illustrate all paths with, a same path descriptor from all inputs to the same output 7.
- path descriptor from input 2 to output 7 and all other paths leading to output 7, is found lo be (I, I, I, 0). Note that the path descriptor is equivalent to the address of the output, least significant bit (Isb) first.
- FIG. 4 also illustrates "intermediate" path descriptor with the labels reversed, for example, ' ⁇ ⁇ for switch I of stage 3 suggests the partial final output address as well as the "digit controlled" nature of the network.
- the top input is labeled ' ⁇
- binury digit 2 of the target address (first 'x' from right in the label) is used to decide whether the bit from the input is passed lo input labeled ' ⁇ at switch I or to input labeled ' ⁇ ⁇ ⁇ ⁇ ' at switch 5 of stage 4.
- N*N RBN remains non-blocking if the target addresses for the inputs arc divided into two groups: one with ascending target addresses and another descending addresses, modulo N.
- H is possible to assign all 1-bils in the mask increasing target addresses from 0 onward and all 0-bits decreasing target addresses from N-l downward.
- Kannan, et al devised a method to generate the address bits in a distributed fashion with an R N network.
- address bit t associated with the 1-bits alternates between 0 and 1.
- stage m there are 2 m_1 address groups wherein the target address bits a m associated with 1-bit inputs alternating between 0 and I in the delta network.
- the address bits alternate between 1 and 0. as the addresses for 0-bits arc assigned from N- 1 down.
- the address bils are then used lo control corresponding switches 502.
- the presenL embodiment is possible to be implemented through one single circuit, instead of a dedicated circuit for 1-bits and 0-bits as proposed by Kannan, et al. using a chain of XOR gates to generate control lines tor the switch elements 502.
- FT !. 6 illustrates a circuit diagram for an 8-bit input address group of an address counting circuit in accordance with a further embodiment of the present invention.
- the circuit diagram comprises 8 inputs . , 3 XOR gales and 4 switches.
- the single circuit will count both 1-bits (from 0) and 0-bits in the inputs correctly.
- the XOR circuit in FiGL 6 is a serial implementation of prefix compulation.
- the circuit has a depth and, thus, latency, of n levels. There arc known methods on parallel prefix circuits with depth as small as log2(n).
- FIG. 7 illustrates a BDST execution unit (BDEU) 700 in accordance with one embodiment of the present invention.
- the BDST execution unit 700 is adapted to handle 128-bit BDST instructions.
- the figure depicts a top-level view of the BDEU 700.
- a 128-bit bitmask mO, consisting of 1-bit and 0-bit markers for distribution locations, and a 128-bit input data xO that is to be distributed are inputs to the BDEU 700 while a completely separated 128-biL bitmask m7 (i.e., all I -bit markers are in the less significant bits in order and all the 0-bit markers arc in the more significant bits in reverse order) and a 128-bit distributed data x.7 are outputs from the BDEU 801).
- the BDEU 700 reads from a 128-bit register file (not shown) and writes the outputs to the same 128-bit register tile. It is also possible that the BDEU 700 reads and writes to the memory instead.
- a person skilled in the art should understand thai reading of inputs and writing of outputs from a computing unit in a processor are performed at different stages in the processor pipeline.
- control stage 7 of the control flip network becomes superfluous. By removing the final stage, a number of switches in other stages in the control flip network may also become unnecessary. In particular, last switch element associated with each address group may be removed.
- PrG. 9 illustrates a simplified operation of the RDST Rxecutioii unit 700 of FIG. 7 in accordance with an embodiment of the present invention.
- the bi-delta network can be implemented without using memory elements and bidirectional switches by having two unidirectional delta networks one in the forward direction for routing control bits mid the other in the backward direufiun for routing user bits, as required by a typical ASIC implementation.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
L'invention concerne un réseau bi-delta de distribution de bits par instruction de distribution de bits (BDST) d'une entrée à une sortie du réseau bi-delta. Le réseau comprend un réseau delta de commande constituant un trajet avant d'un réseau delta pour recevoir n bits de données à partir d'un registre de masque de bits; un réseau delta de données constituant un trajet inverse dudit réseau delta pour n-bits de données d'un registre de source; et une pluralité d'étages de génération de commandes entre le réseau delta de commande et le réseau delta de données, chaque étage recevant de manière fonctionnelle des entrées à partir des entrées de l'étage m du réseau delta de commande, et émettant des signaux de commande pour les commutateurs de l'étage m des deux réseaux, m représentant un numéro d'étage compris entre 1 et log2 (n). L'invention concerne également un procédé de distribution de bits.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
MYPI2011003193 | 2011-07-06 | ||
MYPI2011003193 | 2011-07-06 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013006030A1 true WO2013006030A1 (fr) | 2013-01-10 |
Family
ID=46801614
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/MY2012/000141 WO2013006030A1 (fr) | 2011-07-06 | 2012-06-22 | Système et procédé de distribution parallèle de bits avec un réseau bi-delta |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2013006030A1 (fr) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4752777A (en) * | 1985-03-18 | 1988-06-21 | International Business Machines Corporation | Delta network of a cross-point switch |
WO2001089131A2 (fr) * | 2000-05-05 | 2001-11-22 | Lee Ruby B | Procede et systeme pour effectuer des permutations au moyen d'instructions de permutation basees sur un etage omega et un etage commute modifies |
US6922472B2 (en) | 2000-05-05 | 2005-07-26 | Teleputers, Llc | Method and system for performing permutations using permutation instructions based on butterfly networks |
US20090138534A1 (en) * | 2007-05-23 | 2009-05-28 | The Trustees Of Princeton University | Microprocessor Shifter Circuits Utilizing Butterfly and Inverse Butterfly Routing Circuits, and Control Circuits Therefor |
US20090187746A1 (en) * | 2008-01-22 | 2009-07-23 | Arm Limited | Apparatus and method for performing permutation operations on data |
-
2012
- 2012-06-22 WO PCT/MY2012/000141 patent/WO2013006030A1/fr active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4752777A (en) * | 1985-03-18 | 1988-06-21 | International Business Machines Corporation | Delta network of a cross-point switch |
WO2001089131A2 (fr) * | 2000-05-05 | 2001-11-22 | Lee Ruby B | Procede et systeme pour effectuer des permutations au moyen d'instructions de permutation basees sur un etage omega et un etage commute modifies |
US6922472B2 (en) | 2000-05-05 | 2005-07-26 | Teleputers, Llc | Method and system for performing permutations using permutation instructions based on butterfly networks |
US6952478B2 (en) | 2000-05-05 | 2005-10-04 | Teleputers, Llc | Method and system for performing permutations using permutation instructions based on modified omega and flip stages |
US20090138534A1 (en) * | 2007-05-23 | 2009-05-28 | The Trustees Of Princeton University | Microprocessor Shifter Circuits Utilizing Butterfly and Inverse Butterfly Routing Circuits, and Control Circuits Therefor |
US20090187746A1 (en) * | 2008-01-22 | 2009-07-23 | Arm Limited | Apparatus and method for performing permutation operations on data |
Non-Patent Citations (2)
Title |
---|
DIAS DANIEL: "Analysis and simulation of buffered Delta Networks", IEEE TRANSACTION C-30 VOL 20, 1 April 1981 (1981-04-01), pages 273 - 282, XP055044115, Retrieved from the Internet <URL:http://ieeexplore.ieee.org/ielx5/12/35202/01675775.pdf?tp=&arnumber=1675775&isnumber=35202> [retrieved on 20121113] * |
HILEWITZ; LEE, FAST BIT COMPRESSION AND EXPANSION WITH PARALLEL EXTRACT AND PARALLEL DEPOSIT INSTRUCTIONS, 2006 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7332300B2 (ja) | ハードウェアオフロードを有するアクセラレートされたquicパケット処理のための技術 | |
US6922472B2 (en) | Method and system for performing permutations using permutation instructions based on butterfly networks | |
US6952478B2 (en) | Method and system for performing permutations using permutation instructions based on modified omega and flip stages | |
US7142669B2 (en) | Circuit for generating hash values | |
JP4991223B2 (ja) | データ処理装置 | |
TWI333615B (en) | Lookup table addressing system and method | |
US20090019269A1 (en) | Methods and Apparatus for a Bit Rake Instruction | |
EP2235622A1 (fr) | Dispositif et procédé pour mettre en uvre des opérations de permutation sur des données | |
EP2359234A2 (fr) | Procédé et appareil permettant de réaliser des opérations sur un ensemble redondant de disques indépendants (raid) | |
CN105103454A (zh) | 模块化且可扩展的循环冗余校验计算电路 | |
TW201342220A (zh) | 用以執行groestl雜湊法之指令 | |
WO2014071658A1 (fr) | Procédé et dispositif permettant de générer un code crc imbriqué utilisé pour le contrôle des erreurs dans la transmission de données | |
CN104704780A (zh) | 减小复杂性的散列 | |
CN108183790B (zh) | 一种aes加密装置、芯片及系统 | |
US8775750B2 (en) | Interleaver with parallel address queue arbitration dependent on which queues are empty | |
CN116266106A (zh) | 除法和模运算 | |
US9270460B2 (en) | Instructions to perform JH cryptographic hashing in a 256 bit data path | |
WO2013006030A1 (fr) | Système et procédé de distribution parallèle de bits avec un réseau bi-delta | |
US20140369499A1 (en) | Cryptographic device, cryptographic processing method, and cryptographic processing program | |
WO2019024607A1 (fr) | Procédé de codage sécurisé, et procédé et dispositif de décodage | |
EP1766854A1 (fr) | Appareil et procede pour realiser un condense de md5 | |
ES2653951T3 (es) | Procesador digital de señales y método para direccionar una memoria en un procesador digital de señales | |
Purwita et al. | Optimized 8-level turbo encoder algorithm and VLSI architecture for LTE | |
Cabral et al. | Implementation of the SHA-3 family using AVX512 instructions | |
WO2013009162A1 (fr) | Procédé de transmission de signaux en vue de réaliser une permutation de bits |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12756275 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 12756275 Country of ref document: EP Kind code of ref document: A1 |