US20130094313A1 - Collision prevention in a dual port memory - Google Patents
Collision prevention in a dual port memory Download PDFInfo
- Publication number
- US20130094313A1 US20130094313A1 US13/275,920 US201113275920A US2013094313A1 US 20130094313 A1 US20130094313 A1 US 20130094313A1 US 201113275920 A US201113275920 A US 201113275920A US 2013094313 A1 US2013094313 A1 US 2013094313A1
- Authority
- US
- United States
- Prior art keywords
- wordline
- read
- given row
- signal
- write
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C7/00—Arrangements for writing information into, or reading information out from, a digital store
- G11C7/10—Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
- G11C7/1075—Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers for multiport memories each having random access ports and serial ports, e.g. video RAM
Definitions
- This disclosure relates to memories, and more particularly to collision prevention in a memory.
- One such mechanism uses comparators to detect the same address up front. This type of address detection may require many exclusive-OR (XOR) gates and a “tree” of NAND/NOR gates to combine the many address bits into a single “collision” signal to stop a given wordline from being activated and causing slow write behavior. This conventional approach can be slow, and requires many more gates. In addition, some bit cells may be designed to withstand the contention that arises from a collision. More particularly, another conventional approach increases the size of the n-type pulldown transistors to be greater than the sum of the wordline pass transistors. This is not considered to be an optimal approach.
- a mechanism for preventing collisions in a dual port memory in which the read wordline signal for a given row may be selectively inhibiting based upon address information that is indicative of whether a write operation will be performed to the given row.
- a dual port memory includes read and write wordlines for each row of bit cells in the memory array to accommodate simultaneous reads and writes to different rows. Rather than performing an address comparison between a read and write and then waiting on the result, the read wordline signal for a given row may be inhibited in response to decoding a write address to the given row. The read wordline signal may be inhibited irrespective of whether a read operation will actually be performed.
- a memory in one embodiment, includes dual port bit cells arranged in rows and columns and each bit cell stores a data bit.
- the memory also includes a wordline unit that may provide a respective write wordline signal and a respective read wordline signal to each row of bit cells.
- the wordline unit may also selectively inhibit the read wordline signal for a given row based upon address information that is indicative of whether a write operation will be performed to the given row.
- the wordline unit may inhibit the read wordline signal for a given row in response to the address information indicating that a write operation will be performed to the given row irrespective of whether a read operation will actually be performed.
- FIG. 1 is a block diagram of one embodiment of a processor.
- FIG. 2 is a block diagram of one embodiment of a memory including a collision avoidance mechanism.
- FIG. 3 is a block diagram of one embodiment of the collision avoidance mechanism of FIG. 2 .
- FIG. 4 is a block diagram of one embodiment of a system.
- the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must).
- the words “include,” “including,” and “includes” mean including, but not limited to.
- circuits, or other components may be described as “configured to” perform a task or tasks.
- “configured to” is a broad recitation of structure generally meaning “having circuitry that” performs the task or tasks during operation.
- the unit/circuit/component can be configured to perform the task even when the unit/circuit/component is not currently on.
- the circuitry that forms the structure corresponding to “configured to” may include hardware circuits.
- various units/circuits/components may be described as performing a task or tasks, for convenience in the description.
- the processor 10 includes an instruction cache (ICache) 14 that is coupled to a fetch control unit 12 .
- the processor also includes a decode unit 16 that is coupled to the fetch control unit 12 and to a register file 22 , which is in turn coupled to an execution core 24 .
- the execution core 24 is coupled to an interface unit 34 , which may be coupled to an external interface of the processor 10 , as desired.
- the fetch control unit 12 is configured to provide a program counter address (PC) for fetching from the instruction cache 14 .
- the instruction cache 14 is configured to provide instructions (with PCs) back to the fetch control unit 12 to be fed into the decode unit 16 .
- the decode unit 16 may generally be configured to decode the instructions into instruction operations (ops) and to provide the decoded ops to the execution core 24 .
- the decode unit 16 may also provide decoded operands to the register file 22 , which may provide operands to the execution core 24 .
- the decode unit 16 may also be configured to schedule each instruction and provide the correct register values for execution core 24 to use.
- the register file 22 may also receive results from execution core 24 that are to be written into the register file 22 .
- the register file 22 may generally include any set of registers usable to store operands and results.
- the register file 22 may be implemented using a variety of storage types such as flip-flop type storages, random access memory (RAM), and the like.
- the register file 22 may be implemented using a dual port static RAM (SRAM).
- SRAM static RAM
- it may be important to prevent simultaneous writes and reads to the same bit cells of a dual port memory.
- a collision avoidance mechanism may prevent these occurrences by inhibiting a read wordline signal from accessing a row of bit cells during a write operation to that row.
- the instruction cache 14 may include control logic and memory arrays.
- the memory arrays may be used to store the cached instructions to be executed by processor 10 and the associated cache tags.
- Instruction cache 14 may have any capacity and construction (e.g. direct mapped, set associative, fully associative, etc.).
- Instruction cache 14 may include any cache line size.
- processor 10 may implement any suitable instruction set architecture (ISA), such as ARMTM, PowerPCTM, or x86 ISAs, combinations thereof, etc.
- ISA instruction set architecture
- the processor 10 may implement an address translation scheme in which one or more virtual address spaces are made visible to executing software. Memory accesses within the virtual address space are translated to a physical address space corresponding to the actual physical memory available to the system, for example using a set of page tables, segments, or other virtual memory translation schemes.
- processor 10 may store a set of recent and/or frequently used virtual-to-physical address translations in a translation lookaside buffer (TLB), such as instruction TLB (ITLB) 30 .
- TLB translation lookaside buffer
- the execution core 24 may perform the various operations (e.g., MOV, ADD, SHIFT, LOAD, STORE, etc.) indicated by each instruction.
- the execution core 24 includes data cache 26 , which may be a cache memory for storing data to be processed by the processor 10 .
- data cache 26 may have any suitable capacity, construction, or line size (e.g. direct mapped, set associative, fully associative, etc.).
- data cache 26 may differ from the instruction cache 14 in any of these details.
- data cache 26 may be partially or entirely addressed using physical address bits.
- data TLB (DTLB) 32 may be provided to cache virtual-to-physical address translations for use in accessing data cache 26 in a manner similar to that described above with respect to ITLB 30 . It is noted that although ITLB 30 and DTLB 32 may perform similar functions, in various embodiments they may be implemented differently. For example, they may store different numbers of translations and/or different translation information.
- Interface unit 34 may generally include the circuitry for interfacing processor 10 to other devices on the external interface.
- the external interface may include any type of interconnect (e.g. bus, packet, etc.).
- the external interface may be an on-chip interconnect, if processor 10 is integrated with one or more other components (e.g. a system on a chip configuration).
- the external interface may be on off-chip interconnect to external circuitry, if processor 10 is not integrated with other components.
- processor 10 may implement any instruction set architecture.
- the register file memory 22 includes a wordline unit 201 that is coupled to an array 203 that includes a number of bit cells (e.g., 203 - 214 ).
- bit cells are arranged into rows and columns and are coupled to the wordline unit 201 by read wordlines (e.g., rwl ⁇ 0 >-rwl ⁇ n>) and write wordlines (e.g., wwl ⁇ 0 >-wwl ⁇ n>), where each row of the array (e.g., bit cells 203 - 206 ) is coupled to a respective read and write wordline pair (e.g., rwl ⁇ 0 > and wwl ⁇ 0 >).
- read wordlines e.g., rwl ⁇ 0 >-rwl ⁇ n>
- write wordlines e.g., wwl ⁇ 0 >-wwl ⁇ n>
- each of the bit cell columns (e.g., bit cells 203 - 211 ) is coupled to a differential read and write bitline pair (e.g., rbl ⁇ 0 >, wbl ⁇ 0 >; rblb ⁇ 0 >, wblb ⁇ 0 >).
- a differential read and write bitline pair e.g., rbl ⁇ 0 >, wbl ⁇ 0 >; rblb ⁇ 0 >, wblb ⁇ 0 >).
- bit cells and thus the array 203 , are configured as dual port bit cells since each bit cell is coupled to separate read and write wordlines, and separate read and write bitlines. Accordingly, the array 203 can accommodate a simultaneous read and write to different rows.
- the wordline unit 201 is configured to receive read and write address information, and to generate the appropriate wordline signals to access the bit cells. More particularly, when a read and/or a write address is received, the address is decoded using, for example, a decoder 221 that may perform a number of partial decode operations to create a number of partial decode select signals (shown in FIG. 3 ). These partial decode select signals may be used to generate the appropriate wordlines to access the row of bit cells that correspond to the received address.
- the wordline unit 201 is configured to avoid or prevent a collision between a read and a write to the same row by inhibiting the read wordline signal to a given row during a write to that row.
- the memory corresponds to the register file 22 of FIG. 1 .
- the memory may be any type of memory that is implemented in a dual port configuration.
- the wordline unit 201 includes a write wordline circuit 303 that includes transistors T 1 through T 5 , and an inverter I 2 , which are connected together to form a dynamic logic gate, which is also referred to as a clocked precharge gate.
- the wordline unit 201 includes a write wordline enable circuit that includes the NAND gate N 1 that is coupled to the input of the inverter I 1 .
- the wordline unit 201 includes a read wordline circuit 305 that includes transistors T 6 through T 10 , and an inverter I 3 , which are also connected together to form a dynamic logic gate.
- the wordline unit 201 includes write detection logic 301 that includes a read wordline enable circuit that includes a NAND gate N 3 that is coupled to the input of the inverter I 4 , as well as a NAND gate N 2 , which is coupled to one input of the NAND gate N 3 .
- the embodiment of the wordline unit 201 shown in FIG. 3 represents a portion of the wordline unit 201 . More particularly, the circuit shown in FIG. 3 corresponds to one wordline circuit pair as denoted by the wwl ⁇ n> and rwl ⁇ n>, where n may be any whole number. It is further noted that there may be at least one such circuit for each row of bit cells of the register file 22 .
- the read and write address information received by the wordline unit 201 is decoded by the decoder 221 into partial decode select signals.
- the partial decode select signals correspond to the wpreda, wpredb, rpreda, and rpredb signals shown in FIG. 3 .
- the decoder 221 may generate a pair of partial decode select signals for each wordline circuit for each row. The wordline circuits use these partial decode select signals to assert the respective wordline signals of the respective array rows.
- each of the write wordline circuit 303 and the read wordline circuit 305 are configured to generate a respective wordline signal to the row to which they are connected. More particularly, in the write wordline circuit the transistors T 1 and T 5 , and the inverter I 2 form a precharge and hold circuit, while the transistors T 2 , T 3 , and T 4 correspond to an n-tree logic circuit and an evaluate circuit, respectively. When there is an asserted enable signal at the top input of the NAND gate N 1 , the Wr CLK signal is passed from the bottom input of the NAND gate N 1 through the inverter I 1 , to the gates of transistors T 1 and T 4 .
- the transistor T 4 When the Wr CLK is at a logic value of zero, the transistor T 4 is cut off, and the transistor T 1 conducts charging the input to the inverter I 2 , which drives the wwl ⁇ n> wordline low to a logic value of zero. In this state, since there is no path from circuit ground to the inverter I 2 , the output wordline wwl ⁇ n> stays low and the corresponding row of bit cells is not being written.
- the transistor T 5 is a weak pull-up holding transistor, which will maintain the logic value of one to the inverter I 2 input as long as it is not discarded to a value of zero via the stronger T 2 -T 4 transistor tree.
- the transistor T 1 When the Wr CLK transitions to a logic value of one while the Wr_en signal is also a logic value of one, the transistor T 1 is turned off, and the transistor T 4 conducts. If either of the signals wpreda and wpredb are at a logic value of zero, there is no path from circuit ground to the inverter I 2 , and the output wordline wwl ⁇ n> stays low.
- the wpreda, wpredb, and Wr_en are all at a logic value of one when the Wr CLK transitions to a logic value of one, then a path from circuit ground to the inverter I 2 now exists and the input to the inverter I 2 drains to circuit ground causing the output wordline wwl ⁇ n> to transition to a logic value of one, which causes the data on the write bitlines to be written to the cells of the corresponding row.
- the read wordline circuit 305 operates similar to the write wordline circuit 303 in that the read wordline circuit 305 is precharged, and the rwl ⁇ n> stays low and the corresponding row of bit cells is not being read whenever the output of the inverter I 4 is low. However, for the read wordline rwl ⁇ n> to be asserted, both of the rpreda and rpredb signals must be at a logic value of one, and the wpreda, wpredb, and the Wr_en signals cannot all be at logic value of one.
- the enable logic of the read wordline circuit 305 also includes a NAND gate (e.g., N 3 ) coupled to an inverter (e.g., I 4 ), with the Rd CLK input on the bottom of the NAND gate N 3 .
- a NAND gate e.g., N 3
- I 4 inverter
- the masking signal must be at a logic value of one.
- the NAND gate N 2 of the write detection logic 301 causes the masking signal to be at a logic value of zero whenever the wpreda, wpredb, and Wr_en signals are all at a logic value of one.
- all of the wpreda, wpredb, and Wr_en signals will be at a logic value of one, which effectively inhibits or disables the read wordline circuit 305 from asserting the rwl ⁇ n> signal (even if the rpreda and rpredb signals are at a logic value of one).
- the read wordline circuit 305 By disabling the read wordline circuit 305 , the row cannot be simultaneously read and written, thereby avoiding a collision.
- the processor 10 may simply discard any read data on the read bitlines, or simply retry the read later.
- the processor 10 may include system logic that can detect a collision by comparing the actual read and wrote addresses. However, in such an embodiment, since the comparison is slow, the write may be performed and a read to the same address is inhibited at the wordline unit 201 as described above. If the comparison later indicates that there was no collision, there was no time wasted since the write has already completed. Similarly, if the comparison indicates that there was a collision, since the write has already completed, the read operation may be retried and the data read will be the data that was just written. Accordingly, this collision avoidance mechanism may add only a small number of transistors with a minimal impact on speed.
- the system 400 includes at least one instance of an integrated circuit 410 coupled to one or more peripherals 407 and an external system memory 405 .
- the system 400 also includes a power supply 401 that may provide one or more supply voltages to the integrated circuit 410 as well as one or more supply voltages to the memory 405 and/or the peripherals 407 .
- the integrated circuit 410 be a system on a chip including one or more instances of a processor and various other circuitry such as a memory controller, video and/or audio processing circuitry, on-chip peripherals and/or peripheral interfaces to couple to off-chip peripherals, etc. More particularly, the integrated circuit 410 may include one or more instances of a processor such as processor 10 from FIG. 1 . As such, the integrated circuit 410 may include one or more instances of a register file memory such as register file memory 22 of FIG. 1 . Accordingly, embodiments that include the register file memory 22 , include the collision avoidance mechanism described above.
- the peripherals 407 may include any desired circuitry, depending on the type of system.
- the system 400 may be included in a mobile device (e.g., personal digital assistant (PDA), smart phone, etc.) and the peripherals 407 may include devices for various types of wireless communication, such as WiFi, Bluetooth, cellular, global positioning system, etc.
- the peripherals 407 may also include additional storage, including various types of RAM storage, solid-state storage, or disk storage.
- the peripherals 407 may also include SRAM that includes the redundancy repair mechanism described above.
- the peripherals 407 may include user interface devices such as a display screen, including touch display screens or multitouch display screens, keyboard or other input devices, microphones, speakers, etc.
- the system 400 may be included in any type of computing system (e.g. desktop personal computer, laptop, workstation, net top etc.).
- the external system memory 405 may include any type of memory.
- the external memory 405 may be in the DRAM family such as synchronous DRAM (SDRAM), double data rate (DDR, DDR2, DDR3, etc.), or any low power version thereof.
- SDRAM synchronous DRAM
- DDR double data rate
- DDR2, DDR3, etc. double data rate
- external memory 605 may also be implemented in SDRAM, static RAM (SRAM), or other types of RAM, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Static Random-Access Memory (AREA)
Abstract
Description
- 1. Technical Field
- This disclosure relates to memories, and more particularly to collision prevention in a memory.
- 2. Description of the Related Art
- Many devices use memory arrays that include dual port bit cells in which the bit cells have separate read and write wordlines and separate read bitlines and write bitlines to allow for simultaneous read and write access to both the read and write ports, as long as the read is specified on a different address than the write. This is particularly true in memories used as register files. However, when the read and write address are the same, a collision would result if both the read and write are allowed to proceed. These collisions can be problematic for a variety of reasons. For example, the time it takes a bit cell to recover from a write operation may increase significantly due to the write operation trying to overwrite opposite data being read out. In addition, collisions may cause additional current drain, as well as erroneous data being read from or written to the affected bit cell.
- Accordingly, there are conventional mechanisms to prevent these collisions. One such mechanism uses comparators to detect the same address up front. This type of address detection may require many exclusive-OR (XOR) gates and a “tree” of NAND/NOR gates to combine the many address bits into a single “collision” signal to stop a given wordline from being activated and causing slow write behavior. This conventional approach can be slow, and requires many more gates. In addition, some bit cells may be designed to withstand the contention that arises from a collision. More particularly, another conventional approach increases the size of the n-type pulldown transistors to be greater than the sum of the wordline pass transistors. This is not considered to be an optimal approach.
- Various embodiments of a mechanism for preventing collisions in a dual port memory are disclosed. Broadly speaking, a mechanism for preventing collisions in a dual port memory is contemplated in which the read wordline signal for a given row may be selectively inhibiting based upon address information that is indicative of whether a write operation will be performed to the given row. More particularly, a dual port memory includes read and write wordlines for each row of bit cells in the memory array to accommodate simultaneous reads and writes to different rows. Rather than performing an address comparison between a read and write and then waiting on the result, the read wordline signal for a given row may be inhibited in response to decoding a write address to the given row. The read wordline signal may be inhibited irrespective of whether a read operation will actually be performed.
- In one embodiment, a memory includes dual port bit cells arranged in rows and columns and each bit cell stores a data bit. The memory also includes a wordline unit that may provide a respective write wordline signal and a respective read wordline signal to each row of bit cells. The wordline unit may also selectively inhibit the read wordline signal for a given row based upon address information that is indicative of whether a write operation will be performed to the given row.
- In one specific implementation, the wordline unit may inhibit the read wordline signal for a given row in response to the address information indicating that a write operation will be performed to the given row irrespective of whether a read operation will actually be performed.
-
FIG. 1 is a block diagram of one embodiment of a processor. -
FIG. 2 is a block diagram of one embodiment of a memory including a collision avoidance mechanism. -
FIG. 3 is a block diagram of one embodiment of the collision avoidance mechanism ofFIG. 2 . -
FIG. 4 is a block diagram of one embodiment of a system. - Specific embodiments are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the drawings and detailed description are not intended to limit the claims to the particular embodiments disclosed, even where only a single embodiment is described with respect to a particular feature. On the contrary, the intention is to cover all modifications, equivalents and alternatives that would be apparent to a person skilled in the art having the benefit of this disclosure. Examples of features provided in the disclosure are intended to be illustrative rather than restrictive unless stated otherwise.
- As used throughout this application, the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include,” “including,” and “includes” mean including, but not limited to.
- Various units, circuits, or other components may be described as “configured to” perform a task or tasks. In such contexts, “configured to” is a broad recitation of structure generally meaning “having circuitry that” performs the task or tasks during operation. As such, the unit/circuit/component can be configured to perform the task even when the unit/circuit/component is not currently on. In general, the circuitry that forms the structure corresponding to “configured to” may include hardware circuits. Similarly, various units/circuits/components may be described as performing a task or tasks, for convenience in the description. Such descriptions should be interpreted as including the phrase “configured to.” Reciting a unit/circuit/component that is configured to perform one or more tasks is expressly intended not to invoke 35 U.S.C. §112, paragraph six, interpretation for that unit/circuit/component.
- The scope of the present disclosure includes any feature or combination of features disclosed herein (either explicitly or implicitly), or any generalization thereof, whether or not it mitigates any or all of the problems addressed herein. Accordingly, new claims may be formulated during prosecution of this application (or an application claiming priority thereto) to any such combination of features. In particular, with reference to the appended claims, features from dependent claims may be combined with those of the independent claims and features from respective independent claims may be combined in any appropriate manner and not merely in the specific combinations enumerated in the appended claims.
- Turning now to
FIG. 1 , a block diagram of one embodiment of a processor is shown. Theprocessor 10 includes an instruction cache (ICache) 14 that is coupled to afetch control unit 12. The processor also includes adecode unit 16 that is coupled to thefetch control unit 12 and to aregister file 22, which is in turn coupled to anexecution core 24. Theexecution core 24 is coupled to aninterface unit 34, which may be coupled to an external interface of theprocessor 10, as desired. - In one embodiment, the
fetch control unit 12 is configured to provide a program counter address (PC) for fetching from theinstruction cache 14. Theinstruction cache 14 is configured to provide instructions (with PCs) back to thefetch control unit 12 to be fed into thedecode unit 16. Thedecode unit 16 may generally be configured to decode the instructions into instruction operations (ops) and to provide the decoded ops to theexecution core 24. Thedecode unit 16 may also provide decoded operands to theregister file 22, which may provide operands to theexecution core 24. Thedecode unit 16 may also be configured to schedule each instruction and provide the correct register values forexecution core 24 to use. - The
register file 22 may also receive results fromexecution core 24 that are to be written into theregister file 22. Accordingly, theregister file 22 may generally include any set of registers usable to store operands and results. Thus, theregister file 22 may be implemented using a variety of storage types such as flip-flop type storages, random access memory (RAM), and the like. In one embodiment, theregister file 22 may be implemented using a dual port static RAM (SRAM). As mentioned above, in such embodiments it may be important to prevent simultaneous writes and reads to the same bit cells of a dual port memory. As described in greater detail below in conjunction with the description ofFIG. 2 andFIG. 3 , a collision avoidance mechanism may prevent these occurrences by inhibiting a read wordline signal from accessing a row of bit cells during a write operation to that row. - The
instruction cache 14 may include control logic and memory arrays. The memory arrays may be used to store the cached instructions to be executed byprocessor 10 and the associated cache tags.Instruction cache 14 may have any capacity and construction (e.g. direct mapped, set associative, fully associative, etc.).Instruction cache 14 may include any cache line size. - It is contemplated that the
processor 10 may implement any suitable instruction set architecture (ISA), such as ARM™, PowerPC™, or x86 ISAs, combinations thereof, etc. In some embodiments, theprocessor 10 may implement an address translation scheme in which one or more virtual address spaces are made visible to executing software. Memory accesses within the virtual address space are translated to a physical address space corresponding to the actual physical memory available to the system, for example using a set of page tables, segments, or other virtual memory translation schemes. In embodiments that employ address translation,processor 10 may store a set of recent and/or frequently used virtual-to-physical address translations in a translation lookaside buffer (TLB), such as instruction TLB (ITLB) 30. - The
execution core 24 may perform the various operations (e.g., MOV, ADD, SHIFT, LOAD, STORE, etc.) indicated by each instruction. In the illustrated embodiment, theexecution core 24 includesdata cache 26, which may be a cache memory for storing data to be processed by theprocessor 10. Likeinstruction cache 14,data cache 26 may have any suitable capacity, construction, or line size (e.g. direct mapped, set associative, fully associative, etc.). Moreover,data cache 26 may differ from theinstruction cache 14 in any of these details. As withinstruction cache 14, in some embodiments,data cache 26 may be partially or entirely addressed using physical address bits. Correspondingly, data TLB (DTLB) 32 may be provided to cache virtual-to-physical address translations for use in accessingdata cache 26 in a manner similar to that described above with respect to ITLB 30. It is noted that althoughITLB 30 andDTLB 32 may perform similar functions, in various embodiments they may be implemented differently. For example, they may store different numbers of translations and/or different translation information. -
Interface unit 34 may generally include the circuitry for interfacingprocessor 10 to other devices on the external interface. The external interface may include any type of interconnect (e.g. bus, packet, etc.). The external interface may be an on-chip interconnect, ifprocessor 10 is integrated with one or more other components (e.g. a system on a chip configuration). The external interface may be on off-chip interconnect to external circuitry, ifprocessor 10 is not integrated with other components. In various embodiments,processor 10 may implement any instruction set architecture. - Referring to
FIG. 2 , a block diagram of one embodiment of a register file memory is shown. Theregister file memory 22 includes awordline unit 201 that is coupled to anarray 203 that includes a number of bit cells (e.g., 203-214). As shown, the bit cells are arranged into rows and columns and are coupled to thewordline unit 201 by read wordlines (e.g., rwl<0>-rwl<n>) and write wordlines (e.g., wwl<0>-wwl<n>), where each row of the array (e.g., bit cells 203-206) is coupled to a respective read and write wordline pair (e.g., rwl<0> and wwl<0>). In addition, each of the bit cell columns (e.g., bit cells 203-211) is coupled to a differential read and write bitline pair (e.g., rbl<0>, wbl<0>; rblb<0>, wblb<0>). - As shown, the bit cells, and thus the
array 203, are configured as dual port bit cells since each bit cell is coupled to separate read and write wordlines, and separate read and write bitlines. Accordingly, thearray 203 can accommodate a simultaneous read and write to different rows. - In one embodiment, the
wordline unit 201 is configured to receive read and write address information, and to generate the appropriate wordline signals to access the bit cells. More particularly, when a read and/or a write address is received, the address is decoded using, for example, adecoder 221 that may perform a number of partial decode operations to create a number of partial decode select signals (shown inFIG. 3 ). These partial decode select signals may be used to generate the appropriate wordlines to access the row of bit cells that correspond to the received address. - As mentioned above, read and write collisions may occur in dual port memories unless precautions are taken to avoid or prevent them. As described in greater detail below in conjunction with the description of
FIG. 3 , thewordline unit 201 is configured to avoid or prevent a collision between a read and a write to the same row by inhibiting the read wordline signal to a given row during a write to that row. - It is noted that in the embodiment shown in
FIG. 2 , the memory corresponds to theregister file 22 ofFIG. 1 . However, it is contemplated that in other embodiments, the memory may be any type of memory that is implemented in a dual port configuration. - Turning to
FIG. 3 , a block diagram of one embodiment of the wordline unit including a collision avoidance mechanism ofFIG. 2 is shown. Thewordline unit 201 includes awrite wordline circuit 303 that includes transistors T1 through T5, and an inverter I2, which are connected together to form a dynamic logic gate, which is also referred to as a clocked precharge gate. In addition, thewordline unit 201 includes a write wordline enable circuit that includes the NAND gate N1 that is coupled to the input of the inverter I1. Similarly, thewordline unit 201 includes aread wordline circuit 305 that includes transistors T6 through T10, and an inverter I3, which are also connected together to form a dynamic logic gate. Further, thewordline unit 201 includeswrite detection logic 301 that includes a read wordline enable circuit that includes a NAND gate N3 that is coupled to the input of the inverter I4, as well as a NAND gate N2, which is coupled to one input of the NAND gate N3. It is noted that the embodiment of thewordline unit 201 shown inFIG. 3 represents a portion of thewordline unit 201. More particularly, the circuit shown inFIG. 3 corresponds to one wordline circuit pair as denoted by the wwl<n> and rwl<n>, where n may be any whole number. It is further noted that there may be at least one such circuit for each row of bit cells of theregister file 22. - As described above in conjunction with the description of
FIG. 2 , the read and write address information received by thewordline unit 201 is decoded by thedecoder 221 into partial decode select signals. Accordingly, in one embodiment, the partial decode select signals correspond to the wpreda, wpredb, rpreda, and rpredb signals shown inFIG. 3 . More particularly, thedecoder 221 may generate a pair of partial decode select signals for each wordline circuit for each row. The wordline circuits use these partial decode select signals to assert the respective wordline signals of the respective array rows. - Accordingly, each of the
write wordline circuit 303 and theread wordline circuit 305 are configured to generate a respective wordline signal to the row to which they are connected. More particularly, in the write wordline circuit the transistors T1 and T5, and the inverter I2 form a precharge and hold circuit, while the transistors T2, T3, and T4 correspond to an n-tree logic circuit and an evaluate circuit, respectively. When there is an asserted enable signal at the top input of the NAND gate N1, the Wr CLK signal is passed from the bottom input of the NAND gate N1 through the inverter I1, to the gates of transistors T1 and T4. When the Wr CLK is at a logic value of zero, the transistor T4 is cut off, and the transistor T1 conducts charging the input to the inverter I2, which drives the wwl<n> wordline low to a logic value of zero. In this state, since there is no path from circuit ground to the inverter I2, the output wordline wwl<n> stays low and the corresponding row of bit cells is not being written. The transistor T5 is a weak pull-up holding transistor, which will maintain the logic value of one to the inverter I2 input as long as it is not discarded to a value of zero via the stronger T2-T4 transistor tree. When the Wr CLK transitions to a logic value of one while the Wr_en signal is also a logic value of one, the transistor T1 is turned off, and the transistor T4 conducts. If either of the signals wpreda and wpredb are at a logic value of zero, there is no path from circuit ground to the inverter I2, and the output wordline wwl<n> stays low. If however, the wpreda, wpredb, and Wr_en are all at a logic value of one when the Wr CLK transitions to a logic value of one, then a path from circuit ground to the inverter I2 now exists and the input to the inverter I2 drains to circuit ground causing the output wordline wwl<n> to transition to a logic value of one, which causes the data on the write bitlines to be written to the cells of the corresponding row. - The
read wordline circuit 305 operates similar to thewrite wordline circuit 303 in that theread wordline circuit 305 is precharged, and the rwl<n> stays low and the corresponding row of bit cells is not being read whenever the output of the inverter I4 is low. However, for the read wordline rwl<n> to be asserted, both of the rpreda and rpredb signals must be at a logic value of one, and the wpreda, wpredb, and the Wr_en signals cannot all be at logic value of one. More particularly, similar to the enable logic of thewrite wordline circuit 303, the enable logic of theread wordline circuit 305 also includes a NAND gate (e.g., N3) coupled to an inverter (e.g., I4), with the Rd CLK input on the bottom of the NAND gate N3. However, to allow the Rd CLK to pass through to the transistors T10 and T7, the masking signal must be at a logic value of one. By inspection, it can be seen that the NAND gate N2 of thewrite detection logic 301 causes the masking signal to be at a logic value of zero whenever the wpreda, wpredb, and Wr_en signals are all at a logic value of one. Accordingly, in one embodiment whenever a given row is being written, all of the wpreda, wpredb, and Wr_en signals will be at a logic value of one, which effectively inhibits or disables theread wordline circuit 305 from asserting the rwl<n> signal (even if the rpreda and rpredb signals are at a logic value of one). By disabling theread wordline circuit 305, the row cannot be simultaneously read and written, thereby avoiding a collision. - In various embodiments, since the read is not actually performed the
processor 10 may simply discard any read data on the read bitlines, or simply retry the read later. In one embodiment, theprocessor 10 may include system logic that can detect a collision by comparing the actual read and wrote addresses. However, in such an embodiment, since the comparison is slow, the write may be performed and a read to the same address is inhibited at thewordline unit 201 as described above. If the comparison later indicates that there was no collision, there was no time wasted since the write has already completed. Similarly, if the comparison indicates that there was a collision, since the write has already completed, the read operation may be retried and the data read will be the data that was just written. Accordingly, this collision avoidance mechanism may add only a small number of transistors with a minimal impact on speed. - Referring to
FIG. 4 , a block diagram of one embodiment of a system is shown. Thesystem 400 includes at least one instance of anintegrated circuit 410 coupled to one ormore peripherals 407 and anexternal system memory 405. Thesystem 400 also includes apower supply 401 that may provide one or more supply voltages to theintegrated circuit 410 as well as one or more supply voltages to thememory 405 and/or theperipherals 407. - In one embodiment, the
integrated circuit 410 be a system on a chip including one or more instances of a processor and various other circuitry such as a memory controller, video and/or audio processing circuitry, on-chip peripherals and/or peripheral interfaces to couple to off-chip peripherals, etc. More particularly, theintegrated circuit 410 may include one or more instances of a processor such asprocessor 10 fromFIG. 1 . As such, theintegrated circuit 410 may include one or more instances of a register file memory such asregister file memory 22 ofFIG. 1 . Accordingly, embodiments that include theregister file memory 22, include the collision avoidance mechanism described above. - The
peripherals 407 may include any desired circuitry, depending on the type of system. For example, in one embodiment, thesystem 400 may be included in a mobile device (e.g., personal digital assistant (PDA), smart phone, etc.) and theperipherals 407 may include devices for various types of wireless communication, such as WiFi, Bluetooth, cellular, global positioning system, etc. Theperipherals 407 may also include additional storage, including various types of RAM storage, solid-state storage, or disk storage. As such, theperipherals 407 may also include SRAM that includes the redundancy repair mechanism described above. Theperipherals 407 may include user interface devices such as a display screen, including touch display screens or multitouch display screens, keyboard or other input devices, microphones, speakers, etc. In other embodiments, thesystem 400 may be included in any type of computing system (e.g. desktop personal computer, laptop, workstation, net top etc.). - The
external system memory 405 may include any type of memory. For example, theexternal memory 405 may be in the DRAM family such as synchronous DRAM (SDRAM), double data rate (DDR, DDR2, DDR3, etc.), or any low power version thereof. However, external memory 605 may also be implemented in SDRAM, static RAM (SRAM), or other types of RAM, etc. - Although the embodiments above have been described in considerable detail, numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.
Claims (18)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/275,920 US8432756B1 (en) | 2011-10-18 | 2011-10-18 | Collision prevention in a dual port memory |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/275,920 US8432756B1 (en) | 2011-10-18 | 2011-10-18 | Collision prevention in a dual port memory |
Publications (2)
Publication Number | Publication Date |
---|---|
US20130094313A1 true US20130094313A1 (en) | 2013-04-18 |
US8432756B1 US8432756B1 (en) | 2013-04-30 |
Family
ID=48085899
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/275,920 Active US8432756B1 (en) | 2011-10-18 | 2011-10-18 | Collision prevention in a dual port memory |
Country Status (1)
Country | Link |
---|---|
US (1) | US8432756B1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160141020A1 (en) * | 2014-11-18 | 2016-05-19 | Mediatek Inc. | Static random access memory free from write disturb and testing method thereof |
WO2016195881A1 (en) * | 2015-06-04 | 2016-12-08 | Intel Corporation | Read and write apparatus and method for a dual port memory |
US20220283952A1 (en) * | 2020-03-12 | 2022-09-08 | Micron Technology, Inc. | Memory access collision management on a shared wordline |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9666269B2 (en) | 2015-02-13 | 2017-05-30 | Qualcomm Incorporated | Collision detection systems for detecting read-write collisions in memory systems after word line activation, and related systems and methods |
US10447461B2 (en) * | 2015-12-01 | 2019-10-15 | Infineon Technologies Austria Ag | Accessing data via different clocks |
US10726909B1 (en) | 2019-03-20 | 2020-07-28 | Marvell International Ltd. | Multi-port memory arrays with integrated worldwide coupling mitigation structures and method |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2965043B2 (en) | 1990-04-10 | 1999-10-18 | 三菱電機株式会社 | Dual port memory |
JPH05266654A (en) * | 1992-03-17 | 1993-10-15 | Mitsubishi Electric Corp | Multiport memory |
US5781480A (en) | 1997-07-29 | 1998-07-14 | Motorola, Inc. | Pipelined dual port integrated circuit memory |
US6188633B1 (en) * | 1998-04-28 | 2001-02-13 | Hewlett-Packard Company | Multi-port computer register file having shared word lines for read and write ports and storage elements that power down or enter a high-impedance state during write operations |
US7630272B2 (en) | 2007-02-19 | 2009-12-08 | Freescale Semiconductor, Inc. | Multiple port memory with prioritized world line driver and method thereof |
KR101475346B1 (en) * | 2008-07-02 | 2014-12-23 | 삼성전자주식회사 | Develop level clipping circuit for clipping a develop level of a bitline pair, column path circuit including the same, and multi-port semiconductor memory device |
-
2011
- 2011-10-18 US US13/275,920 patent/US8432756B1/en active Active
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160141020A1 (en) * | 2014-11-18 | 2016-05-19 | Mediatek Inc. | Static random access memory free from write disturb and testing method thereof |
WO2016195881A1 (en) * | 2015-06-04 | 2016-12-08 | Intel Corporation | Read and write apparatus and method for a dual port memory |
US9812189B2 (en) | 2015-06-04 | 2017-11-07 | Intel Corporation | Read and write apparatus and method for a dual port memory |
US20220283952A1 (en) * | 2020-03-12 | 2022-09-08 | Micron Technology, Inc. | Memory access collision management on a shared wordline |
US11698864B2 (en) * | 2020-03-12 | 2023-07-11 | Micron Technology, Inc. | Memory access collision management on a shared wordline |
Also Published As
Publication number | Publication date |
---|---|
US8432756B1 (en) | 2013-04-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8553481B2 (en) | Sense amplifier latch with integrated test data multiplexer | |
JP6230189B2 (en) | Multiport memory with match address and data line control | |
US9158683B2 (en) | Multiport memory emulation using single-port memory devices | |
US8432756B1 (en) | Collision prevention in a dual port memory | |
US7817492B2 (en) | Memory device using SRAM circuit | |
US8988107B2 (en) | Integrated circuit including pulse control logic having shared gating control | |
US8649240B2 (en) | Mechanism for peak power management in a memory | |
US8493811B2 (en) | Memory having asynchronous read with fast read output | |
US8472267B2 (en) | Late-select, address-dependent sense amplifier | |
US8837226B2 (en) | Memory including a reduced leakage wordline driver | |
US20100309731A1 (en) | Keeperless fully complementary static selection circuit | |
US7843760B2 (en) | Interface circuit and method for coupling between a memory device and processing circuitry | |
CN103928049B (en) | Multiport memory with match address control |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: APPLE INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SULLIVAN, STEVEN C.;MILLER, WILLIAM V.;REEL/FRAME:027139/0835 Effective date: 20111018 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 12 |